Real-Time Pose-Based Attention Analysis

Built an embedded system that captures live video of a student, runs on-device pose estimation to extract 17 body keypoints, applies attention heuristics (head tilt, drooping, hand-raise), and flags “Attentive” vs “Not Attentive” in real time all on a Raspberry Pi.


Project Sections

Overview

The Real-Time Pose-Based Attention Analysis project delivers an end-to-end solution for monitoring student engagement using only a Raspberry Pi 4B+ and a USB camera. The system captures live video, runs on-device pose estimation via a lightweight TensorFlow Lite MoveNet model, applies a set of human-interpretable heuristics to assess attention, and logs all relevant data for future refinement.

Features

  • Real-Time Pose Estimation: ~15 FPS on Pi
  • Attention Heuristics: Head tilt, drooping, hand-raise detection
  • On-Device Processing: All computations performed on Raspberry Pi
  • Lightweight Model: MoveNet model optimized for embedded systems
  • Visual Overlay: Annotated live video with keypoints and status
  • Real-Time Inference: Pose estimation and attention analysis in real-time
  • Data Logging: Captures and stores attention data for analysis

Technologies Used

Programming Language: Python

Libraries: imutils, OpenCV, TensorFlow Lite

Machine Learning Model: MoveNet

Hardware: Raspberry Pi 4B+, USB Camera


My Role

Architected end-to-end pipeline: video capture → preprocessing → TFLite inference → feature extraction → attention analysis → data logging

Integrated threading (imutils) for smooth frame rates

Implemented CSV logging of timestamp, head-tilt, attention status, feedback, and raw keypoints for future model training

Optimized dtype and preprocessing to match the quantized MoveNet model

Challenges and Solution

Pi-level Performance: Baseline ~5 FPS; solved via threaded capture.

Model DType Mismatch: Switched preprocessing from float32→normalized to raw uint8 to satisfy quantized TFLite

Heuristic Tuning: Iteratively adjusted tilt/position thresholds on real data to minimize false positives

Outcomes

Prototype Delivered: Complete end-to-end demo running on Pi

Dataset Collected: > 10 000 frames logged with features & labels for future classifier training

Performance Benchmarks: 15 FPS real-time inference with 17 keypoints

Future Enhancements

Custom Classifier: Train an ML model on the logged data to replace hand-tuned rules

Additional Cues: Add body-lean, eye-gaze, facial-expression analysis

User Interface: Develop a web dashboard for real-time monitoring and analytics

Cloud Integration: Stream data to a server for centralized monitoring and analytics

Repository Link

Explore the code and Data in the GitHub repository: GitHub - Real-Time Pose-Based Attention Analysis on Raspberry Pi