Raw Record Source

{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreiggeuwowz35d7g5f42cpotxbwoy27k7lrdeauoprfvmyk3wjli6ha",
    "uri": "at://did:plc:pgryn3ephfd2xgft23qokfzt/app.bsky.feed.post/3mkubvnajq3j2"
  },
  "path": "/t/real-time-exercise-form-analysis-with-mediapipe-looking-for-advice/175699#post_2",
  "publishedAt": "2026-05-02T08:57:48.000Z",
  "site": "https://discuss.huggingface.co",
  "tags": [
    "MediaPipe Pose Landmarker",
    "MediaPipe Pose Landmarker overview",
    "MediaPipe Pose Landmarker Python guide",
    "MediaPipe Pose Landmarker Web guide",
    "MediaPipe Pose Landmarker Android guide",
    "Legacy MediaPipe Pose docs",
    "MediaPipe GitHub repo",
    "Fit3D",
    "Fit3D homepage",
    "Fit3D dataset page",
    "Fit3D code page",
    "Fit3D license/legal page",
    "AIFit CVPR paper",
    "AIFit PDF",
    "Real-Time Fitness Exercise Classification and Counting Using a Bidirectional LSTM",
    "Pose landmark jitter issue",
    "Landmark visibility/presence discussion",
    "Occlusion / hallucinated landmarks issue",
    "Pose accuracy issues with non-standing / rotated poses",
    "MediaPipe Web synchronous detect/detectForVideo performance issue",
    "scikit-learn common pitfalls: data leakage",
    "scikit-learn GroupShuffleSplit",
    "scikit-learn getting started",
    "MediaPipe pose classification and repetition counting guide",
    "ML Kit pose classification guide",
    "Build an AI Fitness Trainer Using MediaPipe for Squat Analysis",
    "AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training",
    "Real-Time Fitness Exercise Classification and Counting Using BiLSTM",
    "BlazePose paper",
    "BlazePose GHUM Holistic",
    "ExercisePoseCorrection",
    "AI Push-Up Trainer",
    "Deadlift posture-correction system",
    "Pose Estimation for Fitness Exercise Analysis",
    "Exercise-Correction",
    "Workout-Trainer",
    "MediaPipe Web Pose Landmarker guide",
    "MediaPipe samples web repo",
    "Pose Landmarker worker sample",
    "@mediapipe"
  ],
  "textContent": "For now, I’ve gathered some existing resources that might be useful:\n\n* * *\n\nYour project idea is feasible, but only if you keep the scope tight.\n\nYou are not really building one system; you are building several systems stacked together:\n\n\n    camera video\n    → pose estimation\n    → landmark cleanup\n    → feature extraction\n    → exercise recognition\n    → repetition segmentation\n    → form feedback\n    → real-time UI\n\n\nFor a one-month course project with three people and no previous full CV-pipeline experience, I would **not** try to build a general-purpose “AI personal trainer” for many exercises. I would build a **narrow, explainable, real-time prototype** that works well for 2 exercises.\n\nMy recommended version of **FormAI** :\n\n\n    Phone/webcam camera\n    → MediaPipe Pose Landmarker\n    → landmark quality checks\n    → normalized joint-angle features\n    → manual exercise mode or lightweight exercise classifier\n    → state-machine repetition counting\n    → exercise-specific form rules\n    → real-time overlay + after-rep feedback\n\n\nThe most important design choice:\n\n> Use ML for exercise recognition or phase recognition, but use rule-based / phase-aware logic for form feedback first.\n\nThat will be easier to build, easier to debug, easier to demo, and easier to explain in your report.\n\n* * *\n\n## 1. Why MediaPipe is a good backbone\n\nMediaPipe Pose Landmarker is a good fit because it detects body landmarks in images/video and outputs both image-coordinate landmarks and 3D world-coordinate landmarks. It is designed for tasks like posture analysis and movement categorization.\n\nUseful official links:\n\n  * MediaPipe Pose Landmarker overview\n  * MediaPipe Pose Landmarker Python guide\n  * MediaPipe Pose Landmarker Web guide\n  * MediaPipe Pose Landmarker Android guide\n  * Legacy MediaPipe Pose docs\n  * MediaPipe GitHub repo\n\n\n\nFor a student project, MediaPipe should be treated as the **pose-estimation backend** , not as the full exercise coach. MediaPipe gives you landmarks. Your project still has to decide:\n\n  * which landmarks are reliable,\n  * which angles matter,\n  * where a rep starts and ends,\n  * whether the current exercise phase is valid,\n  * what feedback should be shown,\n  * whether the camera view is acceptable.\n\n\n\nSo the core of your project is not “running MediaPipe.” The core is everything you build **on top of MediaPipe**.\n\n* * *\n\n## 2. The main warning: do not start with a big correct/incorrect classifier\n\nYour original plan is:\n\n\n    MediaPipe keypoints\n    → joint angles\n    → classifier for exercise type\n    → classifier for correct/incorrect form\n\n\nThe exercise-type classifier is reasonable.\n\nThe correct/incorrect classifier is risky.\n\nWhy? Because “incorrect form” is not one thing.\n\nA squat can be wrong because of:\n\n  * not enough depth,\n  * excessive torso lean,\n  * knees caving inward,\n  * left/right asymmetry,\n  * heels lifting,\n  * bad camera angle,\n  * missing landmarks.\n\n\n\nA bicep curl can be wrong because of:\n\n  * partial range of motion,\n  * not extending fully,\n  * elbow drifting,\n  * shoulder swinging,\n  * torso momentum,\n  * occluded wrist/elbow.\n\n\n\nA binary classifier may output:\n\n\n    incorrect\n\n\nBut the user needs something like:\n\n\n    \"Your elbow is drifting forward. Keep your upper arm more stable.\"\n\n\nSo instead of training a vague correct/incorrect classifier first, define **specific feedback rules**.\n\nBetter structure:\n\n\n    exercise classifier:\n        squat / bicep_curl / push_up / unknown\n\n    form analyzers:\n        squat_depth\n        squat_torso_lean\n        squat_knee_symmetry\n        curl_range_of_motion\n        curl_elbow_drift\n        curl_shoulder_swing\n\n\nThis is more explainable and easier to evaluate.\n\n* * *\n\n## 3. Best scope for one month\n\nI would build only:\n\n\n    1. Squat\n    2. Bicep curl\n\n\nOptional third exercise:\n\n\n    3. Push-up\n\n\nBut only add push-up if squat and curl already work.\n\n### Why squat + bicep curl?\n\nExercise | Why it is good | What you can detect | Main risk\n---|---|---|---\nSquat | Common, visual, uses lower-body landmarks | depth, torso lean, knee symmetry | camera view matters\nBicep curl | Simpler upper-body motion | range of motion, elbow drift, shoulder swing | wrist/elbow occlusion\nPush-up | Good demo exercise | depth, hip sag, elbow angle | floor/prone pose is harder\n\nA robust 2-exercise system is much better than a weak 8-exercise system.\n\nDo not try to support every exercise. For a course project, “we do two exercises well and discuss how to extend it” is a strong result.\n\n* * *\n\n## 4. Use Fit3D carefully\n\nFit3D is a strong dataset choice. The dataset includes exercise videos, multiple camera views, 3D skeletons, meshes, exercise labels, and repetition information. The Fit3D homepage describes the broader AIFit system and dataset context.\n\nAlso read:\n\n  * Fit3D dataset page\n  * Fit3D homepage\n  * Fit3D code page\n  * Fit3D license/legal page\n  * AIFit CVPR paper\n  * AIFit PDF\n\n\n\nFit3D is useful for:\n\n\n    exercise labels\n    repetition intervals\n    multi-view exercise videos\n    3D skeleton reference data\n    offline experiments\n\n\nBut do not assume it directly gives you all the form-error labels you want, such as:\n\n\n    squat_not_deep_enough\n    squat_knees_caving\n    curl_elbow_drift\n    pushup_hip_sag\n\n\nYou may need to define those errors yourself with rules.\n\n### Important Fit3D advice\n\nTrain and test your deployed pipeline using **MediaPipe-extracted landmarks from Fit3D RGB videos** , not only Fit3D’s clean 3D ground-truth skeletons.\n\nUse:\n\n\n    Fit3D RGB video\n    → MediaPipe Pose\n    → MediaPipe landmarks\n    → joint-angle features\n    → classifier/rules\n\n\nDo not only use:\n\n\n    Fit3D ground-truth skeleton\n    → classifier\n\n\nWhy?\n\nBecause your live app will receive noisy MediaPipe predictions, not perfect motion-capture skeletons. MediaPipe can have jitter, missing landmarks, occlusion errors, left/right confusion, and depth instability. Your training/evaluation features should resemble your real deployment features.\n\n* * *\n\n## 5. Good final architecture\n\nA clean architecture:\n\n\n    1. Camera input\n    2. MediaPipe Pose Landmarker\n    3. Landmark quality checker\n    4. Pose normalization\n    5. Feature extraction\n    6. Temporal smoothing\n    7. Exercise module\n    8. Rep phase detector\n    9. Form feedback rules\n    10. UI overlay + logging\n\n\nEach part should be testable independently.\n\nExample directory structure:\n\n\n    formai/\n      app/\n        webcam_demo.py\n        overlay.py\n      pose/\n        mediapipe_runner.py\n        landmark_utils.py\n        quality_checks.py\n      features/\n        angles.py\n        normalization.py\n        window_features.py\n      exercises/\n        squat.py\n        bicep_curl.py\n        pushup.py\n      ml/\n        train_exercise_classifier.py\n        evaluate.py\n      data/\n        process_fit3d.py\n      configs/\n        exercises.yaml\n\n\n* * *\n\n## 6. Pose normalization\n\nDo not feed raw pixel coordinates directly into your classifier.\n\nRaw coordinates depend on:\n\n  * camera distance,\n  * user height,\n  * frame resolution,\n  * where the person stands,\n  * crop/zoom,\n  * phone orientation.\n\n\n\nNormalize landmarks.\n\nBasic normalization:\n\n\n    hip_center = midpoint(left_hip, right_hip)\n    shoulder_center = midpoint(left_shoulder, right_shoulder)\n    scale = distance(hip_center, shoulder_center)\n    normalized_landmark = (landmark - hip_center) / scale\n\n\nThis makes the pose representation less sensitive to body size and camera distance.\n\n* * *\n\n## 7. Joint-angle features\n\nJoint angles are a very good starting point because they are:\n\n  * interpretable,\n  * fast,\n  * simple,\n  * easy to debug,\n  * easy to explain in a report.\n\n\n\nFor a joint angle:\n\n\n    A — B — C\n\n\nThe angle is at point `B`.\n\nExample:\n\n\n    hip → knee → ankle = knee angle\n    shoulder → elbow → wrist = elbow angle\n    shoulder → hip → knee = hip angle\n\n\n### Squat features\n\nFeature | Landmarks | Why useful\n---|---|---\nKnee angle | hip-knee-ankle | squat depth and phase\nHip angle | shoulder-hip-knee | hip hinge / lower-body pattern\nTorso angle | shoulder center to hip center | forward lean\nHip vertical movement | hip center y | depth\nLeft/right knee difference | left knee angle vs right knee angle | asymmetry\nKnee tracking | knee vs ankle/hip x-position | knee cave, front view only\n\n### Bicep curl features\n\nFeature | Landmarks | Why useful\n---|---|---\nElbow angle | shoulder-elbow-wrist | rep phase and range\nElbow drift | elbow relative to shoulder/torso | upper-arm stability\nShoulder motion | shoulder/upper-arm movement | swinging/cheating\nWrist path | wrist relative to elbow | curl motion\nLeft/right elbow difference | both elbows | symmetry if two-arm curl\n\n* * *\n\n## 8. Use temporal windows, not single frames\n\nExercise is motion. A single frame is often ambiguous.\n\nA squat, lunge, and deadlift can look similar in one frame. A curl midpoint can look like many other arm motions. Use time.\n\nA useful paper here is Real-Time Fitness Exercise Classification and Counting Using a Bidirectional LSTM, which uses temporal pose features over frame sequences. You do not need to implement BiLSTM first, but the principle is important:\n\n\n    use sequences/windows, not isolated frames\n\n\nSimple temporal-window features:\n\n\n    window = last 30 frames\n\n    for each angle:\n        mean\n        min\n        max\n        range\n        standard deviation\n        velocity\n\n\nFor squat:\n\n\n    knee_angle_min\n    knee_angle_max\n    knee_angle_range\n    hip_y_range\n    torso_angle_max\n    left_right_knee_difference_mean\n\n\nFor curl:\n\n\n    elbow_angle_min\n    elbow_angle_max\n    elbow_angle_range\n    elbow_velocity_mean\n    elbow_position_drift\n    shoulder_angle_range\n\n\nThis gives you motion information without needing a deep temporal model.\n\n* * *\n\n## 9. Repetition counting: use a state machine\n\nDo not count reps like this:\n\n\n    if knee_angle < 100:\n        count += 1\n\n\nThat will overcount.\n\nUse a state machine.\n\n### Squat state machine\n\n\n    standing\n    → descending\n    → bottom\n    → ascending\n    → standing\n\n\nSimple version:\n\n\n    if state == \"standing\" and knee_angle < down_threshold:\n        state = \"bottom\"\n\n    if state == \"bottom\" and knee_angle > up_threshold:\n        count += 1\n        state = \"standing\"\n\n\nBetter version:\n\n\n    require threshold for several frames\n    require valid landmarks\n    require minimum time between reps\n    ignore low-quality frames\n    smooth angles before decisions\n\n\n### Bicep curl state machine\n\n\n    extended\n    → curling_up\n    → top\n    → lowering\n    → extended\n\n\nExample thresholds:\n\n\n    extended: elbow_angle > 150°\n    top: elbow_angle < 60°\n    rep: extended → top → extended\n\n\nDo not treat those numbers as universal. Use them as starting points and tune them from your videos.\n\n* * *\n\n## 10. Form feedback should be rep-level, not frame-level\n\nFrame-level feedback is noisy.\n\nBad:\n\n\n    Frame 101: torso bad\n    Frame 102: torso okay\n    Frame 103: torso bad\n    Frame 104: torso okay\n\n\nBetter:\n\n\n    Rep 3:\n    - depth: too shallow\n    - torso lean: acceptable\n    - symmetry: acceptable\n\n    Feedback:\n    \"Rep 3: try to squat slightly deeper.\"\n\n\nRecommended feedback behavior:\n\n\n    During the rep:\n        show light live cues\n\n    After the rep:\n        show one main correction\n\n\nUse cooldowns:\n\n\n    Do not repeat the same feedback every frame.\n    Only show a warning if the issue persists for N frames or appears in a significant part of the rep.\n\n\n* * *\n\n## 11. Suggested rules\n\n### Squat rules\n\nUse side view first.\n\nRequired landmarks:\n\n\n    shoulders\n    hips\n    knees\n    ankles\n\n\nMain rep signal:\n\n\n    average knee angle\n    or hip height relative to knee\n\n\nRules:\n\nRule | Signal | Feedback\n---|---|---\nNot deep enough | min knee angle or hip/knee height | “Try to squat deeper.”\nTorso leaning too far | torso angle during descent/bottom | “Keep your chest more upright.”\nLeft/right asymmetry | difference between left and right knee angles | “Try to move both legs evenly.”\nKnee cave | knee position relative to ankle/hip line | “Avoid letting your knees collapse inward.”\n\nImportant: knee-cave detection is mainly a **front-view** problem. Squat depth and torso lean are mainly **side-view** problems. Do not claim all squat errors can be detected from any camera angle.\n\n### Bicep curl rules\n\nRequired landmarks:\n\n\n    shoulder\n    elbow\n    wrist\n    hip/torso reference\n\n\nMain rep signal:\n\n\n    elbow angle\n\n\nRules:\n\nRule | Signal | Feedback\n---|---|---\nIncomplete curl | min elbow angle too large | “Curl higher.”\nIncomplete extension | max elbow angle too small | “Extend your arm more at the bottom.”\nElbow drift | elbow moves relative to shoulder/torso | “Keep your elbow stable.”\nShoulder swing | shoulder/upper arm moves too much | “Avoid swinging your shoulder.”\n\n* * *\n\n## 12. Camera-view constraints are not optional\n\nA phone camera cannot reliably detect every form issue from every angle.\n\nIssue | Best camera view\n---|---\nSquat depth | side view\nSquat torso lean | side view\nKnees caving inward | front view\nBicep curl elbow drift | side or front upper-body view\nBicep curl shoulder swing | side view\nPush-up hip sag | side view\nPush-up elbow flare | front/diagonal view\n\nYour app should guide the user:\n\n\n    \"For squat depth and torso analysis, place the camera to your side.\"\n    \"For knee-cave analysis, use a front view.\"\n    \"For bicep curls, keep shoulder, elbow, and wrist visible.\"\n\n\nThis makes the system more honest and more reliable.\n\n* * *\n\n## 13. Landmark quality checks\n\nBefore giving form feedback, check that the pose is usable.\n\nQuality checks:\n\n\n    one person detected\n    required landmarks visible\n    full body inside frame\n    landmarks inside image bounds\n    limb lengths reasonable\n    angle changes not physically impossible\n    landmarks stable for several frames\n    camera view suitable for selected exercise\n\n\nIf the input is bad, do not say “bad form.” Say:\n\n\n    \"Move farther from the camera.\"\n    \"Make sure your full body is visible.\"\n    \"Improve lighting.\"\n    \"Use side view for squat analysis.\"\n    \"Only one person should be in frame.\"\n\n\nUseful MediaPipe issue links for real-world pitfalls:\n\n  * Pose landmark jitter issue\n  * Landmark visibility/presence discussion\n  * Occlusion / hallucinated landmarks issue\n  * Pose accuracy issues with non-standing / rotated poses\n  * MediaPipe Web synchronous detect/detectForVideo performance issue\n\n\n\nTakeaway: a robust app needs **input-quality warnings** , not only form warnings.\n\n* * *\n\n## 14. Evaluation: avoid fake accuracy\n\nDo not randomly split frames.\n\nBad:\n\n\n    frame 1 from video A → train\n    frame 2 from video A → test\n    frame 3 from video A → train\n    frame 4 from video A → test\n\n\nThis leaks information because neighboring frames are nearly identical.\n\nBetter:\n\n\n    recording A → train\n    recording B → test\n\n\nBest:\n\n\n    subject A/B/C → train\n    subject D → test\n\n\nRead:\n\n  * scikit-learn common pitfalls: data leakage\n  * scikit-learn GroupShuffleSplit\n  * scikit-learn getting started\n\n\n\nUse `GroupShuffleSplit` or `GroupKFold` with:\n\n\n    group = subject_id\n\n\nor:\n\n\n    group = recording_id\n\n\nReport multiple metrics:\n\nComponent | Metric\n---|---\nExercise classifier | accuracy, macro F1, confusion matrix\nRep counter | absolute count error\nForm rules | manual agreement on selected clips\nRuntime | FPS, average latency\nRobustness | failure cases by lighting/camera/occlusion\n\nDo not only report:\n\n\n    accuracy = 98%\n\n\nReport:\n\n\n    exercise classifier macro F1\n    rep-counting error\n    FPS\n    failure cases\n\n\nThat will make your report much more credible.\n\n* * *\n\n## 15. Similar projects worth studying\n\n### Official / high-value guides\n\n  * MediaPipe pose classification and repetition counting guide\nVery relevant. Shows pose classification and repetition counting with push-ups/squats using k-NN.\n\n  * ML Kit pose classification guide\nUseful mobile-oriented explanation of pose classification and rep counting.\n\n  * Build an AI Fitness Trainer Using MediaPipe for Squat Analysis\nPractical squat-focused MediaPipe example with feedback logic.\n\n\n\n\n### Research references\n\n  * AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training\nClosest research-grade version of your idea: 3D pose, rep segmentation, trainer/trainee comparison, interpretable feedback.\n\n  * AIFit PDF\n\n  * Real-Time Fitness Exercise Classification and Counting Using BiLSTM\nGood for the idea that temporal windows matter.\n\n  * BlazePose paper\nBackground on the pose-estimation model family behind MediaPipe Pose.\n\n  * BlazePose GHUM Holistic\nUseful for 3D landmark / on-device pose-estimation background.\n\n\n\n\n### GitHub projects\n\n  * ExercisePoseCorrection\nPush-ups, squats, and bicep curls with real-time form feedback.\n\n  * AI Push-Up Trainer\nGood example of using a state machine before counting reps.\n\n  * Deadlift posture-correction system\nUseful example of stage-based feedback: setup, lifting, lockout.\n\n  * Pose Estimation for Fitness Exercise Analysis\nUses MediaPipe + scikit-learn for exercise phase classification, rep counting, and quality assessment.\n\n  * Exercise-Correction\nUseful for stage-dependent exercise error logic.\n\n  * Workout-Trainer\nGood example of exercise-specific metrics such as elbow flexion, shoulder drift, squat depth, and chest angle.\n\n\n\n\nUse GitHub projects as engineering references, not as proof that the problem is solved. Many hobby/student projects have weak evaluation.\n\n* * *\n\n## 16. Recommended tech stack\n\n### Fastest prototype\n\n\n    Python\n    OpenCV\n    MediaPipe\n    NumPy\n    Pandas\n    scikit-learn\n    Matplotlib\n    Joblib\n\n\nUse this for:\n\n\n    video reading\n    pose extraction\n    angle calculation\n    CSV generation\n    model training\n    evaluation plots\n    debugging\n\n\n### Optional web demo\n\n\n    JavaScript / TypeScript\n    @mediapipe/tasks-vision\n    HTML Canvas\n    Web Worker\n\n\nRelevant links:\n\n  * MediaPipe Web Pose Landmarker guide\n  * MediaPipe samples web repo\n  * Pose Landmarker worker sample\n\n\n\nThe web version is good if you want a phone-browser demo, but be careful with performance. Pose detection in the browser can block the main thread unless you throttle it or run it in a worker.\n\n### Optional Android demo\n\n\n    Kotlin or Java\n    MediaPipe Tasks Android\n    CameraX\n    Canvas overlay\n\n\nOnly do Android if someone on your team already knows Android.\n\n* * *\n\n## 17. Suggested one-month plan\n\n### Week 1: Pose pipeline\n\nGoal:\n\n\n    camera/video\n    → MediaPipe\n    → landmarks\n    → joint angles\n\n\nDeliverables:\n\n  * webcam or video input,\n  * pose overlay,\n  * angle calculation,\n  * CSV export,\n  * basic landmark quality checks.\n\n\n\nDo not start with a complex model yet.\n\n### Week 2: One exercise end-to-end\n\nGoal:\n\n\n    squat works from camera to feedback\n\n\nDeliverables:\n\n  * squat rep counter,\n  * squat state machine,\n  * 2–3 squat feedback rules,\n  * angle smoothing,\n  * after-rep feedback.\n\n\n\nAt the end of Week 2, you should already have a demo.\n\n### Week 3: Second exercise + classifier\n\nGoal:\n\n\n    bicep curl support + exercise classifier baseline\n\n\nDeliverables:\n\n  * curl rep counter,\n  * curl feedback rules,\n  * Fit3D subset processing,\n  * exercise classifier baseline,\n  * confusion matrix.\n\n\n\nStart with:\n\n\n    Random Forest\n    SVM\n    k-NN\n    Logistic Regression\n\n\nDo not start with LSTM unless the simple pipeline is already working.\n\n### Week 4: Integration and polish\n\nGoal:\n\n\n    stable final demo + honest evaluation\n\n\nDeliverables:\n\n  * clean UI,\n  * final demo video,\n  * FPS measurement,\n  * evaluation metrics,\n  * failure cases,\n  * report,\n  * presentation.\n\n\n\n* * *\n\n## 18. Team division\n\nWith three teammates, divide the work like this.\n\n### Teammate 1: Real-time pipeline\n\nResponsibilities:\n\n\n    camera input\n    MediaPipe setup\n    landmark drawing\n    FPS/latency measurement\n    UI overlay\n\n\nDeliverables:\n\n\n    live skeleton demo\n    real-time angle display\n    recorded demo video\n\n\n### Teammate 2: Dataset and ML\n\nResponsibilities:\n\n\n    Fit3D subset\n    landmark extraction\n    feature CSV\n    exercise classifier\n    train/test split\n    evaluation\n\n\nDeliverables:\n\n\n    features.csv\n    trained classifier\n    classification report\n    confusion matrix\n\n\n### Teammate 3: Rep counting and feedback\n\nResponsibilities:\n\n\n    angle logic\n    state machines\n    squat rules\n    curl rules\n    feedback messages\n    failure-case documentation\n\n\nDeliverables:\n\n\n    rep counter\n    form analyzer\n    feedback engine\n    rule documentation\n\n\nThis gives everyone a clear subsystem.\n\n* * *\n\n## 19. What your final report should say\n\nAvoid saying only:\n\n\n    We used MediaPipe and trained a classifier.\n\n\nSay something like:\n\n\n    We built a modular real-time exercise-form analysis pipeline. MediaPipe Pose Landmarker was used to extract body landmarks from camera/video input. We normalized landmarks, computed interpretable joint-angle features, smoothed temporal signals, counted repetitions with exercise-specific state machines, and generated corrective feedback from phase-aware rules. We evaluated exercise classification with a subject/video-level split and measured rep-counting accuracy, runtime FPS, and common failure cases.\n\n\nSuggested report structure:\n\n\n    1. Introduction\n       - problem\n       - motivation\n       - goal\n\n    2. Background\n       - human pose estimation\n       - MediaPipe Pose\n       - exercise classification\n       - form feedback\n\n    3. Dataset\n       - Fit3D overview\n       - selected exercises\n       - preprocessing\n       - train/test split\n\n    4. Method\n       - pose extraction\n       - landmark normalization\n       - joint-angle features\n       - temporal smoothing\n       - rep state machine\n       - form-feedback rules\n       - exercise classifier\n\n    5. Implementation\n       - real-time pipeline\n       - UI\n       - performance considerations\n\n    6. Experiments\n       - exercise classification results\n       - rep-counting results\n       - runtime FPS\n       - failure cases\n\n    7. Discussion\n       - limitations\n       - camera-view constraints\n       - dataset limitations\n       - future work\n\n    8. Conclusion\n\n\n* * *\n\n## 20. Pitfalls to avoid\n\n### Pitfall 1: Too many exercises\n\nBad:\n\n\n    We support 12 exercises.\n\n\nBetter:\n\n\n    We support 2 exercises robustly and explainably.\n\n\n### Pitfall 2: Binary correct/incorrect form\n\nBad:\n\n\n    The model says correct or incorrect.\n\n\nBetter:\n\n\n    The system detects specific issues:\n    - shallow squat\n    - excessive torso lean\n    - elbow drift\n    - partial curl\n\n\n### Pitfall 3: Frame-level random split\n\nBad:\n\n\n    random train_test_split over all frames\n\n\nBetter:\n\n\n    split by subject_id or recording_id\n\n\n### Pitfall 4: No camera setup\n\nBad:\n\n\n    Analyze from any camera angle.\n\n\nBetter:\n\n\n    Use side view for squat depth and torso lean.\n    Use front/side upper-body view for bicep curls.\n\n\n### Pitfall 5: No smoothing\n\nBad:\n\n\n    one-frame warning\n\n\nBetter:\n\n\n    warning only after the condition persists across several frames or across a rep\n\n\n### Pitfall 6: Overusing z-depth\n\nBad:\n\n\n    precise 3D biomechanics from one phone camera\n\n\nBetter:\n\n\n    2D angle features with constrained camera view; optional 3D/world features for experiments\n\n\n### Pitfall 7: Overclaiming safety\n\nAvoid:\n\n\n    prevents injuries\n    guarantees safe form\n    replaces a trainer\n\n\nSay:\n\n\n    provides basic real-time feedback on visible form deviations\n\n\n* * *\n\n## 21. My final recommended FormAI MVP\n\n\n    Input:\n    - phone/webcam video\n\n    Pose:\n    - MediaPipe Pose Landmarker\n\n    Exercises:\n    - squat\n    - bicep curl\n\n    Features:\n    - normalized landmarks\n    - joint angles\n    - temporal window statistics\n\n    Exercise recognition:\n    - manual exercise selection for demo\n    - optional Random Forest/k-NN/SVM classifier for experiment\n\n    Rep counting:\n    - state-machine based\n\n    Feedback:\n    - squat depth\n    - squat torso lean\n    - squat asymmetry\n    - curl range of motion\n    - curl elbow drift\n    - curl shoulder swing\n\n    Evaluation:\n    - Fit3D subset\n    - subject/video-level split\n    - confusion matrix\n    - rep-counting error\n    - FPS\n    - failure cases\n\n\n* * *\n\n## Short summary\n\n  * The project is feasible if you narrow the scope.\n  * Use MediaPipe for pose landmarks, not for the whole coaching logic.\n  * Use Fit3D for exercise videos, rep intervals, and offline experiments.\n  * Train on MediaPipe-extracted landmarks from Fit3D videos, not only clean ground-truth skeletons.\n  * Use ML for exercise recognition or phase recognition.\n  * Use rules for form feedback first.\n  * Use state machines for rep counting.\n  * Use temporal smoothing and rep-level feedback.\n  * Split train/test by subject or recording, not by frame.\n  * Build squat + bicep curl well before adding anything else.\n  * Make the final claim modest: FormAI gives basic real-time feedback on visible form deviations; it does not replace a trainer or guarantee injury prevention.\n\n",
  "title": "Real-time exercise form analysis with MediaPipe , looking for advice"
}