To be more active during working hours I decided to create a app that would disable my keyboard and mouse after a while and force me to do some squats. This blog post will show you how to do and create this by yourself using Arduino, OpenCV and Mediapipe
Matereals needed
To replicate this project you will need the following matereals:
- Usb webcam
- USB hub that you can modify (cut the 5v line and splice the relay in)
- Arduino microcontroller with usb serial comunication
- 5V Relay module
- 10k resistor
- Pushbutton (optional)
- soldering iron, wires and some other misc. stuff
Assembly
When you have got your hands on matereals you will need to modify your USB hub. Open up the USB HUB and and cut the 5V wire that comes from the computer. After that is done attach additional wire to both sides so we can screw them inside NC (normally closed) terminals of the relay module.
After you are done with the USB hub please wire the Arduino and other components following this diagram and flash the code located on github

Working principles
The whole packadge works super simple. When the python app starts running a timer is started. By default timer runs for 45 minutes and after the time has expired it turns on the relay that will disable the USB hub and all connected devices turn. After devices have been turned off the app turns on the webcam and starts to feed Mediapipe with frames captured by webcam. When media pipe has received a frame and has detected the joints/landmarks of the person in frame it starts to calculate angles of knee joints.
The app assumes that if the both leg angles are <=90 degrees the person is squatting, if the angle is >= 175 then the user is standing.
If the user wants to skip current squat session he can press the button attached to Arduino, but that would mean that the next squat session will have more squats to do.
Python code explanation
This section will simplify the hole code and focus only on image capturing and pose estimations. If you want the full picture you can go an look at the code on GitHub.
First we need to retrieve the image from the camera. We will do that with OpenCV. The code to get the image will look something like this.
import cv2
cap = cv2.VideoCapture(0) # camera index if you have multiple cameras you might want to increase this
while True:
ret, image = cap.read() # Read frame from the camera
cv2.imshow("Image", image) # show the image
if cv2.waitKey(1) & 0xFF == ord('q'): # check if 'q' key is pressed so we can quit
break
cap.release() # release camera
cv2.destroyAllWindows() # destroy all windows
once we can receive images we can now start and process the data with MediaPipe. But before we do that the image needs to be converted from BGR to RGB color space. To do that simply call the following method
imageRGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
once the image has been converted we can now add additional imports and get pose estimations from MediaPipe like so.
import mediapipe as mp
mpPose = mp.solutions.pose
pose = mpPose.Pose() # create new instance of pose detector
mpDraw = mp.solutions.drawing_utils # we will want to use drawing utilities to draw overlay on image so we can see if it is working
...
#insert the code bellow inside the while loop right after image capture
results = pose.process(imageRGB) ## find all the landmarks visible in image
if results.pose_landmarks: ## check if result has detected landmarks
mpDraw.draw_landmarks(image, results.pose_landmarks, mpPose.POSE_CONNECTIONS) # draw landmark overlays
to get the landmarks/joints from the result we can iterate over the landmarks returned by MediaPipe like so
for id, lm in enumerate(results.pose_landmarks.landmark):
h, w, c = image.shape
cx, cy = int(lm.x * w), int(lm.y * h) # calculate position of landmark in image
print(id, cx, cy)
if id == mpPose.PoseLandmark.NOSE:
cv2.circle(image, (cx, cy), 15, (255, 0, 255), cv2.FILLED) # draw circle around my nose
if you have followed along this far then the whole code should look like this
import cv2
import mediapipe as mp
import time
cap = cv2.VideoCapture(0)
mpPose = mp.solutions.pose
pose = mpPose.Pose()
mpDraw = mp.solutions.drawing_utils
pTime = 0
cTime = 0
while True:
success, image = cap.read()
#get poses
imageRGB = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pose.process(imageRGB)
if results.pose_landmarks:
for id, lm in enumerate(results.pose_landmarks.landmark):
h, w, c = image.shape
cx, cy = int(lm.x * w), int(lm.y * h)
print(id, cx, cy)
if id == mpPose.PoseLandmark.NOSE:
cv2.circle(image, (cx, cy), 15, (255, 0, 255), cv2.FILLED)
mpDraw.draw_landmarks(image, results.pose_landmarks, mpPose.POSE_CONNECTIONS)
cTime = time.time()
fps = 1/(cTime - pTime)
pTime = cTime
cv2.putText(image, str(round(fps)), (10, 70), cv2.FONT_HERSHEY_COMPLEX, 3, (255, 255, 255), 3)
cv2.imshow("Image", image)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
#release resources
cap.release()
cv2.destroyAllWindows()
Activity reminder on YouTube
I also made a YouTube video about this a while back, that covers what, why and my thought process for all this. The video is rough in quality, but it is what it is… we all have to learn somehow 😉
Conclusion
This project was really fun. The detection of poses was super easy thanks to MediaPipe and simple math, but there are glitches if you are standing too close to the camera that might get counted as squats so there might be a need for additional pose estimator tuning.
Regarding the usefulness of this activity reminder… It works, but it gets quite annoying if you are in the middle of something and keyboard gets disabled. I know you can always press the button so that you can skip the squats, but it feels like cheating and my conscience doesn’t allow that. I have been using this for a few months as the time of writing and I have removed the USB hub from equation, but I still use timer functionality and for some reason I still do squats all the time when the camera turns on.