Simple Windows UI Controller, using hand gestures. The programm supports following actions:
- scroll up/down
- swipe left/ right,
- close/open
- volume up/down
- mouse drag
which allows user to controll ceratin actions like swiping through photo gallery, scrolling the pdf or fauvorite website, closing curently opend app, draggin a mouse and opening a new one or regulating volume WITHOUT any use of keyboard and mouse, only relaying on camera and hand gestures.
The project covers aspects of machine vision and image recognition models, using external python liberies - MediaPipe Hands for recognising hand landmarks and OpenCV for camera image processing. Hand lanmarks are then classified into certain gestures that are mapped in system actions allowing keyboardless and mouseless system controll.
- Hold your hand in the upper half of the camera frame.
- Extend your index finger upward.
- Move your fingertip smoothly upward in a single, continuous motion.
- Begin the movement above the middle vertical zone of the screen.
- Finish with your fingertip reaching the upper zone to trigger the scroll-up gesture.
- Hold your hand in the upper or middle part of the frame.
- Extend your index finger downward.
- Move your fingertip smoothly downward in one motion.
- Begin the movement below the middle vertical zone.
- Finish with your fingertip reaching the lower zone to trigger the scroll-down gesture.
- Position your index fingertip on the right side of the camera frame.
- Extend your index finger forward.
- Move your fingertip leftwards toward the center of the frame.
- The gesture triggers once the finger crosses into the middle horizontal zone.
- Perform the motion within the allowed time window to complete the swipe.
- Position your index fingertip on the left side of the screen.
- Extend your index finger forward.
- Move your fingertip rightwards toward the center.
- The gesture is detected when the finger enters the middle region.
- Perform the movement smoothly and within the gesture timing limit.
- Fully open your hand and face the palm toward the camera.
- Spread your fingers so that the distances between finger tips are large and clearly visible.
- Hold the open hand steady—the gesture is detected instantly.
- Make a quick closing movement with your hand (as if forming a loose fist).
- Ensure your ring finger moves close to the wrist.
- Briefly open the hand again.
- Perform a second closing motion quickly.
- When both steps are detected in sequence, the close gesture is triggered.
- Bring your index finger and thumb close together to create a pinch shape.
- Keep the middle, ring, and pinky fingers slightly curled so that the distances between them stay small.
- Once the pinch is recognized, move your hand—
the mouse cursor will follow the fingertip. - Maintaining the pinch continues the drag action.
- Hold your thumb horizontally, aligned with the wrist
- Keep your hand very still for ~2 seconds to activate volume mode.
- After activation, move your thumb upward to increase volume.
- Move your thumb downward to decrease volume.
- The direction is determined by comparing thumb position to its initial reference point.
pip install opencv-python mediapipe pyautogui
python main.py --debug
Debug mode shows:
- camera window
- hand landmarks
- printed gesture logs
Normal mode runs silently, performing system actions.
I use MediaPipe hands model for recognising hand landmarks, captured from a camera frame using OpenCV that are later processed and classified.
Pipeline:
-
Capture frame using OpenCV
-
Detect hand landmarks via MediaPipe
-
Classify pose + movement into gesture
-
Execute system action using PyAutoGUI
Gestures map:
- scroll gesture -> PyAutoGui.scroll(y/-y)
- swipe gesture -> PyAutoGui.hotkey(ledft/right)
- open gesture -> PyAutoGui.doubleClick()
- close gesture -> PyAutoGui.hotkey("alt", "f4")
- mouse drag gesture -> pyautogui.moveTo(mouse_coords.x * screenWidth, mouse_coords.y * screenHeight)
- volume gesture -> pyautogui.press("volumeup"/"volumedown")
This program can execute system-level actions such as Alt + F4, global hotkeys and mouse control. The author is not responsible for any damage, data loss, or unintended actions caused by using this software. Use at your own risk and test it in a safe environment before everyday use