Detect Body and Hand Pose with Vision

Written by Cihat Gündüz

Description: Explore how the Vision framework can help your app detect body and hand poses in photos and video. With pose detection, your app can analyze the poses, movements, and gestures of people to offer new video editing possibilities, or to perform action classification when paired with an action classifier built in Create ML. And we’ll show you how you can bring gesture recognition into your app through hand pose, delivering a whole new form of interaction. To understand more about how you might apply body pose for Action Classification, be sure to also watch the "Build an Action Classifier with Create ML" and "Explore the Action & Vision app" sessions. And to learn more about other great features in Vision, check out the "Explore Computer Vision APIs" session.

  • 21 points on the hand are recognized
  • 4 points per finger plus one for the wrist
  • Use VNDetectHumanHandPoseRequest
  • Default maximum hand count is 2
  • Multiple bodies supported, too
  • 5 points for the face, nose, eyes and ears
  • 3 points per arm
  • 6 for torso (overlapping shoulders with arm)
  • 3 points per leg (overlapping hip with torso)
  • Could also be used for offline analysis of e.g. a photo collection
  • Can be combined with CreateML classification

This note was originally published at

Missing anything? Corrections? Contributions are welcome 😃


Written by

Cihat Gündüz

Cihat Gündüz

Native iOS & Android developer who loves to share reusable work, like BartyCrouch, AnyLint, HandySwift & more. Founder of Flinesoft, an app company preparing for a "better future".