Upper Body Pose Estimation with Temporal Sequential Forests
In Proceedings British Machine Vision Conference 2014
AbstractOur objective is to efficiently and accurately estimate human upper body pose in gesture videos. To this end, we build on the recent successful applications of random forests (RF) classifiers and regressors, and develop a pose estimation model with the following novelties: (i) the joints are estimated sequentially, taking account of the human kinematic chain. This means that we don't have to make the simplifying assumption of most previous RF methods -- that the joints are estimated independently; (ii) by combining both classifiers (as a mixture of experts) and regressors, we show that the learning problem is tractable and that more context can be taken into account; and (iii) dense optical flow is used to align multiple expert joint position proposals from nearby frames, and thereby improve the robustness of the estimates. The resulting method is computationally efficient and can overcome a number of the errors (e.g. confusing left/right hands) made by RF pose estimators that infer their locations independently. We show that we improve over the state of the art on upper body pose estimation for two public datasets: the BBC TV Signing dataset and the ChaLearn Gesture Recognition dataset.
FilesExtended Abstract (PDF, 1 page, 573K)
Paper (PDF, 12 pages, 1.1M)