K. Tanaka

Kindai University (JAPAN)
When sports teachers induce their students to understand proper forms in target sports, higher effect can be obtained if video images of poses of the students themselves are presented for their observation. A viewpoint change function is desirable when a learner observes the forms using video images. Recently, TV sports programs have been employing multiple-view camera systems, which has facilitated smooth switching of viewpoint in TV programs. On the other hand, bringing multiple cameras into an ordinary gymnasium is difficult due to its cost and difficulties in operation. The objective of this research is to provide an application software that generates a 3D human model of a player (i.e., a virtual player) using a single video camera for sports teachers, thereby enabling observation of the virtual player's poses from any point of view. The recent availability of RGB-D cameras (e.g., Kinect by Microsoft Corp.), which are less expensive and readily available, has facilitated 3D motion capture applications from a single view. However, depth sensors in the RGB-D cameras have strict limitations on distance measurement. Therefore, this study has been developing a method that estimates 3D poses of players in 2D images employing an ordinary single video camera. For the estimation, the method utilizes geometric constraints on the field of play (e.g. tennis court) and geometric constraints on the line of sight from camera to player's joint. As a first step, the study focused on karate teaching and developed a semiautomatic method for the estimation. Joints on the 2D image are specified manually. Recent studies on joint detection proposed a detection method based on machine learning. The second step in this study takes advantage of that detection method. This paper describes the method developed in the first step.