LiftPose3D: Turning 2D images into 3D models

EPFL scientists have developed a deep learning-based method called LiftPose3D, which can reconstruct 3D animal poses using only 2D poses from one camera. This method will have impact in neuroscience and bioinspired robotics.

Nik Papageorgiou 20.08.2021

“When people perform experiments in neuroscience they have to make precise measurements of behavior,” says Professor Pavan Ramdya at EPFL’s School of Life Sciences, who led the study. His group has now published a paper in Nature Methods presenting new software that can simplify one of neuroscience’s most crucial yet laborious tasks: capturing 3D models of freely moving animals. This tool allows them to study the brain mechanisms controlling body movements. This goal of reverse-engineering biological behavior has far-reaching applications in robotics and AI.

“In the past, we used a deep neural network to perform this kind of ‘pose estimation’ in animals,” says Ramdya, referring to the process by which a computer can predict the positions of body parts in camera images. “Each camera acquired a single image of an animal, and multiple images across different cameras could then be triangulated to calculate 3-dimensional positions or poses.” But this triangulation of images requires multiple, synchronized cameras and elaborate calibration protocols, making it hard to adopt for neuroscientific studies of small animals.

In 2019, Ramdya’s group introduced DeepFly3D, another deep learning-based software that uses multiple cameras to quantify the movements of a fruit fly in 3D space. Now, the researchers have leapt ahead with LiftPose3D, a neural network that does away with the need for multiple cameras, by being trained to map 2D poses of a freely moving animal into a 3D model.

“The challenges we wanted to overcome here was, first to reduce the number of cameras needed to perform 3D pose estimation,” says Ramdya. “Second, we wanted to address the problem of occlusions, where one of an animal’s body parts can move in front another, obstructing the camera’s view and making full triangulation impossible.”

«Our vision is to reverse engineer the nervous system and behavior in order to inform the design of robotics controllers.» Pavan Ramdya

Animals are generally very predictable in their behavioral patterns. If an animal, e.g. a mouse, performs a certain movement, it is very likely repeat it in the same or at least similar way the next time. This reproducibility allowed the scientists to train a neural network to map 2D poses onto 3D positions, reducing the number of cameras needed and overcoming the problem of occlusions. “We use deep networks that track 2D poses from each camera view, and then another network that then maps these 2D positions or key points to a library of 3D poses.”

“Now we can use data from previous experiments, where people performed 3D pose estimation on animals, to train our network,” says Ramdya. “We also performed a couple of tricks to be able to generalize that mapping across datasets from different experimental systems and different laboratories. For example, in another lab, they might place their cameras in slightly different positions. So we trained our network to be able to generalize across these potential variations.”

Another advantage of LiftPose3D is that it works with animals that are moving freely as opposed to tethered in a limited space–the usual practice in pose estimation studies. “To understand the nervous system, one must also take into account the biomechanics involved in real behaviors. For example, when a cockroach runs up a hill, physical interactions between the animal and its environment are critical but cannot be captured if the animal is tethered. Now, with LiftPose3D, we can record 3D poses in freely behaving animals, capturing these body-environment interactions.”

Since it needs no specific hardware, LiftPose3D can make pose-estimation studies much easier and cheaper to perform. “Our vision is to reverse engineer the nervous system and behavior in order to inform the design of robotics controllers,” says Ramdya. “The most effective way to do that is using experimentally accessible animal models. We designed LiftPose3D specifically to probe these models with fewer cameras, allowing us to get closer to the goal of reverse engineering the mechanisms that give rise to their complex behaviors.”