Breathing life into video pixels

Autonomous virtual humans that move and behave naturally are Siyu Tang’s vision. One area from which the computer scientist draws inspiration are our behavioural patterns. Collaboration with architects and surgeons provides further input – and it also reveals the enormous potential of virtual people.
At ETH Zurich, Siyu Tang has found the ideal environment in which she can make a difference that goes beyond scientific knowledge. (Photograph: ETH Zurich / Nicola Pitaro)

When she is collaborating with ETH Zurich architecture professors Fabio Gramazio and Matthias Kohler on the Flight Assembled Architecture (FAA) Revisited project at the Guggenheim Museum Bilbao, Siyu Tang – who leads the Computer Vision and Learning Group at the ETH Institute for Visual Computing – is able to reunite two great fascinations from her youth: “When deciding on a course of study, I had to choose between architecture and computer science. Being a computer scientist at ETH Zurich means I can enjoy interdisciplinary collaboration with world-class architects. That’s a perfect combination for me.”

Together with luminaries from other fields

Tang, a specialist in perceiving and modelling humans from visual input, has been working as an assistant professor at ETH Zurich for two and a half years. For her, the fact that the ETH environment makes such collaboration possible is one of the university’s major plus points. “There are world-renowned luminaries working in many areas of research here,” she says. This paves the way for interdisciplinary projects that otherwise wouldn’t be possible at this high level.

For the FAA Revisited project, Tang’s team populated the vertical pedestrian drone-built city with autonomous avatars (see video). In doing so, the computer scientists were able to build on their previous research on modelling virtual humans moving naturally through their environment for extended periods of time: “The big difference between research projects and real-world issues are the requirements for the technologies to be stable and generalizable in all conditions. The avatars in the FAA Revisited project must be able to move through the city essentially forever and in all possible situations.”

Avatars for the drone city: Naturally moving avatars bring “life” to the virtual version of the Flight Assembled Architecture project’s spectacular drone city. (Video: ETH Zurich)

Behavioural beats bring stability

To make this possible, Tang’s team drew on insights from behavioural biology. The longer their previous algorithms ran, the more their directional decisions would go astray – like in long-term weather forecasting – given the practically infinite number of possibilities on offer. This resulted in unnatural-looking movement patterns and logical conflicts.

Tang’s team has now adopted the 0.25 seconds it takes for a human to consciously perceive something and react – perhaps with a movement – as the interval at which the motion algorithm makes directional decisions. This sequencing of movement into these behavioural “beats”, combined with the introduction of a statistical random component into each directional decision, has profound consequences. For one thing, it stabilises movement in the long term, and for another, it gives that movement lifelike variability.

Simulations for training surgeons

Populating the architectural model is not the only interdisciplinary project Tang has tackled in Switzerland so far. Since the spring, her team has been working on a second ambitious project under the auspices of the Kantonsspital St. Gallen, the Centre Hospitalier Universitaire Vaudois (CHUV) and Balgrist University Hospital: ETH Zurich and ZHAW (Zurich University of Applied Sciences) are working with industry partners to develop simulation tools that are intended to raise the training of surgeons to a new level (see video). The goal is to develop standardised courses with a performance record in a format comparable to pilot training. This means surgeons, too, will be able to perfect their craft using immersive simulators in future. To date, their training has been conducted almost exclusively during actual operations or on cadavers, which massively limits individuals’ opportunities to practise what they have learned.

Digital transformation of surgical education: Lifelike surgical modelling provides the basis for future simulator training of surgeons. (Video: ETH Zurich / Industry Relations)

Building on her mother’s work

The visual data processing specialists are still in the early stages of their work, and the challenges are great: How can heavy bleeding, surgeons’ rapid movements, or different tissue textures be rendered realistically in AR/VR devices? Tang expects that to do this, her team will need to analyse images from multi-view cameras filming actual operations and integrate them into a model that can be efficiently learned and rendered in real-time.

For the young scientist, who grew up in China, there is an additional personal motivation in this project: her mother, an ophthalmologist, went to Mauritania for two years to treat patients and train local colleagues there when Tang was little. She can now build on that work: “Training on simulators can enable tremendous advances, especially for medicine in developing countries. The fact that I can indirectly continue my mother’s work makes it even more fulfilling for me.”

From university to industry and back again

Tang radiates a great passion for her field and her team. And her career is evidence that she doesn’t shy away from practical challenges. After graduating from Zhejiang University with a Bachelor’s degree in computer science, she didn’t immediately pursue an academic path. Rather, she moved from China’s elite university to the private sector, developing software for consumer electronic devices. “But after two years of doing that I got bored,” she remarks with a laugh.

Opting to do a Master’s degree next, she decided she wanted to study not in China but in Europe: “Many of the high-quality devices and instruments that my mother, a doctor, and my father, a mechanic engineer, used came from Germany. Plus the country is located in the middle of Europe, which makes it ideal as a base for getting to know the different countries.”

The move to Zurich from the Max Planck Institute for Intelligent Systems in Tübingen, Germany, where she led her first research group, is a stroke of luck for Tang, and not just from a scientific perspective. “For our four-and-a-half-year-old son, this city is a paradise, with the lake, all the playgrounds, the surrounding forests and the nearby mountains. We can do so much together as a family here.”

An egocentric path to natural behaviour

Tang has also set her sights high when it comes to her research projects. In addition to giving virtual people natural movement, in future she wants them to behave naturally, too. In the Flight Assembled Architecture project model, they will then no longer just move randomly through the rooms, but instead interact with one another and the environment like real people.

Tang sees giving avatars an egocentric view as the key to achieving natural interactions. Like real people, they should create a model of themselves of their environment from their first-person perspective and learn individual behaviour from it.

Impact beyond science

Architecture isn’t the only application area where introducing virtual humans that behave naturally will yield novel options. It will be interesting to see which interdisciplinary projects next let Tang’s virtual people realise their potential: “It’s hard to imagine a better working environment than the one here at ETH. I can inspire my research team with my vision, and together we can work with other research groups to make the kind of difference that goes beyond scientific knowledge.”