Understanding the dynamics of the epidemic

Numerous factors affect the spread of coronavirus. To better predict the progression of the epidemic, researchers at ETH Zurich and the University of California, Los Angeles have started a “datathon”. Participants from around the globe are called on to develop models based on publicly accessible data.
Logo of the Datathon "real time epidemic datathon". (Image: ETH Zurich)

Politicians across the world are currently trying to contain coronavirus using measures such as huge restrictions on the freedom of movement. The progression of an epidemic is affected by numerous factors. This is why ETH scientists are calling on researchers throughout the world to develop new prediction models for the spread of the virus. The aim of the “epidemic datathon”, which began last week, is to better understand the dynamics of the epidemic. The term “datathon” is derived from “hackathon” and refers to a challenge whereby participants use data to find new solutions for existing problems within a short time frame.

The datathon is organised by an interdisciplinary team at ETH: in addition to Nino Antulov-Fantulin and Dirk Helbing from the Professorship of Computational Social Science, the team comprises Lucas Böttcher from the Institute for Theoretical Physics, ETH, and the Department of Computational Medicine, University of California, Los Angeles (UCLA), and Zhang Ce and David Dao from ETH’s Department of Computer Science.

Publicly available data forms the basis for the datathon – for example, data related to case and testing figures or to the mobility of the population. “The participants develop models that they then verify in real time,” explains co-organiser Antulov-Fantulin. The datathon participants verify the accuracy of predications several times after a few days and weeks using real data; for example, by comparing the predicted number of infected persons in a country with the actual number of cases. Datathon participants can thus determine which of their models are the most effective. A deliberate decision has been made not to award prizes on ethical grounds.

Behaviour of the population is crucial

Epidemiological models that describe the progression of the epidemic already exist; however, these are meaningful only to a limited extent since they do not take various factors into account. How quickly the coronavirus spreads is not purely a biological question, but also depends on the measures taken to contain it. The behaviour of the population is also crucial. “All these factors should ideally flow into the models,” says Antulov-Fantulin.

The data is globally oriented and interdisciplinary: in addition to medical professionals and epidemiologists, economists, social scientists and machine learning experts are also in demand. Accordingly, the datathon’s advisory board consists of international experts from a variety of specialist fields.

More accurate prediction models could help us to understand how different containment measures will affect the spread of Covid-19 in the short and medium term. Bottlenecks in healthcare could also be anticipated on this basis. However, Antulov-Fantulin warns against too high expectations: “This is an open-ended project and we still don’t know if or when valid results will be available.”

Whether the datathon will be fruitful largely depends on its participants. ETH students in the field of data science have already made a start with the first prediction models. The datathon has also been open to international participants since 30 March and several teams from renowned universities have already registered.