Data-driven Modeling of Complex Physical Systems (DMP)

Data-driven Modeling of Complex Physical Systems (DMP)

The DMP group is focused on the development of robust data-based methods for modeling, analysis, and control of complex dynamical systems. We seek to combine tools from a broad variety of fields, including machine learning, numerical linear algebra, and control theory. Practical application to physical systems on the atomistic or molecular scale is also part of our research agenda.

Computer simulations of dynamical systems are widely used to aid our understanding of complex real-world processes ranging across the natural and engineering sciences. However, three fundamental limitations apply to many complex systems of practical interest. The first problem is the highly non-linear dependence of the dynamics on a large number of parameters (dimensionality problem). The second is the presence of a broad range of characteristic timescales in the system (timescale problem). As a consequence, simulations of the system are forced to employ fairly small time steps, while time horizons of practical interest often exceed the basic simulation time step by many orders of magnitude. A third problem lies in the fact that many computational models of complex systems are already derived as approxi-mations to governing equations stemming from first principles. The ability of these approximate models to recover at least certain properties of the first principles model then needs to be assessed (model building problem).

Prototypical examples of systems subject to these issues are molecular dynamics (MD) simulations of biological macromolecules. At atomistic resolution, MD simulations model the temporal dynamics of 10^5 - 10^7 particles. In a standard setup, the integration time step is on the order of femtoseconds (10^{-15} s), whereas interesting biological phenomena occur on the timescale of milliseconds and beyond. Moreover, MD simulations seek to approximate the fundamental quantum mechanical description of a molecular system by a classical surrogate model, such as Hamiltonian or Langevin dynamics. The energy function (often called force field) is typically comprised of physically intuitive terms to ensure transferrability to a broad range of systems. However, testing the quality of MD force fields compared to a quantum description of the system is highly challenging.

The constantly increasing amount of available computational power offers great opportunities for data-driven modeling of physical systems such as the above. However, the challenges outlined previously will not be overcome by investing vast computational resources alone. The DMP group seeks to develop innovative algorithmic tools to accompany the model building process all along the way. Our main theoretical tool at this time is Koopman theory, which provides a linear statistical description of a nonlinear dynamical system on an infinite-dimensional function space. The main goal in this context is to develop efficient and statistically robust estimation methods equipped with a mathematical error theory. To this end, we draw on elements of statistical learning and optimization, numerical linear algebra, stochastic analysis, and control theory.

In more detail, our aims are:

1. Analyze the effect of the amount of data, the distribution of the data, the complexity of the model class, and the underlying structure of the system, on the accuracy of the resulting data-based model and on quantities derived from the latter.

2. Develop robust tools for the identification of reduced variables, while keeping control of the error incurred on quantities of interest. We then seek to leverage these collective variables for the purpose of accelerated sampling of the original system dynamics.

3. Utilise Koopman-based models to identify the dynamical laws of a system in such a way that certain dynamical properties are optimally recovered.


Go to Editor View