Nova In Silico is a health tech company that develops an in silico clinical trial platform jinkō to simulate drug efficacy and optimize clinical development using virtual patients and disease modeling. As an innovative company, we offer a dynamic work environment distinct from larger, established organizations. Interns will gain significant responsibilities and benefit from a steep learning curve, supported by a highly motivated team. Learn more at www.novainsilico.ai.

Keyword

Surrogate Model, Gaussian Process, Neural Networks, Optimization, Classification, PyTorch

Background

Quantitative Systems Pharmacology and its Challenges

Quantitative Systems Pharmacology (QSP) is a critical discipline in modern drug development. It involves creating complex, mechanistic mathematical models that describe the dynamic interactions between a drug and a biological system. These models integrate pathophysiology and pharmacology to predict a drug’s effect, safety, and efficacy across diverse patient populations. At Nova In Silico, our R&D efforts are focused on building and applying these high-fidelity QSP models.

A significant challenge arises when fitting these models to real-world clinical data. To account for variability between individuals, QSP models are often formulated as Non-Linear Mixed-Effects (NLME) models. Parameter estimation for NLME models, which is typically performed via Maximum Likelihood Estimation (MLE), is a difficult and computationally intensive task. Traditional estimation algorithms can take hours or even days to converge, creating a substantial bottleneck in the R&D pipeline.

Current Surrogate Models and Limitations

To address this computational bottleneck, Nova In Silico has successfully developed surrogate models for some of our key QSP models. These surrogates are lightweight, fast-to-execute approximations of the full, complex QSP models, built using the PyTorch framework. Their purpose is to capture the essential input-output relationship of the original model while reducing computation time by orders of magnitude. This speed-up has been instrumental, enabling us to efficiently perform parameter estimation using standard algorithms like Expectation-Maximization (EM).

Our current generation of surrogate models is primarily based on Gaussian Processes (GPs). GPs are a powerful non-parametric statistical method, well-regarded for their ability to interpolate in low-dimensional spaces and, crucially, to provide a principled measure of uncertainty for their predictions.

However, GPs also present significant limitations. Their computational complexity scales poorly with the size of the training dataset, making them cumbersome for large-scale simulations. More importantly, they are inherently designed for continuous, smooth functions. This underlying assumption makes it challenging to model the complex realities of clinical data and QSP simulations, which are often neither simple nor smooth.

Flexible ML-Based Surrogate Models

Machine learning or deep learning models, which can be implemented seamlessly within our existing PyTorch ecosystem, provide a new level of flexibility that can directly address the shortcomings of GPs and achieve two key advantages:

  • Handling heterogeneous inputs (continuous data, categorical data, time-varying covariates)
  • Modeling complex and discontinuous outputs (e.g. drug dose driven by administration events)

Objective

Develop a new suite of ML-based surrogate models. The intern will implement and train several promising architectures, leveraging data generated from our internal QSP models. This work will culminate in a rigorous evaluation of their performance, comparing their accuracy, training speed, and, most importantly, their flexibility in handling the complex data structures.

You are

  • A team player, a good listener, and an effective communicator
  • Curious and proactive, ready to face real-life engineering challenges 
  • Autonomous and self-motivated with strong analytical and problem-solving skills
  • Eager to learn mathematical modeling and simulations of biological systems
  • Willing to explore latest advances in science and technology
  • Responsive and capable of tackling time-sensitive issues with agility

You will

  • Review the scientific literature on relevant machine learning methods
  • Prototype several machine learning-based surrogate models
  • Evaluate their accuracy, training speed, and flexibility in handling complex data structures
  • Integrate solutions into Nova’s simulation platform

Methodology and technical skills

We are looking for people who know some of the following or are eager to learn and work with them

  • Machine learning in Python, PyTorch
  • Statistical modeling, NLME models

A professional English level (written and oral) is required for this role.

Practical information

  • Salary: Competitive
  • Start date: Flexible