I am a computational scientist passionate about using machine learning and high-performance computing to answer biological questions at scale. I have a strong foundation in probabilistic modeling, algorithm design, and software engineering, and I am experienced with analyzing large-scale omic datasets. My current research focuses on designing interpretable, multi-modal machine learning algorithms integrating genomic, transcriptomic, and proteomic blood plasma data to improve our understanding of disease and identify novel biomarkers.
Technical Skills
- Programming languages: Python, C++, Rust, JavaScript, HTML, CSS, Workflow Description Language, Snakemake.
- Data analysis and visualization: Pandas, Polars, scikit-learn, TensorFlow, Pyro, NetworkX, Matplotlib, d3.js.
- Molecular data analysis: Illumina (genomic, metagenomic, transcriptomic), PacBio (genomic), Olink (proteomics).
- Software: Linux, Git, Google Cloud Platform, PostgreSQL, Adobe Illustrator.
Professional experience
Broad Institute of MIT and Harvard, Cambridge, MA
- Computational Scientist II, Data Sciences Platform, Sep 2024 - Current
- Enabled the characterization of structurally diverse genomic regions in Plasmodium falciparum by redesigning a hidden markov model for recombination-aware sequence alignment, resulting in 360x faster inference.
- Developed a data pre-processing and quality control pipeline, curating disease phenotypes from electronic health records for 10,000 participants in a national biobank and assessing the quality of associated genomic, transcriptomic, and proteomic samples.
- Working on designing interpretable, multi-modal machine learning models to cluster biobank participants, integrating genomic, transcriptomic, and proteomic blood plasma data to improve our understanding of disease and identify novel biomarkers.
- Computational Associate II, Bacterial Genomics Lab, Nov 2017 - Aug 2024
- Designed and implemented a new optimal partial order alignment algorithm in Rust, which was, on average, 4.1x faster than similar tools, enabling DNA sequence-to-graph alignments previously impossible.
- Led the development of Python and C++-based software specifically designed to track low-abundance (>0.1%) bacterial strains in complex microbial communities (e.g., the human gut microbiome) using whole metagenome sequencing data.
- Developed a cloud-based pipeline using the Workflow Description Language (WDL) and Docker to enable fast and reproducible characterization of thousands of metagenomic samples.
- Obtained detailed insight into the E. coli strain-level dynamics in a year-long longitudinal microbiome study of women with recurrent urinary tract infections (UTIs), revealing unexpected similarities with a healthy control group and that the UTI-causing strain is rarely cleared from the gut after antibiotics.


DSM-firmenich, Delft, The Netherlands
- Intern, Jun 2016 - Oct 2016
- Reviewed literature and proposed a plan to analyze a large-scale protein production problem with Bacillus subtilis using genome-scale metabolic models.
Studio bereikbaar, Rotterdam, The Netherlands
- Software Engineer, May 2013 - Jun 2016
- Led the development of web-based geographic information systems (GIS) tools using Django and PostgreSQL, facilitating collaborative project planning and design between engineering colleagues.
- These tools helped win several multi-million euro infrastructure construction contracts with the government.
Thales, Delft, The Netherlands
- Intern, Sep 2012 - Nov 2012
- Developed a GPU-accelerated tool to analyze the radar reflectivity of navy ships using nVidia's OptiX CUDA ray tracing library.
Extracurricular Projects
Python Software Foundation (VisPy Google Summer of Code 2015)
- Implemented a high-performance graph visualization system in Python and OpenGL, including several automatic graph layout algorithms.
- Contributed the open-source code upstream to VisPy, a high-performance scientific data visualization software
library.


Delft University of Technology, "Helios 3D team"
- Built a large 3x3x1 meter 3D RGB LED cube display as part of a student team.
- Architected and implemented the C++ embedded software responsible for receiving the image to display over Wi-Fi and driving individual LEDs to the correct color.
Education
- PhD Bioinformatics. Delft University of Technology. 2025.
- PhD in the bacterial genomics lab at the Broad Institute in collaboration with the bioinformatics lab at TU Delft.
- MSc Computer Science. Delft University of Technology. 2017.
- BEng Electrical Engineering. The Hague University of Applied Sciences. 2013.