Our group’s research programs aim to explore and expand upon how advances in causal inference; semi-parametric estimation, especially causal (i.e., de-biased, targeted) machine learning; statistical machine learning; and computational statistics may catalyze discovery in the biomedical and public health sciences. The methodological research programs emphasize assumption-lean and model-agnostic frameworks, adopting a translational perspective to formulate causal-analytic and statistical methods tailored to answering substantive questions arising in the applied sciences. Broadly, our approach draws upon principles from causal inference to translate scientific questions into precise, interpretable statistical estimands, which one can then aim to accurately and efficiently learn from data generated by observational studies or randomized controlled trials via analytic methods that
- avoid imposing restrictions not justified by available domain knowledge;
- incorporate flexible, adaptive modeling strategies (e.g., machine learning); and
- apply semi-parametric efficiency theory for best-in-class uncertainty quantification.
Thus, thematically, our research programs integrate core aspects of causal inference, to strive to target interpretable estimands, with tools and techniques from non-parametric estimation and statistical machine learning, to avoid restrictive modeling assumptions, using semi-parametric theory to ensure asymptotically efficient estimation. This line of work has yielded novel insights applicable for causal (i.e., de-biased, targeted) machine learning (e.g., targeted minimum loss estimation, sieve estimation); non-parametric causal mediation analysis to study questions of mechanism using novel direct and indirect effect estimands; treatment effect heterogeneity and effect modification analyses to inform stratified medicine; variance-moderated semi-parametric estimation for flexible and stable biomarker discovery; corrections necessary to reliably draw accurate inferences from data collected using two-phase (auxiliary- or outcome-dependent) sampling designs; and causal effect estimands for continuous exposures based on flexible intervention regimes.
We are often interested in and open to working in new substantive areas—wherever the application of rigorous statistical thinking (inclusive of novel techniques and tools) is welcome.
Here are a few highlights from research projects completed over the last few years:
A secondary theme of our research centers on the role of high-performance numerical computing and the development of open-source software tools for statistical science. While distinct, these areas are unified by the overarching aims of pushing the boundaries of statistical methods development and promoting reproducibility and transparency in the practice of applied statistics. Consistent with our commitment to open science, new methods developed by members of the lab are accompanied by open-source software implementations both to ensure replicability of the reported work and to facilitate widespread use of the proposed techniques.
Browse more about our work on statistical methods innovations, their use in applied sciences, and in developing free and open-source software for statistical science.



