About


I’m a statistician in Methodology Research, led by Keaven M. Anderson, at Merck & Co., Inc., Rahway, NJ, USA. I work on problems at the intersection of statistical methodology and research software engineering. My focus is on building software infrastructure that makes late-stage clinical development efficient and reliable.

I serve on the R Consortium Infrastructure Steering Committee and contribute to the R Submissions Working Group, which pioneered the first successful open source R submission pilot to the FDA. I’m a regular contributor to pharmaverse, an ecosystem of open source tools for clinical reporting.

My research interests include sparse linear models, representation learning, and developer tooling. I build software in R, Python, and Rust. Projects I maintain include ggsci, pkglite, rtflite, tinytopics, msaenet, and revdeprun.

Previously, I was a data scientist at Seven Bridges in Boston, building genomics platforms. I studied human genetics in Matthew Stephens lab at the University of Chicago. I have a Ph.D. in Statistics from Central South University, where I developed machine learning methods for high-dimensional data with Qing-Song Xu.