Category: Uncategorized

  • How to conduct a synthetic data experiment

    How to conduct a synthetic data experiment

    Simulation studies (sometimes called synthetic data experiments or Monte Carlo simulations) are useful tools for generating evidence about whether a statistical claim is true. For example: Here’s the idea: Recently I taught a tutorial on the basics on simulation studies for undergraduate students as a part of the USC JumpStart program. I taught the basics…

  • PRESTO accepted to ICML 2023

    PRESTO accepted to ICML 2023

    I’m excited to announce that “Predicting Rare Events by Shrinking Towards Proportional Odds” has been accepted to the Fortieth International Conference on Machine Learning (ICML 2023)! In the paper, we propose PRESTO, a novel method for improving classification in the class imbalance setting. You can read my brief summary of the paper on Twitter.

  • cssr R Package

    cssr R Package

    In a 2022 research paper that I wrote with my advisor Jacob Bien, we proposed a novel feature selection method called cluster stability selection. Cluster stability selection is a method for identifying features that are useful for predicting a response variable. It has applications in medical research (including genomics and genetics), economics, analyzing survey data,…