**Alt text:** Slide titled “A Possible Solution: Leveraging Data From More Common Outcomes.” On the left is a block of explanatory text: * “Same data as before, same red decision boundary as before. Class 2 is an intermediate outcome between class 1 and class 3. (In previous slide, classes 1 and 2 were combined.)” * “Black dashed line: Bayes decision boundary between classes 1 and 2.” * “Abundant data near decision boundary between classes 1 and 2 ⇒ easier to learn.” * “The estimated decision boundary between classes 1 and 2 (orange) does seem to fit much better.” On the right is a scatter plot with horizontal axis labeled “x1” (from –1.5 to 1.0) and vertical axis “x2” (from –1.2 to 1.0). Each point is colored by its true class: blue for class 1, purple for class 2, and red for class 3. A black dashed line slopes down from upper-left to lower-right marking the theoretical Bayes boundary between classes 1 and 2. A gold-orange dashed line, nearly parallel but shifted, shows the model’s estimated boundary between classes 1 and 2. A faint red dashed line in the lower-left corner indicates the original boundary between classes 2 and 3 from the previous slide. To the right of the plot is a legend mapping colors to classes 1 (blue), 2 (purple), and 3 (red). Beneath the plot, in italic dark-red text: “IDEA: Maybe we can leverage our more precise estimate of the decision boundary between classes 1 and 2 to better estimate the decision boundary between classes 2 and 3 ⇒ better estimated probabilities of lying in class 3.”

Talk on PRESTO

On Monday I was lucky to give a presentation on “Predicting Rare Events by Shrinking Towards Proportional Odds” at the 90/30 Club. I’m grateful to the current organizers Ryo Sakai, Logan Graves, and Max Phelps for having me, as well as the founder of the group, Lydia Nottingham.

I figured this is as good of a time as any to gather the resources I have related to PRESTO in one place. So here they are:

  • The published paper
  • The arXiv version, which is less “smushed” and also has a bit more content in the main paper that had to be delegated to the appendix in the published version
  • A 5-minute video I recorded for ICML to summarize the paper, with slides
  • Our poster from ICML walking through the paper
  • The GitHub repo for the paper
  • A video of a talk I gave at DataCon LA 2023 that was mostly about the conceptual background and motivation for PRESTO, but also touched on PRESTO itself towards the end
  • Slides and code from the DataCon LA talk