In a 2022 research paper that I wrote with my advisor Jacob Bien, we proposed a novel feature selection method called cluster stability selection. Cluster stability selection is a method for identifying features that are useful for predicting a response variable. It has applications in medical research (including genomics and genetics), economics, analyzing survey data, psychology, and more.
I recently created an R package called cssr that implements cluster stability selection. I designed it to be user-friendly, reliable, and efficient. It’s well-tested (with over 3,300 tests) and takes advantage of parallel processing for computational speed. Read the post to learn more about it and try it out yourself!