Learning with Stakes: A Cost-Sensitive Framework for Classification
Room 236
Presenter: Kabir Kang
Modality: Traditional Talk
Abstract
The overall topic of my talk will be about describing a framework in which we can give each example a cost that reflects how consequential it is to get wrong. In data with multiple annotators, the clearer the agreement, the higher the cost of misclassifying it. This framework is useful in any setting where obvious cases matter more than borderline cases. In my work so far, I have found that cost-aware training improves the cost-aware score, even when plain accuracy barely changes. This means that if you can define a reasonable per-example cost you can align your model with what matters. I’ll walk through three simple training setups: a standard classifier, a cost-weighted version that pays more attention to high-cost examples, and a model that predicts a margin/confidence signal and then classifies. I’ll evaluate them two ways: the familiar accuracy, and a cost-aware score that asks, of all the potential cost in the dataset, how much did our mistakes actually incur? The key message is that this is a practical retrofit that can reduce the impact of the most harmful errors when the cost signal is at least moderately predictable. I’ll also be clear about limits of this methodology. For example, noisy or biased cost signals blunt the benefits, and distribution shifts can make costs misaligned. My research is still ongoing within the Data Centric ML group at Georgia Tech, so there may be more interesting findings that I can discuss.
Program
Check out the Program page for the full program!
Questions About the Conference?
Check out our FAQ page for answers and contact information!