Directional Tests
Directional tests attempt to test if a model is actually learning sensible information from the data. Chip Huyen gives an example of house price prediction [1]. If the square footage increases while all other features are held constant, then surely the price should not decrease.
For our example, we will increase the study time feature to be maximum, meaning all students study a lot and compare the model's performance against the baseline on the entire untampered dataset.
We see that accuracy drops quite a bit! Recall drops heavily while the f-1 score also drops. On the other hand, precision goes up significantly to almost 100%. What's going on here? If students study longer, they are more likely to fail? That doesn't make any sense. Clearly, there are some issues that require further study.
References
[1] Huyen, Chip (2022). Designing Machine Learning Systems