Important links
Abstract
Selecting the right performance metric is crucial for evaluating machine learning models. This study provides empirical evidence comparing accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC) across various classification scenarios. Using advanced statistical methods, machine learning techniques, and explainable AI (XAI), we demonstrate when each metric is most appropriate and how metric selection impacts model evaluation and decision-making.
Citation
@article{SujonEtAl:2025,
Author = {Sujon, K. M. and Hassan, R. and Choi, K. and Samad, M. A. and others},
Title = {Accuracy, precision, recall, f1-score, or MCC? empirical evidence from advanced statistics, ML, and XAI},
Journal = {Journal of Big Data},
Volume = {12},
Pages = {268},
Year = {2025},
Doi = {10.1186/s40537-025-01313-4}
}