From Measurements to Patients: Data Aggregation in Supervised Classification of X-Ray Diffraction Datasets

Date

May 15, 2026

From Measurements to Patients: Data Aggregation in Supervised Classification of X-Ray Diffraction Datasets

Authors: Alexander Alekseev, Keith Rogers, Lev Mourokh, Pavel Lazarev

This article examines how aggregation strategies can improve supervised machine learning for medical diagnostics using X-ray diffraction datasets. The work focuses on the transition from individual measurements to patient-level diagnosis, a critical step when multiple measurements are collected from the same patient or sample.

The authors applied aggregation approaches before and after machine learning modeling to two XRD datasets: human breast biopsy samples and canine claw samples. Random Forest and Logistic Regression classifiers were evaluated using ROC-AUC and balanced accuracy.

Across both datasets, aggregation improved classification performance, with post-model aggregation generally providing stronger results. For human breast samples, Random Forest with logit aggregation achieved an ROC-AUC above 0.9. For canine samples, both Random Forest with logit aggregation and Logistic Regression using the median cancer probability reached an ROC-AUC of about 0.85.

The study demonstrates that simple, interpretable aggregation methods can significantly improve patient-level classification in X-ray diffraction diagnostics, supporting future clinical applications of XRD-based structural biomarkers.

Keywords: machine learning; aggregation; supervised classification; X-ray diffraction

https://doi.org/10.3390/ijtm6020022

X-ray Imaging & Structural Biology

Jun 3, 2026

Vitacrystallography Presented at QuEBS2026

X-ray Imaging & Structural Biology

Jun 3, 2026

Vitacrystallography Presented at QuEBS2026

X-ray Imaging & Structural Biology

Sep 17, 2025

X-ray diffraction reveals alterations in mouse somatosensory cortex following sensory deprivation

X-ray Imaging & Structural Biology

Sep 17, 2025

Vitacrystallography

Vitacrystallography

From Measurements to Patients: Data Aggregation in Supervised Classification of X-Ray Diffraction Datasets

Date

Category