Clinicians often produce large amounts of data, from patient metrics to drug component analysis. Classical statistical analysis can provide a peek into data interactions, but in many cases, machine learning (ML) can provide additional insight into new features. CLASSify provides these clinicians with a way to automatically train tabular data on many different machine learning models to find which produces the best results.
The interface provides explainability scores for each feature that indicate its contribution to the model's predictions. Users can see exactly how each column of the data affects the model and could gain new insights into the data itself.
CLASSify also provides tools for synthetic data generation. Clinical datasets frequently have imbalanced class labels or protected information that necessitates the use of synthetically-generated data that follows the same patterns and trends as real data.
CLASSify currently provides ten unique supervised ML models to train and evaluate:
Additionally, three unsupervised learning techniques:
Each model has customizable parameters for submission, defaults, or automatic parameter tuning via Optuna.
CLASSify is available on an individual basis. Before you can get started, you must be granted the necessary permissions from an administrator.
Contact ai@uky.edu for access or submit the collaboration intake form.
A tutorial video explaining CLASSify and how to use it is available.
Examples of projects using CLASSify:
CLASSify: A Web-Based Tool for Machine Learning was accepted to AMIA in 2023. PubMed
Additional tools and programs used in CLASSify include ClearML for job queueing, Optuna for parameter tuning, and S3 for secure storage. All training and evaluation runs on the DGX cluster. The Synthetic Data Vault (SDV) library provides synthetic data generation models. Explainability scores are calculated using the SHAP algorithm.
CLASSify is not HIPAA compliant, but private. HIPAA-compliant instances can be created on request.