Introduction To Machine Learning
Expert-defined terms from the Advanced Skill Certificate in AI for Healthcare Leaders course at London School of Planning and Management. Free to read, free to share, paired with a professional course.
Algorithm #
Algorithm
Definition #
A step‑by‑step computational method for solving a problem or performing a task. Examples: Gradient descent for optimizing model parameters; k‑nearest neighbors for classification. Applications: Predicting patient readmission risk, optimizing treatment pathways. Challenges: Choosing the right algorithm for high‑dimensional clinical data; balancing accuracy with interpretability.
Artificial Neural Network (ANN) #
Artificial Neural Network (ANN)
Definition #
A network of interconnected nodes (neurons) that mimics the brain’s structure to learn patterns from data. Examples: A multilayer perceptron that predicts disease onset from electronic health records (EHR). Applications: Imaging diagnostics, genomics‑based risk stratification. Challenges: Requires large labeled datasets; risk of overfitting; limited transparency for clinical decision‑making.
Bias #
Bias
Definition #
Systematic error introduced by assumptions in the learning process, leading to inaccurate predictions. Examples: A model trained on predominantly male patients may underperform for female patients. Applications: Bias detection tools help maintain equitable AI in population health management. Challenges: Identifying hidden biases in complex datasets; mitigating bias without sacrificing performance.
Classification #
Classification
Definition #
Assigning input data to predefined categories based on learned patterns. Examples: Classifying chest X‑rays as “normal” or “pneumonia.”
Applications #
Triage systems, disease coding automation. Challenges: Imbalanced class distributions; need for high specificity in medical contexts.
Cross‑Validation #
Cross‑Validation
Definition #
A technique for assessing model performance by partitioning data into training and testing subsets multiple times. Examples: 5‑Fold cross‑validation on a dataset of diabetic patients to evaluate a predictive model. Applications: Ensuring robust performance before deployment in clinical decision support. Challenges: Computationally intensive for large datasets; preserving temporal integrity in longitudinal health data.
Dataset #
Dataset
Definition #
A collection of data points used to train, validate, or test machine‑learning models. Examples: The MIMIC‑III intensive care database containing vital signs, lab results, and outcomes. Applications: Model development for sepsis prediction, medication adherence monitoring. Challenges: Data quality, missing values, heterogeneity across institutions, privacy regulations.
Deep Learning #
Deep Learning
Definition #
A subset of machine learning that uses deep neural networks with many hidden layers to automatically extract hierarchical features. Examples: Convolutional neural networks (CNNs) detecting retinal disease from fundus images. Applications: Radiology image analysis, pathology slide interpretation, speech‑based symptom extraction. Challenges: Requires extensive computational resources; interpretability concerns; limited labeled data in niche clinical domains.
Dimensionality Reduction #
Dimensionality Reduction
Definition #
Techniques that reduce the number of variables while preserving essential information. Examples: Principal Component Analysis (PCA) compressing high‑dimensional gene expression data. Applications: Visualization of patient clusters, speeding up model training. Challenges: Potential loss of clinically relevant features; selecting the appropriate number of components.
Ensemble Learning #
Ensemble Learning
Definition #
Combining multiple models to improve predictive performance and robustness. Examples: Random Forests aggregating decision trees for mortality prediction. Applications: Risk scoring systems, predictive maintenance of medical equipment. Challenges: Increased complexity, longer inference times, difficulty in interpreting combined predictions.
Feature Engineering #
Feature Engineering
Definition #
The process of creating, selecting, and transforming variables to improve model performance. Examples: Deriving body mass index (BMI) from height and weight, encoding ICD‑10 codes as binary flags. Applications: Enhancing predictive models for chronic disease progression. Challenges: Requires domain expertise; risk of leakage if future information is inadvertently used.
Feature Selection #
Feature Selection
Definition #
Identifying the most relevant variables for a model while discarding redundant or noisy ones. Examples: Using recursive feature elimination to select top 20 lab markers for heart failure prediction. Applications: Streamlining models for real‑time clinical decision support. Challenges: Balancing model simplicity with predictive power; handling correlated clinical variables.
Gradient Descent #
Gradient Descent
Definition #
An iterative optimization algorithm that minimizes a loss function by moving in the direction of steepest descent. Examples: Training a logistic regression model to predict hospital readmission. Applications: Parameter tuning for a wide range of models in health analytics. Challenges: Choosing appropriate learning rates; avoiding local minima in non‑convex loss landscapes.
Hyperparameter Tuning #
Hyperparameter Tuning
Definition #
The process of selecting the best configuration of model settings that are not learned from data. Examples: Adjusting the number of trees in a Random Forest to improve prediction of surgical complications. Applications: Optimizing AI tools for precision medicine. Challenges: Computational cost; risk of over‑fitting to validation data.
Imbalanced Data #
Imbalanced Data
Definition #
Situations where some classes have far fewer observations than others, leading to biased model performance. Examples: Rare disease detection where positive cases constitute <1% of the dataset. Applications: Early detection of rare cancers, adverse event monitoring. Challenges: Standard accuracy metrics become misleading; need for specialized sampling or loss functions.
Inference #
Inference
Definition #
The process of applying a trained model to new, unseen data to generate outputs. Examples: Using a trained neural network to classify new MRI scans in a hospital PACS system. Applications: Real‑time alerts for sepsis, automated triage. Challenges: Ensuring low latency, maintaining data privacy, handling distribution shift over time.
K‑Nearest Neighbors (KNN) #
K‑Nearest Neighbors (KNN)
Definition #
A non‑parametric algorithm that classifies a sample based on the majority label among its k closest neighbors. Examples: Predicting medication adherence by comparing a patient’s behavior to similar historical profiles. Applications: Simple baseline models for patient similarity analysis. Challenges: Sensitive to feature scaling; computationally expensive with large health datasets.
Logistic Regression #
Logistic Regression
Definition #
A statistical model that estimates the probability of a binary outcome using a logistic function. Examples: Estimating the likelihood of hospital readmission within 30 days. Applications: Baseline risk models, interpretable clinical scoring systems. Challenges: Linear decision boundary may not capture complex interactions; need for feature engineering.
Loss Function #
Loss Function
Definition #
A mathematical formulation that quantifies the error between predicted and true values, guiding model training. Examples: Cross‑entropy loss for classification of disease categories. Applications: Optimizing predictive accuracy for diagnostic AI. Challenges: Selecting appropriate loss for imbalanced clinical outcomes; ensuring smooth gradients for deep networks.
Machine Learning (ML) #
Machine Learning (ML)
Definition #
A field of computer science that enables systems to learn patterns from data without explicit programming. Examples: Predictive analytics for patient length of stay. Applications: Clinical decision support, resource allocation, personalized treatment recommendations. Challenges: Data heterogeneity, regulatory compliance, explainability for clinicians.
Model Interpretability #
Model Interpretability
Definition #
The degree to which a human can understand the internal mechanics and predictions of a model. Examples: Using SHAP values to illustrate why a model flagged a patient as high‑risk for stroke. Applications: Building trust in AI‑driven diagnostics, meeting regulatory requirements. Challenges: Trade‑off between accuracy and transparency; explaining deep‑learning models to non‑technical stakeholders.
Natural Language Processing (NLP) #
Natural Language Processing (NLP)
Definition #
Techniques for analyzing and deriving meaning from unstructured textual data. Examples: Extracting medication information from clinical notes using named‑entity recognition. Applications: Automated chart review, sentiment analysis of patient feedback. Challenges: Handling medical jargon, ensuring de‑identification, dealing with misspellings and abbreviations.
Neural Network Architecture #
Neural Network Architecture
Definition #
The structural design of a neural network, including the number and type of layers, connections, and activation functions. Examples: A CNN with pooling layers for processing CT scans. Applications: Tailoring models to specific imaging modalities or time‑series data. Challenges: Selecting architectures that balance performance with computational feasibility in hospital environments.
Overfitting #
Overfitting
Definition #
When a model learns noise and specific patterns from training data, resulting in poor generalization to new data. Examples: A model that perfectly predicts training set outcomes but fails on external test cohorts. Applications: Recognizing overfitting helps maintain reliable AI tools for patient care. Challenges: Detecting subtle overfitting in high‑dimensional health data; applying appropriate regularization techniques.
Precision #
Precision
Definition #
The proportion of true positive predictions among all positive predictions made by the model. Examples: In cancer detection, precision measures how many flagged lesions are truly malignant. Applications: Critical for diagnostic tools where false positives lead to unnecessary procedures. Challenges: Balancing precision with recall, especially in low‑prevalence diseases.
Recall #
Recall
Definition #
The proportion of actual positive cases that the model correctly identifies. Examples: Detecting all sepsis cases in an ICU dataset. Applications: High recall is essential for safety‑critical alerts. Challenges: Increasing recall may reduce precision; requires careful threshold selection.
Regularization #
Regularization
Definition #
Techniques that add a penalty to the loss function to discourage overly complex models and reduce overfitting. Examples: L2 regularization (ridge) applied to logistic regression for readmission prediction. Applications: Stabilizing models on sparse clinical datasets. Challenges: Choosing the right regularization strength; interpreting the effect on feature coefficients.
Reinforcement Learning (RL) #
Reinforcement Learning (RL)
Definition #
A learning paradigm where an agent learns to make sequential decisions by receiving rewards or penalties. Examples: Optimizing treatment dosing schedules to maximize patient outcomes. Applications: Adaptive therapy planning, robotic surgery assistance. Challenges: Defining appropriate reward functions; ensuring safety during exploration phases.
Sampling Bias #
Sampling Bias
Definition #
Systematic error introduced when the sampled data does not reflect the target population. Examples: Training a model on data from a single tertiary hospital may not generalize to community clinics. Applications: Awareness of sampling bias improves model transferability across health systems. Challenges: Acquiring diverse multi‑site datasets; correcting bias post‑hoc.
Scalability #
Scalability
Definition #
The ability of an algorithm or system to maintain performance as data volume or computational resources grow. Examples: Deploying a Spark‑based pipeline for processing millions of EHR records. Applications: Nationwide disease surveillance, population‑level predictive analytics. Challenges: Managing latency, ensuring data security across distributed environments.
Segmentation #
Segmentation
Definition #
Dividing an image or dataset into meaningful parts, often used in medical imaging to isolate structures. Examples: Segmenting tumor boundaries in MRI scans using U‑Net architectures. Applications: Quantifying lesion volume for treatment monitoring. Challenges: Obtaining accurate ground‑truth annotations; handling variability across imaging protocols.
Supervised Learning #
Supervised Learning
Definition #
Learning from input‑output pairs where the desired outcome is known, enabling the model to predict labels for new data. Examples: Predicting 30‑day mortality using labeled historical patient records. Applications: Risk stratification, outcome prediction. Challenges: Limited labeled data for rare conditions; high annotation costs.
Support Vector Machine (SVM) #
Support Vector Machine (SVM)
Definition #
A classification algorithm that finds the hyperplane maximizing the margin between classes, optionally using kernel functions for non‑linear separation. Examples: Classifying ECG signals as normal or arrhythmic. Applications: High‑accuracy binary classifiers in limited‑sample settings. Challenges: Sensitivity to parameter selection; computationally intensive with large feature sets.
Temporal Data #
Temporal Data
Definition #
Data points collected over time, capturing trends, cycles, and temporal dependencies. Examples: Daily blood glucose measurements for diabetic patients. Applications: Forecasting disease progression, detecting early deterioration. Challenges: Missing timestamps, irregular sampling intervals, concept drift.
Transfer Learning #
Transfer Learning
Definition #
Leveraging knowledge from a model trained on one task to improve performance on a related task with limited data. Examples: Fine‑tuning a ResNet model pretrained on ImageNet for histopathology slide classification. Applications: Accelerating development of AI tools for niche medical imaging domains. Challenges: Domain mismatch; ensuring transferred features are clinically relevant.
Underfitting #
Underfitting
Definition #
When a model is too simple to capture underlying patterns, leading to poor performance on both training and test data. Examples: Using a linear model for a highly non‑linear relationship between biomarkers and disease risk. Applications: Identifying underfitting prompts model complexity adjustments. Challenges: Detecting underfitting early; balancing model complexity against interpretability.
Unsupervised Learning #
Unsupervised Learning
Definition #
Learning patterns from data without explicit labels, uncovering hidden structures. Examples: Clustering patients based on multi‑omics profiles to discover disease subtypes. Applications: Phenotype discovery, anomaly detection in vital sign streams. Challenges: Validating clusters without ground truth; ensuring clinical relevance.
Validation Set #
Validation Set
Definition #
A subset of data used to fine‑tune model parameters and assess performance before final testing. Examples: Reserving 20% of a hospital dataset for validation of a predictive sepsis model. Applications: Preventing overfitting during model development. Challenges: Maintaining temporal integrity; avoiding data leakage.
Variance #
Variance
Definition #
The sensitivity of a model to fluctuations in the training data; high variance models capture noise. Examples: Decision trees that change drastically with small changes in patient records. Applications: Diagnosing high variance guides the use of ensemble methods. Challenges: Reducing variance without sacrificing model flexibility.
Weighted Loss #
Weighted Loss
Definition #
Adjusting the loss function to give more importance to under‑represented or critical classes. Examples: Assigning higher weight to mortality outcomes in a model predicting ICU discharge. Applications: Improving detection of rare but severe events. Challenges: Determining appropriate weight ratios; preventing over‑compensation that harms overall performance.
Word Embedding #
Word Embedding
Definition #
A representation of words as dense vectors that capture semantic relationships. Examples: Using Word2Vec to encode clinical note terminology for downstream classification. Applications: Enhancing symptom extraction, medication reconciliation from free‑text. Challenges: Domain‑specific vocabularies; handling out‑of‑vocabulary medical terms.
Zero‑Shot Learning #
Zero‑Shot Learning
Definition #
Enabling a model to recognize classes it has never seen during training by leveraging semantic information. Examples: Detecting a newly emerging disease pattern in imaging without explicit labeled examples. Applications: Rapid response to emerging health threats, such as novel pathogens. Challenges: Requires robust auxiliary information; risk of false positives in safety‑critical settings.