Artificial Intelligence And Machine Learning
Expert-defined terms from the Professional Certificate in GDPR and AI Data Privacy Compliance course at London School of Planning and Management. Free to read, free to share, paired with a professional course.
Algorithm #
Algorithm
Concept #
A defined set of steps for solving a problem. Related terms: Model, Optimization, Heuristic. Explanation: In AI, an algorithm processes data to produce predictions or decisions. Example: Decision‑tree learning algorithm splits data based on feature values.
Practical application #
Used to classify emails as spam or not. Challenges: Selecting the right algorithm for a given data set and avoiding bias.
Artificial Intelligence (AI) #
Artificial Intelligence (AI)
Concept #
The simulation of human intelligence by machines. Related terms: Machine Learning, Deep Learning, Automation. Explanation: AI enables systems to perceive, reason, learn, and act. Example: Virtual assistants interpret voice commands and schedule meetings.
Practical application #
Customer‑service chatbots reduce response time. Challenges: Transparency, ethical use, and compliance with data‑protection laws.
Artificial Neural Network (ANN) #
Artificial Neural Network (ANN)
Concept #
A computational model inspired by biological neurons. Related terms: Deep Learning, Layers, Activation Function. Explanation: ANNs consist of interconnected nodes that transform inputs into outputs through weighted connections. Example: Image‑recognition network classifies pictures of cats.
Practical application #
Facial‑recognition systems for secure access. Challenges: Need for large labeled datasets and interpretability concerns.
Automated Decision‑Making (ADM) #
Automated Decision‑Making (ADM)
Concept #
Decision processes carried out by algorithms without human intervention. Related terms: Profiling, Transparency, Accountability. Explanation: ADM can affect individuals’ rights, making privacy impact assessments essential. Example: Credit‑score algorithms approve or reject loan applications.
Practical application #
Real‑time fraud detection in banking. Challenges: Ensuring fairness, avoiding discrimination, and meeting GDPR “right to explanation” expectations.
Backpropagation #
Backpropagation
Concept #
A training method for adjusting neural‑network weights. Related terms: Gradient Descent, Loss Function, Learning Rate. Explanation: Errors propagate backward from output to input layers, updating weights to minimize loss. Example: Training a speech‑recognition model reduces transcription errors.
Practical application #
Improves accuracy of voice‑controlled devices. Challenges: Vanishing gradients in deep networks and computational cost.
Bias (Algorithmic) #
Bias (Algorithmic)
Concept #
Systematic error that skews outcomes. Related terms: Fairness, Discrimination, Data Quality. Explanation: Bias can arise from unrepresentative training data or flawed model design. Example: Facial‑recognition system misidentifies darker‑skinned individuals.
Practical application #
Auditing AI pipelines to detect bias before deployment. Challenges: Identifying hidden bias, mitigating it, and documenting remediation steps for regulators.
Big Data #
Big Data
Concept #
Extremely large and complex data sets. Related terms: Volume, Velocity, Variety. Explanation: AI models often require big data to achieve high performance. Example: Social‑media platforms analyze billions of posts for sentiment trends.
Practical application #
Predictive maintenance in manufacturing based on sensor streams. Challenges: Ensuring lawful processing under GDPR, anonymization, and storage security.
Classification #
Classification
Concept #
Assigning items to predefined categories. Related terms: Supervised Learning, Labels, Confusion Matrix. Explanation: A model learns from labeled examples to predict categories of new data. Example: Email classifier tags messages as “spam” or “inbox”.
Practical application #
Medical diagnosis tools categorizing images as benign or malignant. Challenges: Class imbalance, overfitting, and requirement for high‑quality labeled data.
Clustering #
Clustering
Concept #
Grouping similar data points without pre‑assigned labels. Related terms: Unsupervised Learning, K‑means, Silhouette Score. Explanation: Algorithms discover natural structures in data, useful for segmentation. Example: Customer segmentation clusters shoppers by purchasing behavior.
Practical application #
Targeted marketing campaigns based on cluster profiles. Challenges: Determining the optimal number of clusters and interpreting results.
Computer Vision #
Computer Vision
Concept #
Enabling machines to interpret visual information. Related terms: Image Recognition, Convolutional Neural Network, Object Detection. Explanation: Techniques extract features from pixels to identify objects, scenes, or actions. Example: Self‑driving cars detect pedestrians and traffic signs.
Practical application #
Automated quality inspection on production lines. Challenges: Managing privacy of captured images and ensuring compliance with GDPR’s “data minimization” principle.
Confidential Computing #
Confidential Computing
Concept #
Protecting data while it is being processed. Related terms: Trusted Execution Environment, Homomorphic Encryption, Secure Enclave. Explanation: Hardware‑based isolation prevents unauthorized access during AI inference. Example: Financial analytics run inside an enclave to keep client data private.
Practical application #
Collaborative AI training across organizations without exposing raw data. Challenges: Limited hardware support, performance overhead, and verification of enclave integrity.
Cross‑Validation #
Cross‑Validation
Concept #
Technique for assessing model performance on unseen data. Related terms: Train‑Test Split, K‑fold, Overfitting. Explanation: Data is partitioned into multiple folds; each fold serves as a test set once. Example: 5‑fold cross‑validation evaluates a churn‑prediction model.
Practical application #
Selecting the best hyperparameters for a classifier. Challenges: Increased computational time and possible data leakage if folds are not independent.
Data Anonymization #
Data Anonymization
Concept #
Removing personally identifiable information (PII) from data sets. Related terms: Pseudonymization, De‑identification, Re‑identification Risk. Explanation: Anonymized data cannot be linked to an individual, facilitating lawful AI training. Example: Patient records are stripped of names and IDs before model training.
Practical application #
Public health research using AI while respecting privacy. Challenges: Ensuring true irreversibility and balancing utility versus privacy.
Data Governance #
Data Governance
Concept #
Framework for managing data availability, usability, integrity, and security. Related terms: Stewardship, Policy, Metadata. Explanation: Strong governance supports compliance with GDPR and AI ethics. Example: Enterprise data catalog tracks lineage of training data.
Practical application #
Auditable pipelines for regulated industries (e.G., Pharma). Challenges: Coordinating across departments, maintaining up‑to‑date documentation, and handling cross‑border data transfers.
Data Minimization #
Data Minimization
Concept #
Collecting only data necessary for a specific purpose. Related terms: Purpose Limitation, Retention, GDPR Principle. Explanation: AI projects must justify each data element used in model development. Example: A recommendation engine stores only product‑view events, not full browsing history.
Practical application #
Reducing storage costs and exposure risk. Challenges: Determining the minimal set that still yields acceptable model performance.
Data Provenance #
Data Provenance
Concept #
Record of data origin, transformations, and ownership. Related terms: Lineage, Audit Trail, Metadata. Explanation: Provenance supports transparency and accountability in AI pipelines. Example: Each training sample logs the source system, timestamp, and cleaning steps.
Practical application #
Demonstrating compliance during regulator inspections. Challenges: Capturing provenance at scale and integrating with existing data‑management tools.
Data Protection Impact Assessment (DPIA) #
Data Protection Impact Assessment (DPIA)
Concept #
Systematic process to evaluate privacy risks of a processing activity. Related terms: Risk Assessment, GDPR Article 35, Mitigation Measures. Explanation: Required when AI introduces high‑risk processing, such as large‑scale profiling. Example: Before launching a facial‑recognition system in a public space, a DPIA identifies potential discrimination.
Practical application #
Provides documented evidence of privacy‑by‑design. Challenges: Accurately forecasting risks, involving stakeholders, and updating DPIA as models evolve.
Data Quality #
Data Quality
Concept #
Accuracy, completeness, consistency, and timeliness of data. Related terms: Validation, Cleansing, Garbage In‑Garbage Out. Explanation: High‑quality data is essential for reliable AI outcomes and GDPR compliance. Example: Missing values in sensor logs are imputed before model training.
Practical application #
Improves predictive maintenance accuracy. Challenges: Detecting subtle errors, handling noisy data, and maintaining quality over time.
Data Subject Rights #
Data Subject Rights
Concept #
Rights granted to individuals under GDPR. Related terms: Right to Access, Right to Erasure, Right to Explanation. Explanation: AI systems must be designed to honor requests such as data deletion or model‑output explanations. Example: A user requests removal of their profile from a recommendation engine.
Practical application #
Implementing automated pipelines to locate and delete all personal data. Challenges: Tracing data across distributed training pipelines and ensuring downstream models no longer rely on deleted records.
Deep Learning #
Deep Learning
Concept #
Subset of machine learning using multi‑layer neural networks. Related terms: ANN, Convolutional Neural Network, Representation Learning. Explanation: Deep models automatically learn hierarchical features from raw data. Example: Transformer models generate human‑like text.
Practical application #
Language translation services. Challenges: High computational demand, opacity, and difficulty meeting GDPR “explainability” requirements.
Dimensionality Reduction #
Dimensionality Reduction
Concept #
Reducing the number of variables while preserving essential information. Related terms: Principal Component Analysis, t‑SNE, Feature Selection. Explanation: Simplifies models, speeds up training, and mitigates overfitting. Example: PCA compresses image data from 1024 dimensions to 50 principal components.
Practical application #
Visualizing high‑dimensional customer data. Challenges: Loss of interpretability and potential removal of subtle privacy‑relevant signals.
Edge AI #
Edge AI
Concept #
Deploying AI inference on devices at the network edge. Related terms: On‑Device Inference, Latency, Federated Learning. Explanation: Edge AI processes data locally, reducing data transfer and enhancing privacy. Example: Smartphone camera applies object detection without sending images to the cloud.
Practical application #
Real‑time video analytics in surveillance. Challenges: Limited compute resources, model compression, and secure updates.
Explainable AI (XAI) #
Explainable AI (XAI)
Concept #
Techniques that make AI decisions understandable to humans. Related terms: Transparency, Model Interpretability, SHAP, LIME. Explanation: XAI helps satisfy GDPR’s accountability and “right to explanation” obligations. Example: SHAP values highlight which features influenced a loan‑approval score.
Practical application #
Auditing credit‑scoring algorithms for fairness. Challenges: Balancing explanation depth with model performance and protecting proprietary algorithms.
Federated Learning #
Federated Learning
Concept #
Training AI models across multiple devices without sharing raw data. Related terms: Decentralized Training, Privacy Preservation, Aggregation. Explanation: Each participant computes local model updates; a central server aggregates them. Example: Mobile keyboards improve predictive text while keeping typing data on the device.
Practical application #
Collaborative health‑research models across hospitals. Challenges: Communication overhead, handling heterogeneous data, and ensuring convergence.
Feature Engineering #
Feature Engineering
Concept #
Process of creating informative input variables for models. Related terms: Feature Extraction, Transformation, Scaling. Explanation: Good features improve model accuracy and reduce training time. Example: Deriving “average purchase value” from transaction logs.
Practical application #
Fraud detection models using engineered risk scores. Challenges: Manual effort, risk of leaking sensitive information, and maintaining consistency across pipelines.
Feature Selection #
Feature Selection
Concept #
Choosing a subset of relevant features for model building. Related terms: Dimensionality Reduction, Wrapper Methods, Filter Methods. Explanation: Reduces overfitting and improves interpretability. Example: Recursive Feature Elimination selects top 10 predictors for churn.
Practical application #
Lightweight models for embedded devices. Challenges: Computational cost for large feature sets and potential loss of predictive power.
Generative Adversarial Network (GAN) #
Generative Adversarial Network (GAN)
Concept #
Two neural networks (generator and discriminator) competing to produce realistic data. Related terms: Synthetic Data, Deepfakes, Unsupervised Learning. Explanation: GANs can create artificial datasets that mimic real data distributions. Example: GAN generates realistic face images for training without using actual persons.
Practical application #
Data augmentation for rare medical conditions. Challenges: Ethical misuse (deepfakes), difficulty controlling output quality, and privacy concerns when synthetic data resembles real individuals.
General Data Protection Regulation (GDPR) #
General Data Protection Regulation (GDPR)
Concept #
EU law governing personal data processing. Related terms: Lawful Basis, Data Controller, Data Processor. Explanation: AI systems that handle personal data must comply with GDPR principles. Example: Company documents the lawful basis for using customer data in a recommendation engine.
Practical application #
Establishes contracts with AI service providers that include GDPR clauses. Challenges: Interpreting vague provisions (e.G., “Legitimate interests”) in AI contexts.
Gradient Descent #
Gradient Descent
Concept #
Optimization algorithm that iteratively adjusts model parameters to minimize loss. Related terms: Learning Rate, Convergence, Stochastic Gradient Descent. Explanation: Parameters move opposite to the gradient of the loss function. Example: SGD updates weights after each mini‑batch during neural‑network training.
Practical application #
Training large‑scale language models. Challenges: Choosing appropriate learning rates, avoiding local minima, and handling non‑convex loss surfaces.
Hyperparameter Tuning #
Hyperparameter Tuning
Concept #
Process of selecting optimal settings that govern model training. Related terms: Grid Search, Bayesian Optimization, Validation Set. Explanation: Hyperparameters (e.G., Depth, regularization strength) are not learned from data but set before training. Example: Random search finds the best number of trees for a random‑forest classifier.
Practical application #
Improves model performance without altering data. Challenges: Computational expense and risk of over‑optimizing on validation data.
Inference #
Inference
Concept #
Using a trained model to make predictions on new data. Related terms: Deployment, Latency, Batch Processing. Explanation: Inference can occur in real time or offline, depending on use case. Example: API endpoint returns sentiment score for incoming text.
Practical application #
Real‑time recommendation engines. Challenges: Scaling to high request volumes, ensuring low latency, and protecting inference data from leakage.
Instance Segmentation #
Instance Segmentation
Concept #
Computer‑vision task that classifies each pixel and groups them into object instances. Related terms: Semantic Segmentation, Mask R‑CNN, Bounding Box. Explanation: Provides detailed shape information, useful for precise analysis. Example: Mask R‑CNN identifies and outlines each vehicle in a traffic video.
Practical application #
Automated inventory counting in warehouses. Challenges: High computational load and need for extensive annotated data.
Interpretability #
Interpretability
Concept #
Degree to which a human can understand the cause of a model’s output. Related terms: Explainable AI, Model Transparency, Feature Importance. Explanation: Interpretable models help stakeholders trust and verify AI decisions. Example: Decision‑tree paths show why a loan was denied.
Practical application #
Regulatory reporting for high‑risk AI systems. Challenges: Trade‑off between interpretability and predictive power, especially for deep models.
Joint Data Controller #
Joint Data Controller
Concept #
Two or more entities jointly determine the purposes and means of processing. Related terms: Data Sharing Agreement, Responsibility, GDPR Article 26. Explanation: Joint controllers share accountability for compliance. Example: Two hospitals collaborate on a shared AI model for disease prediction.
Practical application #
Formalizing roles in multi‑party AI projects. Challenges: Coordinating DPIAs, ensuring consistent data‑subject communications, and allocating liability.
K #
Anonymity
Concept #
Privacy technique that makes each record indistinguishable from at least k‑1 others. Related terms: L‑Diversity, Differential Privacy, De‑identification. Explanation: Reduces re‑identification risk by generalizing or suppressing quasi‑identifiers. Example: Dataset is transformed so that each zip‑code appears in at least 5 records.
Practical application #
Publishing health statistics while protecting patient privacy. Challenges: Balancing data utility with anonymity and handling background knowledge attacks.
Kernel Method #
Kernel Method
Concept #
Technique that maps data into higher‑dimensional space to make it linearly separable. Related terms: Support Vector Machine, Radial Basis Function, Feature Space. Explanation: Kernels compute inner products without explicit transformation. Example: RBF kernel enables SVM to separate non‑linear data.
Practical application #
Text classification where linear separation is insufficient. Challenges: Choosing appropriate kernel and managing computational complexity.
Knowledge Distillation #
Knowledge Distillation
Concept #
Transferring knowledge from a large “teacher” model to a smaller “student” model. Related terms: Model Compression, Transfer Learning, Pruning. Explanation: Student learns to mimic teacher’s output, achieving comparable performance with reduced size. Example: Distilled BERT model runs on mobile devices with limited memory.
Practical application #
Deploying AI on edge devices while preserving accuracy. Challenges: Maintaining fidelity, avoiding loss of critical features, and verifying privacy compliance of distilled models.
Label Leakage #
Label Leakage
Concept #
Situation where target information unintentionally appears in features. Related terms: Data Contamination, Overfitting, Validation Leakage. Explanation: Causes inflated performance metrics and misleading model evaluation. Example: Training set includes a column that directly encodes the outcome.
Practical application #
Auditing data pipelines to detect and remove leakage before model training. Challenges: Identifying subtle leakage sources, especially in complex feature engineering.
Latent Variable #
Latent Variable
Concept #
Hidden variable inferred from observed data. Related terms: Hidden Markov Model, Autoencoder, Probabilistic Model. Explanation: Captures underlying structure that explains observable phenomena. Example: Autoencoder learns a low‑dimensional latent representation of images.
Practical application #
Anomaly detection by reconstructing normal behavior and flagging deviations. Challenges: Interpreting latent space and ensuring it does not encode personal attributes inadvertently.
Legal Basis (GDPR) #
Legal Basis (GDPR)
Concept #
Justification for processing personal data under GDPR. Related terms: Consent, Contractual Necessity, Legitimate Interest. Explanation: Each AI processing activity must be matched to an appropriate legal basis. Example: Consent obtained for using facial data in a security system.
Practical application #
Documenting lawful basis in model‑deployment documentation. Challenges: Maintaining consent records, handling withdrawal, and demonstrating proportionality.
Loss Function #
Loss Function
Concept #
Metric that quantifies error between predicted and true values. Related terms: Objective Function, Cross‑Entropy, Mean Squared Error. Explanation: Training aims to minimize the loss across the dataset. Example: Cross‑entropy loss measures classification error in a neural network.
Practical application #
Guiding optimizer during model training. Challenges: Selecting appropriate loss for imbalanced data and avoiding non‑convexity.
Machine Learning (ML) #
Machine Learning (ML)
Concept #
Field of AI that enables systems to learn patterns from data. Related terms: Supervised Learning, Unsupervised Learning, Reinforcement Learning. Explanation: ML algorithms improve performance with experience without explicit programming. Example: Random‑forest classifier predicts churn based on historical usage.
Practical application #
Dynamic pricing adjustments in e‑commerce. Challenges: Data bias, model drift, and compliance with privacy regulations.
Model Drift #
Model Drift
Concept #
Degradation of model performance over time due to changing data distributions. Related terms: Concept Drift, Monitoring, Retraining. Explanation: Continuous monitoring is required to detect and address drift. Example: Spam filter accuracy declines as spammers adopt new tactics.
Practical application #
Scheduled retraining pipelines to keep models current. Challenges: Detecting subtle drift, balancing retraining frequency with resource cost, and ensuring new models respect GDPR.
Model Explainability #
Model Explainability
Concept #
Ability to describe how a model arrives at a specific output. Related terms: XAI, Transparency, SHAP Values. Explanation: Explainability supports accountability and user trust. Example: Local Interpretable Model‑agnostic Explanations (LIME) highlights influential words in a text classification.
Practical application #
Providing customers with reasons for a credit‑score change. Challenges: Scaling explanations to large models and protecting trade secrets.
Model Governance #
Model Governance
Concept #
Policies and procedures overseeing model lifecycle from design to retirement. Related terms: Model Registry, Audit, Compliance. Explanation: Governance ensures models meet ethical, legal, and performance standards. Example: Model registry records version, training data, and approved use cases.
Practical application #
Centralized oversight for AI in regulated sectors (e.G., Finance). Challenges: Keeping documentation up‑to‑date, integrating with CI/CD pipelines, and handling cross‑jurisdictional regulations.
Model Monitoring #
Model Monitoring
Concept #
Ongoing observation of model performance and behavior in production. Related terms: Drift Detection, Alerting, Metrics Dashboard. Explanation: Monitoring identifies anomalies, bias emergence, and security threats. Example: Real‑time dashboard tracks prediction latency and accuracy for a recommendation engine.
Practical application #
Automated alerts trigger retraining when accuracy falls below a threshold. Challenges: Defining appropriate metrics, avoiding false positives, and ensuring monitoring data itself complies with privacy rules.
Model Retraining #
Model Retraining
Concept #
Updating a model with new data to maintain or improve performance. Related terms: Incremental Learning, Transfer Learning, Continuous Integration. Explanation: Retraining can be scheduled or triggered by drift detection. Example: Weekly batch jobs incorporate latest transaction data into fraud model.
Practical application #
Keeping AI services accurate in dynamic environments. Challenges: Managing version control, ensuring new data respects consent, and validating post‑retraining compliance.
Model Validation #
Model Validation
Concept #
Systematic assessment of model suitability before deployment. Related terms: Test Set, Cross‑Validation, Performance Metrics. Explanation: Validation checks for accuracy, fairness, robustness, and legal compliance. Example: Bias audit reveals gender disparity in hiring recommendation model.
Practical application #
Generating validation reports for regulator review. Challenges: Simulating real‑world conditions, covering edge cases, and documenting validation procedures.
Natural Language Processing (NLP) #
Natural Language Processing (NLP)
Concept #
AI subfield focused on understanding and generating human language. Related terms: Tokenization, Transformer, Sentiment Analysis. Explanation: NLP models convert text into structured representations for downstream tasks. Example: BERT fine‑tuned to classify customer support tickets.
Practical application #
Automated routing of emails to appropriate departments. Challenges: Handling ambiguous language, protecting sensitive information in text, and meeting GDPR “right to access” for generated content.
Neural Architecture Search (NAS) #
Neural Architecture Search (NAS)
Concept #
Automated process of designing optimal neural‑network structures. Related terms: AutoML, Hyperparameter Optimization, Search Space. Explanation: NAS evaluates many architectures to discover high‑performing models. Example: NAS discovers a compact CNN for on‑device image classification.
Practical application #
Reducing manual engineering effort for specialized AI tasks. Challenges: Computational expense, reproducibility, and ensuring searched architectures respect privacy constraints.
Non‑Negative Matrix Factorization (NMF) #
Non‑Negative Matrix Factorization (NMF)
Concept #
Decomposes a matrix into two lower‑rank non‑negative matrices. Related terms: Dimensionality Reduction, Topic Modeling, Latent Factor Model. Explanation: NMF reveals additive components, useful for interpretability. Example: Extracting topics from a corpus of news articles.
Practical application #
Recommender systems that suggest items based on latent factors. Challenges: Convergence to local minima and sensitivity to initialization.
On‑Device Inference #
On‑Device Inference
Concept #
Running AI models directly on user hardware. Related terms: Edge AI, Model Compression, Privacy‑by‑Design. Explanation: Eliminates need to transmit raw data to cloud services. Example: Smartwatch detects arrhythmias locally.
Practical application #
Health monitoring with minimal data exposure. Challenges: Limited memory, power constraints, and secure model updates.
Overfitting #
Overfitting
Concept #
Model learns noise in training data, reducing generalization. Related terms: Regularization, Cross‑Validation, Model Complexity. Explanation: Overfitted models perform well on training data but poorly on unseen data. Example: Decision tree with depth 20 perfectly fits training set but misclassifies test samples.
Practical application #
Applying early stopping to prevent overfitting in deep learning. Challenges: Detecting overfitting early and balancing model capacity.
Parameter (Model) #
Parameter (Model)
Concept #
Internal variable that the learning algorithm adjusts. Related terms: Weight, Bias, Hyperparameter. Explanation: Parameters define the model’s behavior after training. Example: Weights in a linear regression model represent feature importance.
Practical application #
Storing trained parameters for deployment. Challenges: Managing large parameter sets (e.G., Billions in language models) and protecting them from theft.
Privacy‑Enhancing Technologies (PETs) #
Privacy‑Enhancing Technologies (PETs)
Concept #
Tools that protect personal data while enabling useful processing. Related terms: Differential Privacy, Secure Multiparty Computation, Homomorphic Encryption. Explanation: PETs help reconcile AI innovation with GDPR compliance. Example: Adding calibrated noise to model gradients for differential privacy.
Practical application #
Collaborative analytics across competitors without revealing proprietary data. Challenges: Trade‑offs between privacy guarantees and model utility.
Processing (GDPR) #
Processing (GDPR)
Concept #
Any operation performed on personal data, including collection, storage, and analysis. Related terms: Data Subject, Controller, Processor. Explanation: AI training, inference, and monitoring all count as processing activities. Example: Training a recommendation model on user clickstream data.
Practical application #
Documenting processing activities in a register. Challenges: Mapping all AI‑related processes to GDPR definitions and ensuring lawful bases.
Profiling #
Profiling
Concept #
Automated processing that evaluates personal aspects to predict behavior. Related terms: Automated Decision‑Making, DPIA, GDPR Article 22. Explanation: Profiling is high‑risk under GDPR and may require explicit consent. Example: Targeted advertising platform scores users for purchase propensity.
Practical application #
Personalizing product offers. Challenges: Demonstrating fairness, providing explanations, and respecting opt‑out rights.
Public‑Sector AI #
Public‑Sector AI
Concept #
AI applications deployed by government or public institutions. Related terms: Transparency, Public Interest, Accountability. Explanation: Must balance innovation with citizens’ rights and public trust. Example: AI system predicts traffic congestion for city planning.
Practical application #
Efficient allocation of public resources. Challenges: Higher scrutiny, stricter data‑sharing rules, and need for open‑source documentation.
Quantum Machine Learning #
Quantum Machine Learning
Concept #
Integration of quantum computing principles with ML algorithms. Related terms: Quantum Supremacy, Variational Quantum Circuits, Qubit. Explanation: Promises speedups for certain optimization problems. Example: Quantum‑enhanced kernel methods for high‑dimensional data.
Practical application #
Early‑stage research for cryptographic analysis. Challenges: Limited hardware, noise, and regulatory uncertainty regarding data protection on quantum platforms.
Random Forest #
Random Forest
Concept #
Ensemble method that builds multiple decision trees and aggregates their predictions. Related terms: Bagging, Feature Importance, Out‑of‑Bag Error. Explanation: Reduces variance and improves robustness compared to a single tree. Example: Predicting equipment failure using sensor readings.
Practical application #
Credit‑risk scoring with interpretable feature importance. Challenges: Large models may be memory‑intensive and harder to explain fully.
Reinforcement Learning (RL) #
Reinforcement Learning (RL)
Concept #
Learning paradigm where an agent interacts with an environment to maximize cumulative reward. Related terms: Policy, Reward Function, Exploration‑Exploitation. Explanation: RL agents learn optimal actions through trial and error. Example: AlphaGo learns to play Go by self‑play.
Practical application #
Dynamic pricing agents adjusting offers in real time. Challenges: Defining safe reward functions, preventing unintended behaviors, and ensuring compliance with GDPR when personal data informs rewards.
Responsible AI #
Responsible AI
Concept #
Framework ensuring AI systems are ethical, transparent, and aligned with societal values. Related terms: Fairness, Accountability, Human‑in‑the‑Loop. Explanation: Combines technical safeguards with governance and stakeholder engagement. Example: Company adopts a responsible‑AI charter covering bias mitigation and privacy.
Practical application #
Building trust with customers and regulators. Challenges: Translating high‑level principles into concrete processes and measuring impact.
Risk Assessment (GDPR) #
Risk Assessment (GDPR)
Concept #
Systematic evaluation of potential harms to data subjects. Related terms: DPIA, Threat Modeling, Mitigation. Explanation: AI projects must identify privacy, security, and ethical risks. Example: Assessing risk of re‑identification from model outputs.
Practical application #
Documented risk matrix submitted to supervisory authority. Challenges: Quantifying abstract risks like discrimination and ensuring mitigation measures are effective.
Secure Multiparty Computation (SMC) #
Secure Multiparty Computation (SMC)
Concept #
Allows parties to jointly compute a function over their inputs while keeping those inputs private. Related terms: Secret Sharing, Homomorphic Encryption, Federated Learning. Explanation: Enables collaborative AI without exposing raw data. Example: Two banks compute fraud‑score aggregates without revealing customer lists.
Practical application #
Joint market‑analysis across competitors. Challenges: Communication overhead, protocol complexity, and integrating with existing AI pipelines.
Self‑Supervised Learning #
Self‑Supervised Learning
Concept #
Learning paradigm where models generate their own supervision signals from raw data. Related terms: Contrastive Learning, Pretraining, Unlabeled Data. Explanation: Reduces reliance on costly labeled datasets. Example: Masked language modeling trains BERT on massive text corpora.
Practical application #
Pretraining domain‑specific models for legal document analysis. Challenges: Ensuring that pretraining data complies with GDPR and does not embed hidden personal information.
Service Level Agreement (SLA) #
Service Level Agreement (SLA)
Concept #
Contractual terms defining performance and availability expectations. Related terms: Data Processing Agreement, Liability, Uptime. Explanation: AI service providers must specify data‑handling obligations in SLAs. Example: Cloud AI vendor guarantees 99.9% Uptime and encryption at rest.
Practical application #
Negotiating AI outsourcing contracts. Challenges: Aligning SLA clauses with GDPR’s requirement for timely breach notification.
Signal Processing #
Signal Processing
Concept #
Analysis, modification, and synthesis of signals such as audio, video, or sensor data. Related terms: Fourier Transform, Filtering, Feature Extraction. Explanation: Provides raw inputs for AI models in domains like speech recognition. Example: Noise reduction filter improves microphone input before ASR processing.
Practical application #
Enhancing medical imaging quality for diagnostic AI. Challenges: Preserving diagnostic information while applying transformations and respecting privacy.
Singular Value Decomposition (SVD) #
Singular Value Decomposition (SVD)
Concept #
Factorization of a matrix into singular vectors and singular values. Related terms: Dimensionality Reduction, Latent Semantic Analysis, Low‑Rank Approximation. Explanation: Captures principal components for compression and noise reduction. Example: Reducing dimensionality of user‑item rating matrix in collaborative filtering.
Practical application #
Recommender systems with sparse data. Challenges: Computational cost for large matrices and handling missing entries.
Software‑as‑a‑Service (SaaS) AI #
Software‑as‑a‑Service (SaaS) AI
Concept #
Cloud‑based AI functionality delivered via subscription. Related terms: API, Multi‑Tenant Architecture, Data Residency. Explanation: SaaS providers handle infrastructure, but customers remain responsible for data compliance. Example: CRM platform offers AI‑driven lead scoring via REST API.
Practical application #
Quick integration of AI features without in‑house expertise. Challenges: Ensuring contractual clauses cover GDPR responsibilities and data transfer safeguards.
Supervised Learning #
Supervised Learning
Concept #
Learning from labeled examples where input–output pairs are known. Related terms: Classification, Regression, Training Set. Explanation: Model learns a mapping from features to target variables. Example: Linear regression predicts house prices from square footage.
Practical application #
Demand forecasting for inventory management. Challenges: Acquiring high‑quality labels and avoiding label leakage.
Synthetic Data #
Synthetic Data
Concept #
Artificially generated data that mimics statistical properties of real data. Related terms: GAN, Data Augmentation, Privacy Preservation. Explanation: Enables model training while reducing reliance on actual personal data. Example: GAN creates realistic medical images for rare disease research.
Practical application #
Training autonomous‑vehicle perception systems without exposing real driver footage. Challenges: Verifying that synthetic data does not inadvertently reveal real individuals and maintaining utility.
Target Variable #
Target Variable
Concept #
The outcome that a predictive model aims to forecast. Related terms: Label, Dependent Variable, Response. Explanation: In supervised learning, the target guides model optimization. Example: Churn indicator (yes/no) serves as target for retention model.
Practical application #
Predicting loan default risk. Challenges: Defining appropriate targets that align with business goals and regulatory constraints.
Technical and Organizational Measures (TOMs) #
Technical and Organizational Measures (TOMs)
Concept #
Security controls required by GDPR to protect personal data. Related terms: Encryption, Access Control, Incident Response. Explanation: TOMs must be appropriate to risk level of AI processing. Example: Role‑based access limits who can view training data.
Practical application #
Demonstrating compliance during audits. Challenges: Keeping measures up‑to‑date with evolving threats and scaling them for large AI projects.
Temporal Data #
Temporal Data
Concept #
Data points associated with timestamps or sequences. Related terms: Time Series, Sequence Modeling, Recurrent Neural Network. Explanation: Temporal patterns are crucial for forecasting and anomaly detection. Example: Hourly electricity consumption used to predict peak load.
Practical application #
Predictive maintenance based on sensor time series. Challenges: Handling missing timestamps, seasonality, and ensuring storage complies with data‑retention policies.
Transfer Learning #
Transfer Learning
Concept #
Reusing a pre‑trained model on a new, related task. Related terms: Fine‑Tuning, Domain Adaptation, Pretraining. Explanation: Saves resources and often improves performance with limited data. Example: Fine‑tuning a BERT model on legal contract classification.
Practical application #
Rapid deployment of NLP solutions in niche domains. Challenges: Verifying that source data respects privacy and that transferred knowledge does not encode prohibited personal information.