The Compendium · 2026 Edition

A full map of modern AI & data science — with each chapter its own essay.

Eighteen parts, from the mathematics underneath to the governance questions overhead. Each chapter is intended to stand alone as a self-contained essay — long enough to teach, short enough to read in an evening. What's ready is marked; everything else is on the way.

What this is

A working table of contents for a longer project. Each part below groups a handful of closely related chapters, and each chapter — when written — will explain its topic from first principles, with worked examples, diagrams where they help, and pointers to the canonical references for going deeper.

Topics are ordered roughly by foundation: the mathematics and programming that underpin everything, then the classical and modern machine-learning methods built on top, then the application areas, infrastructure, and the hard questions of safety and governance. Skip around freely — the parts are designed to be readable out of order.

Part I

Mathematical Foundations

  1. 01
    Linear AlgebraAvailable
    vectors, matrices, decompositions, eigenvalues
  2. 02
    Calculus & Differential EquationsAvailable
    multivariable calculus, ODEs, PDEs as relevant to physics-informed ML
  3. 03
    Optimization TheoryAvailable
    convexity, gradient descent, Lagrangians, constrained optimization
  4. 04
    Probability TheoryAvailable
    random variables, distributions, expectation, concentration inequalities
  5. 05
    Statistics & Statistical InferenceAvailable
    frequentist inference, hypothesis testing, regression, experimental design
  6. 06
    Information TheoryAvailable
    entropy, KL divergence, mutual information, compression
  7. 07
    Bayesian ReasoningAvailable
    Bayes' theorem, priors, posteriors, conjugacy, hierarchical models
  8. 08
    Signal ProcessingAvailable
    Fourier transforms, convolution, filtering, sampling — prerequisite for audio and time series
Part II

Programming & Software Engineering

  1. 01
    Python for Data ScienceAvailable
    Python idioms, data manipulation with pandas, NumPy
  2. 02
    Scientific ComputingAvailable
    SciPy, numerical methods, linear algebra libraries
  3. 03
    Algorithms & Data StructuresAvailable
    complexity, trees, graphs, hashing — what ML practitioners actually need
  4. 04
    Software Engineering PrinciplesAvailable
    clean code, testing, design patterns, documentation
  5. 05
    Databases & SQLAvailable
    relational databases, query optimization, NoSQL overview
  6. 06
    Version Control & Collaborative DevelopmentAvailable
    Git, code review, branching strategies
Part III

Data Engineering & Systems

  1. 01
    Data Collection & AcquisitionAvailable
    web scraping, APIs, data procurement, synthetic data
  2. 02
    Data Storage & WarehousingAvailable
    data lakes, warehouses, columnar formats, Parquet, Delta Lake
  3. 03
    Data Pipelines & OrchestrationAvailable
    Airflow, Prefect, dbt, batch vs. stream pipelines
  4. 04
    Streaming & Real-Time DataAvailable
    Kafka, Flink, event-driven architectures
  5. 05
    Distributed ComputingAvailable
    Spark, MapReduce, distributed data processing
  6. 06
    Cloud Platforms & InfrastructureAvailable
    AWS, GCP, Azure — services relevant to data and ML
  7. 07
    Data Quality, Governance, & MetadataAvailable
    data contracts, lineage, cataloging, observability
Part IV

Classical Machine Learning

  1. 01
    Supervised Learning: RegressionAvailable
    linear and polynomial regression, regularization, generalized linear models
  2. 02
    Supervised Learning: ClassificationAvailable
    logistic regression, decision trees, Naive Bayes, kNN
  3. 03
    Ensemble MethodsAvailable
    bagging, boosting, random forests, gradient boosting, XGBoost
  4. 04
    Unsupervised Learning: ClusteringAvailable
    k-means, DBSCAN, hierarchical clustering, Gaussian mixture models
  5. 05
    Dimensionality ReductionAvailable
    PCA, ICA, t-SNE, UMAP, autoencoders
  6. 06
    Probabilistic Graphical ModelsAvailable
    Bayesian networks, Markov random fields, HMMs
  7. 07
    Kernel Methods & Support Vector MachinesAvailable
    the kernel trick, SVMs, Gaussian processes seen from above
  8. 08
    Feature Engineering & SelectionAvailable
    encoding, interaction terms, mutual information, wrapper and filter methods
  9. 09
    Model Evaluation & SelectionAvailable
    cross-validation, metrics, calibration, overfitting, leakage
Part V

Deep Learning Foundations

  1. 01
    Neural Network FundamentalsAvailable
    perceptrons, backpropagation, activation functions, MLPs
  2. 02
    Training Deep NetworksAvailable
    optimizers — SGD, Adam, scheduling — initialization, batch size
  3. 03
    Regularization & GeneralizationAvailable
    dropout, weight decay, data augmentation, early stopping
  4. 04
    Convolutional Neural NetworksAvailable
    convolutions, pooling, receptive fields, classic architectures
  5. 05
    Sequence ModelsAvailable
    RNNs, LSTMs, GRUs, vanishing gradients, sequence-to-sequence
  6. 06
    Attention MechanismsAvailable
    soft and hard attention, self-attention, cross-attention, multi-head attention
  7. 07
    Transfer Learning & PretrainingAvailable
    fine-tuning, domain adaptation, representation learning
Part VI

Natural Language Processing & Large Language Models

  1. 01
    NLP FundamentalsAvailable
    tokenization, morphology, POS tagging, parsing, linguistic structure
  2. 02
    Classical NLPAvailable
    bag of words, TF-IDF, n-grams, named entity recognition, information extraction
  3. 03
    Word Embeddings & Distributional SemanticsAvailable
    Word2Vec, GloVe, fastText, contextualized representations
  4. 04
    The Transformer ArchitectureAvailable
    encoder, decoder, positional encoding, layer norm, architecture variants
  5. 05
    Pretraining ParadigmsAvailable
    masked LM, causal LM, encoder-only, decoder-only, encoder-decoder
  6. 06
    Large Language Models: Scale & Emergent CapabilitiesAvailable
    scaling laws, emergent behaviors, capabilities and limitations
  7. 07
    Instruction Tuning & AlignmentAvailable
    RLHF, DPO, Constitutional AI, preference learning
  8. 08
    Fine-Tuning & Parameter-Efficient AdaptationAvailable
    full fine-tuning, LoRA, prefix tuning, adapters, model merging
  9. 09
    Retrieval-Augmented GenerationAvailable
    dense retrieval, hybrid search, RAG architectures, long-context tradeoffs
  10. 10
    LLM EvaluationAvailable
    benchmarks, contamination, human evaluation, critique of leaderboards
Part VII

Computer Vision

  1. 01
    Image Representation & Classical VisionAvailable
    pixel statistics, color spaces, edge detection, classical feature descriptors
  2. 02
    Modern Image Classification & ArchitecturesAvailable
    ResNets, EfficientNets, Vision Transformers, scaling
  3. 03
    Object Detection & Instance SegmentationAvailable
    YOLO, Faster R-CNN, DETR, SAM
  4. 04
    Video UnderstandingAvailable
    temporal modeling, optical flow, action recognition, video transformers
  5. 05
    3D Vision & Spatial UnderstandingAvailable
    depth estimation, point clouds, NeRF, 3D reconstruction
  6. 06
    Vision-Language ModelsAvailable
    CLIP, image captioning, visual question answering, grounding
Part VIII

Speech, Audio & Music

  1. 01
    Audio Signal ProcessingIn progress
    waveforms, spectrograms, MFCCs, mel filterbanks
  2. 02
    Automatic Speech RecognitionIn progress
    CTC, attention-based, Whisper, streaming ASR
  3. 03
    Text-to-Speech & Voice SynthesisIn progress
    WaveNet, Tacotron, neural vocoding, voice cloning
  4. 04
    Speaker Recognition & DiarizationIn progress
    speaker embeddings, verification, who-spoke-when
  5. 05
    Audio Classification & Sound UnderstandingIn progress
    environmental sound, music tagging, sound event detection
  6. 06
    Music Generation & Music AIIn progress
    symbolic music, audio generation, MusicLM-style models
Part IX

Reinforcement Learning

  1. 01
    RL FundamentalsIn progress
    MDPs, Bellman equations, value functions, policies, exploration
  2. 02
    Tabular RLIn progress
    Q-learning, SARSA, dynamic programming, model-based planning
  3. 03
    Deep Q-Networks & Value-Based MethodsIn progress
    DQN, double DQN, dueling networks, Rainbow
  4. 04
    Policy Gradient & Actor-Critic MethodsIn progress
    REINFORCE, A3C, PPO, SAC, TD3
  5. 05
    Model-Based RL & World ModelsIn progress
    Dyna, Dreamer, MBPO, planning with learned models
  6. 06
    Multi-Agent Reinforcement LearningIn progress
    cooperative, competitive, emergent behavior
  7. 07
    Offline RL & Imitation LearningIn progress
    behavior cloning, inverse RL, conservative Q-learning
  8. 08
    Preference Learning & RLHFIn progress
    reward modeling, human feedback, RLAIF
Part X

Generative Models

  1. 01
    Variational AutoencodersIn progress
    ELBO, reparameterization, disentanglement, latent spaces
  2. 02
    Generative Adversarial NetworksIn progress
    training dynamics, mode collapse, StyleGAN, progressive training
  3. 03
    Normalizing FlowsIn progress
    change of variables, RealNVP, Glow, discrete flows
  4. 04
    Diffusion ModelsIn progress
    DDPM, score matching, classifier-free guidance, latent diffusion
  5. 05
    Autoregressive Generative ModelsIn progress
    PixelCNN, WaveNet, GPT as generative model
  6. 06
    Image & Video GenerationIn progress
    Stable Diffusion, DALL-E, Sora-style video, consistency models
  7. 07
    3D & Multimodal GenerationIn progress
    3D-aware generation, NeRF-based synthesis, any-to-any models
  8. 08
    Multimodal Foundation ModelsIn progress
    GPT-4V, Gemini, Flamingo — architectures that jointly process modalities
Part XI

AI Agents & Autonomous Systems

  1. 01
    Agent FundamentalsIn progress
    sense-plan-act loops, agent taxonomies, environments, PDDL
  2. 02
    LLM-Based AgentsIn progress
    ReAct, chain-of-thought, tool-augmented agents, cognitive architectures
  3. 03
    Tool Use & Function CallingIn progress
    APIs, code execution, browser use, structured outputs
  4. 04
    Memory & Knowledge ManagementIn progress
    episodic, semantic, working memory — RAG vs. in-context vs. parametric
  5. 05
    Planning & ReasoningIn progress
    tree-of-thought, MCTS, decomposition, verification
  6. 06
    Multi-Agent SystemsIn progress
    coordination, communication, role specialization, debate, emergent behavior
  7. 07
    Agent Evaluation & BenchmarkingIn progress
    task success, efficiency, safety, trajectory evaluation
Part XII

Robotics & Embodied AI

  1. 01
    Robot Perception & SensingIn progress
    cameras, lidar, IMU fusion, SLAM, sensor calibration
  2. 02
    Motion Planning & ControlIn progress
    path planning, trajectory optimization, PID, model predictive control
  3. 03
    Learning from Demonstration & ImitationIn progress
    behavior cloning, DAgger, teleoperation datasets
  4. 04
    Sim-to-Real TransferIn progress
    domain randomization, physics simulators, gap mitigation
  5. 05
    Foundation Models for RoboticsIn progress
    RT-2, generalist manipulation policies, vision-language-action models
  6. 06
    Autonomous VehiclesIn progress
    perception stack, prediction, planning, safety, regulatory context
Part XIII

Specialized ML Methods

  1. 01
    Time Series Analysis & ForecastingIn progress
    ARIMA, exponential smoothing, temporal CNNs, Transformers for time series
  2. 02
    Anomaly DetectionIn progress
    statistical methods, isolation forest, autoencoders, contextual vs. collective anomalies
  3. 03
    Causal InferenceIn progress
    potential outcomes, DAGs, do-calculus, IV methods, difference-in-differences
  4. 04
    Causal Machine LearningIn progress
    causal discovery, uplift modeling, double ML, heterogeneous treatment effects
  5. 05
    Graph Neural NetworksIn progress
    message passing, GCN, GAT, GraphSAGE, heterogeneous graphs
  6. 06
    Survival Analysis & Event ModelingIn progress
    Kaplan-Meier, Cox regression, neural survival models
  7. 07
    Bayesian Deep LearningIn progress
    Bayesian neural nets, Monte Carlo dropout, deep GPs, Laplace approximation
  8. 08
    Meta-Learning & Few-Shot LearningIn progress
    MAML, prototypical networks, in-context learning as meta-learning
  9. 09
    Continual & Lifelong LearningIn progress
    catastrophic forgetting, EWC, progressive networks, replay methods
  10. 10
    Federated Learning & Privacy-Preserving MLIn progress
    federated averaging, differential privacy, secure aggregation
  11. 11
    Neurosymbolic AIIn progress
    logic plus learning, knowledge graphs, program synthesis, neuro-symbolic reasoning
Part XIV

Applied Domains

  1. 01
    Recommender SystemsIn progress
    collaborative filtering, content-based, matrix factorization, sequential recommendation
  2. 02
    Search & Information RetrievalIn progress
    BM25, dense retrieval, learning to rank, neural search
  3. 03
    Financial ML & Quantitative MethodsIn progress
    alpha research, risk modeling, high-frequency, fraud detection
  4. 04
    Healthcare & Clinical AIIn progress
    medical imaging, EHR modeling, clinical NLP, trial design, regulatory considerations
  5. 05
    AI for CybersecurityIn progress
    intrusion detection, malware classification, adversarial robustness in security contexts
  6. 06
    AI for Education & PersonalizationIn progress
    knowledge tracing, adaptive learning, intelligent tutoring
  7. 07
    AI for Manufacturing & OperationsIn progress
    predictive maintenance, quality control, supply chain optimization
  8. 08
    Human-AI Interaction & UXIn progress
    interface design, cognitive load, trust calibration, feedback collection
Part XV

AI for Science

  1. 01
    Scientific Machine LearningIn progress
    data-driven discovery, surrogate models, physics-informed neural networks
  2. 02
    AI for Biology & GenomicsIn progress
    sequence modeling, variant effect prediction, single-cell analysis
  3. 03
    AI for Drug Discovery & Molecular DesignIn progress
    molecular representations, generative chemistry, docking, ADMET prediction
  4. 04
    AI for Protein ScienceIn progress
    AlphaFold, structure prediction, protein design, function prediction
  5. 05
    AI for Climate & Earth SystemsIn progress
    weather forecasting, climate emulators, remote sensing
  6. 06
    AI for Physics, Materials & AstronomyIn progress
    neural operators, materials property prediction, simulation surrogates
Part XVI

MLOps & Production ML

  1. 01
    Experiment Tracking & ReproducibilityIn progress
    MLflow, W&B, DVC, determinism, environment management
  2. 02
    Feature Stores & Data Management for MLIn progress
    online/offline stores, point-in-time correctness, Feast, Tecton
  3. 03
    Model Deployment & ServingIn progress
    REST, gRPC, batch vs. real-time, model registries, containerization
  4. 04
    Model Monitoring & Drift DetectionIn progress
    data drift, concept drift, shadow deployment, alerting
  5. 05
    CI/CD for Machine LearningIn progress
    automated retraining, testing for ML, MLOps pipelines
  6. 06
    A/B Testing & Causal Experimentation in ProductionIn progress
    randomization, CUPED, multi-armed bandits
  7. 07
    Responsible Release & Deployment PracticesIn progress
    staged rollouts, kill switches, incident response, documentation
Part XVII

AI Infrastructure & Systems

  1. 01
    Hardware for MLIn progress
    GPUs, TPUs, NPUs, memory bandwidth, roofline model
  2. 02
    Distributed TrainingIn progress
    data parallelism, model parallelism, pipeline parallelism, ZeRO, FSDP
  3. 03
    Model CompressionIn progress
    pruning, quantization, knowledge distillation, structured vs. unstructured
  4. 04
    Inference OptimizationIn progress
    batching, KV caching, speculative decoding, FlashAttention, serving frameworks
  5. 05
    AI Chips & Custom SiliconIn progress
    ASIC design philosophy, photonics, neuromorphic computing, the competitive landscape
Part XVIII

AI Safety, Alignment & Governance

  1. 01
    AI Safety FundamentalsIn progress
    problem framing, threat models, instrumental convergence, Goodhart's law
  2. 02
    Technical Alignment MethodsIn progress
    scalable oversight, debate, amplification, interpretability-based approaches
  3. 03
    Robustness & Adversarial MLIn progress
    adversarial examples, certified defenses, distribution shift, red-teaming
  4. 04
    Mechanistic InterpretabilityIn progress
    circuits, features, superposition, probing, causal tracing
  5. 05
    Explainability for PractitionersIn progress
    SHAP, LIME, saliency maps, counterfactuals, when each method applies
  6. 06
    Fairness, Bias & EquityIn progress
    sources of bias, fairness definitions and tensions, auditing, mitigation
  7. 07
    Privacy in MLIn progress
    differential privacy, membership inference, model inversion, data deletion
  8. 08
    AI Governance, Policy & RegulationIn progress
    EU AI Act, executive orders, standards bodies, liability, international coordination

The compendium is a work in progress — chapters will land as they're written, and the table above will update with each release. If you have corrections, suggestions, or just want to tell me which chapter should be written next, you know where to find me.

— Alex