Machine Learning¶

Algorithms that improve at a task by learning patterns from data instead of being explicitly programmed.

Machine Learning is one of the core areas in the AI University map of AI. Explore the diagram, then dive into each topic — every subtopic grows into its own deep-dive over time.

flowchart TB
  D[(Data)] --> P{Learning paradigm}
  P --> S[Supervised]
  P --> U[Unsupervised]
  P --> SS[Self-supervised]
  P --> R[Reinforcement]
  S --> M[Train model]
  U --> M
  SS --> M
  R --> M
  M --> E[Evaluate] --> DEP[[Deploy]]

Key topics¶

Supervised learning

Learn a mapping from inputs to labelled outputs — classification and regression.
Unsupervised learning

Find structure in unlabelled data — clustering, dimensionality reduction, density estimation.
Self-supervised learning

Create supervision from the data itself (e.g. predict the next token); the engine behind modern foundation models.
Reinforcement learning

Learn by trial and error from rewards; covered in depth in its own area.
Core algorithms

Linear/logistic regression, decision trees, SVMs, k-NN, naive Bayes, and ensembles like random forests and gradient boosting.
Feature engineering

Turning raw data into informative inputs — scaling, encoding, selection, and extraction.
Training & evaluation

Loss functions, gradient descent, regularization, cross-validation, and the bias–variance / over- vs under-fitting trade-off.

The learning loop¶

Almost every ML project — from a spam filter to a frontier model — runs the same loop:

flowchart LR
  DATA[(Data)] --> SPLIT[Train / val / test split]
  SPLIT --> FIT[Fit model<br/>minimize loss]
  FIT --> EVAL[Evaluate on held-out data]
  EVAL -->|not good enough| TUNE[Tune features / model]
  TUNE --> FIT
  EVAL -->|good| SHIP[[Deploy]]

The discipline is in the held-out data. A model that memorizes its training set looks perfect in training and fails in the real world. The whole game is generalization — performing well on data you have never seen.

Choosing an algorithm¶

There is no single best algorithm; the right choice depends on the data and the problem.

If you have…	A good first choice
Tabular data, need a strong baseline fast	Gradient boosting (XGBoost / LightGBM)
A simple linear relationship	Linear / logistic regression
Need interpretability	Decision tree or linear model
Images, audio, or text	A neural network (deep learning)
No labels, want structure	Clustering (k-means) or dimensionality reduction (PCA)

The unglamorous truth

On tabular business data, gradient-boosted trees still frequently beat deep learning — and are faster and cheaper. Reach for the fanciest tool only when simpler ones stall.

Overfitting and the bias–variance trade-off¶

Two ways a model fails:

Underfitting (high bias) — too simple to capture the pattern. Bad on both training and test data.
Overfitting (high variance) — memorized noise in the training data. Great on training, poor on test.

The cure for overfitting is regularization (penalizing complexity), more data, or a simpler model. The cure for underfitting is the opposite — a richer model or better features. Watching the gap between training and validation performance tells you which problem you have.

Foundations of AI · Deep Learning · Data & MLOps

Learn this properly

Want hands-on training in machine learning? Explore AI University courses and AI School camps for kids.