Skip to content

Machine Learning

Algorithms that improve at a task by learning patterns from data instead of being explicitly programmed.

Machine Learning is one of the core areas in the AI University map of AI. Explore the diagram, then dive into each topic — every subtopic grows into its own deep-dive over time.

flowchart TB
  D[(Data)] --> P{Learning paradigm}
  P --> S[Supervised]
  P --> U[Unsupervised]
  P --> SS[Self-supervised]
  P --> R[Reinforcement]
  S --> M[Train model]
  U --> M
  SS --> M
  R --> M
  M --> E[Evaluate] --> DEP[[Deploy]]

Key topics

  • Supervised learning


    Learn a mapping from inputs to labelled outputs — classification and regression.

  • Unsupervised learning


    Find structure in unlabelled data — clustering, dimensionality reduction, density estimation.

  • Self-supervised learning


    Create supervision from the data itself (e.g. predict the next token); the engine behind modern foundation models.

  • Reinforcement learning


    Learn by trial and error from rewards; covered in depth in its own area.

  • Core algorithms


    Linear/logistic regression, decision trees, SVMs, k-NN, naive Bayes, and ensembles like random forests and gradient boosting.

  • Feature engineering


    Turning raw data into informative inputs — scaling, encoding, selection, and extraction.

  • Training & evaluation


    Loss functions, gradient descent, regularization, cross-validation, and the bias–variance / over- vs under-fitting trade-off.

The learning loop

Almost every ML project — from a spam filter to a frontier model — runs the same loop:

flowchart LR
  DATA[(Data)] --> SPLIT[Train / val / test split]
  SPLIT --> FIT[Fit model<br/>minimize loss]
  FIT --> EVAL[Evaluate on held-out data]
  EVAL -->|not good enough| TUNE[Tune features / model]
  TUNE --> FIT
  EVAL -->|good| SHIP[[Deploy]]

The discipline is in the held-out data. A model that memorizes its training set looks perfect in training and fails in the real world. The whole game is generalization — performing well on data you have never seen.

Choosing an algorithm

There is no single best algorithm; the right choice depends on the data and the problem.

If you have… A good first choice
Tabular data, need a strong baseline fast Gradient boosting (XGBoost / LightGBM)
A simple linear relationship Linear / logistic regression
Need interpretability Decision tree or linear model
Images, audio, or text A neural network (deep learning)
No labels, want structure Clustering (k-means) or dimensionality reduction (PCA)

The unglamorous truth

On tabular business data, gradient-boosted trees still frequently beat deep learning — and are faster and cheaper. Reach for the fanciest tool only when simpler ones stall.

Overfitting and the bias–variance trade-off

Two ways a model fails:

  • Underfitting (high bias) — too simple to capture the pattern. Bad on both training and test data.
  • Overfitting (high variance) — memorized noise in the training data. Great on training, poor on test.

The cure for overfitting is regularization (penalizing complexity), more data, or a simpler model. The cure for underfitting is the opposite — a richer model or better features. Watching the gap between training and validation performance tells you which problem you have.

Foundations of AI · Deep Learning · Data & MLOps


Learn this properly

Want hands-on training in machine learning? Explore AI University courses and AI School camps for kids.