EN DA
AI
AI

Neural Networks

The computational foundation of modern AI

deep-learningneuronsbackpropagation

Overview

A neural network is a layered system of mathematical functions loosely inspired by biological neurons. Each layer transforms its input through weighted connections and non-linear activation functions, learning representations of increasing abstraction. Deep neural networks—those with many layers—power virtually every modern AI system from image recognition to language models.

Key Concepts

  • Forward pass: input data flows through layers, each computing a weighted sum then applying an activation function
  • Activation functions (ReLU, sigmoid, softmax): introduce non-linearity that allows the network to learn complex patterns
  • Backpropagation: computes gradients of the loss function with respect to each weight using the chain rule
  • Gradient descent: iteratively adjusts weights in the direction that reduces the loss
  • Convolutional layers (CNNs): detect spatial patterns in images; recurrent layers (RNNs/LSTMs): model sequential dependencies

Key Facts

  • The universal approximation theorem states a single hidden layer can approximate any continuous function—but deep networks are far more efficient
  • AlexNet (2012) sparked the deep learning revolution by winning ImageNet with a 10-point accuracy gap over classical methods
  • Modern large language models contain billions of learned parameters—GPT-3 has 175 billion
  • Dropout regularisation randomly zeroes neurons during training to prevent the network from memorising rather than generalising
  • Batch normalisation, introduced in 2015, stabilises training and allows much higher learning rates