Member-only story

Understanding Residual Connections in Neural Networks

The problem of vanishing gradient

4 min readSep 12, 2023

Residual connections were introduced in 2016 with the paper “Deep Residual Learning for Image Recognition” published by He, Zhang, Ren, and Sun at the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Residual connections facilitate a smooth gradient flow during training, improving the vanishing gradient problem. In this post, we will understand residual connections and their role in advancing computer vision.

The vanish gradient problem

The vanishing gradient problem in neural networks is a challenge that arises during the training process of deep networks, particularly when dealing with many layers. Imagine a convolutional network as a sequence of layers, where each layer processes the data and transforms it to learn useful features. During training, the network tries to adjust its internal parameters (weights) to make accurate predictions. This adjustment process is guided by gradients, which are values that indicate how much each parameter should be changed to minimize the difference between the predicted output and the actual output.

The vanishing gradient occurs because as the gradients are propagated backward through the layers during training, they can become extremely small…

Understanding Residual Connections in Neural Networks

The problem of vanishing gradient

The vanish gradient problem

Written by Carla Martins

No responses yet