Note

CS Degree Day 81

CS Degree in 100 Days

29 Aug'25

What I did today?

  • Lecture 11: Convexity - convex sets, convex functions
  • Lecture 12: Gradient descent - derivation, step size, convergence
  • Lecture 13: Stochastic gradient descent

This is the connection I was waiting for. Gradient descent is how neural networks learn. The loss function is a surface in a high-dimensional space. The gradient tells you the direction of steepest ascent. You walk downhill. The learning rate determines the step size. Too large and you overshoot. Too small and you never arrive. I have used gradient descent in PyTorch without understanding what it was doing. Now I do.