1. You'll use Matlab in any other linear algebra or numerical methods course as well as in any digital signals course.
2. Optimization algorithms, of which gradient descent is a subset, are deployed in production in many languages, very often not Python.
3. There is almost nothing to learn. For the programming assignments in the course, Octave is used as a succinct DSL for matrix math. The assignments were to simply write the math in a computer and watch what happens when you run the computations.
4. You wouldn't learn Python by completing the programming assignments because you're just calling numerical routines, not dealing with anything else. Writing the code in Python simply adds more opportunity for error with no pedagogical benefit.