It is called the Deep Galerkin Method [1]. In a nutshell, the method directly minimizes the L2 error over the PDE, boundary conditions and initial conditions. The integral is tricky though, and computed via a Monte Carlo approximation.
As part of the course I got introduced to the FEniCS project[1].
They had Python code looking very much like the math equations generating C++ code at runtime, compiling it into a Python module which got dynamically loaded and executed.
This way they got speeds which rivaled or surpassed handwritten C++, as the C++ code could be optimized around the specific problem, but with superior ergonomics of writing the equations almost directly.
It really blew my mind. I had heard about Java doing JIT but this was on another level for me. Not terribly fancy these days but at the time it really helped me expand my thinking about how to solve problems.
Still don't quite understand why you can't just use Runge Kutta methods to numerically solve these problems. I became quite good at manipulating the symbols to derive variational solutions while having absolutely no idea what any of it meant.
* It can handle PDEs on domains with complicated geometries, while finite differences really prefer rectangular domains. This consideration doesn't apply to ODEs which are always solved on one-dimensional intervals.
* For any numerical approximation it is important to have convergence guarantees, and as the blog post mentions, the analysis is much more well understood for finite elements, particularly on irregular geometries. Strang and Fix's 1973 book is the classic reference here.