I am waiting for this uneducated drivel of explaining NN performance by their 'universal function approximator property' to stop. There are tons other schemes that are also universal approximators, they were known before NN was a thing. Why don't we use those ? Why don't they work as well ?
Learning from examples and generalizing is a much different problem from function approximation.