That's rather a bold claim given that artificial neural networks are universal function approximators.
It's perhaps not terribly surprising that it becomes possible with unlimited width or depth (or an arbitrarily complex activation function).
https://en.wikipedia.org/wiki/Universal_approximation_theore...
The universal function approximator theorem only applies for continuous functions. Non-continuous functions can only be approximated to the extent that they are of the same "class" as the activation function.
Additionally, the theorem only proves that for any given continuous function, there exists a particular NN with particular weight that can approximate that function to a given precision. Training is not necessarily possible, and the same NN isn't guaranteed to approximate any other function to some desired precision.
It seems pretty obvious to me that most interesting behaviors in the real world can't be modelled by a mathematical function at all (that is, for each input having a single output); if we further restrict to continuous functions, or step functions, or whatever restriction we get from our chosen activation function.
Yes, and?
> Training is not necessarily possible
That would be surprising, do you have any examples?
> and the same NN isn't guaranteed to approximate any other function to some desired precision.
Well duh. Me speaking English doesn't mean I can tell 你好[0] from 泥壕[1] when spoken.
> It seems pretty obvious to me that most interesting behaviours in the real world can't be modelled by a mathematical function at all (that is, for each input having a single output)
I think all of physics would disagree with you there, what with it being built up from functions where each input has a single output. Even Heisenberg uncertainty and quantised results from the Stern-Gerlach setup can be modelled that way in silico to high correspondence with reality, despite the result of testing the Bell inequality meaning there can't be a hidden variable.
[0] Nǐ hǎo, meaning "hello"
[1] Ní háo, which google says is "mud trench", but I wouldn't know
It means that there is no guarantee that, given a non-continuous function function f(x), there exists an NN that approximates it over its entire domain withing some precision p.
> That would be surprising, do you have any examples?
Do you know of a universal algorithm that can take a continuous function and a target precision, and return an NN architecture (number of layers, number of neurons per layer) and a starting set of weights for an NN, and a training set, such that training the NN will reach the final state?
All I'm claiming is that there is no known algorithm of this kind, and also that the existence of such an algorithm is not guaranteed by any known theorem.
> Well duh. Me speaking English doesn't mean I can tell 你好[0] from 泥壕[1] when spoken.
My point was relevant because we are discussing whether an NN might be equivalent to the human brain, and using the Universal Approximation Theorem to try to decide this. So what I'm saying is that even if "knowning English" were a continuous function and "knowing French" were a continuous function, so by the theorem we know there are NNs that can approximate either one, there is no guarantee that there exists a single NN which can approximate both. There might or might not be one, but the theorem doesn't promise one must exist.
> I think all of physics would disagree with you there, what with it being built up from functions where each input has a single output.
It is built up of them, but there doesn't exist a single function that represents all of physics. You have different functions for different parts of physics. I'm not saying it's not possible a single function could be defined, but I also don't think it's proven that all of physics could be represented by a single function.