Don't forget to use a validation set for the model and hyperparameter selection though!
As with many models, I suspect that this network really learned some other property that has almost-but-not-quite the same pattern as divisible-by-N.
[1] https://en.wikipedia.org/wiki/Universal_approximation_theore...