by trying to learning the general rules that to explain the dataset and minimise its loss. That's what machine learning is about, it's not called machine memorising.
that would be the optimal rule :) We usually optimise to improve and learn better and better rules, hopefully approximating the optimal rule after some iterations. There's still a gap here, but hopefully it can be closed by improving the models, training algorithms etc.