I found this article helpful when I had that same question. Basically BER has some less rigid specifications for how to encode the data that can be convenient for the implementation. Such as symbol-terminated sequences instead of having to know their length ahead of time. But this means that there are many equivalent serializations of the same underlying object, which is problematic for cryptography, so DER is an unambiguous subset of BER which will have only one correct possible serialization for a given object.
https://luca.ntop.org/Teaching/Appunti/asn1.html