I know that e.g. GMX has had a leak at some point (or sold data), as an email I created there ages ago was used in phishing. Okay, that's lame, but they've also used the fake name I had given to GMX, spelled perfectly. I've never used that name anywhere when signing up, so it must come from the database.