That wouldn’t work for PDF’s use case of being an arbitrary paper-like format because the various Unicode and OpenType algorithms don’t provide sufficient functionality for rendering arbitrary text: there are no one-size-fits all rules! The standards are a set of generic “best effort” guidelines for lowest-common-denominator text layout that are constantly being extended.
Even for English the exact tweaking of line breaking and hyphenation is a problem that requires manual intervention from time to time. In mathematics research papers it’s not uncommon to see symbols that haven’t yet made it into Unicode. Look at the state of text on the web and you’ll encounter all these problems; even Google Docs gave in and now renders to a canvas.
PDF’s Unicode handling is indeed a big mess but it does have the ability to associate any glyph with an arbitrary Unicode string, for text extraction purposes, so there’s nothing to stop the program that generates the PDF from mapping the fi ligature glyph to the to-character string “fi”.