I think the reason this method is not popular is that there are still many ways to encode a semantic object graphically. A sentence can be broken down into words or letters. Table lines can be formed from multiple smaller lines, etc. But, as mentioned by the parent, rule based systems works reasonably well for reasonably focused problems. But you will never have a general purpose extractor since rules needs to be written by humans.
[1] https://poppler.freedesktop.org/ [2] https://gitlab.freedesktop.org/poppler/poppler/-/blob/master...