> The reason that’s okay is because you aren’t competing against the initial source material.
I don't mean to claim that search engine image thumbnailing is like-to-like in every consideration, just that it demonstrates there's no "human spirit" required in order to qualify as "transformative" as far as fair use is concerned. Search engine image thumbnailing has been found to be transformative, for instance in Perfect 10, Inc. v. Amazon.com, Inc.: "Google's use of thumbnails is highly transformative."
And, though I'm probably being pedantic here, I think it's important to distinguish that the other fair use factor you allude to is not whether you're "competing against" the original work, but specifically the effect of your use on the market/value of that original work. For example if your documentary uses a clip from a TV show and also happens to air in the same time-slot as that TV show - the extent you compete/displace market for the TV show in general (even as you would had you not included the clip of it) is not what's under consideration, but rather only the additional extent you displace its market specifically due to inclusion of that clip.
Because of that, I'd claim that some machine-learning-based tool that partially displaces the market for a work it was trained on (for instance, Google Translate displacing the market for a translated version of a book) might still be seen reasonably favorably under the market impact factor, so long as the extent it displaces that work is largely independent of whether it has trained on that work specifically (such as if the translation tool could already provide a decent translation of the original book even before having trained on its translated version).