I'm writing a textbook on a topic no existing book covers: the internals of a web browser.
For your broader question---I understand what you're saying, but it's very difficult to edit someone else's writing. That's where "committee voice" comes from: it's the lowest common denominator to multiple authors working together. And often how I visualize things comes from how I look at things, and coming up with a visualization for how someone else looks at things is hard.
Take the OP as an example. This is a long blog post on gears in general, but animated by the specific question "what shape are gear teeth". If I were writing a blog post about gears, I wouldn't start at that place. And then, imagine if this blog post started text-only instead of visual. "Involute" would now be described with algebra, not a picture. The algebra is complex (compare the Wiki at https://en.wikipedia.org/wiki/Involute), and that algebra itself would need pictures. Illustrations and explorables aren't, ideally, something you sprinkle onto existing text.