The problem with doing it via downloads or plugins is that it's OS and maybe browser specific (have fun making your plugin for Firefox on Windows, Chrome on Mac, etc.), and also it's extra friction.
The reason I want to remote control someone's computer is because talking them through the actions is too tedious. The last thing I want to do is talk them through downloading and installing some browser plugin first.
Security & scamming is obviously a concern but let's not pretend it is impossible to solve. People thought the full screen API shouldn't be done because of security concerns, but that's laughable now.
As an initial step they could at least support showing a "laser pointer" on other people's screens so you can say "click here" instead of "up a bit, no... go back.. no third from the bottom, yeah that one". That has zero security implications.
So now without the humor: How would you design the system to prevent abuse, remote code execution and such? Because if that part isn't clear that idea should probably be shelved.