This was my approach too and it's been working great. Nowadays data isn't rendered directly into HTML anymore, it gets downloaded from some JSON API endpoint. So I use network monitoring tools to see where it's coming from and then inferface with the endpoint directly. I essentially wrote custom clients for someone else's site. One of my scrapers is actually just curl piped into jq. Sometimes they change the API and I have to adapt but that's fine.
> I understand companies can put roadblocks to hinder this
Can you elaborate? I haven't run into any roadblocks yet but I'm not scraping big sites or sending a massive number of requests.