This was especially apparent when I tried to get my hands on some weights during the early corona pandemic. All webshops were out of stock, but in about 80% of them the schema markup indicated otherwise
Edit: I think it was from this discussion: https://news.ycombinator.com/item?id=24057228
As far as selectors go, we're currently working through how to support more specific selection when there's multiple HTML element alternatives for a single visual element (e.g., nested tags with no padding/margin).
This isn't too big of an issue for scraping text, but comes into play when pulling attributes, or when using the selectors to modify the page structure. (Our tool lets you place buttons, panels, etc., copying the style/structure from existing elements)
For our product (PixieBrix) we actually generally grab the data directly from the front-end framework (e.g., React props). It's a bit less stable since it's effectively an internal API, but it means you can grab a lot of data with a single selector and can generally avoid parsing values out of text
I would assume this might change on recompiles or at least library updates, never mind internal code changes. Do you find that it works in practice?
As I mentioned in the post, dynamic CSS classnames are also tricky depending on how much gets mangled. We have some techniques in the pipeline for better handling those
The most popular framework we haven't implemented support for yet is Angular. (AngularJS, the old version, is straightforward.) Any of the compiled frameworks, e.g., Google Closure Compiler are difficult because they mangle identifiers. I suspect Svelte might also be tricky, but we haven't tried that yet
At the end of the day though, every framework has to write to the DOM and be accessible. So you can use selectors, or in the worst case OCR/computer vision. (IIRC, FB actively inserts dummy elements to try to prevent structural scraping).
Singular elements: data-test-save-button, data-test-name-input
Elements that are a part of a list: data-test-user={user.id}, data-test-listing={listing.id}
This allows us to name our elements with data test attributes, but also provide values to them where applicable.
I have also created a testSelector function that takes id and value, and spits out either [data-test-${id}="${value}"] or [data-test-${id}].
We have also experimented with letting shared components popuplate their own data-test-* attribute automatically based on other props. Like in our modal component, which sets data-test-modal={title}. data-test-delete-user-modal vs. data-test-modal="Delete user". But in the latter case, the dev does not need to provide the data-test-* attribute manually, since the component takes care of it.