It's usable, but it's very aggravating and uncomfortable to use.
But it seemed very solid overall and especially so compared to the selenium monstrosities I remember trying to fight (and giving up) some years prior ...
I've never used Puppeteer, but having used a lot of US-hosted web services from Asia, I've seen plenty of latency-sensitive bugs (or at least annoyances not present for NYC-based users).
That's the best one. Of all the browser automators, playwright is the best. Never had a wait fail or anything like that. maybe I just got lucky, but if you're looking to do something with browser automation, try Playwright first, then look elsewhere
Was it definitely not that there were inconsistencies in the pages that you were interacting with?
Curious as someone who has done some browser automation, but not in a while and never with Puppeteer.
As a web dev generalist, I can usually understand how most things work under the hood.
But playing with chrome.browserless.io breaks that. You're streaming the web page in a <canvas> element, but how can I highlight text? When I load a youtube video page are you literally proxying the video through your infra, through <canvas> pixels to my browser?
Who dictates what IP the headless chrome is assigned to? Do you have a lot of IPs? I noticed on some pages I'd get the CloudFlare captcha which makes sense if browserless has to cycle through a limited set of IPs where other people have been using it to scrape another cloudflare page.
As far as the hovering goes, the canvas element is “mirroring” interactions back through to the underlying page. When Devtools are active, this triggers chromium to render hover effects in its GUI. This then gets sent back to the canvas element in the debugging page.
It’s a lot of network traffic and Synchronization... but once everything is setup it works fairly seamlessly
> The application is written in TypeScript, and produces a static asset in the static directory once built.
What should I do with said artifact? How to put it to use?
https://github.com/browserless/chrome/blob/5627f1ef041ec23f3...
The tour is still up at [2]. The servers that actually run the Remote Browser have since gone down, but interestingly you can still run the tour. That's because if you don't change the code in the REPL window, you get cached results (except step 7/7 which scrapes Hacker News and won't work). To get those results, we built a little tour "recorder" that would be run on every release. If I remember correctly, we allowed some dynamic ES6 imports through a custom Babel compiler for the code that's input, which also allows first level async stuff, which still works :)
[1]: https://github.com/intoli/remote-browser [2]: https://intoli.com/tour/
Is puppeteer running on a webserver then the repl connecting to it? Or is puppeteer completely contained within each users browser?
I’m curious if it’s possible to proxy the network requests so for example it would use the browsers IP address instead of the server?