We are early enough in this evolution to help direct the ship in a way that serves the end user, web owners/creators, and the agent.
It’s almost inevitable since everyone wants more growth and advertising is almost always seen as free money left on the table by decision makers.
This is fine until the agent decides to order something the customer did not want. This is inherent to the concept of an agent. Due to the probabilistic nature of LLMs, and the fact that no agent will ever be perfectly able to predict exactly what you want at the time you want, this scenario is inevitable.
As the shop owner, this would result in an increased numbers of returns. You could recommend that the user must approve the purchase, but given that you do not define these agents, there is no way for you to ensure that the user is actually following your advice.
My entire extended family has two yubikeys: My key and my spare key.
I realize there's a strong impulse to not "reinvent the wheel," but what we have currently is unsustainable. Specifically, the fact that every API uses a slightly different REST API and its own unique authentication & authorization workflow. It worked fine for the days when application developers would spend a few weeks on each new integration, but it totally breaks down when you want to be able to orchestrate an agent across many user-defined services.
I think a simple protocol based on JSON and bog-standard public key encryption could allow agents to coordinate and spend credits/money based on human-defined budgets.
1) long tail of websites that don't have APIs, so the only way for an agent to interact with them on the user's behalf is to log in more conventionally, and
2) even if a website has APIs, there may be tasks to be done that are outside the scope of the provided APIs.
Thoughts?
I was an early engineer at Plaid and I think it's an interesting parallel, financial data aggregators used to use more of a screenscraping model of integration but over the past 5+ years, it's moved almost fully to OAuth integrations. would expect the adoption curve here to be much steeper than that, banks are notoriously slow so would expect tech companies to move even more quickly towards OAuth and APIs for agents.
another dimension of this, is that it's quite easy to block ai agents screenscraping, we're able to identify with almost 100% accuracy open ai's operator, anthropic's computer use api, browswerbase, etc. so some sites might choose to block agents from screenscraping and require the API path.
all of this is still early too, so excited to see how things develop!
I've tried making a Firefox extension that fills webforms using an LLM and the things website makers come up with the break their own forms for both humans and agents are just insane.
There are probably over a 1000 different ways to ask for someone's address that an agent (and/or human) would struggle to understand. Just to name an example.
I think agents will be able to get through them easily, but NOT because the websites makers are going to do a better job at being easier to use.
You can find its dogfooding demo on the Show HN [2].
- The human wouldn't need to share their password information with the agent
- Services would be able to block or ask for approval when agents take sensitive actions. Maybe an e-commerce site is happy to let an agent browse and add items to a cart, but wants a human in the loop for checkout.
- Services would be able to attribute any actions taken to the agent on behalf of the user. Did Joe approve this expense report, or did Joe's agent approve this expense report?
But, parsing documentation? And, believing it blindly? hah. Maybe ressurect Semantic web as well..
This gave me a chuckle. I believe the current hype term along this line is "ontologies".
This shows we need to build better approaches to agent interactions that are not at the level of "run a virtual browser", but that encodes much more of the workflows available than raw API's do today.
If you think AI has agency then you must think all software has agency. AI is just software.
To those of you who say humans are just software: try deactivating a human and see what happens. Note that this is a different experience than deactivating AI.