But this is where things break down.
Most modern apps don’t have fine-grained permissions.
Concrete example: Vercel. If I want an agent to read logs or inspect env vars, I have to give it a token that also allows it to modify or delete things. There’s no clean read-only or capability-scoped access.
And this isn’t just Vercel. I see the same pattern across cloud dashboards, CI/CD systems, and SaaS APIs that were designed around trusted humans, not autonomous agents.
So the real question:
How are people actually restricting AI agents in production today?
Are you building proxy layers that enforce policy? Wrapping APIs with allowlists? Or just accepting the risk?
It feels like we’re trying to connect autonomous systems to infrastructure that was never designed for them.
Curious how others are handling this in real setups, not theory.
I use strong reasoning models to understand a codebase or plan changes, then switch to faster and cheaper models for implementation and refactors. In practice this means mixing providers like Anthropic, OpenAI, and Google, because different tasks need different capabilities.
So why do AI coding platforms insist on a single model, often from a single provider, for the entire workflow?
Why burn expensive reasoning tokens while writing boilerplate? Why should planning, coding, and review all be done by the same “brain”? Why do users have to manually glue models together when platforms already have full task context?
This feels less like a technical limitation and more like a product decision.
Maybe multi-model coordination is genuinely hard. Maybe handoffs lose context. Maybe this breaks the “one AI engineer” narrative that demos well.
But engineers already do this themselves today.
Has anyone run multi-model workflows on real repos? Did it fail in non-obvious ways?
Interested in real experience, not demos.
I’m working on AI agents used for software development. These agents automatically spin up short-lived app instances – for example per pull request, per task, or per experiment – each with its own temporary URL.
Auth is handled in the standard way:
- OAuth2 / OIDC
- external identity provider
- redirect URLs must be registered in advance and be static
This clashes badly with short-lived apps:
- URLs are dynamic and unpredictable
- redirect URLs can’t realistically be pre-registered
- auth becomes the only non-ephemeral part of an otherwise fully automated workflow
What I see teams doing instead:
- disabling real auth in preview environments
- routing all callbacks through a single stable environment
- using wildcard redirects or proxy setups that feel like hacks
This gets especially awkward for AI dev agents, because they assume infrastructure is disposable and fully automated – no manual IdP config in the loop.
So I’m curious:
1. If you use short-lived preview apps, how do you handle real auth?
2. Are there clean OAuth/OIDC patterns that work with dynamic URLs?
3. Is the static redirect URL assumption still the right model here?
4. What actually works in production?
Looking for real setups and failure stories, not theory.