NBenkovich on Hacker News

Ask HN: How do you give AI agents access without over-permissioning?

To make AI agents more efficient, we need to build feedback loops with real systems: deployments, logs, configs, environments, dashboards.

But this is where things break down.

Most modern apps don’t have fine-grained permissions.

Concrete example: Vercel. If I want an agent to read logs or inspect env vars, I have to give it a token that also allows it to modify or delete things. There’s no clean read-only or capability-scoped access.

And this isn’t just Vercel. I see the same pattern across cloud dashboards, CI/CD systems, and SaaS APIs that were designed around trusted humans, not autonomous agents.

So the real question:

How are people actually restricting AI agents in production today?

Are you building proxy layers that enforce policy? Wrapping APIs with allowlists? Or just accepting the risk?

It feels like we’re trying to connect autonomous systems to infrastructure that was never designed for them.

Curious how others are handling this in real setups, not theory.

Ask HN: Why do AI coding platforms force to use one model for a task?

When I use AI for coding, I don’t use one model.

I use strong reasoning models to understand a codebase or plan changes, then switch to faster and cheaper models for implementation and refactors. In practice this means mixing providers like Anthropic, OpenAI, and Google, because different tasks need different capabilities.

So why do AI coding platforms insist on a single model, often from a single provider, for the entire workflow?

Why burn expensive reasoning tokens while writing boilerplate? Why should planning, coding, and review all be done by the same “brain”? Why do users have to manually glue models together when platforms already have full task context?

This feels less like a technical limitation and more like a product decision.

Maybe multi-model coordination is genuinely hard. Maybe handoffs lose context. Maybe this breaks the “one AI engineer” narrative that demos well.

But engineers already do this themselves today.

Has anyone run multi-model workflows on real repos? Did it fail in non-obvious ways?

Interested in real experience, not demos.

Ask HN: How do you handle auth when AI dev agents spin up short-lived apps?

Hi HN,

I’m working on AI agents used for software development. These agents automatically spin up short-lived app instances – for example per pull request, per task, or per experiment – each with its own temporary URL.

Auth is handled in the standard way:

- OAuth2 / OIDC

- external identity provider

- redirect URLs must be registered in advance and be static

This clashes badly with short-lived apps:

- URLs are dynamic and unpredictable

- redirect URLs can’t realistically be pre-registered

- auth becomes the only non-ephemeral part of an otherwise fully automated workflow

What I see teams doing instead:

- disabling real auth in preview environments

- routing all callbacks through a single stable environment

- using wildcard redirects or proxy setups that feel like hacks

This gets especially awkward for AI dev agents, because they assume infrastructure is disposable and fully automated – no manual IdP config in the loop.

So I’m curious:

1. If you use short-lived preview apps, how do you handle real auth?

2. Are there clean OAuth/OIDC patterns that work with dynamic URLs?

3. Is the static redirect URL assumption still the right model here?

4. What actually works in production?

Looking for real setups and failure stories, not theory.

5NBenkovich3mo ago9

Ask HN: How do you give AI agents access without over-permissioning?

To make AI agents more efficient, we need to build feedback loops with real systems: deployments, logs, configs, environments, dashboards.

But this is where things break down.

Most modern apps don’t have fine-grained permissions.

And this isn’t just Vercel. I see the same pattern across cloud dashboards, CI/CD systems, and SaaS APIs that were designed around trusted humans, not autonomous agents.

So the real question:

How are people actually restricting AI agents in production today?

Are you building proxy layers that enforce policy? Wrapping APIs with allowlists? Or just accepting the risk?

It feels like we’re trying to connect autonomous systems to infrastructure that was never designed for them.

Curious how others are handling this in real setups, not theory.

Ask HN: Why do AI coding platforms force to use one model for a task?

When I use AI for coding, I don’t use one model.

So why do AI coding platforms insist on a single model, often from a single provider, for the entire workflow?

This feels less like a technical limitation and more like a product decision.

Maybe multi-model coordination is genuinely hard. Maybe handoffs lose context. Maybe this breaks the “one AI engineer” narrative that demos well.

But engineers already do this themselves today.

Has anyone run multi-model workflows on real repos? Did it fail in non-obvious ways?

Interested in real experience, not demos.

Ask HN: How do you handle auth when AI dev agents spin up short-lived apps?

Hi HN,

Auth is handled in the standard way:

- OAuth2 / OIDC

- external identity provider

- redirect URLs must be registered in advance and be static

This clashes badly with short-lived apps:

- URLs are dynamic and unpredictable

- redirect URLs can’t realistically be pre-registered

- auth becomes the only non-ephemeral part of an otherwise fully automated workflow

What I see teams doing instead:

- disabling real auth in preview environments

- routing all callbacks through a single stable environment

- using wildcard redirects or proxy setups that feel like hacks

This gets especially awkward for AI dev agents, because they assume infrastructure is disposable and fully automated – no manual IdP config in the loop.

So I’m curious:

1. If you use short-lived preview apps, how do you handle real auth?

2. Are there clean OAuth/OIDC patterns that work with dynamic URLs?

3. Is the static redirect URL assumption still the right model here?

4. What actually works in production?

Looking for real setups and failure stories, not theory.

NBenkovich

Recent submissions

Show HN: Agyn, an open-source Kubernetes runtime for AI agents (opens in new tab)

Show HN: Fixing Claude Code's amnesia with persistent memory (opens in new tab)

Ask PH: Worktrees or isolated sandboxes for multi-agent AI workloads? (opens in new tab)

Show HN: We achieved 72.2% issue resolution on SWE-bench Verified using AI teams (opens in new tab)

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified (opens in new tab)

Ask HN: How do you give AI agents access without over-permissioning?

Ask HN: Why do AI coding platforms force to use one model for a task?

Ask HN: How do you handle auth when AI dev agents spin up short-lived apps?

Show HN: Gh-PR-review – CLI tool for LLMs to create, read, comment PRs (opens in new tab)

Recent submissions

Show HN: Agyn, an open-source Kubernetes runtime for AI agents (opens in new tab)

Show HN: Fixing Claude Code's amnesia with persistent memory (opens in new tab)

Ask PH: Worktrees or isolated sandboxes for multi-agent AI workloads? (opens in new tab)

Show HN: We achieved 72.2% issue resolution on SWE-bench Verified using AI teams (opens in new tab)

Show HN: Measuring how AI agent teams improve issue resolution on SWE-Verified (opens in new tab)

Ask HN: How do you give AI agents access without over-permissioning?

Ask HN: Why do AI coding platforms force to use one model for a task?

Ask HN: How do you handle auth when AI dev agents spin up short-lived apps?

Show HN: Gh-PR-review – CLI tool for LLMs to create, read, comment PRs (opens in new tab)