The biggest bottleneck for this for the past two years imo wasn't the models, but the engineering and infra around it, and the willingness of companies to work with openaio directly. Now that they've grown and have a decent userbase, companies are much more willing to pay/or involve themselves in these efforts.
This has eventual implications outside user-heavy internet use (once we see more things built on the SDK), where we're gonna see a fork in the web traffic of human centric workflows through chat, and an seo-filled, chat/agent-optimized web that is only catered to agents. (crossposted)
Buying plane tickets for example. It’s not even that I don’t trust the AI or that I’m afraid it might make a mistake. I just inherently want to feel like I’m in control of these processes.
It’s the same reason I’m more afraid of flying than driving despite flying being a way safer mode of travel. When I’m flying I don’t feel like I’m in control.
It could even work against the dynamic pricing algorithms airlines use to maximize revenue: if I have a tireless assistant exploring every possible combination to find the cheapest ticket, it’ll probably do a much better job than I ever could.
Booking an emergency flight last time I had a family issue was a mind-fucking experience. I had to go through 10 screens trying to sell me stuff and constantly hiding the skip button in different places. Maybe HN will say that I "shouldn't have had a family emergency in the first place" but reality is realty.
And honestly it's not just booking websites, it's anything tech that they do. For example, the last checkin kiosk I used also had an incredibly convoluted path for the case where someone else booked my luggage but it was a different size.
Right now I cant imagine an AI (esp. chat) being more convenient for me than skyscanner or Google Hotels, but maybe I’m missing the imagination.
Currently GPT gets you better answers than Google so people are gonna be going there first.
If (when) companies want their things to be present in ChatGPT replies, they need to provide an AI-compatible way to get it. Just shoving a full-ass web page at it is inefficient and error-prone.
They have to either build a version of their site that's AI-accessible or provide an API (or MCP) for it to access the data.
Now that the API is built and the cost is paid, we can use it for non-AI uses.
This experience is 10x better than online alternatives. AI agents can replicate this at marginal cost.
I understand an argument can be made that google is doing similar, but at least you can still search and end up on an actual site, rather than just play telephone via chatgpt. This concept is horrifying for so many reasons.
Even in that dire circumstance, I wish that the web versions keep up/are maintained, instead of being slowly deprecated, which happened for a lot of mobile-native versions of applications.
Going back to first principles, we need to recall that the internet is for the dissemination of cat pictures, and at the end of the day every technical and organizational change must be analyzed through the lens of its impact on the effective throughput of these pictures.
I suspect our future is going to be a lot more frustrating, both from AI screwups and the atrophied skills of humans
I just can't let anything AI make decisions that have consequences, like spending money, buying anything, planning vacations, flights etc. It's so bad now (I've just tried) that I'm not sure if it will ever gain my trust.
ChatGPT has become one of the top-most browsed websites, and they want to capitalize on it even if 2% of the people actually trust the new integrations.
When we launched our mobile banking platform, one of the PM's there swore up and down that we should be piloting banking by text message. He was fabulously wrong at the time and in the end got a lot of things right.
There are a lot of applications that could fit in a text box provided that your not doing the work rather that your delegating it.
So perhaps chatbots are an excellent method for building out a prototype in a new field while you collect usage statistics to build a more refined UX - but it is bizarre that so many businesses seem to be discarding battle tested UXes for chatbots.
-diehard CLI user
and if the apps are trusting ChatGPT to send them users based on those sort of queries, it's only a matter of time before ChatGPT brings the functionality first-party and cuts out the apps - any app who believes chat is the universal interface of the future and exposes their functionality as a ChatGPT app is signing their own death warrant.
It's just like Google and websites, but much more insidious. If they can get your data, they'll subsume your function (and revenue stream).
This is exactly the same playbook as has already been played multiple times in the past(and currently playing) by existing companies.
These companies initially laid out red carpets for such builders, but once they themselves had enough apps, they started to tighten the rope, and then gradually shifted to complete 100% control and extortion in the name of "security" or other made-up-excuse.
No-more walled garden. If something like this has to come (which I truly believe is helpful), it should be buiild on open-web and open protocols, not controlled by single for-profit company (ironical since OpenAI is technically non-profit).
I'm not sure that claim is justified. The primary agentic use case today is code generation, and the target demographic is used to IDEs/code editors.
While that's probably a good chunk of total token usage, it's not representative of the average user's needs or desires. I strongly doubt that the chat interface would have become so ubiquitous if it didn't have merit.
Even for more general agentic use, a chat interface allows the user the convenience of typing or dictating messages. And it's trivially bundled with audio-to-audio or video-to-video, the former already being common.
I expect that even in the future, if/when richer modalities become standard (and the models can produce video in real-time), most people will be consuming their outputs as text. It's simply more convenient for most use-cases.
I could see chat apps becoming dominant in Slack-oriented workplaces. But, like, chatting with an AI to play a song is objectively worse than using Spotify. Dynamically-created music sounds nice until one considers the social context in which non-filler music is heard.
One way to consider it that I like as an EE working in the energy model realm; consider the geometry of an oscilloscope.
Electromagnetism to be carved up into equations that recreate it.
Geometric generators that create bulk structure and allow for changing min/max parameters to achieve desired result.
Consider a hardware system that boots and offers little more than blender and photoshop like parameter UI widgets to manipulate whatever segment of the geometry that isn't quite right.
Currently we rely on an OS paradigm that is basically a virtual machine to noodle strings. The future will be a vector virtual machine that lets users noodle coordinates.
Way less resource intensive to think of it all as sync of memory matrix to display matrix and jettison all the syntax sugar developers stuck with string munging OS of history.
Other app-like interfaces like NotebookLM can be useful, for me one or two real uses a week.
Then there is engineering small open models into larger systems to do structured data extraction, etc.
I am skeptical about the current utility of agentic systems, MCP, etc. - even though I like to experiment.
Someone else said that at least the didn’t go on and on about AGI today - a nice thing. FOMO chasing ASI and AGI will drive us bankrupt, and produce some useful results.
I’m building a tool that helps you solve any type of questionnaire (https://requestf.com) and I just can’t imagine how I could leverage Apps.
It would be awesome to get the distribution, but it has to also make sense from the UX perspective.
Out of curiosity, why iff?
e.g. Coursera can send back a video player
I remember reading some not-Neuromancer book by William Gibson where one of his near-future predictions was print magazines but with custom printed articles curated to fit your interests. Which is cool! In a world where print magazines were still dominant, you could see it as a forward iteration from the magazine status quo, potentially predictive of a future to come. But what happened in reality was a wholesale leapfrogging of magazines.
So I think you sometimes get leapfrogging rather than iteration, which I suspect is in play as a possibility with AI driven apps. I don't think apps will ever literally be replaced but I think there's a real chance they get displaced by AI everything-interfaces. I think the mitigating factor is not some foundational limit to AI's usefulness but enshittification, which I don't think used to consume good services so voraciously in the 00s or 2010s as it does today. Something tells me we might look back at the current chat based interfaces as the good old days.
We are at a moment where we're trying to figure out how to design good interfaces, but very soon after that the moment of "okay, now let's start selling with them" will come and that's really what we're going to be left with.
In that regard, things like adblockers which now a days can be used to mitigate some of these defects you talk about are probably going to be much more difficult to implement in a chat-app interface. What are we going to do when we ask an agent for something and it responds with an ad rather than the relevant information we're seeking? It seems to me like it's going to be even more difficult to be in control for the user.
I imagine the Star Trek vision is pretty accurate. You occasionally talk to the computer when it makes sense, but more often than not you’re still interacting with a GUI of some kind.
I’m not very bullish on people wanting to live in the ChatGPT UI, specifically, but the concept of dynamic apps embedded into a chat-experience I think is a reasonable direction.
I’m mostly curious about if and when we get an open standard for this, similar to MCP.
What users want, which various entities religiously avoid providing to us, is a fair price comparison and discovery mechanism for essentially everything. A huge part of the value of LLMs to date is in bypassing much of the obfuscation that exists to perpetuate this, and that's completely counteracted by much of what they're demonstrating here.
The former is like a Waymo, the latter is like my car suddenly and autonomously deciding that now is a good time to turn into a Dollar Tree to get a COVID vaccine when I'm on my way to drop my kid off at a playdate.
The problem with this approach is precisely that these apps/widgets have hard-coded input and output schema. They can work quite well when the user asks something within the widget's capabilities, but the brittleness of this approach starts showing quickly in real-world use. What if you want to use more advanced filters with Zillow? Or perhaps cross-reference with StreetEasy? If those features aren't supported by the widget's hard-coded schema, you're out of luck as a user.
What I think it much more exciting is the ability to completely create generative UI answers on the fly. We'll have more to say on this soon from Phind (I'm the founder).
That said, I used it a lot more a year ago. Lately I’ve been using regular LLMs since they’ve gotten better at searching.
For a concrete example, think a search result listing that can be broken down into a single result or a matrix to compare results, as well as a filter section. So you could ask for different facets of your current context, to iterate over a search session and interact with the results. Dunno, I’m still researching.
Have you written somewhere about your experience with Phind in this area?
Now that models have gotten much more capable, I'd suggest to give the executing model as much freedom with setting (and even determining) the schema as possible.
Chat paired to the pre-built and on-demand widgets address this limitation.
For example, in the keynote demo, they showed how the chat interface lets you perform advanced filtering that pulls together information from multiple sources, like filtering only Zillow housers near a dog park.
The only place I can see this working is if the LLM is generating a rich UI on the fly. Otherwise, you're arguing that a text-based UX is going to beat flashy, colourful things.
Conservational user interfaces are opaque; they lack affordances. https://en.wikipedia.org/wiki/Affordance
I immediately knew the last generation of voice assistants was dead garbage when there was no way to know what it could do, they just expected you to try 100 things, until it worked randomly
Personally I don't hope thats the future.
For a large number of tasks that cleanly generalize into a stream of tokens, command line or chat is probably superior. We'll get some affordances like tab auto completion to help remember the name of certain bots or mCP endpoints that can be brought in as needed...
But for anything that involves discovery, graphical interaction feels more intuitive and we'll probably get bespoke interfaces relevant to that particular task at hand with some sort of partially hidden layers to abstract away the token stream?
I was really hoping Apple would make some innovations on the UX side, but they certainly haven’t yet.
They want to be the platform in which you tell what you want, and OAI does it for you. It's gonna connect to your inbox, calendar, payment methods, and you'll just ask it to do something and it will, using those apps.
This means OAI won't need ads. Just rev share.
If OpenAI thinks there’s sweet, sweet revenue in email and calendar apps, just waiting to be shared, their investors are in for a big surprise.
Ads are defenitely there. Just hidden so deeply in the black box which is generating the useful tips :)
In my (non-lawyer) understanding, each message potentially containing sponsored content (which would be every message, if the bias is encoded in the LLM itself,) would need to be marked as an ad individually.
That would make for an odd user interface.
You may have started seeing this when LLMs seem to promote things based entirely on marketing claims and not on real-world functionality.
More or less, SEO spam V2.
They obviously want both. In fact they are already building an ad team.
They have money they have to burn, so it makes sense to throw all the scalable business models in the history, eg app store, algo feed, etc, to the wall and see what stick.
[0] https://www.nber.org/system/files/working_papers/w34255/w342...
A lot of the fundamental issues with MCP are still present: MCP is pretty single-player, users must "pull" content from the service, and the model of "enabling connections" is fairly unintuitive compared to "opening an app."
Ideally apps would have a dedicated entry point, be able to push content to users, and have some persistence in the UI. And really the primary interface should be HTML, not chat.
As such I think this current iteration will turn out a lot like GPT's.
Once a service can actively involve you and/or your LLM in ongoing interaction, MCP servers start to get real sticky. We can safely assume the install/auth process will also get much less technical as pressure to deliver services to non-technical users increases.
Is there any progress on that front? That would unlock a lot of applications that aren't feasible at the moment.
Edit: Sampling is a piece of the puzzle https://modelcontextprotocol.io/specification/2025-03-26/cli...
I also see a lot of discussion on Github around agent to agent (a2a) capabilities. So it's a big use case, and seems obvious to the people involved with MCP.
Why would I use a chat to do what could be done quicker with a simple and intuitive button/input UX (e.g. Booking or Zillow search/filter)? Chat also has really poor discoverability of what I can actually do with it.
In 2024, iOS App Store generated $1.3T in revenue, 85% of which went to developers.
Edit: yes I understand it is correct, but still it sounds like an insane amount
Connecting these apps will, at times, require authentication. Where it does not require payment, it's a fantastic distribution channel.
Another commenter suggested a hotel search function:
> Find me hotels in Capetown that have a pool by the beach .Should cost between 200 dollars to 800 dollars a night
ChatGPT can already do this. Similarly, their own pizza lookup example seems like it would exist or nearly exist with current functionality. I can't think of a single non-trivial app that could be built on this platform - and if there are any, I can't think of any that would be useful or not in immediate danger of being swallowed by advances to ChatGPT.
I built this 18 months ago at an OTA platform. We parse the query and identify which terms are locations, which are hotel features, which are room amenities etc. Then we apply those filters (we have thousands of attributes that can be filtered on, but cannot display all of them in the UI) and display the hotel search results in the regular UI. The input query is also through the normal search box.
This does not need and should not be done in a chatbot UX. All the implementation is on the backend and the right display is the already existing UI. This is semantic search and it comes as a standard capability in ElasticSearch, Supabase etc. Though we built our own version.
e.g. if the user asks "Find hotels in Capetown [...] that have availability for this christmas or new year": if your backend, or the response format that you're forcing the LLM to give, doesn't have the ability to do an OR on the date range, you can't give results that the user wants, so the LLM tries to do as best it can, and the user ends up getting only hotels which are available for both Christmas and new year (thus missing some that have availability for one or the other), or the LLM does some other unwanted thing. For us, users would even ask "June or August", and then got July included because that was the closest thing the backend / UI could do.
So this approach is actually less flexible than a chat interface, where the LLM can figure out "Ah, I need to do two separate hotel search MCP calls, and then merge the results to not show the same hotel twice".
They also have this new design gui for visual programming of agents, with boxes and arrows.
It's going to be a hybrid of all these. Obviously the more explicit work done for interoperability, the easier it is, but the gaps can be bridged with the common sense of the AI at the expense of more time and compute. It's like, a self driving car can detect red lights and speed limit signs via cameras but if there are structured signals in smart infrastructure, then it's simpler and better.
But it's always interesting to see this dance between unstructured and structured. Apparently any time one gets big, the other is needed. When theres tons of structured code, we want AI common sense to cut through it because even if it's structured, it's messy and too complicated. So we generate the code. Now if we have natural language code generators we want to impose structure onto how they work, which we express in markup languages, then small scripts, then large scripts that are too complex and have too much boilerplate so we need AI to generate it from natural language etc etc
I tried buying a special kind of lamp this weekend, all LLMs and google sucked at this. The conversation did not help in finding more fine grained results.
Convenience-wise probably this model is more viable, and things will get centralized to the AI apps. And the nested utilities will be walled gardens on steroids. Using custom software and general computing (in the manner of the now discontinued sideloading on Android) will get even further away for the average person.
This time will be different?
I personally prefer well curated information.
The LLM will do the curation.
Custom GPTs (and Gemini gems) didn't really work because they didn't have any utility outside the chat window. They were really just bundled prompt workflows that relied on the inherent abilities of the model. But now with MCP, agent-based apps are way more useful.
I believe there's a fundamentally different shift going on here: in the endgame that OpenAI, Anthropic et al. are racing toward, there will be little need for developers for the kinds of consumer-facing apps that OpenAI appears to be targeting.
OpenAI hinted at this idea at the end of their Codex demo: the future will be built from software built on demand, tailored to each user's specific needs.
Even if one doesn't believe that AI will completely automate software development, it's not unreasonable to think that we can build deterministic tooling to wrap LLMs and provide functionality that's good enough for a wide range of consumer experiences. And when pumping out code and architecting software becomes easy to automate with little additional marginal cost, some of the only moats other companies have are user trust (e.g. knowing that Coursera's content is at least made by real humans grounded in reality), the ability to coordinate markets and transform capital (e.g. dealing with three-sided marketplaces on DoorDash), switching costs, or ability to handle regulatory burdens.
The cynic in me says that today's announcements are really just a stopgap measure to: - Further increase the utility of ChatGPT for users, turning it into the de facto way of accessing the internet for younger users à la how Facebook was (is?) in developing countries - Pave the way for by commoditizing OpenAI's complements (traditional SaaS apps) as ChatGPT becomes more capable as a platform with first-party experiences - Increase the value of the company to acquire more clout with enterprises and other business deals
But cynicism aside, this is pretty cool. I think there's a solid foundation here for the kind of intent-based, action-oriented computing that I think will benefit non-technical people immensely.
The docs mention returning resources, and the example is returning a rust file as a resource, which is nonsensical.
This seems similar to MCP UI in result but it's not clear how it works internally.
More: https://github.com/openai/openai-apps-sdk-examples?tab=readm...
In the current implementation, it makes an iframe (or webview on native) that loads a sandboxed environment which then gets another iframe with your html injected. Your html can include meta field whitelisted remote resources.
I hope their GUI integration will be eventually superseded by native UI integration. I remember such well thought out concepts dating back to 2018 (https://uxdesign.cc/redesigning-siri-and-adding-multitasking...).
"Find me hotels in Capetown that have a pool by the beach .Should cost between 200 dollars to 800 dollars a night "
However, it might be useful for people who do want to use that instead.
I don't see how this is a significant upgrade over the many existing hotel-finder tools. At best it slightly augments them as a first pass, but I would still rather look at an actual map of options than trust a stream of generated, ad-augmented text.
The UI 'cards' will naturally becoming ever increasing, and soon you end up back with a full app within ChatGPT or ChatGPT just becomes an app launcher.
The only advantage I can see is if ChatGPT can use data from other apps/ chats in your searches e.g. find me hotels in NYC for my upcoming trip (and it already knows the types of hotels you like, your budget and your dates)
Instead, the model will provide you with a list of (in chat) “apps” that can fulfill your request. SEO becomes AISO (AI Search Optimization). Sites can partly expose data to entice you to choose them.
Ideally, users should be able to describe a task, and the AI would figure out which tools to use, wire them together, and show the result as an editable workflow or inline canvas the user can tweak. Frameworks like LlamaIndex’s Workflow or LangGraph already let you define these directed graphs manually in Python where each node can do something specific, branch, or loop. But the AI should be able to generate those DAGs on the fly, since it’s just code underneath.
And given that LLMs are already quite good at generating UI code and following a design system (see v0.app), there’s not much reason to hardcode screens at all. The model can just create and adapt them as needed.
Really hope Google doesn’t follow OpenAI down this path.
(Also read the documentation, they specifically mention that you can tell it to create new flow paths)
Of course ads will be there and this is good. A bad thing would be if they took a bunch of traffic from google and then gave no way to promote your products.
That would lead to companies closing and layoffs and economy decline.
Instead of the user wasting time, ChatGpt can come up with the recommendations.
Lots of folks (myself included) are reporting it doesn't: https://github.com/openai/openai-apps-sdk-examples/issues/1
Sure, this helps app partners access their large user base and grows their functionality too - but the end game has to be lock-in with a 30% tax right?
Can’t say I'm unhappy to see the authoritarian duopoly of the existing app stores challenged.
One question that comes to mind is how will multiple providers of similar products and services be recommended/discovered? Perhaps they wont be recommended, but just listed instead as currently done by search engines. Is AISO our future - AI Search Optimization?
While Apps do sound and look like the future, I feel like we're headed down the same road as the App and Google Play stores with this. Sooner or later OpenAI is going to use this to take a cut $$ of the payments going through the system. Which they most likely need and deserve, but still any time you close off part of the web it makes the web less open and free.
so, best of luck to OAI. we'll see how this plays out
To me it seems like a strategic shift from pure AI research and the AGI snake oil to other supposed tangible stuff.
In short, the AI revolution is mostly over, and we seem to be back in the realm of software.
It has the potential to bridge the gap between pure conversation and the functionality of a full website.
I can block adds on a search engine. I cannot prevent an LMM from having hidden biases about what the best brand of vodka or car is.
For example, React and TypeScript were hard to set up initially. I deferred learning them for years until the tooling improved and they were clearly here to stay. Likewise, I'm glad I didn't dive into tech like LangChain and CoffeeScript, which came and went.
I'd much rather see a thriving ecosystem full of competition and innovation than a more stagnant alternative.
On a more serious note, it remains to be seen if this even sticks / is widely embraced.
Of course, part of it was due to the fact that the out-of-the-box models became so competent that there was no need for a customized model, especially when customization boiled down to barely more than some kind of custom system prompt and hidden instructions. I get the impression that's the same reason their fine-tuning services never took off either, since it was easier to just load necessary information into the context window of a standard instance.
Edit: In all fairness, this was before most tool use, connectors or MCP. I am at least open to the idea that these might allow for a reasonable value add, but I'm still skeptical.
It feels like OpenAI's mission has changed from "We want to do do AGI" to
"it'll be easier to do AGI with a lot of money, so let's make a lot of money first" to
"we have a shot at becoming bigger than Google and stealing their revenue. Let's do that and maybe do AGI if that ever works out"
“CEO” Fidji Simo must really need something to do.
Maybe I’m cynical about all of this, but it feels like a whole lot of marketing spin for an MCP standard.
I'mma call it now just for the fun of it: This will go the way of their "GPT" store.
There are plenty of brokers that will add immense value to ChatGPT for free and if users go there looking for something, it's only a matter of time.
Right now, I only like using the chat interface to answer questions I can't quite form into searches, but I also don't go directly to a chat bot to book dinner reservations. However, if I'm using the service to riff on ideas for a romantic thing to do with my partner, and it somehow leads me to resturant reservations, I do think I would engage with it and come back to ChatGPT in the future for novel interactions like that.
MCP standardizes how LLM clients connect to external tools—defining wire formats, authentication flows, and metadata schemas. This means apps you build aren't inherently ChatGPT-specific; they're MCP servers that could work with any MCP-compatible client. The protocol is transport-agnostic and self-describing, with official Python and TypeScript SDKs already available.
That said, the "build our platform" criticism isn't entirely off base. While the protocol is open, practical adoption still depends heavily on ChatGPT's distribution and whether other LLM providers actually implement MCP clients. The real test will be whether this becomes a genuine cross-platform standard or just another way to contribute to OpenAI's ecosystem.
The technical primitives (tool discovery, structured content return, embedded UI resources) are solid and address real integration problems. Whether it succeeds likely depends more on ecosystem dynamics than technical merit.