So without further ado:
* If LLMs can indeed produce wholly novel research independently without any external sources, then prove it. Cite sources, unlike the chatbot that told you it can do that thing. Show us actual results from said research or products that were made from it. We keep hearing these things exponentially increase the speed of research and development but nobody seemingly has said proof of this that’s uniquely specific to LLMs and didn’t rely on older, proven ML techniques or concepts.
* If generative AI really can output Disney quality at a fraction of the cost, prove it with clips. Show me AI output that can animate on 2s, 4s, and 1s in a single video and knows when to use any of the above for specific effects. Show me output that’s as immaculate as old Disney animation, or heck, even modern ToonBoom-like animation. Show me the tweens.
* Prove your arguments. Stop regurgitating hypeslop from CEBros, actually cite sources, share examples, demonstrate its value relative to humanity.
All people like us (myself and the author) have been politely asking for since this hype bubble inflated was for boosters to show actual evidence of their claims. Instead, we just get carefully curated sizzle reels and dense research papers making claims instead of actual, tangible evidence that we can then attempt to recreate for ourselves to validate the claims in question.
Stop insulting us and show some f*king proof, or go back to playing with LLMs until you can make them do the things you claim they can do.
I was actually thinking about it, and there could be a simple test - Remove all knowledge of X from knowledge corpus and train LLM on such corpus. Under X one can imagine anything - differential calculus, logarithms, Reimann Hypothesis, Special Theory of relativity, Fermat's theorems, ... And now ask AI questions which actually has lead to discovery of X.
If AI is able to rediscover X while not knowing about X, we can say it is proof of intelligence.
I don't think this makes any sense and suspect it requires a deep misunderstanding of “research” to even consider this an issue.
Research is inherently something done in response to, and grounded in, the external.
Everything revolving around these LLMs so far as been tech hype culture and similar "think of the future" vibes. IMO we never see proof of this because right now it simply doesn't exist.
compare with a headline from today:
>OpenAI’s ChatGPT to hit 700 million weekly users, up 4x from last year
I don't think there was much that hype when ChatGPT launched. Just an awful lot of people using it because it's kind of cool.
The critics seem to do a certain amount of goal post moving, like saying it's doing well by having unprecedented user growth is met by - f*king prove LLMs can indeed produce wholly novel research. Is anyone actually claiming they produce wholly novel research?
There is a similar effect with the featured blog post where the guy makes some perfectly reasonable arguments why he doesn't really like LLMs and doesn't want to work on them and then instead of titling it "Why I hate LLMs" goes with "Why I hate AI." But there's some quite cool stuff in non LLM AI like AlphaFold and trying to cure diseases. If you are talking about LLMs why not put LLM in the title?
It was quite convincing, and I could see lower-budget studios trying to make it work. (There is a truckload of garbage tier animation on all platforms.)
The person who submitted it is an experienced producer who used something like 600 prompts to generate the end result, so it's not exactly few-shot prompting from novices with no film experience. But it happened
Then again, the astroturfing done (presumably) by big LLM is off the charts, so who knows if this was actually what happened.
But its not copying it. That is the entire point. Its using the training data to adjust floating point numbers. If you train on a single data piece over and over again, then yes it can replicate it, just like you can memorize lines of a school play, but its still not copied/compressed in the traditional, deterministic sense.
You can't argue "we don't know how they work, or our own brains work with any certainty" and then over-trivialize what they do on the next argument.
People suffer brain damage and come out the other side with radically different personalities. What happened to "qualia", or "sense of self", where is their "soul". Its just a mechanistic emergent property of their biological neural network.
Who is to say our brains aren't just very high parameterized biological floating point machines? That is the true Occam's Razor here, as uncomfortable as that might make people.
I believe it's quite possible that what is happening during training is in certain ways similar to what is happening to a child learning the world, although there are many practical differences (and I don't even mean the difference between human neurons and the ones in a neural network).
Is there anything to feel uncomfortable about? It's been a long time since people started discussing the concept of "a self doesn't exist, we're just X" where X was the newest concept popular during that time. I'm 100% sure LLMs are not the last one.
(BTW as for LLMs themselves, there are still two big engineering problems to solve: quite small context windows and hallucinations. The first requires a lot of money to solve, the second needs special approaches and a lot of trial and error to solve, and even then the last 1% might be almost impossible to get working reliably.)
Humans mis-remember and make up things all the time, completely unintentionally. It could be a fundamental flaw in large neural networks. Impressive data compression and ability to generalize, but impossible to make "perfect".
If AI becomes cheap and fast enough, its likely a simple council of models will be enough to alleviate 99% of the problem here.
It may very well be the case that Apple too finds themselves pressured into going all out on LLM
Besides my stance that LLMs can serve specific tasks very well and are likely going to take a place similar to spreadsheets and databases over the coming years, hasn’t Apple already? Rarely has Apple tried to appear so unified on one goal across their product stack as they did with Apple Intelligence, the vast majority of which is heavily LLM focused.
The Author appears to fully skip over their attempt and subsequent failure, which made the entire point the piece is trying to further rather unsubstantiated and made me check whether this wasn’t posted in 2022, even more for someone like myself who also is very confident that there is a large chasm between LLMs and whatever AGI may end up being.
AI is one in a long long long line of new technologies. It is generating a lot of investment, new corporate processes and directives, declarations like "new era" and "civilizational milestone," etc.
If someone thinks any of the above are wrong or misguided, it's a mistake to "blame" or look to AI as the primary cause.
The primary cause is our system: humans are actors in the US economic system and when a new technology is rolling out, usually the response is the same and differs only in magnitude.
Don't hate the player, hate the game.
Just because the author was unable to wrangle LLM to do novel research doesn't mean that it's impossible. We already have examples of LLMs either doing or aiding significantly with novel research.
I'm also a researcher and agree wholeheartedly with the article. LLMs can maybe help you sift through existing literature or help with creative writing, at most they can be used or background research in hypothesis generation by finding pairs of related terms in the literature which can be put together into a network of relationships. They can help with a few tasks suitable for an undergrad research assistant.
> we obtain the first statistically significant conclusion on current LLM capabilities for research ideation: we find LLM-generated ideas are judged as more novel (p < 0.05) than human expert ideas while being judged slightly weaker on feasibility.
It's a bit better than just finding related pairs. And that's with sonnet 3.5 which is basically ancient at this point.
The article says:
> Yet, every time I tried to get LLMs to perform novel research, they fail because they don’t have access to existing literature on the topic.
You say:
> LLMs can maybe help you sift through existing literature
> they can be used or background research in hypothesis generation by finding pairs of related terms in the literature
As far as I can see, these two positions are mutually exclusive. Aren’t you disagreeing with the article?
Researchers using GPT to summarize papers may be helping humans create novel research, but it certainly isn't GPT doing any such thing itself.
I see the claims being levied against LLMs, but in the generative media world these models are nothing short of revolutionary.
In addition to being an engineer, I'm also a filmmaker. This tech has so many orders of magnitude changes to the production cycle:
- Films can be made 5,000x cheaper (a $100M Disney film will be matched by small studios on budgets of $20,000.)
- Films can be made 5x faster (end-to-end, not accounting for human labor hour savings. A 15 month production could feasibly be done in 3 months.)
- Films can be made with 100x fewer people. (Studios of the future will be 1-20 people.)
Disney and Netflix are going to be facing a ton of disruptive pressure. It'll be interesting to see how they navigate.
Advertising and marketing? We've already seen ads on TV that were made over a weekend [1] for a few thousand dollars. I've talked to customers that are bidding $30k for pharmaceutical ad spots they used to bid $300k for. And the cost reductions are just beginning.
[1] https://www.npr.org/2025/06/23/nx-s1-5432712/ai-video-ad-kal...
In theory (idk it probably exists already) you can generate a script and feed it into an AI that generates a film. Novelty aside, who is going to watch it? And what if you generate a hundred films a day? A thousand?
This probably isn't a hypothetical scenario, as low-effort / generated content is already a thing, both writing, video and music. It's an enormous long tail on e.g. youtube, amazon, etc, relying on people passively consuming content without paying too much attention to it. The background muzak of everything.
As someone smarter than me summarized, AI generated stuff is content, not art. AI generated films will be content, not art. There may be something compelling in there, but ultimately, it'll flood the market, become ubiquitous, and disappear into the background as AI generated background noise that only few people will seek out or watch intentionally.
That's not fair. Do you know how many dreamers and artists and great ideas wither away on the vine? It's tragic.
Movies are going to be like books today. And that's not a bad thing.
Distribution is always the hard part. Indie games, indie music. You've still got to market yourself and find your audience.
But the difference is that now it's possible. And you don't have to obey some large capital distributor and mind their oversight and meddling.
How does this work? If the quality ads are easier to produce, wouldn't there be more competition for the same spot with more leftover money for bidding? Why would this situation reduce the cost of a spot?
> Using AI-powered tools, they were able to achieve an amazing result with remarkable speed and, in fact, that VFX sequence was completed 10 times faster than it could have been completed with traditional VFX tools and workflows
> The cost of [the special effects without AI] just wouldn’t have been feasible for a show in that budget
— https://www.theguardian.com/media/2025/jul/18/netflix-uses-g...
Discussed on Hacker News here: https://news.ycombinator.com/item?id=44602779
Personally, I'm not particularly impressed. Yes, I'm impressed by the technology and the fact that we've reached a point where something like this is even possible, but in my opinion, it's soulless and suffers from the same problems as other AI videos. More emphasis was placed on length than quality, and I've seen shorter, traditionally produced videos that had more heart. That's probably because these videos were created by amateurs who thought the AI would fill in all the gaps, but that only underscores the need for human artists with a keen eye.
I do not believe this is true.
These aren't prompted end-to-end. There's a tremendous amount of work being done.
For end-to-end, go to Show Runner AI. Or look up SpongeBob AI on YouTube.
Because AI can inadvertently say nasty things. This could potentially damage the company image.