Great questions!
> Why are you going all in on world models instead of basing everything on top of a 3D engine that could be manipulated / rendered with separate models?
I absolutely think there's going to be super cool startups that accelerate film and game dev as it is today, inside existing 3D engines. Those workflows could be made much faster with generative models.
That said, our belief is that model-imagined experiences are going to become a totally new form of storytelling, and that these experiences might not be free to be as weird and whacky as they could because of heuristics or limitations in existing 3D engines. This is our focus, and why the model is video-in and video-out.
Plus, you've got the very large challenge of learning a rich, high-quality 3D representation from a very small pool of 3D data. The volume of 3D data is just so small, compared to the volumes generative models really need to begin to shine.
> Additionally, curious about what exactly the difference between the new mode of storytelling you’re describing and something like a crpg or visual novel
To be clear, we don't yet know what shape these new experiences will take. I'm hoping we can avoid an awkward initial phase where these experiences resemble traditional game mechanics too much (although we have much to learn from them), and just fast-forward to enabling totally new experiences that just aren't feasible with existing technologies and budgets. Let's see!
> is your hope that you can just bake absolutely everything into the world model instead of having to implement systems for dialogue/camera controls/rendering/everything else that’s difficult about working with a 3D engine?
Yes, exactly. The model just learns better this way (instead of breaking it down into discrete components) and I think the end experience will be weirder and more wonderful for it.