1) a pretty user interface, a backend service layer, all the normal things you’d need for a desktop app or saas.
2) a backend that can actually do work.
I’m going to go out a limb here and say, building an MVP of the UI is a waste of your time.
…because, unless you know how to build the backend (ie. part 2), you actually, have no idea what you need to build for part one.
You can copy of the UI from the Devin videos.
You can build your own langchain framework.
You can fine tune an open model on GitHub issues.
…but just like having building a gpt4 is harder than just “add more params”, building something that works like Devin appears to work requires a reasonably sharp “step up” in capability between what literally everyone has been doing with gpt4 until now, and being able to turn that into a useful framework for solving actual engineering tasks.
So… don’t hold your breath. If you see someone building a UI (like this https://github.com/OpenDevin/OpenDevin/tree/main/frontend, https://github.com/stitionai/devika/tree/main/ui; just read the commit log, it’s basically just ui) it means they’re doing the easy work (part 1) because they don’t know how to do the hard part (part 2).
…so, interesting, but this doesn’t smell like a really serious effort (at least yet).
I guess you could argue that it’s important “setup infrastructure” stuff that any project starts with… but I’m just sceptical.
I can draw pictures of a Devin too. A serious effort would be trying to replicate what Devin does not what it looks like.
Yes. If you link to the UI sections of a repo you will likely see "basically just ui" commit history.
I guess you could argue that it’s important “setup infrastructure” stuff that any project starts with… but I’m just sceptical.
Of course it's an early stage pilot, it's only been developed since the Devin hype, why should it compete in quality at this very early time?
When Zuck releases Llama 3 in July or whenever we might have something to plug this into.
We're very aware that we'll need great agents to be able to compete with Devin and others. We're currently setting up evaluation pipelines to evaluate various agents against SWE-bench.
Our thesis is that a community experimenting with various agents and agent architectures will outpace a private company on a single track. We're building the notion of an "agent hub" out of the gates--anyone can plug into the Agent interface and contribute their work. We're also discussing how to build a meta-agent, which farms out specific tasks to sub-agents.
It's early days though--we've only just gotten things wired together in a sort-of working demo. Stay tuned!
OpenAI already did spend quite a bit on software development reinforcement learning and we already have models with huge context windows.
We've already seen OSS projects where ai agents collaborate to create code projects.
If you use CodeLlama, throw some engineering manpower to create feedback loop with compilers, linters and other tooling to improve the code generated you can probably match Devin's performance. I can get some pretty good results feeding back compiler's output to ChatGPT, if that were automated, my impression of ChatGPT would be much better.
So my bet is that an open source Devin is absolutely possible.
Did you watch the YouTube videos?
I’m also skeptical it can do all the things it appears to be able to do, but I think it’s undeniable that what they show it doing (totally unverified as it is)…
…is significantly more capable than any other programming agent I’ve seen.
If openAI had that capability, they would be selling it.
They are not. I doubt very much that they’re just hoarding it and keeping it to themselves; gpt4 just can’t do it out of the box.
It is not that simple.
It is not just a fancy prompt for gpt4.
If it was that easy, someone would have done a year ago.
> If you use CodeLlama
Oh come on. These pathetic little open source models are shit compared the vendor offerings.
This is daydreaming about what might be, not what actually exists.
It is hard to build something like that.
If it's just a GPT wrapper, we would have tons of indie hackers building competitors and making 10k MRR by now.
The agents, prompts and orchestration are difficult problems to solve.
On the other hand, you have people thinking it's impossible to build something like that. I also disagree.
From a complexity perspective, coding a piece of software is no harder than driving an autonomous vehicle, since the logic is fixed and the environment is deterministic. If we can get to L5 for AV, I don't see why we can't get to L5 for coding.
If they let me I'll get to work on the "hard part."
>Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. Devika aims to be a competitive open-source alternative to Devin by Cognition AI.
(base) OpenDevin % docker login ghcr.io Authenticating with existing credentials... Login Succeeded (base)OpenDevin % docker pull ghcr.io/opendevin/sandbox:v0.1 Error response from daemon: denied
error: File "/Users/meek/anaconda3/lib/python3.11/site-packages/docker/models/containers.py", line 876, in run self.client.images.pull(image, platform=platform) File "/Users/meek/anaconda3/lib/python3.11/site-packages/docker/models/images.py", line 464, in pull pull_log = self.client.api.pull( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/meek/anaconda3/lib/python3.11/site-packages/docker/api/image.py", line 429, in pull self._raise_for_status(response) File "/Users/meek/anaconda3/lib/python3.11/site-packages/docker/api/client.py", line 267, in _raise_for_status raise create_api_error_from_http_exception(e) from e ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/meek/anaconda3/lib/python3.11/site-packages/docker/errors.py", line 39, in create_api_error_from_http_exception raise cls(e, response=response, explanation=explanation) from e docker.errors.APIError: 500 Server Error for http+docker://localhost/v1.44/images/create?tag=v0.1&fromImage=ghcr.io%2Fopendevin%2Fsandbox: Internal Server Error ("denied")