Launch HN: CodeViz (YC S24) – Visual maps of your codebase in VS Code

189 pointsLiamPrevelige1y ago89 comments

Hey HN — we’re Liam and Will from CodeViz (https://codeviz.ai). We're building a VS Code extension that generates interactive diagrams of codebases, from system architecture down to function call graphs. Here’s a demo where we analyze OpenHands, uv, and webviz: https://www.youtube.com/watch?v=fgfDXUtWzRk.

The extension is public if you want to try it on your own repos: https://marketplace.visualstudio.com/items?itemName=CodeViz....

Will and I started CodeViz because we wanted more intuitive representations of software. During our time at Tesla, we encountered a common problem: software engineers spend very little time actually typing code. Most development time was spent navigating convoluted files and building a mental map for each task. At the same time, whiteboard sessions were proof that code could be expressed intuitively.

We started with autogenerated technical documentation. Of course, long markdown docs are not a good solution for long files of code. We realized we needed diagrams that (a) help grasp large quantities of code and (b) can be filtered according to the developer’s task. So, we built a graph-based VS Code extension. It generates diagrams directly within VS Code, illustrating connections between functions and providing overviews of system architecture. These visualizations update as code changes.

CodeViz appears as a side panel in VS Code with two views:

(1) Call graph: as you click on functions, we show a chain of upstream and downstream references. You can navigate your codebase using the call stack and see, in one view, everywhere your functions are called. We generate this call graph using the language servers developers have already installed in VS Code

(2) Architecture diagram: we create a C4 diagram of your system, so you can see a top-level view of your codebase and click into the component layer. We were surprised to find that a small fraction of code can generate a very accurate representation of the system. We detect these important files, then use LLMs to build nested architecture diagrams at the container and component level

Developers are mainly using our extension to navigate spaghetti code, onboard new devs, and interpret open source repos. We're still figuring out our pricing. Currently, we offer basic features for free, with a paid tier for more resource-intensive tools like detailed architecture diagrams. Open to suggestions on this approach.

CodeViz is in active development and our main focus over the next couple of days is to make the call graph much easier to view and navigate. We're continuously working to make it better, so your honest feedback, suggestions, and wishes would be very helpful. Looking forward to hearing any and all thoughts, whether about the current extension, general problem space, or something else!

89 comments

pthangeda1y ago

Congratulations on the launch - this looks great and I've been waiting for years for something like this! As a researcher who mostly uses Python, and explores/navigates a large number of repos for a short time, often written by other researchers not necessarily trained in software best practices, I was always frustrated (and surprised) that there was no VS Code extension or tool that gave me a quick overview/visualization to get a high level gist of different modules and code/data flow!

I tried this with a bunch of small open-source repos and it works great! I imagine using LLM might be a hard no for some people/enterprises - any plans to use stand-alone licenses with small local models? It seems like for what LLM is doing here (if I understand it right, help label the modules in natural language and perhaps help organize them into this hierarchy/modules) you don't necessarily need a SoTA model, right?

Also, this could be coming from LLMs, but I see that the visualizations are more biased towards terminology used in web-dev? (for example, one of my robot related repo was organized into front-end, back-end, etc. with I guess is kinda right but not exactly lol). It would be nice to see an interactive visualization where I can iterate on the initial viz with information I know, e.g., I drag and drop a module or rename it and then you probably do another pass with this feedback and LLMs and update my overall visualization with more domain specific labels and partitions?

Edit: Exploring CodeViz on a few more repos, and it seems like you have a set of hardcoded labels for the highest hierarchy in the architecture diagram? (so far, I've only seen Users, Databases, Backend, Frontend, and Shared Components). I am guessing this is something passed on in the prompts? It'll be nice to allow user to define their own set of labels/partitions at one or more levels and then try to create an architecture visualization that fits into these labels/constraints (although I am guessing at some point you have to be wary of hallucinations?)

WillMcCall1y ago

Re: Edit

The top level categorizations are indeed fixed, however the nodes themselves can be arbitrary. We've found this helps with grouping and organization while still allowing for the flexibility required to accommodate different systems. I'm curious, are there any categories missing here that could be added?

Currently, we categorize by: Frontend (UI/UX elements), Backend (API/Business/Data Access), DB(persisting storage), External Services (Backends maintained outside codebase), Shared Components (internally maintained libraries and helpers)

zingar1y ago

I don’t think I’d draw a diagram by hand with an explicit “frontend” label, I think I’d most likely leave it without a label. If I had to choose I would leave it labeled as “UI”.

1 more reply

0xEF1y ago

I like this perspective. I am just starting to dip my toes into software development and one thing I love about the industry as a whole is that it allows for and often encourages new ways of doing a thing, which in turn promotes new ways of thinking about thing. So many industries I interact with are stuck in a "that's the way it's always been done" loop and it is maddening, sometimes.