Based on modern metrics for code quality, almost nobody will realize that they're looking at bad code. I've seen a lot of horrible codebases which looked pretty good superficially; good linting, consistent naming, functional programming, static typing, etc... But architecturally, it's just shockingly bad; it's designed such that you need to refactor the code constantly; there is no clear business layer; business logic traverses all components including all the supposedly generic ones.
With bad code, any business requirement change requires a deep refactoring... And people will be like "so glad we use TypeScript so that I don't accidentally forget to update a reference across 20 different files required as part of this refactoring" - Newsflash: Your tiny business requirement change requires you to update 20+ files because your code sucks! Sure TypeScript helps in this case, but type safety should be the least of your concerns. If code is well architected, complex abstractions don't generally end up stretching across more than one or two files.
There's a reason we say "Leaky abstraction" - If a complex abstraction leaks through many file boundaries; it's an abstraction and it's leaky!
I wonder if the main problem was all the min maxing interview patterns that rewarded algorithm problem solvers back in the 2010's onwards.
People applied for software engineering jobs because they wanted to play with tech, not because they wanted to solve product problems (which should have a direct correlation with revenue impact)
Then you have the ego boosting blog post era, where everyone wanted to explain how they used Kafka and DDD and functional programming to solve a problem. If you start reading some of those posts, you'll understand that the actual underlying problem was actually not well understood (especially the big picture).
This led the developer down a wild goose chase (willingly), where they end up spending tons of time burning through engineering time, which arguably could be better spent in understanding the domain.
This is not the case for everyone, but the examples are few.
It makes me wonder if the incentives are misaligned, and engineering contributing to revenue ends up not translating to hard cash, promos and bonuses.
In this new AI era, you can see the craftsman style devs going full luddite mode, IMO due to what I've mentioned above. As a craftsman style dev myself. I can only set up the same async job queue pattern that many times. I'm actually enjoying the rubber ducking with the AI more and more. Mostly for digging into the domain and potential approaches for simplification (or even product refinement).
Painful for me because I excel at architecture. My puzzle-solving skills are actually good too, but unfortunately, not under time constraints! Sometimes I feel like there's been an industry-wide conspiracy against the software architect archetype!
I remember since I first learned coding at a young age, I wanted to be a software architect and I was shocked to learn that this skill was rarely appreciated in the industry. I became convinced that the software developer role had become a kind of 'bullshit job' of sorts to meet the needs of the reserve bank's job-creation agenda.
I suppose the silver lining is that at least now LLMs have a bias towards puzzle-solving and so lead most codebases astray... This increases my value as a software architect or 'craftsman' in your words.
I think you make a good argument there. You can extrapolate it to almost every aspect of society. Since you go to school, everything has been geared towards measuring thinking speed... We've been using thinking speed as the definition of intelligence... You know who else besides high IQ individuals are good at thinking fast? LLMs!
It's kind of interesting and fitting though that the AI agents we invented have the same biases as the humans at the top of our organizations!
I feel like the whole "there is only one kind of intelligence" belief which was pervasive in big tech has been thoroughly debunked by now.
This is a naive metric since it's satisfied by putting the entire code base into a single file.
Part of the reason that business requirement changes to modern web dev code bases require changes to so many files is because web devs are encouraged to restrict the scope of any one file as much as possible.
I can't tell if you're complaining about that specifically or if you think it's possible to have both a highly modularized code base & still restrict business requirement changes to only a couple files.
If the latter, then I'd love to know guidelines for doing so.
I said 90% in my comment but that's from my professional experience which is probably biased towards complex projects where maintainability is more important.
What does this program accomplish? How does it accomplish it? Walk me through the boot sequence. Where does it do ABC?
I work in a company where I frequently interact with adjacent teams' code bases. When working on a ticket that touches another system, I'll typically tell it what I'm working on and ask it to point me to areas in the code that are responsible for that capability and which tests exercise that code. This is a great head start for me. I then start "in the ball park".
I would not recommend to have it make diagrams for you. I don't know what it is but they LLMs just aren't great at coveting information into diagram form. I've had it explain, quite impressively, parts of code and when I ask it to turn that into a diagram it comes up short. Must be low on training data expressing itself in that medium. It's an okay way to get the syntax for a diagram started, however.
I wish you an auspicious time in your new role!
I built my own node graph utility to do this for my code, after using Unreal's blueprints for the first time. Once it clicked for me that the two are different views of the same codebase, I was in love. It's so much easier for me to reason about node graphs, and so much easier for me to write code as plain text (with an IDE/language server). I share your wish that there were a more general utility for it, so I could use it for languages other than js/ts.
Anyway, great job on this!
https://en.wikipedia.org/wiki/Doxygen#/media/File:Doxygen-1....
This kind of approach might be what (finally) unlocks visual programming?
I feel like most good programmers are like good chess players. They don't need to see the board (code). But for inputting the code transformation into the system this might be a good programmer's chessboard.
Though to make it work concretely for arbitrary codebases I feel like a coding agent behind the scenes is 100% required.
As a bonus, porting Doom to it should be "trivial".
A specific type or area of developers, I'd say. There are many types and not all of them require understanding sizeable code bases to do their work well.
How would you do it today?
But telling people that isn't helpful. I try at the beginning to give more step by step of how I would get into understand the code base if I didn't already know these kinds of shortcuts. (I'm not sure I could write those down, they are just know how and heuristics, like how when you are a starting to code a missing ; can take a much longer time to see than as you've been programming for a while)
There should be more writing and discussion in this area for several reasons. Simplest reason because we're curious about how others do this. But also because it's an interesting topic, IMHO, because layers of abstraction--code designed to run other code--can be difficult to talk about, because the referents get messy. How do you rhetorically step through layers of abstraction?
When I have a codebase I dont know or didn’t touch in some time and there’s a bug, first step is reproduce it an then set a breakpoint early on somewhere, crab some coffee and spend some time to step through it looking at state until I know what’s happening and from there its usually kind of obvious.
Why would one need a graph view to learn a codebase when you can just slap a red dot next to the route and step a few times?
https://heyes-jones.com/externalsort/treeofwinners.html
Take this example. I can step through the algorithm, view the data structure and see a narration of each step.
A debugger is useful for debugging edge cases but it is very difficult to learn a complex system by stepping through it.
And for huge git repos I always like to generate a Gource animation to understand how the repo grew, when big rearrangements and refactors happened, what the most active parts of the codebase are, etc.
Think of an on-caller who wants to quickly pinpoint a problem. Visualization could help one understand the nature of the problem before reading the code. Then you could select a part of the visualization and ask the computer to tell you what that part does, if there are any recent changes to it, etc.
How is it different from regular code browser/indexers?
I'd add one more technique that's worked well for me: trace a single request from HTTP endpoint to database and back. In a FastAPI app, that means starting at the route handler, following the dependency injection chain, seeing how the ORM/query layer works, and understanding the response serialization. You touch every layer of the stack by following one real path instead of trying to understand the whole codebase at once.
Visualizers are nice for the "big picture" but they rarely help you understand why the code works the way it does. The why is in the git history and the closed issues, not in a dependency graph.
I created Intraview for VS Code, Cursor, etc., that makes it easy to create code tours with your Coding agent by simply saying, "Create a tour that helps me understand how to get started with this repository."
It has other features, but it was designed for the problem of getting in new code bases and it allows the tours to be saved in the repo as flat json files. You can re-open or share tours with new folks, and if the code changes the system notifies you how to ask your agent to update the tour.
Just a thought.