AI PR adds auto generated comments to whole Spring Boot Project (opens in new tab)

(github.com)

38 pointssondr32y ago41 comments

41 comments

Comments that say HOW commonly useless, as that is almost always self-evident (i.e. "just read the code."). Comments that explain WHY are worth their weight in gold, and not something AI could ever create.

For example:

    // How: This filters records by MyType of J and if they have children   
    // Why: We want Jobs (J) when those Jobs have Children. Otherwise, we'll get JobGroups or Jobs without any Children we need to process below.   
     .Where(T => T.MyType == "J" && T.Child.Count() > 0);

WHY are, by their very nature, business rules or explaining a thought process. If the original developer of the code moves on or is unavailable, that information can sometimes not be found elsewhere.

ugh1232y ago

>and not something AI could ever create.

Really? You don't think it's a matter of updating the prompt and expanding the context to include more of the code?

I don't know the exact method of how they generated these comments, but I think just focusing a prompt on a single method with the entire codebase in context would yield much better results.

phyzome2y ago

"Why" comments typically refer to things outside of the actual code, such as:

* Past incidents

* Regulations

* What the code used to do, and why that was bad

The whole point of a really good comment is that you can't infer it just by looking at the code. That's why it's there at all!

treffer2y ago

This should IMHO summarized in the related git commits.

Yes they become paragraph long. But that's the point. That's where I search for "why". And not just "why", "when, why and with what reasoning?".

Imagine if there were some interference that could derive a comment based on history and current code!

1 more reply

nameless9122y ago

90% of the comments in the codebase on the project I work on now are links to JIRA tickets and incident reports. AI can't and won't ever do that, and those comments are absolute gold. You can immediately read a seemingly less than optimal or overly defensive snippet of code, wonder "what moron wrote this!?" and then immediately read the referenced ticket and realize that it's "overly defensive" due to a caller that abused our API and caused it to fall over.

welshwelsh2y ago

You don't think it's possible to integrate an AI system with JIRA so that it can link to the relevant tickets?

1 more reply

LightFog2y ago

Because the information leading to the comment is not necessarily in the code itself? E.g ‘we did this because of business constraint xyz’

mjr002y ago

> Really? You don't think it's a matter of updating the prompt and expanding the context to include more of the code?

AI isn't magic. If the "why" isn't in the code, there's no way for it to magically synthesize it.

Comments explaining "there's a check for this edge case because we saw a problem in production when client X did something, see JIRA-1234 for more details" are what you should be aspiring to. If a human doesn't write that down, the AI can't invent it.

timetraveller262y ago

As said in the PR comments, this is Codemaker AI advertisement, no good will.

And funnily enough, bad advertisement. How could a startup that focuses on code think that making a PR with 3,158 changed files is acceptable?

Tommah2y ago

Maybe they just wanted 3,158 Hacktoberfest T-shirts.

timeon2y ago

I wonder if being code maintainer is going to suck with all the generated PRs.

preommr2y ago

Ironically, it would've been more useful, and more inline with how non-trivial advancements in AI have been, to identify useless and obvious comments, and eliminate them.

I just experimented a little bit, and telling chatgpt to assume a self-documenting approach and skipping comments whenever possible, gets a pretty minimalist output. Add on asking about what non-obvious aspects might be noteworthy based on other projects, and I bet chatgpt would be able to find a list of things to look for, see if it's applicable, and then rewrite in the appropriate format. Like if the function is about sorting, it could figure out that knowing if it's stable is an important marker, and figure that out. Something like that could be useful.

lelandfe2y ago

Given that all the comments I'm seeing are of the “‘makeFooReady’ makes foo ready” ilk, I wonder how these tools fare on incorrectly-named things. Would it pick up on ‘asyncRemoveFoo’ being neither async nor a function that removes foo?

ben_w2y ago

Reminds me of something I learned of over a decade ago:

   isUserAMonkey

   Added in API level 8

   public static boolean isUserAMonkey ()

   Returns "true" if the user interface is currently being messed with by a monkey.

- https://developer.android.com/reference/android/app/Activity...

no_wizard2y ago

I wonder why this exists.

What is this code for. How would they even know its a monkey?!

I'm sure its a joke of sorts, or clever moniker if you want to be charitable, but unclear to me in any case

r3trohack3r2y ago

A quick Kagi search revealed that this checks if the user is a simulated user in for stress tests:

https://developer.android.com/studio/test/other-testing-tool...

My guess is that the name shares its etymology with other Monkey references in tech, like Chaos Monkey. Where a being that has no understanding of the purpose of anything in the environment is given access to the environment and allowed to wreak havoc as a form of resilience testing.

This monkey appears to generate system events and user input events in a pseudo-random (deterministic) way to help with fuzzy-like testing of an app.

mgkimsal2y ago

https://developer.android.com/studio/test/other-testing-tool...

jlund-molfese2y ago

Auto-generated comments ("foo takes bar and returns baz") have been a thing in Java IDEs for a very long time...but most people just consider them to be clutter and don't use them.

A product that scans your repo and identifies mis-named functions would be compelling, though.

1 more reply

mateusfreira2y ago

I struggle to see how any open-source project would accept this kind os PRs, the tools seems ok for research, but not a good ideia to push PRs like that open source projects.

no_wizard2y ago

I have to say, the maintainers handled it with grace though. Didn't sass the company for it or anything, politely declined the PR, closed it, locked it and moved on.

It reflects back on them really well.

I've never used Spring Boot, but I have to give them credit for this. I don't know that I'd have similar restraint

dartos2y ago

Spring boot is one of the most enterprise-y frameworks I’ve used.

I’d be surprised if their team wasn’t incredibly professional.

delecti2y ago

Agreed. This seems like rather shameless advertising at the expense of the labor of the project maintainers.

shaunxcode2y ago

The comments are worthless too.

Things like a method called Foo that takes a Bar so the comment says (in 3 or 4 lines)

“Foo takes a Bar”

Wow! The future is here!

ptx2y ago

I'm curious what the PR description means by "the tool reached a compilation success rate of 99.9%". Did it break the code and introduce compilation failures when adding the useless comments?

shadowgovt2y ago

To be honest, I see two positives regarding what Codemaker has provided here.

One is that (glancing through the comments) they're actually significantly more thorough than what is already there and they're at the level of thoroughness that a new user may actually want. Phil Webb is quite right; this is the level of detail someone new to the codebase could use, and if you could generate it dynamically on the fly as a "Help me understand what I'm looking at" tool that'd be really nice-to-have.

Second: I've definitely worked at places so tight-ass about code documentation that they do want "makeAppleRed(): Makes the apple red." Mostly because they're using doc engines that do a bad job of letting you know a function exists at all if it isn't documented. I have no doubt Codemaker is going to make its money on those places.

michaelt2y ago

> One is that (glancing through the comments) they're actually significantly more thorough than what is already there and they're at the level of thoroughness that a new user may actually want.

Consider this real example from the pull request:

  /**
  * Returns the plugin version property.
  * @return the plugin version property
  */
  @Input
  public Property<String> getPluginVersion() {
      return this.pluginVersion;

To me this seems less like a useful comment, and more like a joke trying to highlight the absurdity of aiming for complete javadoc coverage.

shadowgovt2y ago

I agree on that example.

But contrast with this example: https://github.com/spring-projects/spring-boot/pull/39754/fi...

A function named `apply` with no documentation; you have to go up to the class to find the docs. And yes, it's not DRY to repeat the docs, but the fact the tool can pull relevant context into this location is helpful if it could be used as an on-the-fly doc generator (assuming, of course, it gives true output).

People (understandably) don't want to repeat themselves but they also can't predict what entrypoint a reader enters their class from. I've definitely bounced off APIs before because something like `apply` isn't doc'd even though it's doc'd in source twenty lines away (because I'm busy; I'm not scanning entire source files on the off chance that the thing I'm looking for is nearby).

TheCoelacanth2y ago

Unfortunately, your brain will have been trained to ignore all comments by the 100 other completely useless comments that do nothing but add noise.

If the tool generated that one comment without generating any useless comments, then it might be worth using.

spiderxxxx2y ago

> /* > * Determines the GitHub tag for the given project. > * @param project the project for which to determine the GitHub tag > * @return the GitHub tag for the project > */ > private String determineGitHubTag(Project project) {

Here's a sample of the crap you get, for 3000+ files, multiple comments per file, all this inane, that anyone with two brain cells could figure out just by reading the name of the function. If the project didn't already have these sort of comments, then adding them now is a choice that the maintainer of standards would have to make, not some random contributor.

ptx2y ago

Do you have an example of those comments you think would be helpful for beginners?

The second point doesn't seem like a positive on the whole, except maybe for the company selling it, in the "there's good money to be made in being part of the problem" sense.

Edit: I noticed one comment that seemed like it added some useful context: For CheckAdditionalSpringConfigurationMetadata.Report::iterator it says "Each line represents a source code file and its associated problems." However, this turns out to be completely wrong, if I'm reading the code right, because a line could represent a filename, or a problem, or just be a blank line. The comment gives the appearance of being helpful while actually just adding confusion and wasting everyone's time.

shadowgovt2y ago

> The second point doesn't seem like a positive on the whole, except maybe for the company selling it, in the "there's good money to be made in being part of the problem" sense.

No argument there (I don't miss those firm's doc standards). But you can make a lot of money being part of the problem in the business universe.

> The comment gives the appearance of being helpful while actually just adding confusion and wasting everyone's time.

That's definitely a concern; if the tool doesn't work it doesn't work, potential or no.

I think the first comment (https://github.com/spring-projects/spring-boot/pull/39754/fi...) looks helpful, although it is repeating info you can find in the class docs, the fact you have to go to the class docs to find it is an issue. A tool that could on--the-fly generate "Give me context for this function" (and give the right output) would be pretty useful for comprehending a codebase.

ptx2y ago

I don't know about that one. As you say, it's just repeating the class documentation, so maybe the tooling should make it easier to view the class docs if that's the problem. The generated version is also less clear: In what sense are the conventions "necessary"? The process "includes" the steps listed, but does it also include other things?

cj2y ago

This sort of AI-generated code context seems like it would be better as a VS Code extension rather than in code.

garblegarble2y ago

This is exactly my thought on the matter - this sort of auto-generated documentation is transient synthetic data, it doesn't warrant being kept in version control (or, if in version control, in a sidecar db clearly marked as auto-generated)

Always up-to-date and succinct auto-generated LLM-generated 'documentation' that describes a method in terms of who calls it (and perhaps why?) and the behaviour of the whole deep call graph inside the function seems like something that could be quite useful when looking at a method that doesn't have any human written documentation available, and isn't obvious... but it shouldn't masquerade as if it's actual API documentation that describes a durable contract.

If this synthesised analysis could flag incompatibilities between the contract described in API documentation and the actual code behaviour that could be useful (similar to what tools do today for nullable/non-nullable-annotated args/returns, except on fuzzier natural language descriptions of higher level behaviours). That seems like a much harder reasoning problem for the LLM to solve, though.

TrianguloY2y ago

Serious questions: for those of you who say "just read the code", don't you find it useful when your IDE autocompletes what a function does? I do. And sometimes you can't even read the code!

Note: I'm not talking about things that you know just by looking at the method itself, but things that are obvious when reading the code of the method only.

    /** This will return the data */
    Data getData() {...}

This is useless, But

    /** This will return true iff it's not empty */
    boolean isValid() {return !empty();}

This is useful. This is what a good documentation should provide, and having documentation in the code itself (that you can later extract to an html page or other) is way better than having it on a separate platform that you need to remember to update.

Edit: remember that this is library that other people will use, it's not an internal tool that only your team knows about.

mjr002y ago

There's often a reason that you want to encapsulate details of the behavior in a library. What if the definition of valid changes in the future and you need to adjust isValid() to also return false if the data matches some other criteria, like being out of bounds? You can say "just change the comment" but the comment is effectively part of your public-facing API and you've just introduced a breaking change.

TrianguloY2y ago

I see you agree with my point. By introducing a breaking change you need to update the facing api, so what better place to not forget to change the api that to place it exactly where the code is!

ptx2y ago

If that documented behavior is part of the API contract, then the entire isValid function seems useless and should probably be removed – if it's intended to check emptiness, just use isEmpty which already does that (or rename isValid to isNotEmpty).

mjr002y ago

uh, no, I don't agree with your point. Your example is turning a bugfix into a breaking change by describing the implementation details of a method its documentation, which becomes inaccurate after changing the implementation.

Good developers solve this by not describing the implementation details of functions in comments. This is commonly known as encapsulation.[0] This is why most libraries do not have comments detailing the implementation like you suggest.

[0] https://en.wikipedia.org/wiki/Encapsulation_(computer_progra...

j / k navigate · click thread line to collapse

41 comments

Someone12342y ago

For example:

    // How: This filters records by MyType of J and if they have children   
    // Why: We want Jobs (J) when those Jobs have Children. Otherwise, we'll get JobGroups or Jobs without any Children we need to process below.   
     .Where(T => T.MyType == "J" && T.Child.Count() > 0);

WHY are, by their very nature, business rules or explaining a thought process. If the original developer of the code moves on or is unavailable, that information can sometimes not be found elsewhere.

ugh1232y ago

>and not something AI could ever create.

Really? You don't think it's a matter of updating the prompt and expanding the context to include more of the code?

I don't know the exact method of how they generated these comments, but I think just focusing a prompt on a single method with the entire codebase in context would yield much better results.

phyzome2y ago

"Why" comments typically refer to things outside of the actual code, such as:

* Past incidents

* Regulations

* What the code used to do, and why that was bad

The whole point of a really good comment is that you can't infer it just by looking at the code. That's why it's there at all!

treffer2y ago

This should IMHO summarized in the related git commits.

Yes they become paragraph long. But that's the point. That's where I search for "why". And not just "why", "when, why and with what reasoning?".

Imagine if there were some interference that could derive a comment based on history and current code!

1 more reply

nameless9122y ago

welshwelsh2y ago

You don't think it's possible to integrate an AI system with JIRA so that it can link to the relevant tickets?

1 more reply

LightFog2y ago

Because the information leading to the comment is not necessarily in the code itself? E.g ‘we did this because of business constraint xyz’

mjr002y ago

> Really? You don't think it's a matter of updating the prompt and expanding the context to include more of the code?

AI isn't magic. If the "why" isn't in the code, there's no way for it to magically synthesize it.

timetraveller262y ago

As said in the PR comments, this is Codemaker AI advertisement, no good will.

And funnily enough, bad advertisement. How could a startup that focuses on code think that making a PR with 3,158 changed files is acceptable?

Tommah2y ago

Maybe they just wanted 3,158 Hacktoberfest T-shirts.

timeon2y ago

I wonder if being code maintainer is going to suck with all the generated PRs.

preommr2y ago

Ironically, it would've been more useful, and more inline with how non-trivial advancements in AI have been, to identify useless and obvious comments, and eliminate them.

lelandfe2y ago

ben_w2y ago

Reminds me of something I learned of over a decade ago:

   isUserAMonkey

   Added in API level 8

   public static boolean isUserAMonkey ()

   Returns "true" if the user interface is currently being messed with by a monkey.

- https://developer.android.com/reference/android/app/Activity...

no_wizard2y ago

I wonder why this exists.

What is this code for. How would they even know its a monkey?!

I'm sure its a joke of sorts, or clever moniker if you want to be charitable, but unclear to me in any case

r3trohack3r2y ago

A quick Kagi search revealed that this checks if the user is a simulated user in for stress tests:

https://developer.android.com/studio/test/other-testing-tool...

This monkey appears to generate system events and user input events in a pseudo-random (deterministic) way to help with fuzzy-like testing of an app.

mgkimsal2y ago

https://developer.android.com/studio/test/other-testing-tool...

jlund-molfese2y ago

Auto-generated comments ("foo takes bar and returns baz") have been a thing in Java IDEs for a very long time...but most people just consider them to be clutter and don't use them.

A product that scans your repo and identifies mis-named functions would be compelling, though.

1 more reply

mateusfreira2y ago

I struggle to see how any open-source project would accept this kind os PRs, the tools seems ok for research, but not a good ideia to push PRs like that open source projects.

no_wizard2y ago

I have to say, the maintainers handled it with grace though. Didn't sass the company for it or anything, politely declined the PR, closed it, locked it and moved on.

It reflects back on them really well.

I've never used Spring Boot, but I have to give them credit for this. I don't know that I'd have similar restraint

dartos2y ago

Spring boot is one of the most enterprise-y frameworks I’ve used.

I’d be surprised if their team wasn’t incredibly professional.

delecti2y ago

Agreed. This seems like rather shameless advertising at the expense of the labor of the project maintainers.

shaunxcode2y ago

The comments are worthless too.

Things like a method called Foo that takes a Bar so the comment says (in 3 or 4 lines)

“Foo takes a Bar”

Wow! The future is here!

ptx2y ago

I'm curious what the PR description means by "the tool reached a compilation success rate of 99.9%". Did it break the code and introduce compilation failures when adding the useless comments?

shadowgovt2y ago

To be honest, I see two positives regarding what Codemaker has provided here.

michaelt2y ago

> One is that (glancing through the comments) they're actually significantly more thorough than what is already there and they're at the level of thoroughness that a new user may actually want.

Consider this real example from the pull request:

  /**
  * Returns the plugin version property.
  * @return the plugin version property
  */
  @Input
  public Property<String> getPluginVersion() {
      return this.pluginVersion;

To me this seems less like a useful comment, and more like a joke trying to highlight the absurdity of aiming for complete javadoc coverage.

shadowgovt2y ago

I agree on that example.

But contrast with this example: https://github.com/spring-projects/spring-boot/pull/39754/fi...

TheCoelacanth2y ago

Unfortunately, your brain will have been trained to ignore all comments by the 100 other completely useless comments that do nothing but add noise.

If the tool generated that one comment without generating any useless comments, then it might be worth using.

spiderxxxx2y ago

ptx2y ago

Do you have an example of those comments you think would be helpful for beginners?

The second point doesn't seem like a positive on the whole, except maybe for the company selling it, in the "there's good money to be made in being part of the problem" sense.

shadowgovt2y ago

> The second point doesn't seem like a positive on the whole, except maybe for the company selling it, in the "there's good money to be made in being part of the problem" sense.

No argument there (I don't miss those firm's doc standards). But you can make a lot of money being part of the problem in the business universe.

> The comment gives the appearance of being helpful while actually just adding confusion and wasting everyone's time.

That's definitely a concern; if the tool doesn't work it doesn't work, potential or no.

ptx2y ago

cj2y ago

This sort of AI-generated code context seems like it would be better as a VS Code extension rather than in code.

garblegarble2y ago

TrianguloY2y ago

Serious questions: for those of you who say "just read the code", don't you find it useful when your IDE autocompletes what a function does? I do. And sometimes you can't even read the code!

Note: I'm not talking about things that you know just by looking at the method itself, but things that are obvious when reading the code of the method only.

    /** This will return the data */
    Data getData() {...}

This is useless, But

    /** This will return true iff it's not empty */
    boolean isValid() {return !empty();}

Edit: remember that this is library that other people will use, it's not an internal tool that only your team knows about.

mjr002y ago

TrianguloY2y ago

I see you agree with my point. By introducing a breaking change you need to update the facing api, so what better place to not forget to change the api that to place it exactly where the code is!

ptx2y ago

mjr002y ago

[0] https://en.wikipedia.org/wiki/Encapsulation_(computer_progra...

j / k navigate · click thread line to collapse