undefined | Better HN

0 pointstoast012d ago0 comments

> It's just one darn hallucinated citation for heaven's sake, not fraud or something.

It is fraud.

> It doesn't account for the substance or quality of their work at all.

References are part of the work. If you're making up the references, what else are you making up?

> People make mistakes and a good fraction of them can learn from those mistakes. There's no need to permanently cripple someone's ability to progress their life or contribute to humanity just because an AI hallucinated a reference one time in their life.

A one year ban is not permanent. Having a negative consequence for making poor decisions seems like an inducement to learn from the mistake?

In an ideal world, one would be keeping notes on references used while doing the research that lead to writing the paper. Choosing not to do that is one poor decision.

Having a positive outlook, if asking an AI to provide references that may have been missed, one should at least verify the references exist and are relevant. Choosing not to do that is also a poor decision, even if one did take notes on references used while researching.

0 comments

godelski12d ago

  > In an ideal world, one would be keeping notes on references used

In a far less than ideal world authors are referencing papers they've at least read the title and abstract of. In an ideal world, authors would be only referencing works they have read in their entirety. I don't think we need to live in the ideal world[0], but let's also not pretend the ideal world is even remotely out of reach. Let's also be honest that in the current setting a lot of citations are being used to encourage a work be accepted more than they are being used because of their utility to the paper. The average ML paper now is 8 pages and has >50 citations. That's crazy

[0] References can be entire textbooks, which is potentially too high of a bar

withinboredom11d ago

Even as a human, you can still fuck up references.

I submitted a paper with a reference author as Elisio because I couldn’t read my own handwriting. After submitting, I double checked all the references through an LLM. It pointed out that their name was actually Enrique. Yes, you should probably double check your references before submitting, not after.

Point is, I didn’t even trust the LLM at first. But after verifying the mistake, I was embarrassed af. I resubmitted with the fixes before it went live, but ultimately, what’s the difference between “mistake” and “hallucination”?

emil-lp11d ago

Sounds like you could use a tool like Zotero.

With proper bibliography management tools, everything (that has one) is centered around the DOI.

In fact, if a DOI is present, it's trivial to verify authors, title, venue, year, pages etc.

Of course, some older and more obscure papers won't have a DOI, but the vast majority of research work has.

lhoff11d ago

I assume they won’t ban anyone automatically without a way to object. Using your example, i wouldn’t assume they would enforce the ban if you object and explain your typo and if the corrected citation actually says what you cited. Mistakes like these are explainable a completely hallucinated citation is usually not.

godelski11d ago

Given their examples and examples I've seem Thomas talk about in the past, I doubt a typo like that would be grounds for the ban.

Perhaps the issue is that people aren't logged in or using xcancel so missing part of the tweet thread. Here's an important line

> If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper.

Followed by

> Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments")

I wouldn't look at your case and read that as "incontrovertible evidence". They are looking for the absolutely brain dead, no one at the wheel, type of errors. They're looking for things like your paper saying "As an AI language model". Which, there will be real papers with that exact phrase, but it should get flagged, not auto banned

rossjudson12d ago

If you write your own paper (mostly) and choose your own references (because you've actually read the papers) you won't have a problem.

ksd48212d ago

> It is fraud.

I think we are talking semantics here.

While fraud does require intention to deceive, I get the sentiment that hallucinated citations shouldn't be dismissed as simply carelessness. It should be something stronger than that: gross negligence or something MUCH stronger! There should absolutely be repercussions for this.

But let's not call it fraud. That word is reserved for something specific.

EDIT: someone else said "reckless disregard" equals intent or something to that effect. So I looked it up.

It appears so that is the case. "Reckless Disregard Equals Intent" in legal language.

But I am not sure if this particular clause should apply here. Perhaps it depends on what kind of research is being published? For e.g., if it is related to medical science and has a real consequence on people's health, we can then apply this?

eqvinox12d ago

I do believe this policy is appropriate to deal with the reckless disregard of posting hallucinated references.

It's a conscious decision to not take the time to check your AI output, and instead waste a whole bunch of other people's time letting them essentially do that for you in duplicate.

Feels like that should disqualify you from participation for a bit. Intent or no intent.

InfiniteAscent11d ago

100% agreed.

Doing your job poorly means giving more work to others and, consequently, stealing their time, their most precious asset.

Many here don't agree with this ban because they work in IT, where this immoral and antisocial behavior is normalized.

dataflow12d ago

> Feels like that should disqualify you from participation for a bit. Intent or no intent.

Exactly! For a bit!

Yet this is not for a bit! This is a lifetime disqualification, and that's been my entire grip the whole time! Is nobody reading this?

"The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue."

2 more replies

gpm12d ago

The intent to deceive is there. The deception is lying when you submit it that it is a scholarly piece of work in which amongst many other things you know the citations are accurate. This false representation was knowingly and intentionally made at the time of submission.

The citation being incorrect is merely the proof of deception not the (relevant) deception itself.

Fraud is the correct description provided (and this is practically a guarantee) you intended to benefit from the submission of the paper (e.g. by bolstering your resume).

fc417fc80212d ago

If I violate the letter of the ToS when clicking submit you can correctly argue that I have technically committed fraud! Yet that is almost never what anyone actually means when having discussions like this one.

Fraud in a scientific context generally refers to fabricated research results. At least personally I agree with GP that hallucinated citations are generally something akin to laziness thus not fraud but rather some sort of professional negligence.

1 more reply

fc417fc80212d ago

I think (though might well be misunderstanding) that reckless disregard is taken to be an intentional choice but that it does not imply that the outcome itself was intentional. The difference between intentionally doing something that you know for a fact has a high risk of failure but you can't necessarily predict the outcome versus intentionally seeking a particular legally disallowed outcome.

But what LPisGood was saying is that reckless disregard (as opposed to explicit intent) is sufficient to meet the legal bar for fraud.

jruohonen12d ago

> In an ideal world, one would be keeping notes on references used while doing the research that lead to writing the paper. Choosing not to do that is one poor decision.

In this book

https://news.ycombinator.com/item?id=44022957

there is this passage on p. 127:

"Any author citing another paper should be required to provide proof that they a) possess a copy of that paper, b) have read that paper, c) have read the paper carefully."

dataflow12d ago

> It is fraud.

No, it is emphatically not. Fraud requires intent to deceive.

> A one year ban is not permanent.

...what text are you reading? Nobody was calling the one-year ban permanent, or even against it. I was literally in favor of it in my comment. I explicitly said it is already plenty sufficient. What I said is there's no need to go beyond that. My entire gripe was that they very much are going beyond that with a permanent penalty. Did you completely miss where they said "...followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue"?

LPisGood12d ago

Fraud requires intent to deceive _or_ reckless disregard, sometimes called, “conscious indifference” for the veracity of the statement asserted.

dataflow12d ago

No. One single hallucinated citation on a document with you as an author is not evidence of your reckless disregard for anything. These exaggerations are crazy and you would absolutely deny such accusations if you missed your co-author's AI hallucinating a citation on your manuscript too. At best it would be careless, if you really relish extrapolating from one data point and smearing people's character based on that. Not reckless. It's quite literally the difference between going five miles per hour over the speed limit versus fifty.

6 more replies

zeusdclxvi12d ago

If you are using AI-hallucinated references in scientific papers then there is some obvious intent to deceive there

NiloCK12d ago

> No, it is emphatically not. D Fraud requires intent to deceive.

I'm about as pro AI-as-a-research--and-writing-assistant and anti AI-witchhunt as they come, but I simply cannot parse what I've quoted here.

Posting slop to arxiv is blatant deception. Posting an article is an attestation that the article is a genuine engagement with the literature. If you're posting things to arxiv that are not sincere engagements with the literature, you are attempting to deceive.

protocolture12d ago

>I'm about as pro AI-as-a-research--and-writing-assistant and anti AI-witchhunt as they come, but I simply cannot parse what I've quoted here.

Ditto. And its only 1 year. Like its about the most reasonable thing they could have done.

1 more reply

fc417fc80212d ago

You are equating cutting corners (ie laziness) with intentional deception and not being genuine. That doesn't seem accurate to me. In most contexts I think cutting corners would be taken to be some form of negligence or recklessness.

Regardless of terminology, I agree that it's certainly punishable and certainly a serious problem.

1 more reply

toast0OP12d ago

> followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue"?

This part seemed reasonable too. I'm not in academia, but my understanding is most people writing papers intend for them to be accepted by reputable peer-reviewed venues, but post to arXiv because those venues don't always allow for simple distribution.

If your papers aren't going to be accepted at reputable venues and you posted slop to arXiv before (and they noticed it!), seems reasonable that they only want reputable stuff from you in the future?

blazespin12d ago

it's very silly, but not a big deal. Arxiv is becoming irrelevant these days anyways.

In fact would be better if they just banned AI, so we could just get off the luddite platforms.

Automated research is the future, end of story. And really it couldn't have come out at a better time, given the increasingly diminishing returns on human powered research.

AnimalMuppet12d ago

If automated research is the future, it has to be research, not making stuff up.

Which of those two does "hallucinated references" fit into?

andrepd12d ago

Poe's law striking hard.

j / k navigate · click thread line to collapse

0 comments

godelski12d ago

  > In an ideal world, one would be keeping notes on references used

[0] References can be entire textbooks, which is potentially too high of a bar

withinboredom11d ago

Even as a human, you can still fuck up references.

emil-lp11d ago

Sounds like you could use a tool like Zotero.

With proper bibliography management tools, everything (that has one) is centered around the DOI.

In fact, if a DOI is present, it's trivial to verify authors, title, venue, year, pages etc.

Of course, some older and more obscure papers won't have a DOI, but the vast majority of research work has.

lhoff11d ago

godelski11d ago

Given their examples and examples I've seem Thomas talk about in the past, I doubt a typo like that would be grounds for the ban.

Perhaps the issue is that people aren't logged in or using xcancel so missing part of the tweet thread. Here's an important line

> If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can't trust anything in the paper.

Followed by

rossjudson12d ago

If you write your own paper (mostly) and choose your own references (because you've actually read the papers) you won't have a problem.

ksd48212d ago

> It is fraud.

I think we are talking semantics here.

But let's not call it fraud. That word is reserved for something specific.

EDIT: someone else said "reckless disregard" equals intent or something to that effect. So I looked it up.

It appears so that is the case. "Reckless Disregard Equals Intent" in legal language.

eqvinox12d ago

I do believe this policy is appropriate to deal with the reckless disregard of posting hallucinated references.

It's a conscious decision to not take the time to check your AI output, and instead waste a whole bunch of other people's time letting them essentially do that for you in duplicate.

Feels like that should disqualify you from participation for a bit. Intent or no intent.

InfiniteAscent11d ago

100% agreed.

Doing your job poorly means giving more work to others and, consequently, stealing their time, their most precious asset.

Many here don't agree with this ban because they work in IT, where this immoral and antisocial behavior is normalized.

dataflow12d ago

> Feels like that should disqualify you from participation for a bit. Intent or no intent.

Exactly! For a bit!

Yet this is not for a bit! This is a lifetime disqualification, and that's been my entire grip the whole time! Is nobody reading this?

"The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue."

2 more replies

gpm12d ago

The citation being incorrect is merely the proof of deception not the (relevant) deception itself.

Fraud is the correct description provided (and this is practically a guarantee) you intended to benefit from the submission of the paper (e.g. by bolstering your resume).

fc417fc80212d ago

1 more reply

fc417fc80212d ago

But what LPisGood was saying is that reckless disregard (as opposed to explicit intent) is sufficient to meet the legal bar for fraud.

jruohonen12d ago

> In an ideal world, one would be keeping notes on references used while doing the research that lead to writing the paper. Choosing not to do that is one poor decision.

In this book

https://news.ycombinator.com/item?id=44022957

there is this passage on p. 127:

"Any author citing another paper should be required to provide proof that they a) possess a copy of that paper, b) have read that paper, c) have read the paper carefully."

dataflow12d ago

> It is fraud.

No, it is emphatically not. Fraud requires intent to deceive.

> A one year ban is not permanent.

LPisGood12d ago

Fraud requires intent to deceive _or_ reckless disregard, sometimes called, “conscious indifference” for the veracity of the statement asserted.

dataflow12d ago

6 more replies

zeusdclxvi12d ago

If you are using AI-hallucinated references in scientific papers then there is some obvious intent to deceive there

NiloCK12d ago

> No, it is emphatically not. D Fraud requires intent to deceive.

I'm about as pro AI-as-a-research--and-writing-assistant and anti AI-witchhunt as they come, but I simply cannot parse what I've quoted here.

protocolture12d ago

>I'm about as pro AI-as-a-research--and-writing-assistant and anti AI-witchhunt as they come, but I simply cannot parse what I've quoted here.

Ditto. And its only 1 year. Like its about the most reasonable thing they could have done.

1 more reply

fc417fc80212d ago

Regardless of terminology, I agree that it's certainly punishable and certainly a serious problem.

1 more reply

toast0OP12d ago

> followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue"?

If your papers aren't going to be accepted at reputable venues and you posted slop to arXiv before (and they noticed it!), seems reasonable that they only want reputable stuff from you in the future?

blazespin12d ago

it's very silly, but not a big deal. Arxiv is becoming irrelevant these days anyways.

In fact would be better if they just banned AI, so we could just get off the luddite platforms.

Automated research is the future, end of story. And really it couldn't have come out at a better time, given the increasingly diminishing returns on human powered research.

AnimalMuppet12d ago

If automated research is the future, it has to be research, not making stuff up.

Which of those two does "hallucinated references" fit into?

andrepd12d ago

Poe's law striking hard.

j / k navigate · click thread line to collapse