Claiming we can tell that there's a distinction that merits saying people are reasoning and LLMs are not, is "hallucination" to me. It's making a claim there is insufficient evidence to make a reasoned statement about.
EDIT: Ironically, on feeding ChatGPT (w/GPT4) my comment and your reply and asking it to "compose a reply on behalf of 'vidarh'" it produced a reply that was far more willing to accept your claim that there is a fundamental difference (while otherwise giving a reasonable reaffirmation of my argument that reinforcement of the boundaries of its knowledge would reduce its "hallucinations")
E.g. ChatGPT can (badly) perform maths or symbolically evaluate code that is guaranteed not to be in its input in a way that would be very hard to explain as "just stringing words together".
Here is an example where I loosely describe a language that is a mix of an INTERCAL-inspired "come from" construct and Ruby, and then give it an example program and asks it what it will output:
> We're going to write a program in a "new" language that is a mix of Ruby and INTERCAL. We're going to take the "come from" statement and use it to allow "hijacking" the return of a function. Furthermore, we're going to do it conditionally. "come from <method> if <condition>" will execute the following block if <method> was executed and <condition> is true. In <condition>, "result" can be used to refer to the result of executing <method>.
>
> Can you deduce what this program will print, and why?
>
> def foo(str)
> str
> end
>
> come from foo if result == "hi" do
> print "foo returned hi"
> end
>
> puts "1:"
> puts foo("some string")
> puts "2:"
> foo("hi")
I explicitly avoided passing strings that'd indicate the expected output. There's an ambiguity introduced by passing a block rather than just letting the statement stand by itself (passing a block might indicate I'd like the statement to return at the end of the block). This was unintended from my side. I intended execution to just continue after the "come from" block, as that is consistent with INTERCAL. I'll get back to that ambiguity.Here's ChatGPT(GPT4)'s output:
> Sure, let's analyze this program step by step. The new language is a mix of Ruby and INTERCAL, using the "come from" statement to conditionally hijack the return of a function.
>
> Here's the program:
[snipped a verbatim copy of the program I input for brevity] > 1. The `foo` method is defined, taking a single argument `str` and returning it.
>
> 2. The "come from" statement is introduced with a condition to hijack the return of the `foo` method if the `result` is equal to "hi".
>
> 3. The program outputs "1:".
>
> 4. The program calls `foo("some string")` and outputs its result, which is "some string". The "come from" statement is not executed because the result is not equal to "hi".
>
> 5. The program outputs "2:".
>
> 6. The program calls `foo("hi")`. Since the result is "hi", the "come from" statement is executed, and "foo returned hi" is printed. Note that the return value of the `foo` method is not printed here, as there's no `puts` statement wrapping the call to `foo("hi")`.
>
> So, the output of the program will be:
>
> ```
> 1:
> some string
> 2:
> foo returned hi
> ```
This is what I intended.This isn't a very complex problem, but it's the kind of problem that requires stepping outside how you're used to think to the point where I'd like to e.g. pose it as an interview question, but I'd worry about using it as I know from experience of conversations with people about INTERCAL's "come from" that a surprising number of people struggle with reasoning about it.
I asked a followup question:
> "come from" alters the control flow, and prevents a method call from returning to its call site. Suggest a couple of implementation and specification issues specific to dealing with the control flow and its implications.
It produced a detailed explanation of how it'd require stack unwinding, pointed out that you'd need to define how it is affected by dynamic dispatch in Ruby (e.g. which method should it actually intercept?), and gave an example of a scenario where "come from" could affect exception handling that I frankly hadn't thought of. If someone gave an answer like it did to this in a technical interview, I'd hire them.I then followed up with one more:
> Are there other options to unwinding the stack? E.g. when the block passed to "come from" exits, what are our options for where code execution will continue?
It correctly ascertains that returning to the original call site is another option, and then deduced that this would make "come from" behave like an aspect-oriented "around" mechanism. It also produced several more esoteric suggestions and explained why they'd be a bad idea.Because of looking through this I recognised that the block I'd added to the "come from" syntax made it somewhat ambiguous and might hint at different semantics, I asked this:
> Does the fact I passed a block to the "come from" statement give any hints as to which of these options I might have intended?
ChatGPT replied: > Yes, the fact that you passed a block to the "come from" statement does provide some hints about your intended behavior. It suggests that you might want the "come from" statement to behave like a method interceptor or an "around" aspect, executing the block of code when the specified condition is met and then returning control back to the original call site. [snipped it pointing out that matches one of the options it had described before and copying it's previous explanation]
I was hopeful it'd pick up on this when specifically asked, but I'm frankly surprised it did. It's a subtlety of how idiomatic Ruby differs from INTERCAL, and something I didn't think about myself when I wrote the example, and so given I told it the "come from" was inspired by INTERCAL, it was right to assume at first that my intent was for it to pass through. But it's also right to recognise that the block given might imply other intent (and indeed either this construct should be named something else, like "around" or "after", or the block ought to go)If this is just "stringing words together", then I know a lot of developers who are bad at just stringing words together.