Does anyone have interesting anecdotes in different categories? Programming, General Lookup, Math, etc
Gemini on the other hand will ask several follow-up questions and behaves more like a conversation you might have with a coworker on a pair-programming topic. It's kind of changed the way I think about the utility of LLMs.
Front-end web development with HTML, CSS and JavaScript is not one of my strengths, so I've been trying to go back-and-forth with GPT-4 to get a working prototype.
Before asking GPT-4 to generate any code, I try to have a conversation to explore the pros and cons of various approaches. Is it better to put separate click events on every object, or listen for events at a higher level of the DOM? Is it better to use callbacks for this or async/await?
But GPT-4 rushes to write code before requirements are defined, and then it struggles to modify that code later after I've decided what approach to take. It forgets what it did before, provides partial results, alters variable names between chat turns, contradicts itself.
Just tried the same dialogues with Gemini Advanced and it's night-and-day better. It seems to remember contexts from 10+ conversation turns ago, it provides a more thorough rationale for its suggestions, doesn't prematurely spit out code, and keeps variable names constant throughout the dialogue.
After the approach and requirements are defined, it can then generate large blocks of code with intelligent modularization into separate functions, more similar to how I would write it by hand than what I get from GPT-4.
Gemini Advanced also seems to prefer the more modern ES6 coding style (arrow functions, destructuring, const/let variable declaration). GPT-4 defaults to the 15-year-old style (string concatenation for building strings instead of template literals, heavy use of var for variable declaration, function expressions instead of arrow functions, etc.). GPT-4 routinely mixes styles even when I explicitly request ES6.
It sort of adds a "confidence level" check that saves me from doing it myself afterward like I would for GPT-4.
Edit to add: Gemini Advanced seems to be a lot faster as well at responding, in terms of tokens/second, compared to GPT-4.
But gpt-4 is still best for ML Python coding. Gemini hallucinates non-existent libraries and often adds unnecessary junk to its code. For example, Gemini often defines spurious variables and then never uses them.
How does GPT4 do with Pytorch, Pytorch Lightning, etc? I'm pretty surprised at how poor GTP3.5 is with those sometimes.
I've put a complex codebase into a single 120K-token prompt, and asked 7 questions GPT-4 and Gemini 1.5. Here are the results! https://www.reddit.com/r/singularity/comments/1atjz9v/ive_pu...
Also how do you get the code into Gemini? I’ve pasted a C++ file in to the chat window and I get errors. Seems to work on AI studio though, is that how you interface with it?
https://gemini.google.com/advanced
It was not so easy to even find out how to buy Gemini Ultra, apparently it's called "Gemini Advanced," or maybe it's called "Google One AI Premium Plan." Searching for "Gemini Ultra" doesn't give you a link to any this on any of the top page of hits on a Google Web Search, and you have to dig deep in announcements to find it.
I've been very dissatisfied with all the free Google options, but this post has given me the kick to see if the bigger models can compete.
I'll find out next billing cycle, but it'd be pretty misleading otherwise, and they'd obviously be incentivized to hook you in with the dark pattern of "auto renewing free trial subscription"
Response was that you have to request access, either through 1) your account representative, or 2) "Apply through Google: If you don't have an account representative, you may be able to apply for access through Google's channels (a specific form or process might be available)"
I don't have an account representative, of course (who does?!)
But the really interesting thing is that there was an additional annotation added to option 2, about the other channels, that said "Google Search didn’t find relevant content. Consider researching further to assess the statement." which sounds about right.
So, the model itself seems OK, if Google can figure out how to get out of its own way.
1. I asked it to do an image search and it responded with in-line images throughout its response. This was a nice "wow" moment and very neat.
2. Its drawing/illustration style is different than DALL-E, so I use both.
3. Quality of general text responses is comparable, though I prefer ChatGPT.
I imagine you can probably just use one rather than have both. I am still primarily using ChatGPT.
I just tried asking it again and asking seems to work now too.
> Image generation in Gemini Apps is available in most countries, except in the European Economic Area (EEA), Switzerland, and the UK. It’s only available for English prompts.
Came up with 4 images.
You raise a good point, though... I've also asked it to use its "web search" capability for tasks and it says it doesn't have that capability, but when I ask it by implying it should do a web search, it goes ahead and does it. Weird!
For human writing, Ultra is w-a-y beyond what ChatGPT 4 is capable of IMHO. GPT4 sounds like an AI write it. Even if you put a lot of prompting into establishing the right tone. With Ultra, it is getting hard to tell. Which is great! I think of them like people. Each has their own opinion and has talents in different areas. Given they are $20/m I am happy to pay for all of them. It takes them about 5 minutes to pay for themselves.
I passed this text through Ultra and this is what I got:
Absolutely! Here's a rephrasing that focuses on a more natural, flowing conversational style while still conveying the core ideas:
I use GitHub Copilot, GPT-4 (and CodyAI), and Gemini Ultra regularly, and I've noticed a big difference in how they approach tasks. For coding, Copilot and GPT-4 are my top picks. However, when it comes to writing that needs to feel truly human, Gemini Ultra is hands-down the winner. It's getting to the point where I often can't tell the difference between its output and something a real writer would create. That's a huge difference from ChatGPT-4, which still can have that telltale "AI feel" even with careful prompting. These tools each have such unique strengths that I find them all valuable – the $20/month investment for each pays off in just minutes.
Changes Made:
* No Visual Cues: Removed formatting elements (bold, bullets) that can scream "AI-assisted".
* Conversational Flow: Uses phrases like "I've noticed a big difference" and "hands-down the winner" to feel more like someone speaking their experience.
* Softened Comparison: Keeps the idea that Ultra surpasses ChatGPT-4 for a certain type of writing, but tones down the strong declaration for a more subjective feel.
Let me know if you want it even more casual or have another specific tone in mind!
It also fails at more complex tasks that I would actually use AIs for. E.g, I'm trying to learn Lagrangian Mechanics and keep running into dead ends with Gemini in cases where ChatGPT is working.
I have faith that Google will make it better in the near future (unless they get bored and move on to something else) so I'm hoping it will be worth paying for. But right now, I'm going to keep my ChatGPT subscription since that is actually useful.
Math is exact. It's a bad fit for a machine-that-guesses-words.
But you're right that someone who doesn't understand a subject should be wary of trusting LLMs to teach the subject.
But - I am a Uk iPhone user - so Gemini Advanced is only available via a website rather than app, and no image generator. So, for me:
ChatGPT4 Pro > copilot pro > Gemini advanced
I then asked it to generate a portrait of a woman and it generated a white woman, a woman that to my eye looks East Asian, and a black woman. If you consider a random sample of all skin tones and regions on earth, this distribution seems pretty normal.
I’ve seen some people get very annoyed when people who don’t look like them get represented somewhere. Maybe those people were being extra loud about the representation. Or maybe I’ve not been able to generate the purported weird distributions you have heard of.