Google Gemini Pro API Available Through AI Studio (opens in new tab)

(ai.google.dev)

185 pointssam1234apter2y ago109 comments

109 comments

I used this just-released API (of Gemini Pro) with multimodal input to test some of the things from the infamous Gemini Demo. You can see here [ https://www.youtube.com/watch?v=__nL7Vc0OCg ] my GPT-4 recreation of that ad which went viral.

Gemini Pro is... not great. In one test, I asked what gesture I was making (while showing a thumbs up) -- it said thumbs down and "The image is a commentary on the changing nature of truth".

I just just made a heads-to-heads comparison -- you can watch it here: https://www.youtube.com/watch?v=1RrkRA7wuoE

Code is here: https://github.com/gregsadetsky/sagittarius

dopb2y ago

I think the fair comparison would be GPT3.5 (if image inputs were supported) vs Gemini Pro. It would be great to compare this with Gemini Ultra next year.

chamoda2y ago

Free quota looks reasonable with 60 queries per minute. On the other hand data from free quota requests will be used to improve the product.

https://ai.google.dev/pricing

civilitty2y ago

It’s far more than reasonable, it might be Google’s saving grace.

I wasn’t going to bother even testing Google’s AI products unless everyone started gushing about how much better they are than GPT4 but with 60 free queries per minute? That’s worth exploring even if only to find out shortly that it’s not worth paying for.

matsemann2y ago

Typical Google behavior, where it's cheap/free in the beginning, and when you've built on top of it prices increases drastically? Like gmaps pricing.

summerlight2y ago

This must be a significant investment to pick up the hype; 1 qps cannot be sustainable with a free pricing tier unless their resource efficiency is 10x better than competitors.

tobyjsullivan2y ago

No doubt it’s a loss leader to some degree but, in practice, very, very few customers will have sustained request rates of 60 requests per minute. Their actual usage averaged across all users will be a tiny fraction of that.

For users that get close to that sustained rate, they’re just as likely to exceed it and actually pay.

isalmon2y ago

I know it's just an anecdote, but my biggest problem with Google's Bard/Gemini is that the moment I tried to ask a question about something - I started getting ads all over the internet and social media related to that.

Doing this with ChatGPT 4.0 for months and months did not cause this type of behavior.

rany_2y ago

Does that happen even with Bard Activity turned off? It's kind of silly of Google because the types of queries I would send to Bard are the type that I wouldn't care to see as adverts anyway!

pesfandiar2y ago

I like that they have a "blog post creator"[1] in their examples. There's no hope for the future of the web when the self-proclaimed stewards of its quality encourage AI spam.

[1] https://makersuite.google.com/app/prompts/blog-post-creator

pphysch2y ago

There are legitimate use cases, like "create a press release with these facts".

zeven72y ago

I would take a list of the facts over garbage filler paragraphs any day.

This is a problem with human writers too.

dudus2y ago

This is still not Gemini Ultra. That's the one they said was above state of the art. Still waiting for that one.

sam1234apterOP2y ago

Developers can start building with our first version of Gemini Pro through Google AI Studio at ai.google.dev

Developers have a free quota and access to a full range of features including function calling, embeddings, semantic retrieval, custom knowledge grounding, chat functionality and more. It supports 38 languages across 180+ countries.

georgehill2y ago

> Access restricted You do not have permission to view this page.

Wait only in the US?

Edit: I can access it through the Google Cloud Console.

https://imgur.com/a/NXAgvFb

ianbicking2y ago

Some thoughts comparing this to the GPT API (from a thread: https://hachyderm.io/@ianbicking/111574983914336748):

It looks like a fairly easy swap-in for GPT. "messages" becomes "content". Some of the configuration parameters are slightly different (topP/etc), but I have never put in the effort to understand the practical effect of those so I never tweak their values.

The messages themselves are a list of "parts", which allows mixed media messages. This feels a little cleaner than how GPT has handled messages being extended.

Instead of role: "assistant" they use role: "model". There's no role: "system" – presumably you just shove everything into user messages. You can also leave off the role... and I assume that means default to "user" but it's not clear if it's 100% equivalent...?

There's a bunch of moderation parameters, which seems like a good idea. OpenAI has a moderation endpoint you can use to preflight check your input, but doing it all at once makes more sense. There's four categories and you can adjust your sensitivity to each (and turn off blocking at entirely). The sensitivity is not about how extreme the violation is, but how likely it is a violation. So it's not like a G/PG/PG-13/etc rating. Just a question of how many false positives/negatives you want.

There's functions, though they are in beta (whatever that means): https://ai.google.dev/docs/function_calling – they look very very similar to GPT functions. They don't have the "JSON response" that GPT has, but that seems mostly redundant with functions anyway.

I have no idea how well prompts translate, but it feels like the API is an easy translation. And importantly everything is semantically equivalent, you don't have to make one pretend it is the other, like turning a completion API into a chat API.

Given the generous free tier I feel fairly motivated to swap in Gemini and try to ship experiments that I've sat on until now.

brrrrrm2y ago

why on earth did they design the Node.js and Web APIs to be slightly different and incompatible? (edit: this might just be a bug/oversight on the landing page?)

Node.js:

   const model = genAI.getGenerativeModel({ model: "gemini-pro-vision"});

   const result = model.generateContent({ contents: [{parts: [
      {text: "What’s in this photo?"},
      {inlineData: {data: imgBase64, mimeType: 'image/png'}}
    ] }] })

Web:

    const model = genAI.getGenerativeModel({ model: "gemini-pro-vision"});

    const result = await model.generateContent([
      "What’s in this photo?",
      {inlineData: {data: imgDataInBase64, mimeType: 'image/png'}}
    ]);

miguelramos2y ago

Hey! This is Miguel from Google working of these SDKS. I'm confused about this comment, Both Web and Node.js are the same. Can you clarify where you see the difference?

brrrrrm2y ago

as documented it looks different (async vs sync + the necessity of `text:` annotations in the Node.js version?)

updated my comment to paste in what's written in the docs

miguelramos2y ago

Thanks, that helps to understand the confusion.

The web and Node.js are the same, each of them has many function overloads so users can call functions with simplified or complex arguments, as they prefer. We are going to fix the doc and code snippets so both Web and Node.js consistently show the same code to avoid misunderstandings. Thanks a lot!

magemgem2y ago

What do you mean? They look exactly the same to me.

aidabbler2y ago

You may be a host

lovasoa2y ago

You can make 1 query per second to it for free, including large queries that contain images ? This is crazy !

I will happily let google buy me for that price.

https://ai.google.dev/pricing

thedangler2y ago

Wow. what a crap site. I clicked on the option for prompt thinking I could go back and request an API key. Boy was I wrong. No matter what I do it takes me to the prompt console where I get Access Denied and it hijacked my back button.

legendofbrando2y ago

When I try to create an API key it says that "We are sorry, but you do not have access to Early Access Apps" yet my domain does allow access to early access apps....

AlmostSchurLie2y ago

When I try to create an API key, all I see is "an internal error occured". Still waiting for Gemini Ultra though.

vibhajaiman2y ago

Make a questionnaire senior secondary school students and mobile phone impact reply

vibhajaiman2y ago

Make a questionnaire senior secondary school students and mobile phone impact

andre-z2y ago

See how to use new Gemini Embeddings with Qdrant Vector Database https://qdrant.tech/documentation/embeddings/gemini/

SubiculumCode2y ago

I'd like to see this benchmarked on humaneval for coding.

roschdal2y ago

How can I use Google Gemini in a Java application?

magemgem2y ago

https://cloud.google.com/vertex-ai/docs/generative-ai/multim...

prakhar8972y ago

Can someone recreate the Google Demo of gemini?

ziga92y ago

Anyone else having access restricted problem?

zlg_codes2y ago

I'd like to know why the name of this AI product coincides with the alternative in-between-HTTP-and-Gopher Gemini protocol.

I'm sure it's just an accident.

krapp2y ago

"Gemini" is a very common name (being the name of a constellation) which has been used by countless products, companies and endeavors over the years. Almost no one outside of Hacker News and a small core of misanthropic anarchists knows about, much less cares about, the Gemini protocol.

In the case of this specific Gemini, it's apparently the result of there being two teams involved, and it's a reference to the Gemini space program[0].

[0]https://twitter.com/JeffDean/status/1733580264859926941

zlg_codes2y ago

Interestingly, were it trademarked you wouldn't have this snarky attitude about how a name doesn't matter unless people care about it. I'm not talking on other forums right now either, so who cares about outside of HN while talking on HN?

krapp2y ago

¯\_(ツ)_/¯ You asked and I answered, and provided a reference for the answer, which was outside of HN.

And as far as trademarks go, it still wouldn't matter. As I mentioned, countless entities were already using "Gemini" before the protocol was created, and Google's AI isn't even in the same domain.

tanyongsheng2y ago

The pricing is attractive.

replwoacause2y ago

This news doesn’t excite me at all after trying Bard Gemini Pro in the browser.

fotcorn2y ago

I can only access https://makersuite.google.com/ when using a VPN to the US. Also, it spams popups that get blocked by Firefox.

Some basic prompts, which are answered correctly most of the time by ChatGPT4:

There are 31 books in my house. I read 2 books over the weekend. How many books are still in my house?

> 29 books

Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

> Three

If you place an orange below a plate in the living room, and then move the plate to the kitchen, where is the orange now?

> Under the plate in the kitchen.

So, not great.

tr3ntg2y ago

Since the Maker Suite doesn't have the same prompting as something like Bard or ChatGPT, I gave Gemini Pro a bit more context:

> Consider the following user chat. Reason out loud as needed to provide a reasonable response. User message: "Julia has three brothers, each of them has two sisters. How many sisters does Julia have?"

Gemini Pro:

> Julia's brothers have nothing to do with how many sisters she has. Julia's sisters are also her brothers' sisters, so the number of sisters Julia has is the same as the number of sisters her brothers have. Julia's brothers have two sisters each, so if we add up the number of sisters for each brother, we get: 2 + 2 + 2 = 6. Therefore, Julia has 6 sisters. The answer is 6.

staticman22y ago

Mixtral 8x7b only gets the sister question correct when I ask it to show it's work, but also shows that it's aware of the brain teaser:

https://poe.com/s/5gtLvuWT52o9LEBAJouo

GPT 4 gets it right.

GPT 3.5 gets it hilariously wrong if I change the question slightly:

User: Julia has 8 brothers, each of them has 5 sisters. How many sisters does Julia have?

Assistant:

Julia has 8 brothers, and each of them has 5 sisters. Since Julia is one of those sisters, she has 8 brothers + 1 (herself) = 9 sisters.

https://chat.openai.com/share/3a49800f-7747-44d9-90e7-c1e037...

Also on 3.5 I tried this:

User: Julia has 8 brothers, each of them has 5 sisters. How many sisters does Julia have? Show your work step by step.

Assistant: Julia has 8 brothers, and each of them has 5 sisters, including Julia herself. So, Julia is one of those 5 sisters for each of her 8 brothers.

To calculate how many sisters Julia has, you can simply multiply the number of brothers by the number of sisters each brother has:

8 brothers * 5 sisters per brother = 40 sisters

Therefore, Julia has 40 sisters in total.

angm1282y ago

Mistral 7B (mistral-7b-instruct-v0.2.Q6_K.gguf): Let's break down this problem step by step. First, we know that Julia has three brothers. Then it is stated that each of her brothers has two sisters. This means there are two sisters for each brother. Since Julia is one of the sisters for her brothers, she also counts as one of their sisters. However, when counting the number of sisters for herself, she should not be included in the count. Therefore, Julia has three brothers and two sisters (excluding herself), so she has a total of five sisters.

hatsunearu2y ago

Isn't 6 still incorrect? Or am I dumb?

tr3ntg2y ago

This made me laugh. It's very wrong. Unless I'm dumb, the answer we're looking for should be 1.

Sounds like the confidence of Gemini worked on you, though.

3 more replies

vasco2y ago

The sum is incorrect and the statement of "Julia's sisters are also her brothers' sisters, so the number of sisters Julia has is the same as the number of sisters her brothers have" is also incorrect. The brothers each have one more sister than Julia, since she is one of the sisters so she only has one sister. It's just reworded but still wrong.

skywhopper2y ago

You're not dumb. But it's an ambiguous question as written. Presumably, we are to assume that all siblings involved are still alive, that all all full siblings (ie no half- or step-siblings), that everyone is cisgender, and that Julia is female. In which case, Julia would have one sister. Each of her three brothers would have two sisters, one of whom was Julia herself.

1 more reply

shawabawa32y ago

Yes, making it reason made it even more wrong

ajhai2y ago

Gemini Pro compares to GPT 3.5.. their biggest model that competes with GPT-4 is called Gemini Ultra and they say that's coming early next year.

isoprophlex2y ago

Asking these to GPT3.5 has been an utterly frustrating experience, lol. I guess gemini is at this level of intelligence right now, not GPT4... rigged demos notwithstanding;)

FergusArgyll2y ago

I asked this to mini-orca 3b and here was it's brilliant answer.

> If you read 2 books over the weekend, then there are 31 books in your house. However, if you only read one book, then there would be only 25 books left in your house.

BlindEyeHalo2y ago

No joke, even if trying to correct GTP3.5 it still gives a nonsense answer:

> I apologize for any confusion caused by my previous response. Reading a book doesn't physically remove it from your house. The assumption I made was a misunderstanding. If you read a book, it is still in your house unless you lend it, give it away, or otherwise remove it.

> So, if you started with 31 books and read 2 over the weekend, you would have 31 - 2 = 29 books still in your house.

la647102y ago

To the second question chatGPT answered : The orange is in the kitchen, as that is where the plate, along with the orange, was moved.

Now a human can reason because a human can visualize in their mind that a plate is being moved without moving the orange. How is that kind of reasoning possible in LLMs which in my very basic understanding is predicting the next token.

tagraves2y ago

Here's the response chatGPT 4 gave for me:

> The orange remains in the living room. Moving the plate to the kitchen does not affect the location of the orange, since it was placed below the plate but not attached to it. Therefore, the orange stays where it was originally placed, which is in the living room.

You don't need to visualize it in your mind to understand the relationship between being _below_ and being _moved with_. Keep in mind that many people cannot visualize anything in their mind!

la647102y ago

I have wondered how do people blind from birth create their mental world? Is it all dark with no color or light and only sound? No shapes? Or do they still form mental images from non visual sensory inputs?

luma2y ago

Make sure you're including the GPT version, there's a wide gulf between GPT4 and prior models. GPT-4 answers:

Q. If you place an orange below a plate in the living room, and then move the plate to the kitchen, where is the orange now?

A. The orange remains in the living room. Moving the plate to the kitchen does not affect the position of the orange.

----

Q. There are 31 books in my house. I read 2 books over the weekend. How many books are still in my house?

A. There are still 31 books in your house. Reading books does not change their quantity

----

Q. Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

A. Julia has one sister. Each of her three brothers has two sisters, and Julia is one of them. The other one is her sister, making it a total of one sister for Julia.

lern_too_spel2y ago

In order to predict next tokens in text with greater and greater accuracy, some parameters of the language model must represent a model of the world.

amf122y ago

FWIW, Gemini Pro is equivalent to GPT 3.5, so expected

EZ-E2y ago

GPT4 gets the second question wrong for me, but in a weird way.

> Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

Julia has two sisters. Each brother having two sisters refers to the same two sisters, Julia and another sister.

Then if I ask to verify its answer, it realizes the issue.

> Apologies for any confusion. If Julia has three brothers and each of them has two sisters, then Julia is one of the sisters. The other sister is common among all three brothers. Therefore, Julia has only one sister.

ryoshu2y ago

Mistral 7b gets the first two wrong, but gets the third one right: "The orange is still in the living room. When you moved the plate, you didn't move the orange. It remains in its original location until you pick it up and move it to a different place"

strobe2y ago

mistralai/Mixtral-8x7B-Instruct-v0.1 responses - which seems much better:

>>There are 31 books in my house. I read 2 books over the weekend. How many books are still in my house?

You still have 31 books in your house. The number of books doesn't change because you finished reading two of them; they're just no longer unread.

>>Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

Julia has one sister.

Here's the logic behind this:

1. We know that Julia has three brothers. 2. Then we are told that each of her brothers has two sisters. 3. Since all four children (Julia and her brothers) share the same parents, they also share the same siblings. 4. Therefore, if each of Julia's brothers has two sisters, then there must be only one other sister besides Julia herself.

>> If you place an orange below a plate in the living room, and then move the plate to the kitchen, where is the orange now?

The orange is still in the living room, as moving the plate does not affect the location of the orange.

robertlagrant2y ago

Wow - that is good.

dhoe2y ago

Great questions. Gpt4 got the second wrong in my test, but corrected itself when I told it to review for errors. It got the other two right at first try.

vbezhenar2y ago

GPT4 insisted on wrong answer in the second test: https://chat.openai.com/share/301cc597-929b-43d6-9685-2aad3e...

kuratkull2y ago

My attempt had GPT4 fail on Julia, it was adamant that Julia has two sisters. It took couple of back and forths for it to understand.

bool3max2y ago

Why would you expect an LLM to answer thoe questions correctly?

imdsm2y ago

Typical Google UX.

Get API key, takes me to makersuite, where I get a create API key button that errors. Then when I reload the page, I get a straight forbidden page.

HP said it best, you have to isolate the team from the bigger company to allow them to work as an effective startup. How can solo-preneurs provide better UX & onboarding while doing 16 other jobs than Google can with multi-billion dollar budgets?

epolanski2y ago

You reminded me of how much hatred I had for Google binding all their products language (including Workspaces) to my account language, with no chance to be changed (even if I updated the account settings).

How can they be so unaware of the fact that people will often prefer english because that's the language with most tutorials/guides/resources and makes interoperability in cross-country remote companies simpler?

Don't they want to sell cloud products to global companies?

How am I supposed to help or receive help from my coworkers?

I have lost days and days trying to set Google Sheets in English and I have been stuck with the Italian version no matter how many changes I did to my Sheets or Google account settings.

There's a 5000+ comments/upvotes discussion on their forums and they simply don't give two damns, I don't think humans even see those threads.

Didn't feel so stressed using a software since programming in Liferay professionally or trying to figure out Autodesk products a decade ago for hobby 3d modelling..

slig2y ago

Their language handling was fubar'd years ago. There was a time that you could open the Google.com home page passing some query strings and it would let you search in English with non localized results BS. Years later, and Google/YouTube don't care. YouTube's search is abysmal and will shove shitty results that have nothing to do with your search and the language that your Google account is set up.

unsupp0rted2y ago

Ugh Google Flights.

Hey maybe try using, I dunno, the currency of my logged in account / country? Or the currency I used last time? Or the currency of my VPN country?

Just because I’m traveling in Uzbekistan for a few days doesn’t mean I want to buy tickets denominated in Uzbek Som.

jgilias2y ago

It’s not about budgets, it’s about incentives and the Dunbar limit. A solo-preneur, or a small organisation has all the incentives to have great onboarding experiences. And they’re small enough for everyone to know each other and have a coherent vision.

At a huge organisation like Google there’s a plethora of people all working according to their incentives. These organisations pay lip service to customer experience and excellence, but when push comes to shove it’s the flashy projects that get rewarded.

So nobody has a great overview of everything going on to the necessary level of detail, and nobody actually cares about onboarding UX.

Microsoft has the same problem, I want to shoot myself every time I need to get something done for a kid’s account on Xbox.

skywhopper2y ago

> How can solo-preneurs provide better UX & onboarding while doing 16 other jobs than Google can

Solo-preneurs don't have to spend time coordinating with the 39 other people assigned to the project plus the 20 execs hovering over their shoulder, all with different ideas and agendas about what the product should be, while navigating the politics to get appropriate resources assigned without upsetting the hundreds of other folks with competing interests.

kossTKR2y ago

This reminds me of the year i had to use GCloud/Ads/Analytics and their respective interfaces which seemingly had no connections and changed randomly hour to hour.

Simple features so bizarrely complex and hidden that turning an instance off, checking visitors for a page or other trivial matters seriously required hiring a full time "google interface understander", i kid you not.

Kafka on steroids.

cccybernetic2y ago

Similar thing just happened to me.

Click on link > Get API Key > "We are sorry, but you do not have access to Early Access Apps"

Since I'm the admin, I checked and confirmed that I do have access. It's enabled for both my account and the entire organization.

Whatever.

witort2y ago

I had the same problem until I also enabled "Core Data Access Permissions"

code512y ago

I know and agree about the UX problem.

But I wanted to note API key generation works when you pick an existing Google Cloud project.

nextworddev2y ago

GCP console’s UX is somehow worse than AWS’s, which is pretty crazy

verdverm2y ago

Cross posting some links from another post that HNers found helpful

- https://cloud.google.com/vertex-ai (marketing page)

- https://cloud.google.com/vertex-ai/docs (docs entry point)

- https://console.cloud.google.com/vertex-ai (cloud console)

- https://console.cloud.google.com/vertex-ai/model-garden (all the models)

- https://console.cloud.google.com/vertex-ai/generative (studio / playground)

VertexAI is the umbrella for all of the Google models available through their cloud platform.

You want the last link if you are looking for a ChatGPT like experience, with the ability to also adjust the parameters, so more like a UI on top of the API

pvg2y ago

Just link your other comment rather than repaste. One reason is it makes merging related threads harder.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

verdverm2y ago

It is not a direct copy and paste, the other words around it are contextualized to the posts, which I do not expect to be merged, as they are different stories (language model vs image model). Having to make fewer click is also typically appreciated

alexb_2y ago

When I enter into the AI, Firefox blocks an insane amount of popups. The counter for blocked pop ups quickly reaches >100 where it stops counting. What is it trying to do?

werdnapk2y ago

It might just be trying the same pop up over and over again each time it's blocked

skywhopper2y ago

Get you to use Chrome.

behnamoh2y ago

Doesn't matter—it's already available on Bard and it's not good.

georgehill2y ago

Bard? It is powered by Palm-2, not Gemini.

chamoda2y ago

As of now Bard is powered by Gemini Pro.

https://bard.google.com/updates

georgehill2y ago

Okay, from my end, I am still seeing Palm (EU).

3 more replies

debugnik2y ago

In my language/country I see no mention of Gemini there. Or any model name, really.

martythemaniak2y ago

This is very good:

- 60 queries per minute free - about 1/5th the price of GPT3.5 Turbo - priced per char, not per token - same image pricing as GPT4 150x150

mil222y ago

60 QPM free is great, but the pay-as-you-go pricing is the same.

Courtesy of GPT4:

"To determine which option is cheaper, Gemini Pro or GPT-3.5 Turbo, we need to consider the average length difference between tokens and characters and the pricing structure for each.

Gemini Pro Pricing: Input: $0.00025 per 1,000 characters Output: $0.0005 per 1,000 characters

GPT-3.5 Turbo Pricing: Input: $0.0010 per 1,000 tokens Output: $0.0020 per 1,000 tokens

Average Length Difference Between Tokens and Characters: A token in GPT-3.5 can be a single word, part of a word, or a special character. On average, a token in GPT-3 models is roughly equivalent to 4 characters (this is a rough estimate as the length of tokens can vary significantly).

Given this information, let's calculate the effective cost per 1,000 characters for both Gemini Pro and GPT-3.5 Turbo. For GPT-3.5 Turbo, since 1 token ≈ 4 characters, the cost per 1,000 characters would be a quarter of the cost per 1,000 tokens. We'll calculate the cost for both input and output for each and compare.

The total cost per 1,000 characters for both Gemini Pro and GPT-3.5 Turbo, considering both input and output, is the same at $0.00075. Therefore, based on the provided pricing and the average token-to-character ratio, they are equally cost-effective."

WiSaGaN2y ago

I am wondering why it would price them in characters but not tokens? Are they processing characters directly as tokens without tokenizer?

abeshkek9192y ago

Maybe because it's easier this way to estimate the data size before you send it to the API.

1 more reply

yeldarb2y ago

We put the image portion through its paces and compared it with GPT-V here: https://blog.roboflow.com/first-impressions-with-google-gemi...

theusus2y ago

We have GPT 5 ready?

_Algernon_2y ago

*4V

yeldarb2y ago

OpenAI calls it GPT-V https://help.openai.com/en/articles/8555496-gpt-v-api

j / k navigate · click thread line to collapse

109 comments

gregsadetsky2y ago

Gemini Pro is... not great. In one test, I asked what gesture I was making (while showing a thumbs up) -- it said thumbs down and "The image is a commentary on the changing nature of truth".

I just just made a heads-to-heads comparison -- you can watch it here: https://www.youtube.com/watch?v=1RrkRA7wuoE

Code is here: https://github.com/gregsadetsky/sagittarius

dopb2y ago

I think the fair comparison would be GPT3.5 (if image inputs were supported) vs Gemini Pro. It would be great to compare this with Gemini Ultra next year.

chamoda2y ago

Free quota looks reasonable with 60 queries per minute. On the other hand data from free quota requests will be used to improve the product.

https://ai.google.dev/pricing

civilitty2y ago

It’s far more than reasonable, it might be Google’s saving grace.

matsemann2y ago

Typical Google behavior, where it's cheap/free in the beginning, and when you've built on top of it prices increases drastically? Like gmaps pricing.

summerlight2y ago

This must be a significant investment to pick up the hype; 1 qps cannot be sustainable with a free pricing tier unless their resource efficiency is 10x better than competitors.

tobyjsullivan2y ago

For users that get close to that sustained rate, they’re just as likely to exceed it and actually pay.

isalmon2y ago

Doing this with ChatGPT 4.0 for months and months did not cause this type of behavior.

rany_2y ago

Does that happen even with Bard Activity turned off? It's kind of silly of Google because the types of queries I would send to Bard are the type that I wouldn't care to see as adverts anyway!

pesfandiar2y ago

I like that they have a "blog post creator"[1] in their examples. There's no hope for the future of the web when the self-proclaimed stewards of its quality encourage AI spam.

[1] https://makersuite.google.com/app/prompts/blog-post-creator

pphysch2y ago

There are legitimate use cases, like "create a press release with these facts".

zeven72y ago

I would take a list of the facts over garbage filler paragraphs any day.

This is a problem with human writers too.

dudus2y ago

This is still not Gemini Ultra. That's the one they said was above state of the art. Still waiting for that one.

sam1234apterOP2y ago

Developers can start building with our first version of Gemini Pro through Google AI Studio at ai.google.dev

georgehill2y ago

> Access restricted You do not have permission to view this page.

Wait only in the US?

Edit: I can access it through the Google Cloud Console.

https://imgur.com/a/NXAgvFb

ianbicking2y ago

Some thoughts comparing this to the GPT API (from a thread: https://hachyderm.io/@ianbicking/111574983914336748):

The messages themselves are a list of "parts", which allows mixed media messages. This feels a little cleaner than how GPT has handled messages being extended.

Given the generous free tier I feel fairly motivated to swap in Gemini and try to ship experiments that I've sat on until now.

brrrrrm2y ago

why on earth did they design the Node.js and Web APIs to be slightly different and incompatible? (edit: this might just be a bug/oversight on the landing page?)

Node.js:

   const model = genAI.getGenerativeModel({ model: "gemini-pro-vision"});

   const result = model.generateContent({ contents: [{parts: [
      {text: "What’s in this photo?"},
      {inlineData: {data: imgBase64, mimeType: 'image/png'}}
    ] }] })

Web:

    const model = genAI.getGenerativeModel({ model: "gemini-pro-vision"});

    const result = await model.generateContent([
      "What’s in this photo?",
      {inlineData: {data: imgDataInBase64, mimeType: 'image/png'}}
    ]);

miguelramos2y ago

Hey! This is Miguel from Google working of these SDKS. I'm confused about this comment, Both Web and Node.js are the same. Can you clarify where you see the difference?

brrrrrm2y ago

as documented it looks different (async vs sync + the necessity of `text:` annotations in the Node.js version?)

updated my comment to paste in what's written in the docs

miguelramos2y ago

Thanks, that helps to understand the confusion.

magemgem2y ago

What do you mean? They look exactly the same to me.

aidabbler2y ago

You may be a host

lovasoa2y ago

You can make 1 query per second to it for free, including large queries that contain images ? This is crazy !

I will happily let google buy me for that price.

https://ai.google.dev/pricing

thedangler2y ago

legendofbrando2y ago

When I try to create an API key it says that "We are sorry, but you do not have access to Early Access Apps" yet my domain does allow access to early access apps....

AlmostSchurLie2y ago

When I try to create an API key, all I see is "an internal error occured". Still waiting for Gemini Ultra though.

vibhajaiman2y ago

Make a questionnaire senior secondary school students and mobile phone impact reply

vibhajaiman2y ago

Make a questionnaire senior secondary school students and mobile phone impact

andre-z2y ago

See how to use new Gemini Embeddings with Qdrant Vector Database https://qdrant.tech/documentation/embeddings/gemini/

SubiculumCode2y ago

I'd like to see this benchmarked on humaneval for coding.

roschdal2y ago

How can I use Google Gemini in a Java application?

magemgem2y ago

https://cloud.google.com/vertex-ai/docs/generative-ai/multim...

prakhar8972y ago

Can someone recreate the Google Demo of gemini?

ziga92y ago

Anyone else having access restricted problem?

zlg_codes2y ago

I'd like to know why the name of this AI product coincides with the alternative in-between-HTTP-and-Gopher Gemini protocol.

I'm sure it's just an accident.

krapp2y ago

In the case of this specific Gemini, it's apparently the result of there being two teams involved, and it's a reference to the Gemini space program[0].

[0]https://twitter.com/JeffDean/status/1733580264859926941

zlg_codes2y ago

krapp2y ago

¯\_(ツ)_/¯ You asked and I answered, and provided a reference for the answer, which was outside of HN.

And as far as trademarks go, it still wouldn't matter. As I mentioned, countless entities were already using "Gemini" before the protocol was created, and Google's AI isn't even in the same domain.

tanyongsheng2y ago

The pricing is attractive.

replwoacause2y ago

This news doesn’t excite me at all after trying Bard Gemini Pro in the browser.

fotcorn2y ago

I can only access https://makersuite.google.com/ when using a VPN to the US. Also, it spams popups that get blocked by Firefox.

Some basic prompts, which are answered correctly most of the time by ChatGPT4:

There are 31 books in my house. I read 2 books over the weekend. How many books are still in my house?

> 29 books

Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

> Three

If you place an orange below a plate in the living room, and then move the plate to the kitchen, where is the orange now?

> Under the plate in the kitchen.

So, not great.

tr3ntg2y ago

Since the Maker Suite doesn't have the same prompting as something like Bard or ChatGPT, I gave Gemini Pro a bit more context:

Gemini Pro:

staticman22y ago

Mixtral 8x7b only gets the sister question correct when I ask it to show it's work, but also shows that it's aware of the brain teaser:

https://poe.com/s/5gtLvuWT52o9LEBAJouo

GPT 4 gets it right.

GPT 3.5 gets it hilariously wrong if I change the question slightly:

User: Julia has 8 brothers, each of them has 5 sisters. How many sisters does Julia have?

Assistant:

Julia has 8 brothers, and each of them has 5 sisters. Since Julia is one of those sisters, she has 8 brothers + 1 (herself) = 9 sisters.

https://chat.openai.com/share/3a49800f-7747-44d9-90e7-c1e037...

Also on 3.5 I tried this:

User: Julia has 8 brothers, each of them has 5 sisters. How many sisters does Julia have? Show your work step by step.

Assistant: Julia has 8 brothers, and each of them has 5 sisters, including Julia herself. So, Julia is one of those 5 sisters for each of her 8 brothers.

To calculate how many sisters Julia has, you can simply multiply the number of brothers by the number of sisters each brother has:

8 brothers * 5 sisters per brother = 40 sisters

Therefore, Julia has 40 sisters in total.

angm1282y ago

hatsunearu2y ago

Isn't 6 still incorrect? Or am I dumb?

tr3ntg2y ago

This made me laugh. It's very wrong. Unless I'm dumb, the answer we're looking for should be 1.

Sounds like the confidence of Gemini worked on you, though.

3 more replies

vasco2y ago

skywhopper2y ago

1 more reply

shawabawa32y ago

Yes, making it reason made it even more wrong

ajhai2y ago

Gemini Pro compares to GPT 3.5.. their biggest model that competes with GPT-4 is called Gemini Ultra and they say that's coming early next year.

isoprophlex2y ago

Asking these to GPT3.5 has been an utterly frustrating experience, lol. I guess gemini is at this level of intelligence right now, not GPT4... rigged demos notwithstanding;)

FergusArgyll2y ago

I asked this to mini-orca 3b and here was it's brilliant answer.

> If you read 2 books over the weekend, then there are 31 books in your house. However, if you only read one book, then there would be only 25 books left in your house.

BlindEyeHalo2y ago

No joke, even if trying to correct GTP3.5 it still gives a nonsense answer:

> So, if you started with 31 books and read 2 over the weekend, you would have 31 - 2 = 29 books still in your house.

la647102y ago

To the second question chatGPT answered : The orange is in the kitchen, as that is where the plate, along with the orange, was moved.

tagraves2y ago

Here's the response chatGPT 4 gave for me:

You don't need to visualize it in your mind to understand the relationship between being _below_ and being _moved with_. Keep in mind that many people cannot visualize anything in their mind!

la647102y ago

luma2y ago

Make sure you're including the GPT version, there's a wide gulf between GPT4 and prior models. GPT-4 answers:

Q. If you place an orange below a plate in the living room, and then move the plate to the kitchen, where is the orange now?

A. The orange remains in the living room. Moving the plate to the kitchen does not affect the position of the orange.

----

Q. There are 31 books in my house. I read 2 books over the weekend. How many books are still in my house?

A. There are still 31 books in your house. Reading books does not change their quantity

----

Q. Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

A. Julia has one sister. Each of her three brothers has two sisters, and Julia is one of them. The other one is her sister, making it a total of one sister for Julia.

lern_too_spel2y ago

In order to predict next tokens in text with greater and greater accuracy, some parameters of the language model must represent a model of the world.

amf122y ago

FWIW, Gemini Pro is equivalent to GPT 3.5, so expected

EZ-E2y ago

GPT4 gets the second question wrong for me, but in a weird way.

> Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

Julia has two sisters. Each brother having two sisters refers to the same two sisters, Julia and another sister.

Then if I ask to verify its answer, it realizes the issue.

ryoshu2y ago

strobe2y ago

mistralai/Mixtral-8x7B-Instruct-v0.1 responses - which seems much better:

>>There are 31 books in my house. I read 2 books over the weekend. How many books are still in my house?

You still have 31 books in your house. The number of books doesn't change because you finished reading two of them; they're just no longer unread.

>>Julia has three brothers, each of them has two sisters. How many sisters does Julia have?

Julia has one sister.

Here's the logic behind this:

>> If you place an orange below a plate in the living room, and then move the plate to the kitchen, where is the orange now?

The orange is still in the living room, as moving the plate does not affect the location of the orange.

robertlagrant2y ago

Wow - that is good.

dhoe2y ago

Great questions. Gpt4 got the second wrong in my test, but corrected itself when I told it to review for errors. It got the other two right at first try.

vbezhenar2y ago

GPT4 insisted on wrong answer in the second test: https://chat.openai.com/share/301cc597-929b-43d6-9685-2aad3e...

kuratkull2y ago

My attempt had GPT4 fail on Julia, it was adamant that Julia has two sisters. It took couple of back and forths for it to understand.

bool3max2y ago

Why would you expect an LLM to answer thoe questions correctly?

imdsm2y ago

Typical Google UX.

Get API key, takes me to makersuite, where I get a create API key button that errors. Then when I reload the page, I get a straight forbidden page.

epolanski2y ago

Don't they want to sell cloud products to global companies?

How am I supposed to help or receive help from my coworkers?

I have lost days and days trying to set Google Sheets in English and I have been stuck with the Italian version no matter how many changes I did to my Sheets or Google account settings.

There's a 5000+ comments/upvotes discussion on their forums and they simply don't give two damns, I don't think humans even see those threads.

Didn't feel so stressed using a software since programming in Liferay professionally or trying to figure out Autodesk products a decade ago for hobby 3d modelling..

slig2y ago

unsupp0rted2y ago

Ugh Google Flights.

Hey maybe try using, I dunno, the currency of my logged in account / country? Or the currency I used last time? Or the currency of my VPN country?

Just because I’m traveling in Uzbekistan for a few days doesn’t mean I want to buy tickets denominated in Uzbek Som.

jgilias2y ago

So nobody has a great overview of everything going on to the necessary level of detail, and nobody actually cares about onboarding UX.

Microsoft has the same problem, I want to shoot myself every time I need to get something done for a kid’s account on Xbox.

skywhopper2y ago

> How can solo-preneurs provide better UX & onboarding while doing 16 other jobs than Google can

kossTKR2y ago

This reminds me of the year i had to use GCloud/Ads/Analytics and their respective interfaces which seemingly had no connections and changed randomly hour to hour.

Kafka on steroids.

cccybernetic2y ago

Similar thing just happened to me.

Click on link > Get API Key > "We are sorry, but you do not have access to Early Access Apps"

Since I'm the admin, I checked and confirmed that I do have access. It's enabled for both my account and the entire organization.

Whatever.

witort2y ago

I had the same problem until I also enabled "Core Data Access Permissions"

code512y ago

I know and agree about the UX problem.

But I wanted to note API key generation works when you pick an existing Google Cloud project.

nextworddev2y ago

GCP console’s UX is somehow worse than AWS’s, which is pretty crazy

verdverm2y ago

Cross posting some links from another post that HNers found helpful

- https://cloud.google.com/vertex-ai (marketing page)

- https://cloud.google.com/vertex-ai/docs (docs entry point)

- https://console.cloud.google.com/vertex-ai (cloud console)

- https://console.cloud.google.com/vertex-ai/model-garden (all the models)

- https://console.cloud.google.com/vertex-ai/generative (studio / playground)

VertexAI is the umbrella for all of the Google models available through their cloud platform.

You want the last link if you are looking for a ChatGPT like experience, with the ability to also adjust the parameters, so more like a UI on top of the API

pvg2y ago

Just link your other comment rather than repaste. One reason is it makes merging related threads harder.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

verdverm2y ago

alexb_2y ago

When I enter into the AI, Firefox blocks an insane amount of popups. The counter for blocked pop ups quickly reaches >100 where it stops counting. What is it trying to do?

werdnapk2y ago

It might just be trying the same pop up over and over again each time it's blocked

skywhopper2y ago

Get you to use Chrome.

behnamoh2y ago

Doesn't matter—it's already available on Bard and it's not good.

georgehill2y ago

Bard? It is powered by Palm-2, not Gemini.

chamoda2y ago

As of now Bard is powered by Gemini Pro.

https://bard.google.com/updates

georgehill2y ago

Okay, from my end, I am still seeing Palm (EU).

3 more replies

debugnik2y ago

In my language/country I see no mention of Gemini there. Or any model name, really.

martythemaniak2y ago

This is very good:

- 60 queries per minute free - about 1/5th the price of GPT3.5 Turbo - priced per char, not per token - same image pricing as GPT4 150x150

mil222y ago

60 QPM free is great, but the pay-as-you-go pricing is the same.

Courtesy of GPT4:

"To determine which option is cheaper, Gemini Pro or GPT-3.5 Turbo, we need to consider the average length difference between tokens and characters and the pricing structure for each.

Gemini Pro Pricing: Input: $0.00025 per 1,000 characters Output: $0.0005 per 1,000 characters

GPT-3.5 Turbo Pricing: Input: $0.0010 per 1,000 tokens Output: $0.0020 per 1,000 tokens

WiSaGaN2y ago

I am wondering why it would price them in characters but not tokens? Are they processing characters directly as tokens without tokenizer?

abeshkek9192y ago

Maybe because it's easier this way to estimate the data size before you send it to the API.

1 more reply

yeldarb2y ago

We put the image portion through its paces and compared it with GPT-V here: https://blog.roboflow.com/first-impressions-with-google-gemi...

theusus2y ago

We have GPT 5 ready?

_Algernon_2y ago

*4V

yeldarb2y ago

OpenAI calls it GPT-V https://help.openai.com/en/articles/8555496-gpt-v-api

j / k navigate · click thread line to collapse