Demo of =GPT3() as a spreadsheet feature (opens in new tab)

(twitter.com)

400 pointsdrewp3y ago139 comments

139 comments

There is a huge potential for language models to get close to messy text problems (many of which are in Excel and Sheet). I am the founder of promptloop.com - the author of this tweet has been an early user.

The challenge to making something like this, or Co-pilot / Ghostwrite, work well is about meeting users where they are. Spreadsheet users dont want to deal with API keys or know what temperature is - but anyone (like this tweet) can set up direct API use with generic models in 10 minutes. This document has all the code to do so ;). [1]

For non-engineers - or folks who need a reliable and familiar syntax to use at scale and across their org - promptloop [2] is the best way to do that. All comments in here are great though. We have been live with users since the summer - no waitlist. And as a note - despite the name "prompt engineering" has almost nothing to do with making this work at scale.

[1] https://docs.google.com/spreadsheets/d/1lSpiz2dIswCXGIQfE69d... [2] https://www.promptloop.com/

tomcam3y ago

> The author of this tweet has been an early user

Maybe not good to reveal customer names this way, unless they already disclosed it publicly

elil173y ago

Any plans to bring this to Excel? I would love to recommend this to folks at my company but we aren't allowed to use G-Suite.

pbmango3y ago

Yes - Not open access yet but drop me an email!

geoduck143y ago

How much does it cost to run?

gbro3n3y ago

The most sensible use for AI that I can see at this time is for supporting humans in their work, but only where the system is set up so that the human has to do the work first, with the AI system looking for possible errors. For example the human drives the car, and the AI brakes when it senses dangerous conditions ahead, or the human screens test results for evidence of cancer and the AI flags where it disagrees so that the human might take another look. The opposite scenario with AI doing the work and humans checking for errors as is the case here will lead to humans being over reliant on less than perfect systems and producing outcomes with high rates of error. As AI improves and gains trust in a field, it can then replace the human. But this trust has to come from evidence of AI superiority over the long term, not from companies over-selling the reliability of their AI.

uh_uh3y ago

Humans are also less than perfect systems. Especially if they have to deal with monotonous tasks. A human might perform better on a 100 entries than an AI, but on 10 thousand? Of course you can distribute the workload, but you will balloon the costs (I'm talking about a future where GPT3 costs come down).

There must be a set of projects which are cost prohibited now due to having to pay humans but will become feasible exactly because of this tech. For a good portion of these, higher-than-human error rate will also be tolerable or at least correctable via a small degree of human intervention.

visarga3y ago

> A human might perform better on a 100 entries than an AI, but on 10 thousand?

There's also increased variance in human accuracy. You might train 100 but 10k people? A model is consistent all the way.

gbro3n3y ago

This is a good point. There is some work that just wont be done unless it can be automated, and in that case work with a higher rate of error is preferable to no work at all.

armchairhacker3y ago

This won’t work because humans are lazy and fundamentally wired to expend the least amount of effort possible. Just the belief that you have an AI that will correct your mistakes, will make people expend less effort (even subconsciously), until it completely cancels out any additional error correction from the AI. Plus, workers will hate the fact that an AI could automatically do exactly what they are doing but they are doing it manually for “error correction”.

It only works the opposite way, where machines and AI handle the trivial cases and humans handle the non-trivial ones. Many people actually genuinely like to solve hard problems which require thinking and skill, most people strongly dislike mundane repetitive tasks.

RodgerTheGreat3y ago

Humans may find, however, that without practice solving the trivial problems they are ill-prepared for the difficult ones.

andreilys3y ago

So humans should do the work of image classification, voice transcription, text summarization, etc. before an AI gets involved?

Makes total sense to me.

gbro3n3y ago

"As AI improves and gains trust in a field, it can then replace the human. But this trust has to come from evidence of AI superiority over the long term"

quickthrower23y ago

Often the other way around is better. Example: Let AI give text classification a go then let a human perfect and check the result. Let AI end your email but check you are happy with how it is physically phrased. Etc.

Human first scenarios will be more rare. And probably where the human has to do it by law. Made up example: border control checking passport photos match face. Human checks and if they click OK then AI double checks.

potatoman223y ago

One could argue that AI finishing emails is a progression of AI proofreading emails

leereeves3y ago

I mostly agree. Happily, that's also what people will want as long as human participation is necessary. We'd generally prefer to write rather than correct an AI's writing, and prefer to drive rather than carefully watch an AI drive.

But when the AI is capable of something the person can't do (like Stable Diffusion creating images compared to me) the AI should take first chair.

gbro3n3y ago

This is a good point. Where less than perfect AI is better than the alternative, it is useful.

Imnimo3y ago

The ability of language models to do zero-shot tasks like this is cool and all, but there is no way you should actually be doing something like this on data you care about. Like think about how much compute is going into trying to autofill a handful of zip codes, and you're still getting a bunch of them wrong.

dylan6043y ago

I've used/added the USPS API into a system and it took practically no time at all to do it. I'm guessing that is significantly less time than building an AI tool. What's worse is that the thing that takes the least time to implement actually provides good data.

Petersipoi3y ago

Obviously adding the USPS API to your tool would take less time than building an AI tool. But the AI tool is infinitely more powerful for almost anything other than dealing with addresses.

So the question isn't which one you can add to your tool faster. The question is, if I already have this AI tool setup, is it worth setting up the USPS API to go from 95% accuracy to 99.9% accuracy. For countless applications, it wouldn't be. Obviously if you need to scale and need to ensure accuracy, it's a different story.

1 more reply

mritchie7123y ago

yeah, determining the zip for an address is a really bad example.

Better one would be "based on these three columns, generate a cold outbound email for the person..."

it would suck to be on the receiving end of those, but the use case makes much more sense.

cycomachead3y ago

I've been wondering about this... It definitely doesn't feel great to be on the receiving end of something auto-generated. But a "unique" message is at least more interesting to read, and doesn't feel quite as frustrating.

A yet if P(someone unknown is a robot) gets too large, it's going to be a weird adjustment period.

pbmango3y ago

Microsoft and Google both have excellent formulas for dates - and are getting there for addresses. Right now - the most useful things you can accomplish in sheets center around what the underlying models are good at - general inference and generation based on text. Anything needing exact outputs should be a numeric formula or programmatic.

Non-exact outputs are actually a feature and not a bug for other use cases - but this takes a bit of use to really see.

layer83y ago

Yeah, I think energy efficiency considerations will become important at some point. Or at least they should.

ajmurmann3y ago

They only will if we price in the true cost of energy including negative externalities

cstross3y ago

Now wait for =deep_dream() or maybe =stable_diffusion() as a graph-generating function! (Graphs plotted with this function will of course zoom in infinitely but the further you go the more eyes and shiba dogs you'll notice in the corners ...)

bee_rider3y ago

Plots that devolve into madness as if you zoom in too close? Finally the programmers get to experience electronics class.

dj_mc_merlin3y ago

Sadly current models are bad at plotting graphs with any kind of accuracy. We're still quite far off from getting them to do a properly labeled, colored pie chart with a legend.

dplavery923y ago

For what it's worth, I'm also very bad at plotting graphs with any kind of accuracy, which is why I use plotting software instead of doing it by hand.

I get the feeling that my visual system and the language I use are respectively pretty bad at processing and conveying precise information from a plot, (beyond simple descriptors like "A is larger than B" or "f(x) has a maximum"). I guess I would find it mildly surprising if any Vision-Language model were able to perform those tasks very well, because the representations in question seem pretty poorly suited.

I get that popular diffusion models for image generation are doing a bad job composing concepts in a scene and keeping relationships constant over the image--even if Stable Diffusion could write in human script, it's a bad bet that the contents of a legend would match a pie chart that it drew. But other Vision-Language models, designed for image captioning or visual question answering, rather than generating diverse, stylistic images, are pretty good at that compositional information (up to, again, the "simple descriptions" level of granularity I mentioned before.)

wesleyyue3y ago

Also check out https://usedouble.com (YC W23) if you're interested in using something like this today.

Note: I'm the founder :) Happy to answer any questions.

Reply below with some sample data/problem and I'll reply with a demo to see if we can solve it out of the box!

trialskid863y ago

Just signed up. How long is the wait list?

wesleyyue3y ago

It's about 3 weeks right now. Mostly limited by me trying to make sure each person onboarded gets a high touch experience from me since the product is still rough around the edges. If you need access to solve an urgent problem, email me at founder@usedouble.com and I can get you in right away if you don't mind the early unpolished nature of the product.

conductr3y ago

> Get this AI tool for FREE. Next waitlist closes in:

> 0 day 7 hour 31 min 42 sec

I've never seen rolling waitlists, it's kind of strange tbh

1 more reply

mike2563y ago

Do I understand that correctly? When I have to create a spreadsheet like this, there are 2 options. Option 1 I write a table zipcode to state and use this table to generate my column. If I carefully check my table my spreadsheet would be okay. Option 2 I ask GPT3 to do my work. But I have to check the whole spreadsheet for errors.

giarc3y ago

I dealt with something similar. I was creating a large directory of childcare centres in Canada. I had thousands of listings with a url but no email address. I created a Mechanical Turk job to ask turkers to go to website and find an email address. Many came back with email addresses like admin@<<actualURL>>.com. After checking a few, I realized that the turkers were just guessing that admin@ would work and I'd approve their work. I ended up having to double check all the work.

visarga3y ago

> I ended up having to double check all the work.

Me too, different project and different labelling company but the conclusion - it's better to do it in house. Labelling is hard. You need to see, talk with and train your labelling team.

kolinko3y ago

That’s why you always set up layers of work with mturk, with other layers validating the first ones. Or give the same task to multiple workers and compare the results.

1 more reply

xvector3y ago

I wonder if those workers can be reported and fired.

2 more replies

chmod7753y ago

This seems to be doing much worse than existing solutions: Google Maps probably wouldn't have gotten quite as many wrong if you just pasted those addresses into the search bar. However it could be interesting as a last shot if parsing the input failed using any other way.

"I tried parsing your messy input. Here's what I came up with. Please make sure it's correct then proceed with the checkout."

1 more reply

orblivion3y ago

Could we hook GPT3 up to our dating apps? On both sides. That way we can just let the computers do the small talk and if they hit it off we can meet.

CPLX3y ago

Google: Black Mirror Hang The DJ

hairofadog3y ago

Such an excellent episode! A master class in what it means for actors to have chemistry.

Havoc3y ago

Of all the places spreadsheet is probably the one place you don’t want AI generated content. Half the time it’s financial info so sorta correct simply isn’t good enough

cdrini3y ago

Spreadsheets are used for _waaaay_ more than just finances. I don't think it's anywhere near 50% finances. I can't recall where, but I saw a study from I think the 90s saying most of the spreadsheets they found were being used as Todo lists.

Maybe like 1 in my past 2y of many, many spreadsheets has been financing related. I think you might be overgeneralizing to an ungeneralizeably large group -- the set of all human spreadsheets.

netsharc3y ago

I don't remember where I read this (and my summary might be off) but the computerized spreadsheet is a great invention because it's a programming environment for non-programmers.

"I need to input a number of variables and find their sum and average [or even more]. And I need to see how the outputs change if I change an input...".

wstuartcl3y ago

It does not matter the point stands. I can think of almost nothing worse than data in spreadsheets that LOOKS good but is wrong.

unnah3y ago

Most real-world spreadsheets contained significant errors in this 2005 review: https://www.researchgate.net/publication/228662532_What_We_K...

Is there any reason to think the situation has substantially improved since then?

mrguyorama3y ago

This is not a good excuse to actively and knowingly make it worse

anon257833y ago

I was going to say: This can only end well for the economy... /s

dylan6043y ago

Let's see if we can tell what data was used to train the model by watching where the money starts to be moved around into offshore accounts and what not. Was the model trained on the data dumps of from those "off shore" banks recently-ish leaked.

contravariant3y ago

When I see something is in a spreadsheet I immediately assume there are at least 3 things wrong with the data, 1 of which is obvious.

wstuartcl3y ago

I do not know what's worse. Missing data you need to populate, or having an AI populate data that LOOKS right. ugh I do know what is worse.

armchairhacker3y ago

I said it before: we need Copilot flash fill. Infer what the user wants the output to be from patterns and labels, so they can enter a few examples and then “extend” and automatically do the equivalent of a complex formula. e.g.

    Formal          | Informal
    Lane, Thomas    | Tommy Lane
    Brooks, Sarah   | Sarah Brooks
    Yun, Christopher |
    Doe, Kaitlyn    |
    Styles, Chris   |
    …

Automating something like this is extremely hard with an algorithm and extremely easy with ML. Even better, many people who use spreadsheets aren’t very familiar with coding and software, so they do things manually even in cases where the formula is simple.

dwringer3y ago

I posed this exact question to character.ai's "Ask Me Anything" bot. It decided to redo the examples, too. The results:

> Lane, Thomas => Thomas Layne

> Brooks, Sarah => Sarah Brooksy

> Yun, Christopher => Chris Yun

> Doe, Kaitlyn => KD

> Styles, Chris => Chris Spice, Chris Chasm

I'm sure the bot overcomplicated an otherwise simple task, but I think there's always gonna be some creative error if we rely on things like that. It's funny though because these results are plausible for what a real person might come up with as informal nicknames for their friends.

swyx3y ago

try to tweak the prompts so you get to the level of Katie Dee, KDizzle, Kdawg

1 more reply

scanr3y ago

That’s amazing. It does rely on a level of comfort with a fuzzy error budget.

vntok3y ago

About 20% of the generated postcodes are absurdly wrong.

swyx3y ago

also previously from 2020 https://twitter.com/pavtalk/status/1285410751092416513?s=20&...

Quarrelsome3y ago

"lack of dark mode" should be "features" not "usability"?

camtarn3y ago

Similarly, "Not sure what to do with the button" is clearly a usability issue, not features.

And for the second Kindle review, it summarized one point from the actual review, then completely made up two additional points!

Really impressive Sheets extension, but you'd have to be so careful what you applied this to.

adrianmonk3y ago

> completely made up two additional points

I wonder if this means the AI is dumb or that the AI is smart enough to notice that humans just make shit up sometimes, like when they're not reading carefully or when they need filler.

jedberg3y ago

If anything it's a lack of a usability feature. Sounds like both would be right.

planetsprite3y ago

GPT3 charges for every token read/written. What may be more useful is using GPT-3 not to manually run itself on every row, but to take the task and generate a sufficient function that fulfills the task.

visarga3y ago

Code as Policies: - Language Model Programs for Embodied Control

generates python, then executes

https://code-as-policies.github.io/

miohtama3y ago

Like with Stable Diffusion, maybe there will be an open model for the language prompts which less or no restrictions in the near future.

persedes3y ago

Gpt-j?

CrypticShift3y ago

Amen.

tonmoy3y ago

The tasks on the first sheet is easily accomplished by flash fill in MS Excel and I suspect less prone to error. Not sure why flash fill is not more popular

baxtr3y ago

What is flash fill? I have worked a lot on Excel sheets and still haven’t heard anything about it.

localhost3y ago

Flash fill is an implementation of program synthesis, a technique invented by Sumit Gulwani formerly of Microsoft Research. Here's a paper that explains more about how it works [1]. It's not a very discoverable feature of Excel though [2]

[1] https://arxiv.org/pdf/1703.03539.pdf [2] https://support.microsoft.com/en-us/office/using-flash-fill-...

brassattax3y ago

ctrl+e

ACV0013y ago

This particular example is an inadeqate application of AI. This is static data which can be looked up in a table (at least zip code).

a13713y ago

This is compared to inadequate application of humans. it is not competing with people who know how to do regex and string parsing. It is for the people who put an office assistant to the task. It is better to inadequately apply AI here as opposed to inadequately apply a human who probably has more fun things to do.

krossitalk3y ago

What about subtle formatting differences (Country, Territory, Postal code is the norm. Doesn't have to be.). What if we applied this to hand written addresses? (Adding an OCR component).

I'm sure the USPS is already doing this and more, and if not, there's probably some AI jobs lined up for it :)

lifthrasiir3y ago

Yes, USPS has Remote Encoding Centers (REC) to handle handwritten address that can't be recognized with OCR. So AI is already there, just that humans are for harder and tedious jobs ;-)

1 more reply

gpderetta3y ago

I think that the function should be called DWIM instead. Amazing feature otherwise, we really live in interesting times!

chime3y ago

DWIM: Do What I mean

https://en.wikipedia.org/wiki/DWIM

renewiltord3y ago

This is terrific stuff, honestly. I could see an Airtable integration being really quite useful. There were lots of times when I will run some quick scraping, some cleaning up via an Upworker, and then join against something else.

Here volume matters, and all misses are just lost data which I'm fine with. The general purpose nature of the tool makes it tremendous. There was a time when I would have easily paid $0.05 / query for this. The only problem with the spreadsheet setting is that I don't want it to repeatedly execute and charge me so I'll be forced to use `=GPT3()` and then copy-paste "as values" back into the same place which is annoying.

A4ET8a8uTh03y ago

Was anyone able to test if the Airtable implementation works as well as the twitter 'ad'?

magic_hamster3y ago

As a casual Google user, how do you start developing something like this? Does Google Sheets allow loading custom plugins?

greenie_beans3y ago

if you need that address parser, this is a bit more robust and easier to use: https://workspace.google.com/u/0/marketplace/app/parserator_...

DeathArrow3y ago

I would love to see a tool which uses GPT-3 to generate SQL from English.

Like: give me a list of all customers from London who purchased in January a laptop with more than 16GB of RAM and used a coupon between 10% and 25%. Sort it by price payd.

sandkoan3y ago

Everything required to make the tool already exists.

Just ran your exact query through OpenAI's Codex (model: code-davinci-002), and this was the result:

SELECT * FROM customers WHERE city = 'London' AND purchase_date BETWEEN '2019-01-01' AND '2019-01-31' AND product_name = 'laptop' AND product_ram > 16 AND coupon_percentage BETWEEN 10 AND 25 ORDER BY price_paid DESC;

I'd say it's pretty damn accurate.

skrebbel3y ago

The amount of 90% sensible, 10% ridiculously wrong computer generated crap we’re about to send into real humans’ brains makes my head spin. There’s truly an awful AI Winter ahead and it consists of spending a substantial amount of your best brain cycles on figuring out whether a real person wrote that thing to you (and it’s worth figuring out what they meant in case of some weird wording) or it was a computer generated fucking thank you note.

PontifexMinimus3y ago

> The amount of 90% sensible, 10% ridiculously wrong computer generated crap we’re about to send

Agreed. Sooner or later a company is going to do this with its customers, in ways that are fine 95% of the time but cause outrage or even harm on outliers.

And if that company is anyone like Google, it'll be almost impossible for the customers to speak to a human to rectify things.

szundi3y ago

And the funniest is that actual people may be worse, but still it is freaking me out to be moderated by an AI.

Also when this is normal and ubiquitous come people who are playing it and AI will be just dumb to recognise, the real humans all fired, game over, stuck at shitty systems and everyone goes crazy.

johnfn3y ago

Idea: Use GPT-3 to identify GPT-3-generated snippets.

pfarrell3y ago

> Use GPT-3 to identify GPT-3-generated snippets.

I just lost the game.

lucasmullens3y ago

In some cases it would be impossible, since sometimes it can output exactly what was written by a human, or something that sounds 100% like what someone would write.

But if you allow some false negatives, such as trying to detect if a bot is a bot, I think that could work? But I feel like the technology to write fake text is inevitably going to outpace the ability to detect it.

1 more reply

ASalazarMX3y ago

This, but unironically. It could be used to further improve the snippets that weren't identified.

1 more reply

roflyear3y ago

Then we need to train a model that gives us an idea of how accurate each prediction is

Eavolution3y ago

Is that not sort of what a GAN is?

jedberg3y ago

It depends on how people use the tools. For example the thank you note one -- if someone just prints off the output of this and sends it, yeah, that's bad.

But if someone uses this to do 90% of the work and then just edits it to make it personal and sound like themselves, then it's just a great time saving tool.

I mean, in this exact example, 70 years ago you'd have to hand address each thank you card by hand from scratch. 10 years ago you could use a spreadsheet just like this to automatically print off mailing labels from your address list. It didn't make things worse, just different.

This is just the next step in automation.

agf3y ago

> But if someone uses this to do 90% of the work and then just edits it to make it personal and sound like themselves, then it's just a great time saving tool.

This is still way too optimistic. Reading through something that's "almost right", seeing the errors when you already basically know what it says / what it's meant to say, and fixing them, is hard. People won't do it well, and so even in this scenario we often end up with something much worse than if it was just written directly.

There is a lot of evidence for this, from the generally low quality of lightly-edited speech-to-text material, to how hard it is to look at a bunch of code and find all of the bugs without any extra computer-generated information, to how hard editing text for readability can be without serious restructuring.

2 more replies

rdiddly3y ago

I would classify that act of editing as "completing the remaining 10% of the work." Somebody has to do it, whether you're doing it from the writing side as in your example, or making the reader do it from their side, as in my grandparent comment's example. But it's usually the last 10% of anything that's the hardest, so if someone abdicates that to a machine and signs their name to it (claiming they said it, and taking responsibility for it) they're kind of an asshole, in both the schlemiel and the schlemozel senses of the word.

I could extrapolate in my extremely judgmental way that the person who does that probably has a grandiose sense of how valuable their own time is, first of all, and secondly an impractical and sheepishly obedient devotion to big weddings with guest-lists longer than the list of people they actually give a shit about. Increase efficiency in your life further upstream, by inviting fewer people! (Yeah right, might as well tell them to save money by shopping less and taking fewer trips. Like that would ever work!)

But I digress, and anyway don't take any of that too seriously, as 20 years ago I was saying the same kinds of things about mobile phones... like "Who do you think you are, a surgeon, with that phone?" Notice it's inherently a scarcity-based viewpoint, based on the previous however-many years when mobile phones really were the province only of doctors and the like. Now they're everywhere... So, bottom line, I think the thank-you notes are a lousy use of the tech, but just like the trivial discretionary conversations I hear people having on their mobile phones now that they're ubiquitous, this WILL be used for thank-you notes!

mhh__3y ago

Probably closer to a Butlerian Jihad than a AI winter as per se, assuming something dramatic does happen

dzink3y ago

You got it! After seeing a few tweet storms and articles that turn out to be GPT3 gibberish, I end up coming to HN more for my news because usually someone flags waste of time in the comments.

The software would save people 80% or the work and most are lazy enough to release it as is, instead of fixing the remaining 20%. That laziness will end up forcing legislation to flag and eventually ban or deprioritize all GPT content, which will result in a war of adversarial behaviors trying to hide generated stuff among real. Can’t have nice things!

andreilys3y ago

How would you go about classifying something as GPT generated?

Let alone flagging/deprioritizing it via some draconian legislation?

2 more replies

nsxwolf3y ago

In the sci fi movie "Her", the main character has a job with the "Beautiful Handwritten Letters Company", a service for the outsourcing of letter writing. It seemed bizarre to me, but now I can envision a future where people are so tired of not knowing if their letter is a fake generated by some descendant of GPT-3, and feel great relief knowing their note was instead written by a human third party.

Domenic_S3y ago

Mail merge has existed since the 1980s. https://en.wikipedia.org/wiki/Mail_merge

brookst3y ago

Maybe? Is it really going to be all that different from the past thousand years where we've had 90% sensible, 10% ridiculously wrong[0] human-generated crap?

[0] https://ncse.ngo/americans-scientific-knowledge-and-beliefs-...

whiddershins3y ago

What happens if you try to ask GPT-3 whether something was written by GPT-3?

zikduruqe3y ago

Just listen here - https://infiniteconversation.com/

roody153y ago

Think the winter is here

est3y ago

Kinda reminds me of Google Sets.

mensetmanusman3y ago

Once it becomes an excel function, then it will be spun off into a start up.

ninefathom3y ago

Fascinating and terrifying all at once.

Queue Fry "I'm scare-roused" meme...

breck3y ago

If you do a startup for this please email me wire instructions.

appleflaxen3y ago

This is so freaking cool. Brilliant idea.

29athrowaway3y ago

GPT3 is not deterministic though.

layer83y ago

From one of the replies: “This is awesome. I also love how 20% of the zip codes are wrong. Messy AI future seems really fun and chaotic.”

forgotusername63y ago

You need an AI that can understand when not to answer as opposed to some best effort guessing. Some of that input didn't have numbers in the right format so no zip code.

The hilarious one is changing the zip code to 90210. The AI basically accusing you of a typo because you obviously meant that more famous zip code.

General purpose AIs in situations where more targeted, simpler solutions are needed are going to be incredibly dangerous. Sure this AI can fly a plane 99.999% of the time, but every once in a while it does a nose dive because of reasons we cannot possibly understand or debug.

dylan6043y ago

A human developer once told me that bad data is better than no data. <facepalm>

So of course a human developer made an AI that makes bad data.

a13692099933y ago

FWIW, the actual saying is that for the purposes of collection by enemies (like Facebook and Google or KGB and NSA), the only thing better than no data is bad data.

1 more reply

harrisonjackson3y ago

The author posted a follow up using a more advanced (and expensive) gpt3 model (davinci) which does a better job of parsing out the zip codes. It generally does a better job at everything, but if you can get away with one of the less expensive models then all the better.

ren_engineer3y ago

yeah, most of these demos for GPT-3 are that go viral are cherry picked at best

moralestapia3y ago

>I also love how 20% of the zip codes are wrong.

Only they aren't. Check the video again, they come out fine.

Edit: Oh dang, you're all right, several of them have wrong digits. :l

hexomancer3y ago

What are you talking about? Look at 27 seconds into the video. Many of the zip codes are wrong.

giarc3y ago

Row 15 includes the zip code 92105 in column A but the output is 92101. Similar for Row 5.

thatguymike3y ago

51s -- "We are truly grateful", says the heartfelt thankyou card that was written by an algorithm in a spreadsheet.

bee_rider3y ago

If people want to put this sort of language in a thank-you note, I guess... I dunno, it always comes off as inauthentic to me, so I don't really care if I got mass produced or artisanal hand-crafted lies.

semi-extrinsic3y ago

Oh, this is nothing new.

I remember in like 2007 or something, in the early days of Facebook, someone made a CLI interface to the FB API. And I wrote a random-timed daily cron job that ran a Bash script that checked "which of my FB friends have their birthday today", went through that list, selected a random greeting from like 15 different ones I'd put into an array, and posted this to the wall of person $i. Complete with a "blacklist" with names of close friends and family, where the script instead sent me an email reminder to write a manual, genuine post.

I used to have a golfed version of that script as my Slashdot signature.

CobrastanJorji3y ago

The problem's worse than you know. I've heard that sometimes real humans who tell a lot of people "thank you" aren't even that thankful.

j / k navigate · click thread line to collapse

139 comments

pbmango3y ago

[1] https://docs.google.com/spreadsheets/d/1lSpiz2dIswCXGIQfE69d... [2] https://www.promptloop.com/

tomcam3y ago

> The author of this tweet has been an early user

Maybe not good to reveal customer names this way, unless they already disclosed it publicly

elil173y ago

Any plans to bring this to Excel? I would love to recommend this to folks at my company but we aren't allowed to use G-Suite.

pbmango3y ago

Yes - Not open access yet but drop me an email!

geoduck143y ago

How much does it cost to run?

gbro3n3y ago

uh_uh3y ago

visarga3y ago

> A human might perform better on a 100 entries than an AI, but on 10 thousand?

There's also increased variance in human accuracy. You might train 100 but 10k people? A model is consistent all the way.

gbro3n3y ago

This is a good point. There is some work that just wont be done unless it can be automated, and in that case work with a higher rate of error is preferable to no work at all.

armchairhacker3y ago

RodgerTheGreat3y ago

Humans may find, however, that without practice solving the trivial problems they are ill-prepared for the difficult ones.

andreilys3y ago

So humans should do the work of image classification, voice transcription, text summarization, etc. before an AI gets involved?

Makes total sense to me.

gbro3n3y ago

"As AI improves and gains trust in a field, it can then replace the human. But this trust has to come from evidence of AI superiority over the long term"

quickthrower23y ago

potatoman223y ago

One could argue that AI finishing emails is a progression of AI proofreading emails

leereeves3y ago

But when the AI is capable of something the person can't do (like Stable Diffusion creating images compared to me) the AI should take first chair.

gbro3n3y ago

This is a good point. Where less than perfect AI is better than the alternative, it is useful.

Imnimo3y ago

dylan6043y ago

Petersipoi3y ago

Obviously adding the USPS API to your tool would take less time than building an AI tool. But the AI tool is infinitely more powerful for almost anything other than dealing with addresses.

1 more reply

mritchie7123y ago

yeah, determining the zip for an address is a really bad example.

Better one would be "based on these three columns, generate a cold outbound email for the person..."

it would suck to be on the receiving end of those, but the use case makes much more sense.

cycomachead3y ago

A yet if P(someone unknown is a robot) gets too large, it's going to be a weird adjustment period.

pbmango3y ago

Non-exact outputs are actually a feature and not a bug for other use cases - but this takes a bit of use to really see.

layer83y ago

Yeah, I think energy efficiency considerations will become important at some point. Or at least they should.

ajmurmann3y ago

They only will if we price in the true cost of energy including negative externalities

cstross3y ago

bee_rider3y ago

Plots that devolve into madness as if you zoom in too close? Finally the programmers get to experience electronics class.

dj_mc_merlin3y ago

Sadly current models are bad at plotting graphs with any kind of accuracy. We're still quite far off from getting them to do a properly labeled, colored pie chart with a legend.

dplavery923y ago

For what it's worth, I'm also very bad at plotting graphs with any kind of accuracy, which is why I use plotting software instead of doing it by hand.

wesleyyue3y ago

Also check out https://usedouble.com (YC W23) if you're interested in using something like this today.

Note: I'm the founder :) Happy to answer any questions.

Reply below with some sample data/problem and I'll reply with a demo to see if we can solve it out of the box!

trialskid863y ago

Just signed up. How long is the wait list?

wesleyyue3y ago

conductr3y ago

> Get this AI tool for FREE. Next waitlist closes in:

> 0 day 7 hour 31 min 42 sec

I've never seen rolling waitlists, it's kind of strange tbh

1 more reply

mike2563y ago

giarc3y ago

visarga3y ago

> I ended up having to double check all the work.

Me too, different project and different labelling company but the conclusion - it's better to do it in house. Labelling is hard. You need to see, talk with and train your labelling team.

kolinko3y ago

That’s why you always set up layers of work with mturk, with other layers validating the first ones. Or give the same task to multiple workers and compare the results.

1 more reply

xvector3y ago

I wonder if those workers can be reported and fired.

2 more replies

chmod7753y ago

"I tried parsing your messy input. Here's what I came up with. Please make sure it's correct then proceed with the checkout."

1 more reply

orblivion3y ago

Could we hook GPT3 up to our dating apps? On both sides. That way we can just let the computers do the small talk and if they hit it off we can meet.

CPLX3y ago

Google: Black Mirror Hang The DJ

hairofadog3y ago

Such an excellent episode! A master class in what it means for actors to have chemistry.

Havoc3y ago

Of all the places spreadsheet is probably the one place you don’t want AI generated content. Half the time it’s financial info so sorta correct simply isn’t good enough

cdrini3y ago

Maybe like 1 in my past 2y of many, many spreadsheets has been financing related. I think you might be overgeneralizing to an ungeneralizeably large group -- the set of all human spreadsheets.

netsharc3y ago

I don't remember where I read this (and my summary might be off) but the computerized spreadsheet is a great invention because it's a programming environment for non-programmers.

"I need to input a number of variables and find their sum and average [or even more]. And I need to see how the outputs change if I change an input...".

wstuartcl3y ago

It does not matter the point stands. I can think of almost nothing worse than data in spreadsheets that LOOKS good but is wrong.

unnah3y ago

Most real-world spreadsheets contained significant errors in this 2005 review: https://www.researchgate.net/publication/228662532_What_We_K...

Is there any reason to think the situation has substantially improved since then?

mrguyorama3y ago

This is not a good excuse to actively and knowingly make it worse

anon257833y ago

I was going to say: This can only end well for the economy... /s

dylan6043y ago

contravariant3y ago

When I see something is in a spreadsheet I immediately assume there are at least 3 things wrong with the data, 1 of which is obvious.

wstuartcl3y ago

I do not know what's worse. Missing data you need to populate, or having an AI populate data that LOOKS right. ugh I do know what is worse.

armchairhacker3y ago

    Formal          | Informal
    Lane, Thomas    | Tommy Lane
    Brooks, Sarah   | Sarah Brooks
    Yun, Christopher |
    Doe, Kaitlyn    |
    Styles, Chris   |
    …

dwringer3y ago

I posed this exact question to character.ai's "Ask Me Anything" bot. It decided to redo the examples, too. The results:

> Lane, Thomas => Thomas Layne

> Brooks, Sarah => Sarah Brooksy

> Yun, Christopher => Chris Yun

> Doe, Kaitlyn => KD

> Styles, Chris => Chris Spice, Chris Chasm

swyx3y ago

try to tweak the prompts so you get to the level of Katie Dee, KDizzle, Kdawg

1 more reply

scanr3y ago

That’s amazing. It does rely on a level of comfort with a fuzzy error budget.

vntok3y ago

About 20% of the generated postcodes are absurdly wrong.

swyx3y ago

also previously from 2020 https://twitter.com/pavtalk/status/1285410751092416513?s=20&...

Quarrelsome3y ago

"lack of dark mode" should be "features" not "usability"?

camtarn3y ago

Similarly, "Not sure what to do with the button" is clearly a usability issue, not features.

And for the second Kindle review, it summarized one point from the actual review, then completely made up two additional points!

Really impressive Sheets extension, but you'd have to be so careful what you applied this to.

adrianmonk3y ago

> completely made up two additional points

I wonder if this means the AI is dumb or that the AI is smart enough to notice that humans just make shit up sometimes, like when they're not reading carefully or when they need filler.

jedberg3y ago

If anything it's a lack of a usability feature. Sounds like both would be right.

planetsprite3y ago

visarga3y ago

Code as Policies: - Language Model Programs for Embodied Control

generates python, then executes

https://code-as-policies.github.io/

miohtama3y ago

Like with Stable Diffusion, maybe there will be an open model for the language prompts which less or no restrictions in the near future.

persedes3y ago

Gpt-j?

CrypticShift3y ago

Amen.

tonmoy3y ago

The tasks on the first sheet is easily accomplished by flash fill in MS Excel and I suspect less prone to error. Not sure why flash fill is not more popular

baxtr3y ago

What is flash fill? I have worked a lot on Excel sheets and still haven’t heard anything about it.

localhost3y ago

[1] https://arxiv.org/pdf/1703.03539.pdf [2] https://support.microsoft.com/en-us/office/using-flash-fill-...

brassattax3y ago

ctrl+e

ACV0013y ago

This particular example is an inadeqate application of AI. This is static data which can be looked up in a table (at least zip code).

a13713y ago

krossitalk3y ago

What about subtle formatting differences (Country, Territory, Postal code is the norm. Doesn't have to be.). What if we applied this to hand written addresses? (Adding an OCR component).

I'm sure the USPS is already doing this and more, and if not, there's probably some AI jobs lined up for it :)

lifthrasiir3y ago

Yes, USPS has Remote Encoding Centers (REC) to handle handwritten address that can't be recognized with OCR. So AI is already there, just that humans are for harder and tedious jobs ;-)

1 more reply

gpderetta3y ago

I think that the function should be called DWIM instead. Amazing feature otherwise, we really live in interesting times!

chime3y ago

DWIM: Do What I mean

https://en.wikipedia.org/wiki/DWIM

renewiltord3y ago

A4ET8a8uTh03y ago

Was anyone able to test if the Airtable implementation works as well as the twitter 'ad'?

magic_hamster3y ago

As a casual Google user, how do you start developing something like this? Does Google Sheets allow loading custom plugins?

greenie_beans3y ago

if you need that address parser, this is a bit more robust and easier to use: https://workspace.google.com/u/0/marketplace/app/parserator_...

DeathArrow3y ago

I would love to see a tool which uses GPT-3 to generate SQL from English.

Like: give me a list of all customers from London who purchased in January a laptop with more than 16GB of RAM and used a coupon between 10% and 25%. Sort it by price payd.

sandkoan3y ago

Everything required to make the tool already exists.

Just ran your exact query through OpenAI's Codex (model: code-davinci-002), and this was the result:

I'd say it's pretty damn accurate.

skrebbel3y ago

PontifexMinimus3y ago

> The amount of 90% sensible, 10% ridiculously wrong computer generated crap we’re about to send

Agreed. Sooner or later a company is going to do this with its customers, in ways that are fine 95% of the time but cause outrage or even harm on outliers.

And if that company is anyone like Google, it'll be almost impossible for the customers to speak to a human to rectify things.

szundi3y ago

And the funniest is that actual people may be worse, but still it is freaking me out to be moderated by an AI.

Also when this is normal and ubiquitous come people who are playing it and AI will be just dumb to recognise, the real humans all fired, game over, stuck at shitty systems and everyone goes crazy.

johnfn3y ago

Idea: Use GPT-3 to identify GPT-3-generated snippets.

pfarrell3y ago

> Use GPT-3 to identify GPT-3-generated snippets.

I just lost the game.

lucasmullens3y ago

In some cases it would be impossible, since sometimes it can output exactly what was written by a human, or something that sounds 100% like what someone would write.

1 more reply

ASalazarMX3y ago

This, but unironically. It could be used to further improve the snippets that weren't identified.

1 more reply

roflyear3y ago

Then we need to train a model that gives us an idea of how accurate each prediction is

Eavolution3y ago

Is that not sort of what a GAN is?

jedberg3y ago

It depends on how people use the tools. For example the thank you note one -- if someone just prints off the output of this and sends it, yeah, that's bad.

But if someone uses this to do 90% of the work and then just edits it to make it personal and sound like themselves, then it's just a great time saving tool.

This is just the next step in automation.

agf3y ago

> But if someone uses this to do 90% of the work and then just edits it to make it personal and sound like themselves, then it's just a great time saving tool.

2 more replies

rdiddly3y ago

mhh__3y ago

Probably closer to a Butlerian Jihad than a AI winter as per se, assuming something dramatic does happen

dzink3y ago

You got it! After seeing a few tweet storms and articles that turn out to be GPT3 gibberish, I end up coming to HN more for my news because usually someone flags waste of time in the comments.

andreilys3y ago

How would you go about classifying something as GPT generated?

Let alone flagging/deprioritizing it via some draconian legislation?

2 more replies

nsxwolf3y ago

Domenic_S3y ago

Mail merge has existed since the 1980s. https://en.wikipedia.org/wiki/Mail_merge

brookst3y ago

Maybe? Is it really going to be all that different from the past thousand years where we've had 90% sensible, 10% ridiculously wrong[0] human-generated crap?

[0] https://ncse.ngo/americans-scientific-knowledge-and-beliefs-...

whiddershins3y ago

What happens if you try to ask GPT-3 whether something was written by GPT-3?

zikduruqe3y ago

Just listen here - https://infiniteconversation.com/

roody153y ago

Think the winter is here

est3y ago

Kinda reminds me of Google Sets.

mensetmanusman3y ago

Once it becomes an excel function, then it will be spun off into a start up.

ninefathom3y ago

Fascinating and terrifying all at once.

Queue Fry "I'm scare-roused" meme...

breck3y ago

If you do a startup for this please email me wire instructions.

appleflaxen3y ago

This is so freaking cool. Brilliant idea.

29athrowaway3y ago

GPT3 is not deterministic though.

layer83y ago

From one of the replies: “This is awesome. I also love how 20% of the zip codes are wrong. Messy AI future seems really fun and chaotic.”

forgotusername63y ago

You need an AI that can understand when not to answer as opposed to some best effort guessing. Some of that input didn't have numbers in the right format so no zip code.

The hilarious one is changing the zip code to 90210. The AI basically accusing you of a typo because you obviously meant that more famous zip code.

dylan6043y ago

A human developer once told me that bad data is better than no data. <facepalm>

So of course a human developer made an AI that makes bad data.

a13692099933y ago

FWIW, the actual saying is that for the purposes of collection by enemies (like Facebook and Google or KGB and NSA), the only thing better than no data is bad data.

1 more reply

harrisonjackson3y ago

ren_engineer3y ago

yeah, most of these demos for GPT-3 are that go viral are cherry picked at best

moralestapia3y ago

>I also love how 20% of the zip codes are wrong.

Only they aren't. Check the video again, they come out fine.

Edit: Oh dang, you're all right, several of them have wrong digits. :l

hexomancer3y ago

What are you talking about? Look at 27 seconds into the video. Many of the zip codes are wrong.

giarc3y ago

Row 15 includes the zip code 92105 in column A but the output is 92101. Similar for Row 5.

thatguymike3y ago

51s -- "We are truly grateful", says the heartfelt thankyou card that was written by an algorithm in a spreadsheet.

bee_rider3y ago

semi-extrinsic3y ago

Oh, this is nothing new.

I used to have a golfed version of that script as my Slashdot signature.

CobrastanJorji3y ago

The problem's worse than you know. I've heard that sometimes real humans who tell a lot of people "thank you" aren't even that thankful.

j / k navigate · click thread line to collapse