Getting AI to write good SQL (opens in new tab)

koakuma-chan10mo ago

And Gemini is free.

oplorpe10mo ago

I’ve yet to see any llm proselytizers acknowledge this glaring fact:

Each new release is “game changing”.

The implication being the last release y’all said was “game changing” is now “from a different century”.

Do you see it?

For this to be an accurate and true assessment means you were wrong both before and wrong now.

ayrtondesozzla10mo ago

I'm not an LLM proselytiser but this makes no sense? It would almost make sense if someone were claiming there are only two possible games, the old one and the new one, and never any more. Who claims that?

squidbeak10mo ago

I'm unsure I fully understand your contention.

Are you suggesting that a rush to hyperbole which you don't like means advances in a technology aren't groundbreaking?

Or is it that if there is more than one impressive advance in a technology, any advance before the latest wasn't worthy of admiration at the time?

raincole10mo ago

It is just how fast this field advances compared to all the other things we've seen before. Human language doesn't have better words to describe this unusual phenomenon, so we resort to "game-changing".

in_ab10mo ago

I asked it to make some changes to the code it wrote. But it kept pumping out the same code with more and more comments to justify itself. After the third attempt I realized I could have done it myself in less time.

Eezee10mo ago

I tried it out because of your comment and the very first prompt Gemini 2.5 Pro hallucinated a non-existant plugin including detailed usage instructions.

Not really my idea of good.

patrick45110mo ago

This has consistently been my experience with every LLM I have tried. Everybody says "Oh, you tried it one the model from two months ago? Doesn't count, the new ones are sooo much better". So I try the new one and it still hallucinates.

Jnr10mo ago

Same thing, I thought I would give it a shot and it got the first solution so wrong in a simple nextjs typescript project I laughed out. It was fast but incorrect.

gundmc10mo ago

Can you provide your prompt? This hasn't matched my experience. You can also try enabling search grounding in the right hand bar. You have to also explicitly tell it in your prompt to use grounding with Google Search, but I've had very good success with that even for recent or niche plugins/libraries.

belter10mo ago

> Gemini AI Studio is such a giant leap ahead in programming I have to pinch myself when I'm using it

Every time in the last three or four weeks, there is a post here about Gemini, the top comment, or one of the top comments is something along these lines. And every time I spend a few minutes making empirical tests to check if I made a mistake in cancelling my paid Gemini account after giving up on it...

So I just did a couple of tests sending the same prompt on some AWS related questions to Gemini Pro 2.5 (free) and Claude paid, and no, Claude still better.

Workaccount210mo ago

Can you share the prompts?

bossyTeacher10mo ago

> Today I'm not sure how I could continue programming without Google Gemini in my toolkit

Anyone else concerned about this kind of statements? Make no mistake, everyone. We are living in a LLM bubble (not an AI bubble as none of these companies are actually interested in AI as such as moving towards AGI). They are all trying to commercialise LLMs with some minor tweaks. I don't expect LLMs to make the kind of progress made by the first 3 iterations of GPT. And when the insanely hyped overvaluations crashed, the bubble WILL crash. You BETTER hope there is any money left to run this kind of tools at a profit or you will be back at Stackoverflow trying to relearn all the skills you lost using generative coding tools.

reacharavindh10mo ago

I use Gemini2.5 Pro through work and it is excellent. However, I use Claude 3.7 Sonnet via API for personal use using money added to their account.

I couldn’t find a way to use Gemini like a prepaid plan. I ain’t giving my credit card to Google for an LLM that can easily charge me hundreds or thousands of EUR.

b0ringdeveloper10mo ago

Try OpenRouter. Load up with $20 of credits and use their API for a variety of models across providers, including Gemini. I think you pay ~5% extra for the OpenRouter service.

petesergeant10mo ago

Is this distinct from using Gemini 2.5 Pro? If not, this doesn’t match my experience — I’ve been getting a lot of poorly designed TypeScript with an excess of very low quality comments.

christophilus10mo ago

The comments drive me nuts.

// Moved to foo.ts

Ok, great. That’s what git is for.

// Loop over the users array

Ya. I can read code at a CS101 level, thanks.

CommenterPerson10mo ago

Sorry sounds like a marketing plug.

alostpuppy10mo ago

How do you use it exactly? Does it integrate with any IDEs?

Mossy910mo ago

Jetbrains AI recently added (beta) access to Gemini Pro 2.5 and there's of course plugins like Continue.dev that provide access to pretty much anything with an API

pimeys10mo ago

Zed supports it out of the box.

DonHopkins10mo ago

Just install Cursor, it supports Gemini and many other LLMs right out of the box.

https://developers.google.com/gemini-code-assist/docs/overvi...

miyuru10mo ago

There is Gemini Code Assist.

kdmtctl10mo ago

Copilot has it in preview. I found it looks deeper on devops tasks in the Agent mode. But context matters, you should include everything and it will push. Now I switch between Cloude and Gemini when one of them starts going circles. Gemini certainly could have more context but Copilot clearly limits it. Didn't try with Studio key though, only default settings.

beauzero10mo ago

Give Cline + vscode a try. Make sure to implement the "memory bank"...see Cline docs at cline.bot

mayas_10mo ago

I guess it depends on the type of tasks you give it.

They all seem to work remarkably well writing typescript or python but in my experience, they fall short when it comes to shell and more broadly dev ops

https://news.ycombinator.com/item?id=43996431

MrDarcy10mo ago

I’ve felt the same, but what is the equivalent of Claude code in Google’s ecosystem?

I want something running in a VM I can safely let all tools execute without human confirmation and I want to write my own tools and plug them in.

Right now a pro max subscription with Claude code plus my MCP servers seems to be the sweet spot, and a cursory look at the Google ecosystem didn’t identify anything like it. Am I overlooking something?

thinkxl10mo ago

I think using Aider[1] with Google's models is the closest.

It's my daily driver so far. I switch between the Claude and Gemini models depending on the type of work I'm doing. When I know exactly what I want, I use Claude. When I'm experimenting and discovering, I use Gemini.

[1]: https://aider.chat/docs/llms/gemini.html

noosphr10mo ago

It always is for the first week. Then you find out that the last 10% matter a lot more than than the other 90%. And finally they turn off the high compute version and you're left with a brain dead model that loses to a 32b local model half the time.

Barbing10mo ago

If a user eventually creates half a dozen projects with an API key for each, and prompts Gemini side-by-side under each key, and only some of the responses are consistently terrible…

Would you expect that to be Google employing cost-saving measures?

landl0rd10mo ago

Really? I get goofy random substitutions like sometimes from foreign languages. It also doesn't do good with my mini-tests of "can you write modern Svelte without inserting React" and "can you fix a borrow-checking issue in Rust with lifetimes, not Arc/Cell slop"

That doesn't mean it's worse than the others just not much better. I haven't found anything that worked better than o1-preview so far. How are you using it?

insin10mo ago

Is it just me or did they turn off reasoning mode in free Gemini Pro this week?

It's pretty useful as long as you hold it back from writing code too early, or too generally, or sometimes at all. It's a chronic over-writer of code, too. Ignoring most of what it attempts to write and using it to explore the design space without ever getting bogged down in code and other implementation details is great though.

I've been doing something that's new to me but is going to be all over the training data (subscription service using stripe) and have often been able to pivot the planned design of different aspects before writing a single line of code because I can get all the data it already has regurgitated in the context of my particular tech stack and use case.

energy12310mo ago

They rolled out a new model a week ago which has a "bug" where in long chats it forgets to emit the tokens required for the UI to detect that it's reasoning. You can remind it that it needs to emit these tokens, which helps, or accept that it will sometimes fail to do it. I don't notice a deterioration in performance because it is still reasoning (you can tell by the nature of the output), it's just that those tokens aren't in <think> tags or whatever's required by the UI to display it as such.

CuriouslyC10mo ago

I think reasoning in the studio is gated by load, and at the same time I wasn't seeing so much reasoning in AIstudio, I was getting vertex service overloaded calls pretty frequently on my agents.

lifty10mo ago

Excuse my ignorance, but is the good experience somehow influenced by Google AI Studio as well or only by the capability of the model itself? I know Gemini 2.5 is good, have been using it myself for a while. I still switch between Sonnet and Gemini, because I feel Claude code does some things better.

teleforce10mo ago

There's a complex Numpy indexing codes riddle in the section of "I don’t like NumPy indexing" and Gemini Pro 2.5 came on top (DeepSeek R1, only get the first time right but not later) [1],[2].

> For fun, I tried asking a bunch of AI models to figure out what shapes those arrays have. Here were the results:

Based on the results from the top 8 state-of-the- art AI models, Gemini is the best and consistently got the right results:

[1] I don't like NumPy (204 comments):

[2] I don't like NumPy: I don’t like NumPy indexing:

https://dynomight.net/numpy/

DHolzer10mo ago

without wanting to sound overly sceptical, what exactly makes you think it performs so much better compared to claude and chatgpt?

Is there any concrete example that makes it really obvious? I had no such success with it so far and i would really like to see the clear cut between the gemini and the others.

conartist610mo ago

You don't worry that you can't think anymore without paying google to think for you?

conartist610mo ago

OK, a better scenario than that: for some reason they cut you off. They're a huge company, they don't really care, and you would have no recourse. Many people live this story. Where once you were a programmer, if Google convinces you to eliminate your self-reliance they can then remotely turn off you being a programmer. There are other people who will use those GPU cycles to be programmers! Google will still make money.

ifellover10mo ago

Absolutely agree. I really pushed it last week with a screenshot of a very abstract visualisation that we’d done in a Miro board of which we couldn’t find a library that did exactly what we wanted, so we turned to Gemini.

Essentially we were hoping to tie that to data inputs and have a system to regularly output the visualisation but with dynamic values. I bet my colleague it would one shot it: it did.

What I’ve also found is that even a sloppy prompt still somehow is reading my mind on what to do, even though I’ve expressed myself poorly.

Inversely, I’ve really found myself rejecting suggestions from ChatGPT, even o4-mini-high. It’s just doing so much random crap I didn’t ask and the code is… let’s say not as “Gemini” as I’d prefer.

alecco10mo ago

Remember when Microsoft started to do good things? Big corps suck when they are on top and unchallenged. It's imperative to reduce their monopolies.

Gud10mo ago

No, I don’t.

Der_Einzige10mo ago

Shhh!!! Normies will catch on and google will stop making it free.

But more seriously, they need to uncap temperature and allow more samplers if they want to really flex on their competition.

energy12310mo ago

Can you explain what you mean by "uncapping" temperature and "samplers"? You can currently set temperature to whatever you want. Or do you want > 2 temp.

the_arun10mo ago

Are you talking about Firebase Studio?

Saul1998zx10mo ago

6451937099

yahoozoo10mo ago

Nice try, Mr. Google.

But seriously, yeah, Gemini is pretty great.

CommenterPerson10mo ago

SQL Data analyst with years and years of experience here.

Most of my roles were in small teams building quick ad-hoc analyses for business leaders in large multi billion dollar businesses. Example, one db was Oracle e-business suite. It had been set up ~20 years prior with enhancements along the way. There were only a handful of people in the company who knew what the fields helpfully named like ATTR_000349857. Everyone was overworked with urgent requests (and occasional layoffs) and no one bothered to spend time on documenting the database. I suppose this topic fits under "Provide business-specific context". Great, hire someone to understand and document all that crap. Ain't going to happen.

The other roles also fit this pattern, different systems but urgent needs, always a drought of people who understood both the business and the database.

Occasionally we'd get some "AI team" come looking for "data". After many mindless meetings with no clear objective except "increase profits", they'd quietly disappear.

I use stack overflow a lot to pull SQL examples and tweak. A lot of AI talk feels like hype -- to start with, i'd suggest not redefining words like "hallucination", "intelligence" etc which mean something else in English. Maybe call it "advanced algorithms" and stop with the hype. Also the surveillance and extraction of user data for advertising. Thank you.

senko10mo ago

> Oracle e-business suite [...] set up ~20 years prior

> fields helpfully named like ATTR_000349857

> Everyone was overworked

> no one bothered to spend time on documenting the database.

I can't blame AI for not being helpful here, nothing short of divine intervention can fix that.

levocardia10mo ago

In one of Stephen Boyd's lectures on convex optimization, he has some quip like "if your optimization problem is computationally intractable, you could try really hard to improve the algorithm, or you could just go on vacation for a few weeks and by the time you get back, computers will be fast enough to solve it."

I feel like that's actually true now with LLMs -- if some query I write doesn't get one-shotted, I don't bother with a galaxy-brain prompt; I just shelve it 'til next month and the next big OpenAI/Anthropic/Google model will usually crush it.

owebmaster10mo ago

> I just shelve it 'til next month and the next big OpenAI/Anthropic/Google model will usually crush it.

1 month to write some code with LLM, that's quite the opposite of the promised productivity gain

AbstractH2410mo ago

Has the pace of this slown down or I have just lost track of the narrative?

Feels like innovation in AI is rapidly changing from paradigm-shifting to incremental.

th0ma510mo ago

Except here the core functionality changes day to day and hinges on specific word usage.

user393938210mo ago

Try getting it to write a codepen sim of 3 rectangles parallel parking.

mykowebhn10mo ago

I understand from a technical POV how this could be considered great news.

But I don't see how this is good news at all from a societal POV.

The last 15 or so years has seen an unprecedented rise in salaries for engineers, especially software engineers. This has brought an interest in the profession from people who would normally not have considered SW as a profession. I think this is both good and bad. It has brought new found wealth to more people, but it may have also diluted the quality of the talent pool. That said, I think it was mostly good.

Now with this game-changing efficiency from these AI tools, I'm sure we've seen an end to the glory days in terms of salaries for the SW profession.

With this gone, where else could relatively normal people achieve financial independence? Definitely not in the service industry.

Very sad.

thicTurtlLverXX10mo ago

I understand how, from a technical POV, electricity and electrification could be considered great news.

But I don't see how this is good news at all from a societal POV.

Think about all the lamplighters who lost their jobs. Streetlights just turn on now? Lamplighting used to be considered a stable job! And what about the ice cutters…

For real tho, it's not like there's nothing left to do — we still have potholes to fix, t-shirts to fold and illnesses to cure. Just the fact that many people continue to believe that wars are justified by resource scarcity shows we need technological progress.

Winsaucerer10mo ago

> we still have potholes to fix, t-shirts to fold and illnesses to cure

Only one of these things interests me. The hype of AI is threatening to kill something I actually enjoy doing. If the hype actualises, I'll likely find myself having to do something I don't enjoy. That being said, if programming can be automated, then probably every white collar job is under serious threat.

mykowebhn10mo ago

From what I understand, prior to the 1980s/90s lamplighters, waiters, factory workers, etc. could live comfortable lives on decent wages.

These days not so much.

DHolzer10mo ago

how is that technological progress not fueling resource scarcity?

user393938210mo ago

I can’t reconcile statements like this with my experience trying to code with LLMs. As soon as there’s any real complexity they spit out nonsense broken code that in some cases could take a long time to debug. Then when you correct it “You’re totally right, I’ll change it so that x y z”. If you weren’t a senior dev with loads of experience you wouldn’t be able to debug or correct the code these tools produce.

mykowebhn10mo ago

If you were a new dev now learning the ropes, with these AI coding tools available, I highly doubt you would gain the same "loads of experience".

Learning comes through struggle and it's too easy to bypass that struggle now. It's so much easier to get the answers from AI.

zkry10mo ago

Im curious why there's this sentiment in regarding advances in AI. High level programming languages didnt in the least bit take away the value of the SW profession, despite allowing a vast number more people to write software.

The amount and complexity of software will expand to its very outer bounds for which specialists will be required.

AbstractH2410mo ago

A better comparison I think is low-code platforms.

There are plenty of folks making a living using platforms like Salesforce and “clicks not code,” but it never led to an implosion of the SE job market. Just expanded the tech job pool. And it’s hard to imagine how that would have happened if everything needed to be coded.

Like how a growth in medical-paraprofessionals didn’t negate the need for doctors and nurses.

a_imho10mo ago

I've not fully bought the hype yet but actually think LLMs democratizing technical solutions would be a fantastic opportunity for both established players and newcomers. The more LLMs improve, the less of a moat technology is in itself.

lodovic10mo ago

While AI can empower experts with strong prompt engineering skills, I don’t believe it enables the creation of truly complex solutions unless the user already possesses the necessary expertise. For seasoned developers, these tools are fantastic. For those without a software background, they appear almost magical. But people shouldn’t build solutions they don’t fully understand, it leads to a maintainability nightmare.

rocqua10mo ago

Sounds like there need to be measures to fix income inequality.

lerp-io10mo ago

Aren't programmers supposed to build digital products for end users and this just makes it faster? more like POV from a person who got hired...EOD you need to think what you are doing for the bigger world and what the world can do for you because that is what your end user - your boss - is thinking. people just like to swim in their lane in their own little B2B world (i do X = i get Y) without ever stopping to think about anything except what is in front of them

hiAndrewQuinn10mo ago

It's better for society to get much wealthier, much faster, by opening up the possibility for anyone to do advanced programming, than for a small class of anointed and studied elites to get rich via this exclusion. It's the opposite of sad. It's the best thing that ever happened for the productive use of a computer by a layperson since the invention of the search engine.

prmph10mo ago

LLMs are not going to allow you to do advanced programming if you couldn't already do it by hand. The thing about LLMs is, they are a force multiplier, imperfectly, but I guess they are getting there. The overall vision (unless trivial), architecture, functionality-gaps-filling, revisions, etc. of an advanced project is not going to come from an LLM.

I personally don't think we are ever going to get to that point where I can give a simple propnmt and have an LLM generate a complex app ready to run. Think about what that would require:

1. The LLM would have to read my mind and extrapolate all the minute decisions I would make to implement the app based on the prompt.

2. Assuming the LLM can get past (1), it would have to basically be AGI to be able to implement pretty much whatever I can dream up.

3. If 2 & 3 above is somehow achieved, it would be economically very valuable, and you can bet that functionality is not going to be casually enabled in LLMs, for just anyone to use.

eqvinox10mo ago

Sure, but this isn't "anyone" doing advanced programming, it's the LLM doing it. The humans get skill in using LLM, not programming, and whether this new skill will make anyone wealthy is an open question.

(Also, just by market logic, rare skills in demand are always paid more; I'm not sure why you're calling it an "exclusion". The education system in a lot of places might have that function, but that's a separate issue not helped by LLMs writing SQL?)

2 - https://news.ycombinator.com/item?id=25930190

foldr10mo ago

Software engineers earning enough to achieve financial independence are generally employed by FAANG or (indirectly) by venture capitalists who have more money than they know what to do with.

With all this money sloshing around, it takes only a little imagination to think of ways of channeling some of it to working people without employing them to write pointless (or in some cases actively harmful) software.

mritchie71210mo ago

the short answer: use a semantic layer.

It's the cleanest way to give the right context and the best place to pull a human in the loop.

A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.

With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.

We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.

My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.

0 - https://www.definite.app/

1 - https://cube.dev/

ljm10mo ago

> you get the added benefit of writing queries in JSON instead of raw SQL.

I’m sorry, I can’t. The tail is wagging the dog.

dang, can you delete my account and scrub my history? I’m serious.

fhkatari10mo ago

You move all the tools to debug and inspect slow queries, in a completely unsupported JSON environment, with prompts not to make up column names. And this is progress?

IncreasePosts10mo ago

You're right, it's a bit ridiculous. This is a perfect time to use xml instead of json.

indymike10mo ago

This may be the best comment on Hacker News ever.

mritchie71210mo ago

LLMs are far more reliable at producing something like this:

    {
      "dimensions": [
        "users.state",
        "users.city",
        "orders.status"
      ],
      "measures": [
        "orders.count"
      ],
      "filters": [
        {
          "member": "users.state",
          "operator": "notEquals",
          "values": ["us-wa"]
        }
      ],
      "timeDimensions": [
        {
          "dimension": "orders.created_at",
          "dateRange": ["2020-01-01", "2021-01-01"]
        }
      ],
      "limit": 10
    }

than this:

    SELECT
      users.state,
      users.city,
      orders.status,
      sum(orders.count)
    FROM orders
    CROSS JOIN users
    WHERE
      users.state != 'us-wa'
      AND orders.created_at BETWEEN '2020-01-01' AND '2021-01-01'
    GROUP BY 1, 2, 3
    LIMIT 10;

tclancy10mo ago

Mother of God. I can write JSON instead of a language designed for querying. What is the advantage? If I’m going to move up an abstraction layer, why not give me natural language? Lots of things turn a limited natural language grammar into SQL for you. What is JSON going to {do: for: {me}}?

8n4vidtmkvmk10mo ago

Sorry, I couldn't parse that. You didn't quote your keys

Spivak10mo ago

I find it funny people are making fun of this while every ORM builds up an object representing the query and then compiles it to SQL. SQL but as a data structure you can manipulate has thousands of implementations because it solves a real problem. This time it's because LLMs have an easier time outputting complex JSON than SQL itself.

meindnoch10mo ago

>you get the added benefit of writing queries in JSON instead of raw SQL

^ kids, this is what AI-induced brainrot looks like.

fkyimeanit10mo ago

>you get the added benefit of writing queries in JSON instead of raw SQL

You should have written your comment in JSON instead of raw English.

christophilus10mo ago

A semantic layer would be great. It should be a structured layer designed to make relational queries easy to write. We could call it “structured data language” or maybe “structured query language”.

In all seriousness, I have some complaints about SQL (I think LINQ’s reordering of it is a good idea), but there’s no need to invent another layer on order for LLMs to be able to wrangle it.

cmrdporcupine10mo ago

The semantic layer for database queries is (roughly) the relational algebra.

jinjin210mo ago

I agree that using a semantic layer is the best way to get better precision. It is almost like a cheatsheet for the AI.

But I would never use one that forced me to express my queries in JSON. The best implementations integrate right into the database so they become an integral part of regular your SQL queries, and as such also available to all your tools.

In my experience, from using the Exasol Semantic Layer, it can be a totally seamless experience.

galenmarchetti10mo ago

still need someone to build the semantic layer, why not use text2sql or something similar for that

hakanito10mo ago

The game changer for me will be when AI stops hallucinating SDK methods. I often find myself asking ”show me how to do advanced concept X in somewhat niche Y sdk”, and while it produces confident answers, 90% of the time it is suggesting SDK methods that do not exist, so a lot of time is wasted just arguing about that

M4v3R10mo ago

The current method of solving this is providing the AI with the documentation of the SDKs your code uses. Current LLMs have quite big context windows so you can feed them a lot of documentation. Some tools can even crawl multipage documentation and index them for the use of LLMs.

hakanito10mo ago

How do you do that practically/reliably? Would be great to just paste a link to the SDK Github repo, but doesn't seem to work (yet) in my experience

stuaxo10mo ago

Bring LLMs they always hallucinate an API it would be great to have.

If you had something on the other side to hallucinate the API itself you could have a program that dreams itself into existence as you use it.

flir10mo ago

"I think callTheExactMethodINeed() was a hallucination. Can you try again?"

Then it apologizes and gives the right answer. It's weird. We really need a new work for what they're doing, 'cos it ain't thinking.

Hilift10mo ago

I wouldn't be surprised if some of these hallucinations were actual custom code samples in the same namespace that they have access to, but can't attribute due to IP issues.

danjc10mo ago

The article comments "out of the box, LLMs are particularly good at tasks like creative writing" but I think this actually demonstrates the problem with the ai.

A writer won't think that they're good at creative writing. In fact, I'm pretty sure they'd think LLM's are terrible at creative writing.

In other words, to an expert in their field, they're not that good - at least not yet.

But to someone who is not an expert, they're unbelievably good - they're enabled to do something they had zero ability to do before.

randomNumber710mo ago

Yes, but why is then everyone on HN claiming LLMs can code on expert level?

danjc10mo ago

Fast for hammering out boilerplate, great for understanding something you've never done before. Much less value for field-frontier or novel work.

__loam10mo ago

I would posit that most people on hackernews are actually not that experienced.

jeltz10mo ago

Because the people claiming so are actually bad at coding. I suspect a lot of them actually work in non-coding positions. And while I can for sure see how LLMs can be useful, they code at the level of a junior dev fresh out of college, if I am being generous.

[1] https://www.malloydata.dev/ [2] https://docs.malloydata.dev/documentation/user_guides/malloy... [3] https://github.com/malloydata/publisher

candiddevmike10mo ago

Techbro astroturfing. You don't really see the same level of OMG AI on other forums like Reddit. Same thing happened with cryptocurrencies, HN was inundated with plugs for them and the same behavior was downvoted severely elsewhere.

AdrianB110mo ago

In real life I find using AI for SQL dangerous. It allows people that don't know what they do to write queries that can significantly impact servers. In my world databases are relatively big for most developers, but not huge.

Sometimes when I want to fine tune a query I am challenging AI to provide a better solution. I give it the already optimized query and I ask for better. I never got a better answer, sometimes because AI is hallucinating or because the changes that it proposes are not working in a way that is beneficial, it is like an idiot parrot is telling what it overheard in the brothel - good info if it is a war brothel frequented by enemy officers in 1916, but not these days.

awesome_dude10mo ago

Mate, IME programmers who don't know what they are doing just do it anyways then look to blame someone/something else if things turn to custard.

AI is just increasing the frequency of things turning to custard :)

HideousKojima10mo ago

AI is most effective as an accountability sink

strict910mo ago

It should never be at the point where some random person can impact a server.

That's what read replicas with read-only access are for. Production db servers should not be open to random queries and usage by people. That's only for the app to use.

sgarland10mo ago

Unless you have a much more regimented code review process than anywhere I've seen, "a random person" can impact prod quite easily by introducing a bad query into the app. Since ORMs are rampant, it's probably heavily obfuscated to begin with, so they won't even see the raw SQL. At best, they'll have run it on stage, where the DB size is probably so tiny that its performance issues go unnoticed.

AdrianB110mo ago

How it should be and how it is, that depends on who is the decision maker. If the decision maker is a technical person, there is no gap, but in my case the decision maker is a non-technical manager with no competence to make such decisions, but that is the way the company is organized. So letting people use AI to dig through a 1 TB database is not a good idea, while not using AI prevents them to even try. Security by oblivion.

cheema3310mo ago

> I give it the already optimized query and I ask for better. I never got a better answer..

This was my experience as well. However I have observed that things have been improving this regard. Newer LLMs do perform much better. And I suspect they will continue to get better over time.

cjbgkagh10mo ago

I’ve been working on highly optimized code that heavily uses CPU intrinsics, a year ago no chance, 6 months ago a helpful reference, today it’s a good starting point. That is an insane pace of improvement.

scarface_7410mo ago

> It allows people that don't know what they do to write queries that can significantly impact servers.

At least for the only OLAP DB I use often - Amazon Redshift - that’s a solved problem with Workload Management Queues. You can restrict those users ability to consume too many resources.

For queries that are used for OLTP, I usually try to keep those queries relatively simple. If there is a reason for read queries that consume resources , those go to read replicas when strong consistently isn’t required

ziml7710mo ago

The strategy I've used with these people is to let them prototype with AI and then have them hand over their work to me where I can then make it significantly more efficient. The nice thing is that their poor performing version acts as a reference for validating the output of my queries.

insin10mo ago

Is it too late to rescue the phrase "one-shotted" or is it already too far gone, like "AI" and "agent"?

troupo10mo ago

For some reason I can't get the image of someone swinging back shots of vodka/tequila every time I see "one-shotted" out of my head

th0ma510mo ago

Reminds me the "crypto" name overloading. It is clear that fanboys are jealous of competence.

bob102910mo ago

For the problems where it would matter the most, these tools seem to help the least. The hardest problem domains don't have just one schema to worry about. They have hundreds. If you need to spin up a personal blog or todo list tracker, I have no doubt that Google, et. al. can take you exactly where you want to go.

galenmarchetti10mo ago

and then add in ambiguity in the business terms / intention behind the query. still a big need for something like semantic layer or ontology to sit between business and at least right now that stuff hasn’t been automated away yet (it should be though)

mrtimo10mo ago

Malloy [1] has a semantic layer [2]... and Model Context Protocol (MCP) support is being added through Publisher [3]. Something to keep an eye on. Seems like a great fit for LLMs.

fourfun10mo ago

Google may be getting AI to write good SQL, but they aren’t getting it to write good blog posts.

gizmodo5910mo ago

The blog post lacks lots of details and sounds more of a marketing piece and “Try this!”. They did not release the evals, a very basic architecture flow which is not novel nor any real world benchmarks that says how it worked expect some vague statements. Must have been generated by Gemini

mark_l_watson10mo ago

Nice! A little off topic but I spent years experimenting writing AI-like natural language wrappers for relational databases that would query meta data to get column names, etc. Peter Norvig, in doing a tech review for me for the second edition of my Java AI book made a comment that the NLP database example was much better than anything else in the book, so the code I sweated over off and on for years was probably pretty good, BUT!, compared to what you can build with LLMs today, my old NLP wrappers aren't good at all.

LLMs make some things that were difficult very easy now.

Good article!

rectang10mo ago

> We will cover state-of-the-art [...] how we approach techniques that allows the system to offer virtually certified correct answers.

I don't need AI to generate perfect SQL, because I am never going to trust the output enough to copy/paste it — the risk of subtle semantic errors is too high, even if the code validates.

Instead, I find it helpful for AI to suggest approaches — after which I will manually craft the SQL, starting from scratch.

hsbauauvhabzb10mo ago

Explain that to the average manager or junior engineer, both who don’t care about your desire to build well but not fast.

rectang10mo ago

It’s not true that I want to build “well but not fast” — I’m trying to add value, and both speed and reliability matter. My productivity is high and I don’t have trouble articulating why; my approach has generally (though not universally) been well received by management and colleagues.

noosphr10mo ago

> So now that we brought down prod for a day the new rule is no AI sql without three humans signing off on any queries.

hosel10mo ago

Really? In my experience it’s been pretty good (using Pydantic)! I read over before I execute it, but it’s never done anything malicious.

rectang10mo ago

I don't trust myself to craft a prompt in natural language which completely specifies my intent as codified with the precision of a programming language.

I also tend to turn to AI for advising me on difficult use cases, and most of the time it's for production code rather than one-offs. The easy cases, I just write myself because it's more mental effort to review code for subtle errors than it is to write it.

yahoozoo10mo ago

What is the relevance of Pydantic with SQL?

paulddraper10mo ago

Hopefully your trust in yourself is warranted

rectang10mo ago

I embrace my fallibility, and enthusiastically pursue testing, code reviews, staging environments, and so on to minimize the mistakes that make it through to production.

It seems to me that this skeptical mindset is consonant with handling AI output with care.

auggierose10mo ago

You'd rather trust in AI than yourself?

pcblues10mo ago

Can someone please answer these questions because I still think AI stinks of a false promise of determinable accuracy:

Do you need an expert to verify if the answer from AI is correct? How is it time saved refining prompts instead of SQL? Is it typing time? How can you know the results are correct if you aren't able to do it yourself? Why should a junior (sorcerer's apprentice) be trusted in charge of using AI? No matter the domain, from art to code to business rules, you still need an expert to verify the results. Would they (and their company) be in a better place to design a solution to a problem themselves, knowing their own assumptions? Or just check of a list of happy-path results without a FULL knowledge of the underlying design? This is not just a change from hand-crafting to line-production, it's a change from deterministic problem-solving to near-enough is good enough, sold as the new truth in problem-solving. It smells wrong.

lmeyerov10mo ago

I can bring data here:

We recently did the first speed run where Louie.ai beat teams of professional cybersecurity analysts in an open competition, Splunk's annual Boss of the SOC. Think writing queries, wrangling Python, and scanning through 100+ log sources to answer frustratingly sloppy database questions:

- We get 100% correct for basic stuff in the first half that takes most people 5-15 minutes per question, and 50% correct in the second half that most people take 15-45+ minute per question, and most teams time out in that second half.

- ... Louie does a median 2-3min per question irrespective of the expected difficulty, so about 10X faster than a team of 5 (wall clock), and 30X less work (person hours). Louie isn't burnt out at the end ;-)

- This doesn't happen out-of-the-box with frontier models, including fancy reasoning ones. Likewise, letting the typical tool here burn tokens until it finds an answer would cost more than a new hire, which is why we measure as a speedrun vs deceptively uncapped auto-solve count.

- The frontier models DO have good intuition , understand many errors, and for popular languages, DO generate good text2query. We are generally happy with OpenAI for example, so it's more on how Louie and the operator uses it.

- We found we had to add in key context and strategies. You see a bit in Claude Code and Cursor, except those are quite generic, so would have failed as well. Intuitively in coding, you want to use types/lint/tests, and same but diff issues if you do database stuff. But there is a lot more, by domain, in my experience, and expecting tools to just work is unlikely, so having domain relevant patterns baked in and that you can extend is key, and so is learning loops.

A bit more on louie's speed run here: https://www.linkedin.com/posts/leo-meyerovich-09649219_genai...

This is our first attempt at the speed run. I expect Louie to improve: my answers represent the current floor, not the ceiling of where things are (dizzyingly) going. Happy to answer any other q's where data might help!

blibble10mo ago

is a competition/speed run a realistic example?

BenderV10mo ago

My 2 cents, building a tool in this space...

> Do you need an expert to verify if the answer from AI is correct?

If the underling data has a quality issue that is not obvious to a human, the AI will miss it too. Otherwise, the AI will correct it for you. But I would argue that it's highly probable that your expert would have missed it too... So, no, it's not a silver bullet yet, and the AI model often lacks enough context that humans have, and the capacity to take a step back.

> How is it time saved refining prompts instead of SQL?

I wouldn't call that "prompting". It's just a chat. I'm at least ~10x faster (for reasonable complex & interesting queries).

herrkanin10mo ago

Same reason as why it's harder to solve a sudoku than it is to verify its correctness.

pcblues10mo ago

I should have made my post clearer :)

There isn't one perfect solution to SQL queries against complex systems.

A suduko has one solution.

A reasonably well-optimised SQL solution is what the good use of SQL tries to achieve. And it can be the difference between a total lock-up and a fast running of a script that keeps the rest of a complex system from falling over.

neRok10mo ago

I've recently started asking the free version of chat-gpt questions on how I might do various things, and it's working great for me - but also my questions come from a POV of having existing "domain knowledge".

So for example, I was mucking around with ffmpeg and mkv files, and instead of searching for the answer to my thought-bubble (which I doubt would have been "quick" or "productive" on google), I straight up asked it what I wanted to know;

  > are there any features for mkv files like what ffmpeg does when making mp4 files with the option `--movflags faststart`?

And it gave me a great answer!

  (...the answer happened to be based upon our prior conversation of av1 encoding, and so it told me about increasing the I-frame frequency).

Another example from today - I was trying to build mp4v2 but ran in to drama because I don't want to take the easy road and install all the programs needed to "build" (I've taken to doing my hobby-coding as if I'm on a corporate-PC without admin rights (windows)). I also don't know about "cmake" and stuff, but I went and downloaded the portable zip and moved the exe to my `%user-path%/tools/` folder, but it gave an error. I did a quick search, but the google results were grim, so I went to chat-gpt. I said;

  > I'm trying to build this project off github, but I don't have cmake installed because I can't, so I'm using a portable version. It's giving me this error though: [*error*]

And the aforementioned error was pretty generic, but chat-gpt still gave a fantastic response along the lines of;

  >  Ok, first off, you must not have all the files that cmake.exe needs in the same folder, so to fix do ..[stuff, including explicit powershell commands to set PATH variables, as I had told it I was using powershell before].
  >  And once cmake is fixed, you still need [this and that].
  >  For [this], and because you want portable, here's how to setup Ninja [...]
  >  For [that], and even though you said you dont want to install things, you might consider ..[MSVC instructions].
  >  If not, you can ..[mingw-w64 instructions].

neRok10mo ago

[Going to give myself a self-reply here, but what-ev's. This is how I talk to chat-gpt, FYI]... So I happened to be shopping for a cheap used car recently, and we have these ~15 year old Ford SUV's in Aus that are comfortable, but heavy and thirsty. Also, they come in AWD and RWD versions. So I had a thought bubble about using an AWD "gearbox" in a RWD vehicle whilst connecting an electric motor to the AWD front "output", so that it could work as an assist. Here was my first question to chat-gpt about it;

  > I'm wondering if it would be beneficial to add an electric-assist motor to an existing petrol vehicle. There are some 2010 era SUV's that have relatively uneconomical petrol engines, which may be good candidates. That is because some of them are RWD, whilst some are AWD. The AWD gearbox and transfer case could be fitted to the RWD, leaving the transfers front "output" unconnected. Could an electric motor then be connected to this shaft, hence making it an input?

It gave a decent answer, but it was focused on the "front diff" and "front driveshaft" and stuff like that. It hadn't quite grasped what I was implying, although it knew what it was talking about! It brought up various things that I knew were relevant (the "domain knowledge" aspect), so I brought some of those things in my reply (like about the viscous coupling and torque split);

  > I mentioned the AWD gearbox+transfer into a RWD-only vehicle, thus keeping it RWD only. Thus both petrol+electric would be "driving" at the same time, but I imagine the electric would reduce the effort required from the petrol. The transfer case is a simple "differential" type, without any control or viscous couplings or anything - just simple gear ratio differences that normally torque-split 35% to the front and 65% to the rear. So I imagine the open-differential would handle the 2 different input speeds could "combine" to 1 output?

That was enough to "fix" its answer (see below). And IMO, it was a good answer!

I'm posting this because I read a thread on here yesterday/2-days-ago about people stuggling with their AI's context/conversation getting "poisoned" (their word). So whilst I don't use AI that much, I also haven't had issue with it, and maybe that's because of that way I converse with it?

---------

"Edit": Well, the conversation was too long for HN, so I put it here - https://gist.github.com/neRok00/53e97988e1a3e41f3a688a75fe3b...

tango1210mo ago

What’s the eventual goal of text to sql?

Is it to build a copilot for a data analyst or to get business insight without going through an analyst?

If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.

These don’t seem like text2sql problems:

> Why did we hit only 80% of our daily ecommmerce transaction yesterday?

> Why is customer acquisition cost trending up?

> Why was the campaign in NYC worse than the same in SF?

phillipcarter10mo ago

> These don’t seem like text2sql problems:

Correct, but I would propose two things to add to your analysis:

1. Natural language text is a universal input to LLM systems

2. text2sql makes the foundation of retrieving the information that can help answer these higher-level questions

And so in my mind, the goals for text2sql might be a copilot (near-term), but the long-term is to have a good foundation for automating text2sql calls, comparing results, and pulling them into a larger workflow precisely to help answer the kinds of questions you're proposing.

There's clearly much work needed to achieve that goal.

galenmarchetti10mo ago

yeah I agree with this - good text2sql is essential but just one part of a larger stack that will actually get there. Seems possible tho

mynegation10mo ago

To be fair, these don’t look like SQL problems either. SQL answers “what”, not “why” questions. The goal of text2sql is to free up analyst time to get through “what” much faster and - possibly- focus on “why” questions.

cdavid10mo ago

My observation is the latter, but I agree the results fall short of expectations. Business will often want last minute change in reporting, don't get what they want at the right time because lack of analysts, and hope having "infinite speed" will solve the problem.

But ofc the real issue is that if your report metrics change last minute, you're unlikely to get good report. That's a symptom of not thinking much about your metrics.

Also, reports / analysis generally take time because the underlying data are messy, lots of business knowledge encoded "out of band", and poor data infrastructure. The smarter analytics leaders will use the AI push to invest in the foundations.

richardw10mo ago

Any algo that a human would follow can be built and tested. If you have 10 analysts you have 10 different skill levels, with differing understanding of the database and business context. So automation gives you a platform to achieve a floor of skill and knowledge. The humans can now be “at least this good or better”. A new analyst instantly gets better, faster.

I assume a useful goal would be to guide development of the system in coordination with experts, test it, have the AI explain all trade offs, potential bugs, sense check it against expected results etc.

Taste is hard to automate. Real insight is hard to automate. But a domain expert who isn’t an “analyst” can go extremely far with well designed automation and a sense of what rational results should look like. Obviously the state of the art isn’t perfect but you asked about goals, so those would be my goals.

layer810mo ago

But “text to sql” isn’t an algorithm.

https://pastebin.com/yfg0Zn0u

westurner10mo ago

From "Show HN: We open sourced our entire text-to-SQL product" (2024) https://news.ycombinator.com/item?id=40456236 :

> awesome-Text2SQL: https://github.com/eosphoros-ai/Awesome-Text2SQL

> Awesome-code-llm > Benchmarks > Text to SQL: https://github.com/codefuse-ai/Awesome-Code-LLM#text-to-sql

zeroq10mo ago

Every once in a while I've been trying AI, since everyone and their mother told me to, so I comply.

My recent endevour was with Gemini 2.5:

  - Write me a simple todo app on cloudflare with auth0 authentication.
  - Here's a simple todo on cloudflare. We import the @auth0-cloudflare and...
  - Does that @auth0-cloudflare exists?
  - Oh, it doesn't. I can give you a walkthrough on how to set up an account on auth0. Would you like me to?
  - Yes, please.
  - Here. I'm going to write the walkthrough in a document... (proceed to create an empty document)
  - That seems to be an empty document.
  - Oh, my bad. I'll produce it once more. (proceed to create another empty document)
  - Seems like you're md parsing library is broken, can you write it in chat instead?
  - Yes... (your gemini trial has expired, would you like to pay $100 to continue?)

karencarits10mo ago

It's difficult to assess how typical your experience is; I tried your initial prompt (`Write me a simple todo app on cloudflare with auth0 authentication.` on gemini-2.5-pro-preview-05-06) and didn't get any mentions of @auth0-cloudfare, although I cannot verify if the answer is working as-is

__loam10mo ago

Shocked you got a different output from the stochastic token generator.

e3bc54b210mo ago

The worse part is not even being trolled at AI roundabout. The worse part is gaslighting by people who then go on to imply that I'm dumb to not be able to 'guide' the model 'towards the solution', whatever the fuck that means. And this is after telling me that model is so smart to just know what I want.

Claude and Gemini are pretty decent at providing a small and tight function definition with well defined parameters and output, but anything big and it starts losing shit left and right.

All vibecoding sessions I've seen have been pretty dead easy stuff with lot of boilerplate, maybe I'm weird for just not writing a lot of boilerplate and rely on well-built expressive abstractions..

floren10mo ago

Remember, if AI couldn't solve your problem, you were probably using the wrong model. Did you try with o5-selfsuck-20250523-512B?

Kiro10mo ago

You're the one doing the gaslighting now. "It doesn't work for me, therefore it can't possibly work for anyone else."

[1] https://openai.com/index/introducing-structured-outputs-in-t...

zxexz10mo ago

I find Gemini excellent for sql. Wouldn’t consider myself an expert in many things, but in sql and database design id consider myself close. I like writing queries and doing the architecture, and that’s where it’s exceptionally helpful. The massive context length combined with pointed questions means i can just dump the entire DDL, and ask “what am i missing?”. It really is an excellent tool for helping with times like checks and catching dumb errors on complex databases.

dcrimp10mo ago

I wonder if, for a given dialect (and even DDL), you could use that token masking technique similar to how that Structured Outputs [1] thing went:

Quote: "While sampling, after every token, our inference engine will determine which tokens are valid to be produced next based on the previously generated tokens and the rules within the grammar that indicate which tokens are valid next. We then use this list of tokens to mask the next sampling step, which effectively lowers the probability of invalid tokens to 0. Because we have preprocessed the schema, we can use a cached data structure to do this efficiently, with minimal latency overhead."

I.e. mask any tokens that would produce something that isn't valid SQL in the given dialect, or further, a valid query for the given schema. I assume some structured outputs capability is latent to most assistants nowadays, so they probably already have explored this

jamesblonde10mo ago

LLMs are still not great at generating SQL. If Google had a breakthrough, it should be on the bird brain benchmark - (A Big Bench for Large-Scale Database Grounded Text-to-SQLs) https://bird-bench.github.io/

At the moment GCP are at 76%, humans are at 93%.

jgalt21210mo ago

For me the flash model is way better than the pro model. I don't want to wait all the extra time to get some code back that I'm going to have to read and modify anyway. I much prefer getting a 92.5% right answer now, than 95% correct answer a minute or minutes from now.

squidbeak10mo ago

Are those percentages real, or plucked out of the air? I find Pro's quality starkly higher

jgalt21210mo ago

estimates my from uses. no api, just AI studio. < 100 prompts.

jgalt21210mo ago

another metaphor I feel is apt. Flash is like a REPL, Pro is like a compiler.

deadbabe10mo ago

All this LLM written SQL stuff sounds great until you realize if you don’t really know SQL you won’t be able to debug or fix any broken SQL an LLM generates.

Thus, this is mainly just a tool for true experts to do less work and still get paid the same, not a tool for beginners to rise to the level of experts.

roywiggins10mo ago

It depends, sometimes just feeding back broken SQL with "that didn't return any rows, can you fix it" and it comes up with something that works. Or "you're looking at the wrong entity, look at this table instead" or whatever, without knowing how to write competent SQL.

Obviously being able to at least read a bit of SQL and understanding the basic idea of relational databases helps loads.

sgarland10mo ago

> It depends, sometimes just feeding back broken SQL with "that didn't return any rows, can you fix it" and it comes up with something that works.

But how do you know if the SQL is correct, or just happened to return results that match for one particular case?

harvey910mo ago

I'm not an expert but I've written SQL on and off for years. LLMs help me when I can describe my intent but can't think how to implement it. I don't expect a perfect solution just a starting point that I can refine.

bongodongobob10mo ago

Have you not actually used LLMs? Just copy in the errors and away it goes.

deadbabe10mo ago

Error goes away but it gives the wrong result.

dangus10mo ago

> Even with a high-quality model, there is still some level of non-determinism or unpredictability involved in LLM-driven SQL generation. To address this we have found that non-AI approaches like query parsing or doing a dry run of the generated SQL complements model-based workflows well. We can get a clear, deterministic signal if the LLM has missed something crucial, which we then pass back to the model for a second pass. When provided an example of a mistake and some guidance, models can typically address what they got wrong.

Sounds like a bunch of bespoke not-AI work is being done to make up for LLM limitations that point blank can’t be resolved.

rawgabbit10mo ago

Regarding the first issue: ” For example, even the best DBA in the world would not be able to write an accurate query to track shoe sales if they didn't know that cat_id2 = 'Footwear' in a pcat_extension table means that the product in question is a kind of shoe. The same is true for LLMs.”

I wish developers would make use of long table names and column names. For example, pcat_extension could have been named release_schema_1_0.product_category_extension. And cat_id2 could have been named category_id2.

antman10mo ago

This is on howto to to write good SELECTS, not SQL. AI is good enough to also create schemas from spec, migrate, explore databases, testing etc which tgis article does not touch upon

TechDebtDevin10mo ago

Every time I've fed more than 5 migration files and asked Claude to make multiple across those files it fails, it does very badly in almost all cases, even on kinda basic schemas. I actually don't think LLMs grok complex migration files or sql that well at all.

fooker10mo ago

Well that's a great startup idea if you're familiar with the domain.

navaed0110mo ago

“Metrics: We combine user metrics and offline eval metrics, and employ both human and automated evaluation, particularly using LLM-as-a-judge techniques”.

I’m curious to know what people are doing to measure whether the customer got what they were looking for. Thumbs up/down seems insufficient to me.

The ability of the LLM to perform purely depends on having good knowledge of what is going to get asked and how, which is more complex than it sounds

What techniques are people having success with?

edmundsauto10mo ago

Training a 2nd agent as a qualitative evaluator works pretty well "LLM-as-a-judge". You train it with labeled critiques from experts, iterate a few times, then point it to your ground truth human-labelled-data ("golden dataset"). The quantitative output metric is human2ai alignment on the golden dataset, mix that with some expert judgment about the critique output by the ai as well.

Works pretty well for me, where you can typically get within the range of human2human variance.

tmpz2210mo ago

Is it me or is the grammar of this article really poor:

> If the user is a technical analyst or a developer asking a vague question, giving them a reasonable, but perhaps not 100% correct SQL query is a good starting point

> Out of the box, LLMs are particularly good at tasks like creative writing, summarizing or extracting information from documents.

I don't -think- this was written by an LLM, but it really pulls me out of the technical article.

neuroelectron10mo ago

No mention of knowing anything about the tables, versions or relational structure? Are we just assuming that's already given to the AI?

stefap210mo ago

I have done this using the OpenAI 4o model. I had to pass in a prompt with business-specific instructions, industry jargon, and descriptions of tables, including foreign keys. Then it would generate even complex join queries and return data. In my case, I was more interested in providing results to users not knowledgeable about SQL, but the SQL was displayed for information.

LAC-Tech10mo ago

If LLMs are so wonderful we can just read from B+ Tree storage engines directly. SQL, ORMs, Query Planners... all bloat.

bool3max10mo ago

great point

IncreasePosts10mo ago

Can't believe I'm seeing something from Google involving shoes but it isn't named gShoe.

JodieBenitez10mo ago

Wait... people need AI to write SQL ?

jeanloolz10mo ago

A junior in SQL would need AI to write things they're not sure about, the same way stackoverflow has helped us for many many years before AI. A senior in sql, and in fact any languages, would use AI to be accelerated (I know I do).

JodieBenitez10mo ago

I see this comparison too often and I don't think it's fair. Stackoverflow has peer review.

randomNumber710mo ago

Most people here have not understood the relational model, so yes.

criddell10mo ago

I’ve read most of Codd’s book on the subject and have written SQL on and off since the mid 90’s and I still need to look up the differences between the various joins anytime I use them.

aatd8610mo ago

no.but if I ask for a report in natural language, the AI needs to be able to write sql.

msvana10mo ago

Problem no. 2 (Understanding user intent) is relevant not only to writing SQL but also to software development in general. Follow-up questions are something I had in mind for a long time. I wonder why this is not the default for LLMs.

pcblues10mo ago

A question: Does anyone know how well AI does generating performative SQL in years-old production databases? In terms of speed of execution, locking, accuracy, etc.?

I see the promise for green-field projects.

sgarland10mo ago

It's very hit or miss. Claude does OK-ish, others less so. You have to explicitly state the DB and version, otherwise it will assume you have access to functions / features that may not exist. Even then, they'll often make subtle mistakes that you're not going to catch unless you already have good knowledge of your RDBMS. For example, at my work we're currently doing query review, and devs have created an AI recommendation script to aid in this. It recommended that we create a composite index on something like `(user_id, id)` for a query. We have MySQL. If you don't know (the AI didn't, clearly), MySQL implicitly has a copy of the PK in every secondary index, so while it would quite happily make that index for you, it would end up being `(user_id, id, id)` and would thus be 3x the size it needed to be.

iddan10mo ago

For anybody wanting to use best-in-class AI SQL, I highly recommend checking out Sherloq (W23): https://www.sherloqdata.io/

todotask210mo ago

Those days, we have many types of database tools—ORMs, query builders, and more. AI can help reduce the complexity and avoid lock-in to a specific tech stack. I love to write raw SQL.

sgarland10mo ago

Given that their first query example has a leading wildcard as a predicate (WHERE p.product_name LIKE '%shoe%') and doesn't take case into account, I have doubts.

rrrrrrrrrrrryan10mo ago

Yeah there's still a long way to go. Until these things actually try to consistently spit out SARGable queries, look at the query plans, check for covering indexes, etc. they're going to write worse queries than an entry level data engineer.

I'm certain they'll get there soon, they're just not there yet.

treebeard90110mo ago

If a lot of the value in a company is the software and over time a handful of AI companies start writing all the software, who really ends up owning all the value of the company?

wheelerwj10mo ago

That’s easy. None of the value is in the software. The only value is in customers that use the software.

mousetree10mo ago

Out of all the AI tools and models I’ve tried, the most disappointing is the Gemini built into BigQuery. Despite having well named columns with good descriptions it consistently gets nowhere close to solving the problem.

flysand710mo ago

Having written more SQL than any other programming language by now, every time I've tried to use AI to write the query for me, I'd spend way more time getting the output right than if I'd just written it myself.

As a quick aside there's one thing I wish SQL had that would make writing queries so much faster. At work we're using a DSL that has one operator that automatically generates joins from foreign key columns, just like

    credit.CLIENT->NAME

And you got clients table automatically joined into the query. Having to write ten to twenty joins for every query is by far the worst thing, everything else about writing SQL is not that bad.

6 more replies

quantadev10mo ago

Having proper constraints and foreign keys that are clear is generally all that's needed in my experience. Are you sure your tables have well defined constraints, so that the AI can be absolutely 100% sure how everything links up? SQL is very precise, but only if you're utilizing constraints and foreign key definitions well.

https://blog.codinghorror.com/regular-expressions-now-you-ha...

benjbrooks10mo ago

o3 has yet to fail me on complex, multi-table queries. Not a fan of BigQuery’s Gemini integration.

nashashmi10mo ago

AI text to regex solutions would be incredibly handy.

RadiozRadioz10mo ago

This comment appears frequently and always surprises me. Do people just... not know regex? It seems so foreign to me.

It's not like it's some obscure thing, it's absolutely ubiquitous.

Relatively speaking it's not very complicated, it's widely documented, has vast learning resources, and has some of the best ROI of any DSL. It's funny to joke that it looks like line noise, but really, there is not a lot to learn to understand 90% of the expressions people actually write.

It takes far longer to tell an AI what you want than to write a regex yourself.

12 more replies

fkyimeanit10mo ago

"Text to SQL", "text to regex", "text to shell", etc. will never fundamentally work because the reason we have computer languages is to express specific requirements with no ambiguity.

With an AI prompt you'll have to do the same thing, just more verbosely.

You will have to do what every programmer hates, write a full formal specification in English.

DonHopkins10mo ago

Oh, so an AI assisted number of problems increaser?

https://blog.codinghorror.com/parsing-html-the-cthulhu-way/

https://en.wikiquote.org/wiki/Jamie_Zawinski

sgarland10mo ago

Can't wait for another regex-induced massive outage [0].

[0]: https://blog.cloudflare.com/details-of-the-cloudflare-outage...

cloudking10mo ago

This is pretty simple in any foundation model, provide a well commented schema and ask for the query

tibbar10mo ago

Step 1: Your schema has thousands of tables and there aren't many comments.

Step 2...

fsndz10mo ago

the smolagents library is also pretty nice to do the scaffolding around the model. Text to sql seems simple in demos, but to make it work in real life complex cases is very hard: https://medium.com/thoughts-on-machine-learning/build-a-text...

galenmarchetti10mo ago

there’s two kinds of people using AI to generate SQL…those who say it’s already solved and those who say it’ll be impossible to ever solve

quantadev10mo ago

I agree. There's really no magic to it any more. The table create DDL commands are a very precise description of the tables, so almost nothing more is ever needed. You can just describe in detail what query you need, and any decent LLM can do it just fine.

j / k navigate · click thread line to collapse

358 comments

wewewedxfgdf10mo ago

Can I just say that Google AI Studio with latest Gemini is stunningly, amazingly, game changingly impressive.

Gemini AI Studio is such a giant leap ahead in programming I have to pinch myself when I'm using it.

CuriouslyC10mo ago

wewewedxfgdf10mo ago

It is absolutely the greatest golden age in programming ever - all these infinitely wealthy companies spending bajillions competing on who can make the best programming companion.

Apart from the apologising. It's silly when the AI apologises with ever more sincere apologies. There should be no apologies from AIs.

5 more replies

snthpy10mo ago

The Omega Directive: https://snth.prose.sh/the_omega_directive

koakuma-chan10mo ago

And Gemini is free.

oplorpe10mo ago

I’ve yet to see any llm proselytizers acknowledge this glaring fact:

Each new release is “game changing”.

The implication being the last release y’all said was “game changing” is now “from a different century”.

Do you see it?

For this to be an accurate and true assessment means you were wrong both before and wrong now.

ayrtondesozzla10mo ago

squidbeak10mo ago

I'm unsure I fully understand your contention.

Are you suggesting that a rush to hyperbole which you don't like means advances in a technology aren't groundbreaking?

Or is it that if there is more than one impressive advance in a technology, any advance before the latest wasn't worthy of admiration at the time?

raincole10mo ago

in_ab10mo ago

Eezee10mo ago

I tried it out because of your comment and the very first prompt Gemini 2.5 Pro hallucinated a non-existant plugin including detailed usage instructions.

Not really my idea of good.

patrick45110mo ago

Jnr10mo ago

Same thing, I thought I would give it a shot and it got the first solution so wrong in a simple nextjs typescript project I laughed out. It was fast but incorrect.

gundmc10mo ago

belter10mo ago

> Gemini AI Studio is such a giant leap ahead in programming I have to pinch myself when I'm using it

So I just did a couple of tests sending the same prompt on some AWS related questions to Gemini Pro 2.5 (free) and Claude paid, and no, Claude still better.

Workaccount210mo ago

Can you share the prompts?

bossyTeacher10mo ago

> Today I'm not sure how I could continue programming without Google Gemini in my toolkit

reacharavindh10mo ago

I use Gemini2.5 Pro through work and it is excellent. However, I use Claude 3.7 Sonnet via API for personal use using money added to their account.

I couldn’t find a way to use Gemini like a prepaid plan. I ain’t giving my credit card to Google for an LLM that can easily charge me hundreds or thousands of EUR.

b0ringdeveloper10mo ago

Try OpenRouter. Load up with $20 of credits and use their API for a variety of models across providers, including Gemini. I think you pay ~5% extra for the OpenRouter service.

petesergeant10mo ago

Is this distinct from using Gemini 2.5 Pro? If not, this doesn’t match my experience — I’ve been getting a lot of poorly designed TypeScript with an excess of very low quality comments.

christophilus10mo ago

The comments drive me nuts.

// Moved to foo.ts

Ok, great. That’s what git is for.

// Loop over the users array

Ya. I can read code at a CS101 level, thanks.

CommenterPerson10mo ago

Sorry sounds like a marketing plug.

alostpuppy10mo ago

How do you use it exactly? Does it integrate with any IDEs?

Mossy910mo ago

Jetbrains AI recently added (beta) access to Gemini Pro 2.5 and there's of course plugins like Continue.dev that provide access to pretty much anything with an API

pimeys10mo ago

Zed supports it out of the box.

DonHopkins10mo ago

Just install Cursor, it supports Gemini and many other LLMs right out of the box.

https://developers.google.com/gemini-code-assist/docs/overvi...

miyuru10mo ago

There is Gemini Code Assist.

kdmtctl10mo ago

beauzero10mo ago

Give Cline + vscode a try. Make sure to implement the "memory bank"...see Cline docs at cline.bot

mayas_10mo ago

I guess it depends on the type of tasks you give it.

They all seem to work remarkably well writing typescript or python but in my experience, they fall short when it comes to shell and more broadly dev ops

https://news.ycombinator.com/item?id=43996431

MrDarcy10mo ago

I’ve felt the same, but what is the equivalent of Claude code in Google’s ecosystem?

I want something running in a VM I can safely let all tools execute without human confirmation and I want to write my own tools and plug them in.

thinkxl10mo ago

I think using Aider[1] with Google's models is the closest.

[1]: https://aider.chat/docs/llms/gemini.html

noosphr10mo ago

Barbing10mo ago

If a user eventually creates half a dozen projects with an API key for each, and prompts Gemini side-by-side under each key, and only some of the responses are consistently terrible…

Would you expect that to be Google employing cost-saving measures?

landl0rd10mo ago

That doesn't mean it's worse than the others just not much better. I haven't found anything that worked better than o1-preview so far. How are you using it?

insin10mo ago

Is it just me or did they turn off reasoning mode in free Gemini Pro this week?

energy12310mo ago

CuriouslyC10mo ago

I think reasoning in the studio is gated by load, and at the same time I wasn't seeing so much reasoning in AIstudio, I was getting vertex service overloaded calls pretty frequently on my agents.

lifty10mo ago

teleforce10mo ago

There's a complex Numpy indexing codes riddle in the section of "I don’t like NumPy indexing" and Gemini Pro 2.5 came on top (DeepSeek R1, only get the first time right but not later) [1],[2].

> For fun, I tried asking a bunch of AI models to figure out what shapes those arrays have. Here were the results:

Based on the results from the top 8 state-of-the- art AI models, Gemini is the best and consistently got the right results:

[1] I don't like NumPy (204 comments):

[2] I don't like NumPy: I don’t like NumPy indexing:

https://dynomight.net/numpy/

DHolzer10mo ago

without wanting to sound overly sceptical, what exactly makes you think it performs so much better compared to claude and chatgpt?

Is there any concrete example that makes it really obvious? I had no such success with it so far and i would really like to see the clear cut between the gemini and the others.

conartist610mo ago

You don't worry that you can't think anymore without paying google to think for you?

conartist610mo ago

ifellover10mo ago

Essentially we were hoping to tie that to data inputs and have a system to regularly output the visualisation but with dynamic values. I bet my colleague it would one shot it: it did.

What I’ve also found is that even a sloppy prompt still somehow is reading my mind on what to do, even though I’ve expressed myself poorly.

alecco10mo ago

Remember when Microsoft started to do good things? Big corps suck when they are on top and unchallenged. It's imperative to reduce their monopolies.

Gud10mo ago

No, I don’t.

Der_Einzige10mo ago

Shhh!!! Normies will catch on and google will stop making it free.

But more seriously, they need to uncap temperature and allow more samplers if they want to really flex on their competition.

energy12310mo ago

Can you explain what you mean by "uncapping" temperature and "samplers"? You can currently set temperature to whatever you want. Or do you want > 2 temp.

the_arun10mo ago

Are you talking about Firebase Studio?

Saul1998zx10mo ago

6451937099

yahoozoo10mo ago

Nice try, Mr. Google.

But seriously, yeah, Gemini is pretty great.

CommenterPerson10mo ago

SQL Data analyst with years and years of experience here.

The other roles also fit this pattern, different systems but urgent needs, always a drought of people who understood both the business and the database.

Occasionally we'd get some "AI team" come looking for "data". After many mindless meetings with no clear objective except "increase profits", they'd quietly disappear.

senko10mo ago

> Oracle e-business suite [...] set up ~20 years prior

> fields helpfully named like ATTR_000349857

> Everyone was overworked

> no one bothered to spend time on documenting the database.

I can't blame AI for not being helpful here, nothing short of divine intervention can fix that.

levocardia10mo ago

owebmaster10mo ago

> I just shelve it 'til next month and the next big OpenAI/Anthropic/Google model will usually crush it.

1 month to write some code with LLM, that's quite the opposite of the promised productivity gain

AbstractH2410mo ago

Has the pace of this slown down or I have just lost track of the narrative?

Feels like innovation in AI is rapidly changing from paradigm-shifting to incremental.

th0ma510mo ago

Except here the core functionality changes day to day and hinges on specific word usage.

user393938210mo ago

Try getting it to write a codepen sim of 3 rectangles parallel parking.

mykowebhn10mo ago

I understand from a technical POV how this could be considered great news.

But I don't see how this is good news at all from a societal POV.

Now with this game-changing efficiency from these AI tools, I'm sure we've seen an end to the glory days in terms of salaries for the SW profession.

With this gone, where else could relatively normal people achieve financial independence? Definitely not in the service industry.

Very sad.

thicTurtlLverXX10mo ago

I understand how, from a technical POV, electricity and electrification could be considered great news.

But I don't see how this is good news at all from a societal POV.

Think about all the lamplighters who lost their jobs. Streetlights just turn on now? Lamplighting used to be considered a stable job! And what about the ice cutters…

Winsaucerer10mo ago

> we still have potholes to fix, t-shirts to fold and illnesses to cure

mykowebhn10mo ago

From what I understand, prior to the 1980s/90s lamplighters, waiters, factory workers, etc. could live comfortable lives on decent wages.

These days not so much.

DHolzer10mo ago

how is that technological progress not fueling resource scarcity?

user393938210mo ago

mykowebhn10mo ago

If you were a new dev now learning the ropes, with these AI coding tools available, I highly doubt you would gain the same "loads of experience".

Learning comes through struggle and it's too easy to bypass that struggle now. It's so much easier to get the answers from AI.

zkry10mo ago

The amount and complexity of software will expand to its very outer bounds for which specialists will be required.

AbstractH2410mo ago

A better comparison I think is low-code platforms.

Like how a growth in medical-paraprofessionals didn’t negate the need for doctors and nurses.

a_imho10mo ago

lodovic10mo ago

rocqua10mo ago

Sounds like there need to be measures to fix income inequality.

lerp-io10mo ago

hiAndrewQuinn10mo ago

prmph10mo ago

I personally don't think we are ever going to get to that point where I can give a simple propnmt and have an LLM generate a complex app ready to run. Think about what that would require:

1. The LLM would have to read my mind and extrapolate all the minute decisions I would make to implement the app based on the prompt.

2. Assuming the LLM can get past (1), it would have to basically be AGI to be able to implement pretty much whatever I can dream up.

3. If 2 & 3 above is somehow achieved, it would be economically very valuable, and you can bet that functionality is not going to be casually enabled in LLMs, for just anyone to use.

eqvinox10mo ago

2 - https://news.ycombinator.com/item?id=25930190

foldr10mo ago

Software engineers earning enough to achieve financial independence are generally employed by FAANG or (indirectly) by venture capitalists who have more money than they know what to do with.

mritchie71210mo ago

the short answer: use a semantic layer.

It's the cleanest way to give the right context and the best place to pull a human in the loop.

A human can validate and create all important metrics (e.g. what does "monthly active users" really mean) then an LLM can use that metric definition whenever asked for MAU.

With a semantic layer, you get the added benefit of writing queries in JSON instead of raw SQL. LLM's are much more consistent at writing a small JSON vs. hundreds of lines of SQL.

We[0] use cube[1] for this. It's the best open source semantic layer, but there's a couple closed source options too.

My last company wrote a post on this in 2021[2]. Looks like the acquirer stopped paying for the blog hosting, but the HN post is still up.

0 - https://www.definite.app/

1 - https://cube.dev/

ljm10mo ago

> you get the added benefit of writing queries in JSON instead of raw SQL.

I’m sorry, I can’t. The tail is wagging the dog.

dang, can you delete my account and scrub my history? I’m serious.

fhkatari10mo ago

You move all the tools to debug and inspect slow queries, in a completely unsupported JSON environment, with prompts not to make up column names. And this is progress?

IncreasePosts10mo ago

You're right, it's a bit ridiculous. This is a perfect time to use xml instead of json.

indymike10mo ago

This may be the best comment on Hacker News ever.

mritchie71210mo ago

LLMs are far more reliable at producing something like this:

    {
      "dimensions": [
        "users.state",
        "users.city",
        "orders.status"
      ],
      "measures": [
        "orders.count"
      ],
      "filters": [
        {
          "member": "users.state",
          "operator": "notEquals",
          "values": ["us-wa"]
        }
      ],
      "timeDimensions": [
        {
          "dimension": "orders.created_at",
          "dateRange": ["2020-01-01", "2021-01-01"]
        }
      ],
      "limit": 10
    }

than this:

    SELECT
      users.state,
      users.city,
      orders.status,
      sum(orders.count)
    FROM orders
    CROSS JOIN users
    WHERE
      users.state != 'us-wa'
      AND orders.created_at BETWEEN '2020-01-01' AND '2021-01-01'
    GROUP BY 1, 2, 3
    LIMIT 10;

tclancy10mo ago

8n4vidtmkvmk10mo ago

Sorry, I couldn't parse that. You didn't quote your keys

Spivak10mo ago

meindnoch10mo ago

>you get the added benefit of writing queries in JSON instead of raw SQL

^ kids, this is what AI-induced brainrot looks like.

fkyimeanit10mo ago

>you get the added benefit of writing queries in JSON instead of raw SQL

You should have written your comment in JSON instead of raw English.

christophilus10mo ago

In all seriousness, I have some complaints about SQL (I think LINQ’s reordering of it is a good idea), but there’s no need to invent another layer on order for LLMs to be able to wrangle it.

cmrdporcupine10mo ago

The semantic layer for database queries is (roughly) the relational algebra.

jinjin210mo ago

I agree that using a semantic layer is the best way to get better precision. It is almost like a cheatsheet for the AI.

In my experience, from using the Exasol Semantic Layer, it can be a totally seamless experience.

galenmarchetti10mo ago

still need someone to build the semantic layer, why not use text2sql or something similar for that

hakanito10mo ago

M4v3R10mo ago

hakanito10mo ago

How do you do that practically/reliably? Would be great to just paste a link to the SDK Github repo, but doesn't seem to work (yet) in my experience

stuaxo10mo ago

Bring LLMs they always hallucinate an API it would be great to have.

If you had something on the other side to hallucinate the API itself you could have a program that dreams itself into existence as you use it.

flir10mo ago

"I think callTheExactMethodINeed() was a hallucination. Can you try again?"

Then it apologizes and gives the right answer. It's weird. We really need a new work for what they're doing, 'cos it ain't thinking.

Hilift10mo ago

I wouldn't be surprised if some of these hallucinations were actual custom code samples in the same namespace that they have access to, but can't attribute due to IP issues.

danjc10mo ago

The article comments "out of the box, LLMs are particularly good at tasks like creative writing" but I think this actually demonstrates the problem with the ai.

A writer won't think that they're good at creative writing. In fact, I'm pretty sure they'd think LLM's are terrible at creative writing.

In other words, to an expert in their field, they're not that good - at least not yet.

But to someone who is not an expert, they're unbelievably good - they're enabled to do something they had zero ability to do before.

randomNumber710mo ago

Yes, but why is then everyone on HN claiming LLMs can code on expert level?

danjc10mo ago

Fast for hammering out boilerplate, great for understanding something you've never done before. Much less value for field-frontier or novel work.

__loam10mo ago

I would posit that most people on hackernews are actually not that experienced.

jeltz10mo ago

[1] https://www.malloydata.dev/ [2] https://docs.malloydata.dev/documentation/user_guides/malloy... [3] https://github.com/malloydata/publisher

candiddevmike10mo ago

AdrianB110mo ago

awesome_dude10mo ago

Mate, IME programmers who don't know what they are doing just do it anyways then look to blame someone/something else if things turn to custard.

AI is just increasing the frequency of things turning to custard :)

HideousKojima10mo ago

AI is most effective as an accountability sink

strict910mo ago

It should never be at the point where some random person can impact a server.

That's what read replicas with read-only access are for. Production db servers should not be open to random queries and usage by people. That's only for the app to use.

sgarland10mo ago

AdrianB110mo ago

cheema3310mo ago

> I give it the already optimized query and I ask for better. I never got a better answer..

This was my experience as well. However I have observed that things have been improving this regard. Newer LLMs do perform much better. And I suspect they will continue to get better over time.

cjbgkagh10mo ago

scarface_7410mo ago

> It allows people that don't know what they do to write queries that can significantly impact servers.

At least for the only OLAP DB I use often - Amazon Redshift - that’s a solved problem with Workload Management Queues. You can restrict those users ability to consume too many resources.

ziml7710mo ago

insin10mo ago

Is it too late to rescue the phrase "one-shotted" or is it already too far gone, like "AI" and "agent"?

troupo10mo ago

For some reason I can't get the image of someone swinging back shots of vodka/tequila every time I see "one-shotted" out of my head

th0ma510mo ago

Reminds me the "crypto" name overloading. It is clear that fanboys are jealous of competence.

bob102910mo ago

galenmarchetti10mo ago

mrtimo10mo ago

Malloy [1] has a semantic layer [2]... and Model Context Protocol (MCP) support is being added through Publisher [3]. Something to keep an eye on. Seems like a great fit for LLMs.

fourfun10mo ago

Google may be getting AI to write good SQL, but they aren’t getting it to write good blog posts.

gizmodo5910mo ago

mark_l_watson10mo ago

LLMs make some things that were difficult very easy now.

Good article!

rectang10mo ago

> We will cover state-of-the-art [...] how we approach techniques that allows the system to offer virtually certified correct answers.

I don't need AI to generate perfect SQL, because I am never going to trust the output enough to copy/paste it — the risk of subtle semantic errors is too high, even if the code validates.

Instead, I find it helpful for AI to suggest approaches — after which I will manually craft the SQL, starting from scratch.

hsbauauvhabzb10mo ago

Explain that to the average manager or junior engineer, both who don’t care about your desire to build well but not fast.

rectang10mo ago

noosphr10mo ago

> So now that we brought down prod for a day the new rule is no AI sql without three humans signing off on any queries.

hosel10mo ago

Really? In my experience it’s been pretty good (using Pydantic)! I read over before I execute it, but it’s never done anything malicious.

rectang10mo ago

I don't trust myself to craft a prompt in natural language which completely specifies my intent as codified with the precision of a programming language.

yahoozoo10mo ago

What is the relevance of Pydantic with SQL?

paulddraper10mo ago

Hopefully your trust in yourself is warranted

rectang10mo ago

I embrace my fallibility, and enthusiastically pursue testing, code reviews, staging environments, and so on to minimize the mistakes that make it through to production.

It seems to me that this skeptical mindset is consonant with handling AI output with care.

auggierose10mo ago

You'd rather trust in AI than yourself?

pcblues10mo ago

Can someone please answer these questions because I still think AI stinks of a false promise of determinable accuracy:

lmeyerov10mo ago

I can bring data here:

A bit more on louie's speed run here: https://www.linkedin.com/posts/leo-meyerovich-09649219_genai...

blibble10mo ago

is a competition/speed run a realistic example?

BenderV10mo ago

My 2 cents, building a tool in this space...

> Do you need an expert to verify if the answer from AI is correct?

> How is it time saved refining prompts instead of SQL?

I wouldn't call that "prompting". It's just a chat. I'm at least ~10x faster (for reasonable complex & interesting queries).

herrkanin10mo ago

Same reason as why it's harder to solve a sudoku than it is to verify its correctness.

pcblues10mo ago

I should have made my post clearer :)

There isn't one perfect solution to SQL queries against complex systems.

A suduko has one solution.

neRok10mo ago

  > are there any features for mkv files like what ffmpeg does when making mp4 files with the option `--movflags faststart`?

And it gave me a great answer!

  (...the answer happened to be based upon our prior conversation of av1 encoding, and so it told me about increasing the I-frame frequency).

  > I'm trying to build this project off github, but I don't have cmake installed because I can't, so I'm using a portable version. It's giving me this error though: [*error*]

And the aforementioned error was pretty generic, but chat-gpt still gave a fantastic response along the lines of;

  >  Ok, first off, you must not have all the files that cmake.exe needs in the same folder, so to fix do ..[stuff, including explicit powershell commands to set PATH variables, as I had told it I was using powershell before].
  >  And once cmake is fixed, you still need [this and that].
  >  For [this], and because you want portable, here's how to setup Ninja [...]
  >  For [that], and even though you said you dont want to install things, you might consider ..[MSVC instructions].
  >  If not, you can ..[mingw-w64 instructions].

neRok10mo ago

  > I'm wondering if it would be beneficial to add an electric-assist motor to an existing petrol vehicle. There are some 2010 era SUV's that have relatively uneconomical petrol engines, which may be good candidates. That is because some of them are RWD, whilst some are AWD. The AWD gearbox and transfer case could be fitted to the RWD, leaving the transfers front "output" unconnected. Could an electric motor then be connected to this shaft, hence making it an input?

  > I mentioned the AWD gearbox+transfer into a RWD-only vehicle, thus keeping it RWD only. Thus both petrol+electric would be "driving" at the same time, but I imagine the electric would reduce the effort required from the petrol. The transfer case is a simple "differential" type, without any control or viscous couplings or anything - just simple gear ratio differences that normally torque-split 35% to the front and 65% to the rear. So I imagine the open-differential would handle the 2 different input speeds could "combine" to 1 output?

That was enough to "fix" its answer (see below). And IMO, it was a good answer!

---------

"Edit": Well, the conversation was too long for HN, so I put it here - https://gist.github.com/neRok00/53e97988e1a3e41f3a688a75fe3b...

tango1210mo ago

What’s the eventual goal of text to sql?

Is it to build a copilot for a data analyst or to get business insight without going through an analyst?

If it’s the latter - then imho no amount of text to sql sophistication will solve the problem because it’s impossible for a non analyst to understand if the sql is correct or sufficient.

These don’t seem like text2sql problems:

> Why did we hit only 80% of our daily ecommmerce transaction yesterday?

> Why is customer acquisition cost trending up?

> Why was the campaign in NYC worse than the same in SF?

phillipcarter10mo ago

> These don’t seem like text2sql problems:

Correct, but I would propose two things to add to your analysis:

1. Natural language text is a universal input to LLM systems

2. text2sql makes the foundation of retrieving the information that can help answer these higher-level questions

There's clearly much work needed to achieve that goal.

galenmarchetti10mo ago

yeah I agree with this - good text2sql is essential but just one part of a larger stack that will actually get there. Seems possible tho

mynegation10mo ago

cdavid10mo ago

But ofc the real issue is that if your report metrics change last minute, you're unlikely to get good report. That's a symptom of not thinking much about your metrics.

richardw10mo ago

layer810mo ago

But “text to sql” isn’t an algorithm.

https://pastebin.com/yfg0Zn0u

westurner10mo ago

From "Show HN: We open sourced our entire text-to-SQL product" (2024) https://news.ycombinator.com/item?id=40456236 :

> awesome-Text2SQL: https://github.com/eosphoros-ai/Awesome-Text2SQL

> Awesome-code-llm > Benchmarks > Text to SQL: https://github.com/codefuse-ai/Awesome-Code-LLM#text-to-sql

zeroq10mo ago

Every once in a while I've been trying AI, since everyone and their mother told me to, so I comply.

My recent endevour was with Gemini 2.5:

  - Write me a simple todo app on cloudflare with auth0 authentication.
  - Here's a simple todo on cloudflare. We import the @auth0-cloudflare and...
  - Does that @auth0-cloudflare exists?
  - Oh, it doesn't. I can give you a walkthrough on how to set up an account on auth0. Would you like me to?
  - Yes, please.
  - Here. I'm going to write the walkthrough in a document... (proceed to create an empty document)
  - That seems to be an empty document.
  - Oh, my bad. I'll produce it once more. (proceed to create another empty document)
  - Seems like you're md parsing library is broken, can you write it in chat instead?
  - Yes... (your gemini trial has expired, would you like to pay $100 to continue?)

karencarits10mo ago

__loam10mo ago

Shocked you got a different output from the stochastic token generator.

e3bc54b210mo ago

Claude and Gemini are pretty decent at providing a small and tight function definition with well defined parameters and output, but anything big and it starts losing shit left and right.

All vibecoding sessions I've seen have been pretty dead easy stuff with lot of boilerplate, maybe I'm weird for just not writing a lot of boilerplate and rely on well-built expressive abstractions..

floren10mo ago

Remember, if AI couldn't solve your problem, you were probably using the wrong model. Did you try with o5-selfsuck-20250523-512B?

Kiro10mo ago

You're the one doing the gaslighting now. "It doesn't work for me, therefore it can't possibly work for anyone else."

[1] https://openai.com/index/introducing-structured-outputs-in-t...

zxexz10mo ago

dcrimp10mo ago

I wonder if, for a given dialect (and even DDL), you could use that token masking technique similar to how that Structured Outputs [1] thing went:

jamesblonde10mo ago

At the moment GCP are at 76%, humans are at 93%.

jgalt21210mo ago

squidbeak10mo ago

Are those percentages real, or plucked out of the air? I find Pro's quality starkly higher

jgalt21210mo ago

estimates my from uses. no api, just AI studio. < 100 prompts.

jgalt21210mo ago

another metaphor I feel is apt. Flash is like a REPL, Pro is like a compiler.

deadbabe10mo ago

All this LLM written SQL stuff sounds great until you realize if you don’t really know SQL you won’t be able to debug or fix any broken SQL an LLM generates.

Thus, this is mainly just a tool for true experts to do less work and still get paid the same, not a tool for beginners to rise to the level of experts.

roywiggins10mo ago

Obviously being able to at least read a bit of SQL and understanding the basic idea of relational databases helps loads.

sgarland10mo ago

> It depends, sometimes just feeding back broken SQL with "that didn't return any rows, can you fix it" and it comes up with something that works.

But how do you know if the SQL is correct, or just happened to return results that match for one particular case?

harvey910mo ago

bongodongobob10mo ago

Have you not actually used LLMs? Just copy in the errors and away it goes.

deadbabe10mo ago

Error goes away but it gives the wrong result.

dangus10mo ago

Sounds like a bunch of bespoke not-AI work is being done to make up for LLM limitations that point blank can’t be resolved.

rawgabbit10mo ago

antman10mo ago

This is on howto to to write good SELECTS, not SQL. AI is good enough to also create schemas from spec, migrate, explore databases, testing etc which tgis article does not touch upon

TechDebtDevin10mo ago

fooker10mo ago

Well that's a great startup idea if you're familiar with the domain.

navaed0110mo ago

“Metrics: We combine user metrics and offline eval metrics, and employ both human and automated evaluation, particularly using LLM-as-a-judge techniques”.

I’m curious to know what people are doing to measure whether the customer got what they were looking for. Thumbs up/down seems insufficient to me.

The ability of the LLM to perform purely depends on having good knowledge of what is going to get asked and how, which is more complex than it sounds

What techniques are people having success with?

edmundsauto10mo ago

Works pretty well for me, where you can typically get within the range of human2human variance.

tmpz2210mo ago

Is it me or is the grammar of this article really poor:

> If the user is a technical analyst or a developer asking a vague question, giving them a reasonable, but perhaps not 100% correct SQL query is a good starting point

> Out of the box, LLMs are particularly good at tasks like creative writing, summarizing or extracting information from documents.

I don't -think- this was written by an LLM, but it really pulls me out of the technical article.

neuroelectron10mo ago

No mention of knowing anything about the tables, versions or relational structure? Are we just assuming that's already given to the AI?

stefap210mo ago

LAC-Tech10mo ago

If LLMs are so wonderful we can just read from B+ Tree storage engines directly. SQL, ORMs, Query Planners... all bloat.

bool3max10mo ago

great point

IncreasePosts10mo ago

Can't believe I'm seeing something from Google involving shoes but it isn't named gShoe.

JodieBenitez10mo ago

Wait... people need AI to write SQL ?

jeanloolz10mo ago

JodieBenitez10mo ago

I see this comparison too often and I don't think it's fair. Stackoverflow has peer review.

randomNumber710mo ago

Most people here have not understood the relational model, so yes.

criddell10mo ago

I’ve read most of Codd’s book on the subject and have written SQL on and off since the mid 90’s and I still need to look up the differences between the various joins anytime I use them.

aatd8610mo ago

no.but if I ask for a report in natural language, the AI needs to be able to write sql.

msvana10mo ago

pcblues10mo ago

A question: Does anyone know how well AI does generating performative SQL in years-old production databases? In terms of speed of execution, locking, accuracy, etc.?

I see the promise for green-field projects.

sgarland10mo ago

iddan10mo ago

For anybody wanting to use best-in-class AI SQL, I highly recommend checking out Sherloq (W23): https://www.sherloqdata.io/

todotask210mo ago

Those days, we have many types of database tools—ORMs, query builders, and more. AI can help reduce the complexity and avoid lock-in to a specific tech stack. I love to write raw SQL.

sgarland10mo ago

Given that their first query example has a leading wildcard as a predicate (WHERE p.product_name LIKE '%shoe%') and doesn't take case into account, I have doubts.

rrrrrrrrrrrryan10mo ago

I'm certain they'll get there soon, they're just not there yet.

treebeard90110mo ago

If a lot of the value in a company is the software and over time a handful of AI companies start writing all the software, who really ends up owning all the value of the company?

wheelerwj10mo ago

That’s easy. None of the value is in the software. The only value is in customers that use the software.

mousetree10mo ago

flysand710mo ago

    credit.CLIENT->NAME

And you got clients table automatically joined into the query. Having to write ten to twenty joins for every query is by far the worst thing, everything else about writing SQL is not that bad.

6 more replies

quantadev10mo ago