That's the buried lede.
Yes, there was a vulnerability in psql... but that's so much less a problem than the huge gaping hole of allowing users to directly interact with psql.
No DB can be safe if you are turning untrusted user commands into psql executions. It'd be like giving untrusted users ssh access and then complaining when they find a privilege elevation exploit.
They had one job, and they failed at it. This amateur-level mistake should sink the entire company.
The problem is that getting information security right is a matter of process control, which everyone hates, and so CEOs are absolute suckers for being sold a product which magically "adds on" security. This is like trying to buy "anti-lead-paint" rather than actually remove all your existing lead paint.
we now have one job to ask for accountability and will not do it.
The author discovered quite a few... well lets just call them policy errors in postgres, but had a hard time filing bug reports, mainly because the response was usually an incredulous "why on earth would you even do that in the first place?.
But the author has fun with this, There is a trophy in the game that you can only get by putting your name in the trophy table.
Systems are built on a set of expectations. Undermine the expectations and you undermine the system.
This is a 101 rookie level approach to SQL or injection defense.
It's dumb for exactly the same reason why this is dumb
"SELECT * FROM foo WHERE bar=" + sanitize(userInput)
The correct way to do something like this will always be parameterized input which looks something like this "SELECT * FROM foo WHERE bar=?"
bindParameter(1, userInput);
Why? Because that the postgres protocol splits out the command and the data for the command in a way that can't be injected. Something that should be viewed as impossible to do when data and command are merged into 1 String.IF this company wanted to build dynamic queries, then the only correct way to do that is to limit input to only valid variables. IE "isValidColumnName(userInput)" before sending the request. And even then, you'd not use psql to do that.
You simply can't use a generalized sanitizer and expect good results.
I dont think we appreciate how much of a wild west things are with the incredible mix of hugely complex and powerful tools available trivially to developers and the concept of "move fast, break things."
Especially as corporate sees devs like they see salesmen (big moneymakers who deserve perks, exceptions) and top-down security culture as a cost center.
The other buried ledes are that postgres allows emojis (not sure if that's intended but it works) and that you can just run system commands and scripts directly from postgres cli. I imagine a lot of eyes are going to be on new hardening guidelines for postgres now.
I also imagine the first high performance enterprise friendly drop-in db written in something like rust is going to one day be a big deal.
You’re onto something here. People perceive the world through the lens of their education and environment. Sales, legal, finance, are all easy constructs for a business leader to view the rest of the world through. The secret of the game isn’t to have the best tech or to code the most, it’s to “outsell” your competing business unit.
This is not due diligence. In band messaging of user controlled data has been proven to be bad for security and this is not the first time "escaping" user controlled data for SQL has been done incorrectly.
One of the nice things about modern ORMs like SQLAlchemy 2 is that it forces you to use prepared statements even when when calling raw queries.
I've avoided SQL injection since ancient times not by escaping strings, but by transforming any given input via "0x" + bin2hex(value) and plopping that into the query.
No quotes needed, no code buried deep in included libraries needed, handles any kind of data possible, and also no funny business sneaking in based on how you may have mangled the input.
I was a bit perplexed by the final destination of command line for data and/or queries -- seems like an odd choice when they could've just interfaced directly with the database like a civilized human hahaha
- why a no side-effects function on a database can be used to get lateral access to the whole database instance
- why do you need to validate strings on the database itself and not on the client anyway, heck why are there no type safe way of doing it
- why would you want to execute shell commands from the database itself
- Even if there's a real use case for executing commands like that why is it enabled by default on a regular connection to the database without you specifying a THIS_IS_REALLY_DANGEROUS_BUT_I_PINKY_PROMISE_I_KNOW_WHAT_IM_DOING flag to the connection handshake.
It's not always PHP but there are some kirks that are shrugged off on PHP that makes me really concerned about the reliability of projects coded with it.
What account were they authenticating with when attaching to psql?
If you have the connection string why does psql even matter, couldn’t you use any client? Or is this a case of your input being forwarded to a running, already authenticated, psql instance?
And finally, why do we need unicode support for schema? I assume it’s because the schema is itself data?
It's not so much about Turing as it is libraries and patterns
After all, as I understand it this very issue was caused by escaping SQL rather than parameterizing it
That's a neat tidbit.
From the adoption page: Each adoption comes with a digital badge and certificate that you can proudly display.
SELECT * FROM users WHERE username = [<LEN_OF_TEXT>]raw-text-of-len-not-parsed-at-all
E.g. SELECT * FROM users WHERE username = [21]flyin' and wavin' guy
^^^^^^^^^^^^^^^^^^^^^
these 21 chars are NOT parsed AT ALL, just taken as data
I am not very familiar with SQL so you might need a different prefix but hopefully the idea is obvious.The basic idea behind your proposal exists and is called prepared statements. It's actually, I hope, the normal way to write queries these days.
You write your query like: "SELECT * FROM users WHERE username = ?" and execute your query like "execute(query, username)".
The problem? It's optional.
"Prepared statements" sounds EXACTLY like what I was thinking! I don't understand why people would ever use anything else.
For a moment I had a dev-flashback to the problem of utf8mb4 versus (broken) utf8 in mySQL. Easy now, this is Postgres, everything is safe(er)...
Unicode is a surprisingly elegant system but also an open invitation for all kinds of abuse.
s/elegant/clever
What could go wrong? I bet unicode is how AGI escapes and enslaves humanity.
Working with Unicode is anything but elegant, but that’s another story.
I had not considered it as an injection attack.
https://www.youtube.com/watch?v=ekr2nIex040
ROSÉ & Bruno Mars - APT. (Official Music Video)
https://xkcd.com/3054/ (the scream cypher)
And halfway through implementing it. I realized that these combining characters will attach to any letter, and it would be easy to have crappy stegonography on top of your crappy ceaser cypher.
Probably nothing new, but I had fun.
D̊o̝ n̝o̊ť f̔o͞r̊g̝e͞t̍ t̊h̠e̗ milk
import string
def combine_list():
l = []
for c in range(0x0300, 0x0370):
if c != 0x340f: #non joiner mark
l.append(chr(c))
#print(str(c), 'A' + chr(c))
return l
class codebook_a():
def __init__(self):
self.cb = dict(zip(string.printable, combine_list()))
self.cb[''] = ''
self.d = {}
for c in self.cb:
self.d[self.cb[c]] = c
def encode(self, message, fake):
if len(message) > len(fake):
raise ValueError('fake must be longer than message')
ml = list(message)
fl = list(fake)
encode = []
while fl:
c = fl.pop(0)
if c != ' ':
if ml:
m = ml.pop(0)
else:
m = ''
encode.append(c + self.cb[m])
else:
encode.append(c)
return ''.join(encode)
def decode(self, message):
plain = []
for c in message:
p = self.d.get(c)
if p is not None:
plain.append(p)
return ''.join(plain)
def cli(args):
cb = codebook_a()
#check for coded text
if len(args) == 0:
print('message fake : two args')
print('a message with no combining charactors(dicritics) will be coded onto fake')
print('with combining charactors will extract the real message')
return 1
m = args[0]
for c in m:
if cb.d.get(c):
print('decode')
print(cb.decode(m))
return 0
print('encode')
print(cb.encode(m, args[1]))
if __name__ == '__main__':
import sys
cli(sys.argv[1:])> The PQescapeStringInternal method doesn’t actually validate that the string it is parsing with pg_utf_mblen is valid Unicode. So, instead, it just takes the length of 2, and grabs the next byte.
So the bug was a shitty function in a generic open source library which was probably never properly tested or fuzzed, which ended up letting attackers move laterally through the database. And this is one reason you want full test coverage; tiny stupid functions matter.
(Another fix for this is to enforce at the boundaries of every function that the input data has been "blessed" or sanitized by some other function whose purpose is just to validate that the data is what it's supposed to be. That would have to happen before escaping, and every function that uses that data would need to confirm that it got blessed. Basically you want a home-rolled strong-typing system with types (or data classes?) for all your data. But that's a lot of work, I don't expect many would do that for most apps)
* No static type-checking for Unicode data