Hidden Messages in Emojis and Hacking the US Treasury (opens in new tab)

(slamdunksoftware.substack.com)

209 pointsnickagliano1y ago78 comments

78 comments

> Now, all of this might have been fine had Beyond Trust not written a feature which allowed users to directly, programmatically interact with psql (the postgres command line interface).

That's the buried lede.

Yes, there was a vulnerability in psql... but that's so much less a problem than the huge gaping hole of allowing users to directly interact with psql.

No DB can be safe if you are turning untrusted user commands into psql executions. It'd be like giving untrusted users ssh access and then complaining when they find a privilege elevation exploit.

gowld1y ago

Let's be clear: Beyond Trust is not a company that wrote a database-backed web app and made the all-too-common mistake of writing insecure code that tickled a bug in the database that allowed privilege escalation. Beyond Trust's is a company whose entire contribution is adding a security layer to prevent privilege escalation, and their solution here was to bypass Postgres's standard functionality and use this weird `psql` hack instead.

They had one job, and they failed at it. This amateur-level mistake should sink the entire company.

pjc501y ago

Didn't happen when Crowdstrike broke all their customers.

The problem is that getting information security right is a matter of process control, which everyone hates, and so CEOs are absolute suckers for being sold a product which magically "adds on" security. This is like trying to buy "anti-lead-paint" rather than actually remove all your existing lead paint.

1oooqooq1y ago

but they got the right sales people to get to the IRS and they hired the "right" (wrong) certification company. so that's three jobs at least.

we now have one job to ask for accountability and will not do it.

somat1y ago

There is an interesting project https://github.com/Abstrct/Schemaverse where the whole game takes place entirely within a postgres database that the players connect directly to.

The author discovered quite a few... well lets just call them policy errors in postgres, but had a hard time filing bug reports, mainly because the response was usually an incredulous "why on earth would you even do that in the first place?.

But the author has fun with this, There is a trophy in the game that you can only get by putting your name in the trophy table.

0xbadcafebee1y ago

The only reason BeyondTrust implemented that was it wasn't untrusted user commands. They sanitized the data, so it should have been fine. The unfortunate problem was that the sanitizer didn't sanitize.

Systems are built on a set of expectations. Undermine the expectations and you undermine the system.

cogman101y ago

> They sanitized the data, so it should have been fine.

This is a 101 rookie level approach to SQL or injection defense.

It's dumb for exactly the same reason why this is dumb

    "SELECT * FROM foo WHERE bar=" + sanitize(userInput)

The correct way to do something like this will always be parameterized input which looks something like this

    "SELECT * FROM foo WHERE bar=?"
    bindParameter(1, userInput);

Why? Because that the postgres protocol splits out the command and the data for the command in a way that can't be injected. Something that should be viewed as impossible to do when data and command are merged into 1 String.

IF this company wanted to build dynamic queries, then the only correct way to do that is to limit input to only valid variables. IE "isValidColumnName(userInput)" before sending the request. And even then, you'd not use psql to do that.

You simply can't use a generalized sanitizer and expect good results.

7 more replies

zoeysmithe1y ago

Just a guess but this looks like politically powerful dev culture overwriting cybersecurity culture, demanding, thus getting an exception from management for 'productivity' and 'being agile.'

I dont think we appreciate how much of a wild west things are with the incredible mix of hugely complex and powerful tools available trivially to developers and the concept of "move fast, break things."

Especially as corporate sees devs like they see salesmen (big moneymakers who deserve perks, exceptions) and top-down security culture as a cost center.

The other buried ledes are that postgres allows emojis (not sure if that's intended but it works) and that you can just run system commands and scripts directly from postgres cli. I imagine a lot of eyes are going to be on new hardening guidelines for postgres now.

I also imagine the first high performance enterprise friendly drop-in db written in something like rust is going to one day be a big deal.

Terr_1y ago

Hey now, a large portion of developers are seen as cost-centers too! Not everybody has the skill of flattering managers into approving greenfield projects, and then transferring away before they break horribly. :p

randmeerkat1y ago

> Especially as corporate sees devs like they see salesmen…

You’re onto something here. People perceive the world through the lens of their education and environment. Sales, legal, finance, are all easy constructs for a business leader to view the rest of the world through. The secret of the game isn’t to have the best tech or to code the most, it’s to “outsell” your competing business unit.

singpolyma31y ago

Giving untrusted users ssh access... You mean like every shared hosting company or shell provider?

charcircuit1y ago

Which is why shared hosting companies get hacked relatively more.

charcircuit1y ago

>Beyond Trust did their due diligence by properly calling a sanitization method on the user’s string input using it in a PostgreSQL query.

This is not due diligence. In band messaging of user controlled data has been proven to be bad for security and this is not the first time "escaping" user controlled data for SQL has been done incorrectly.

Spivak1y ago

Yep, it's prepared statements or bust. But the long tail of legacy code, examples, documentation, that uses escaping is gonna take a while to get through.

One of the nice things about modern ORMs like SQLAlchemy 2 is that it forces you to use prepared statements even when when calling raw queries.

1oooqooq1y ago

without the PR spin: beyond trust did the bare minimum when implementing an open back door to a federal database

kevinsync1y ago

Fun fact: MySQL (and I'm sure many other databases?) lets you pass in values directly as hexadecimal strings.

I've avoided SQL injection since ancient times not by escaping strings, but by transforming any given input via "0x" + bin2hex(value) and plopping that into the query.

No quotes needed, no code buried deep in included libraries needed, handles any kind of data possible, and also no funny business sneaking in based on how you may have mangled the input.

vessenes1y ago

Cool idea. Genuinely. But it does not in any way guarantee that an injection attack like used here won’t work - unless it’s maintained as hex through the whole pipeline. In this case the (malformed) Unicode was sent to a command line call - if your hex text needed to be parsed to be understood on the command line, then your security plan would have failed.

kevinsync1y ago

Totally agree on my tip not being a silver bullet for all situations, just wanted to pass it along in case somebody finds themselves needing to sanitize input for queries rather than constructing prepared statements.

I was a bit perplexed by the final destination of command line for data and/or queries -- seems like an odd choice when they could've just interfaced directly with the database like a civilized human hahaha

mtrovo1y ago

So many questions:

- why a no side-effects function on a database can be used to get lateral access to the whole database instance

- why do you need to validate strings on the database itself and not on the client anyway, heck why are there no type safe way of doing it

- why would you want to execute shell commands from the database itself

- Even if there's a real use case for executing commands like that why is it enabled by default on a regular connection to the database without you specifying a THIS_IS_REALLY_DANGEROUS_BUT_I_PINKY_PROMISE_I_KNOW_WHAT_IM_DOING flag to the connection handshake.

It's not always PHP but there are some kirks that are shrugged off on PHP that makes me really concerned about the reliability of projects coded with it.

benmmurphy1y ago

They mentioned PAM module so maybe the sql injection just allowed bypassing the authorization of a system that was using the PAM module. Like it’s in the realm of possibility that a PAM module that wanted to validate a user against credentials stored in a pg database might shell out to the psql command to do this. Though, the whole thing is very questionable.

rapind1y ago

Yeah we’re missing some info.

What account were they authenticating with when attaching to psql?

If you have the connection string why does psql even matter, couldn’t you use any client? Or is this a case of your input being forwarded to a running, already authenticated, psql instance?

And finally, why do we need unicode support for schema? I assume it’s because the schema is itself data?

c45y1y ago

In this case PAM is the name of a type of security product and not the Linux PAM system.

jarebear6expepj1y ago

Your questions are programming language agnostic-- where did your PHP angst come in? And are there specific things in PHP that are problematic and avoidable by using a different Turing complete language?

marsovo1y ago

PHP has grown up but in its wild youth was notorious for such gems as mysql_escape_string vs mysql_real_escape_string, rather than proper parameterization

It's not so much about Turing as it is libraries and patterns

After all, as I understand it this very issue was caused by escaping SQL rather than parameterizing it

xmprt1y ago

I learned about the breach a few days ago but I didn't know that you could "adopt" an emoji: https://aac.unicode.org/sponsors

That's a neat tidbit.

jfengel1y ago

Does "adopting an emoji" mean something other than "appearing on that page"?

From the adoption page: Each adoption comes with a digital badge and certificate that you can proudly display.

Implicated1y ago

It also appears to be a $5k permanent backlink from unicode.org, custom anchor text even.

nkrisc1y ago

It’s a fun way to donate.

thrdbndndn1y ago

It's just a way to donate, similar to lots of "adopt a xxx" programs.

Validark1y ago

Could someone please explain to me why "sanitizing database inputs" was ever considered a good idea? Why not just add a feature in SQL like so?

  SELECT * FROM users WHERE username = [<LEN_OF_TEXT>]raw-text-of-len-not-parsed-at-all

E.g.

  SELECT * FROM users WHERE username = [21]flyin' and wavin' guy
                                           ^^^^^^^^^^^^^^^^^^^^^ 
                          these 21 chars are NOT parsed AT ALL, just taken as data

I am not very familiar with SQL so you might need a different prefix but hopefully the idea is obvious.

whatnow373731y ago

How is this not "sanitizing inputs"?

The basic idea behind your proposal exists and is called prepared statements. It's actually, I hope, the normal way to write queries these days.

You write your query like: "SELECT * FROM users WHERE username = ?" and execute your query like "execute(query, username)".

The problem? It's optional.

Validark1y ago

To me, "sanitizing inputs" implies a transformation of the data into a string that can be "safely" evaluated as code which hopefully yields the input data. Instead you should be able to just mark a piece of the code as data, that will never be tokenized or parsed or anything, just dropped directly into a buffer.

"Prepared statements" sounds EXACTLY like what I was thinking! I don't understand why people would ever use anything else.

1 more reply

billpg1y ago

Most (all?) SQL client libraries will allow you indicate a parameter placeholder and supply that parameter value separately.

Terr_1y ago

> In order to to this, PQescapeStringInternal must call pg_utf_mblen.

For a moment I had a dev-flashback to the problem of utf8mb4 versus (broken) utf8 in mySQL. Easy now, this is Postgres, everything is safe(er)...

nthingtohide1y ago

Oh, the fun we had when a single emoji which crashed youtube or whatsapp.

https://www.youtube.com/watch?v=jC4NNUYIIdM

Bluecobra1y ago

A while back I stumbled on a way to crash my company’s Jira server simply by sending an email to it containing an emoji. Makes me wonder if that could have been abused by malicious parties if they knew the email address we used to forward new support issues to Jira.

hnlmorg1y ago

As someone who’s done a fair amount with parsing of Unicode strings lately, I’m not at all surprised by this bug.

Unicode is a surprisingly elegant system but also an open invitation for all kinds of abuse.

rapind1y ago

> Unicode is a surprisingly elegant system…

s/elegant/clever

What could go wrong? I bet unicode is how AGI escapes and enslaves humanity.

hnlmorg1y ago

The cleverness is in the simplicity of its implementation while maintaining backwards compatibility. Which satisfies the definition of “elegance”.

Working with Unicode is anything but elegant, but that’s another story.

1 more reply

giancarlostoro1y ago

I havent deep dived unicode enough to know, and probably someone somewhere already made it, but I often wonder if we can do compression more efficiently by abusing unicode somehow, especially regarding plain text.

pavel_lishin1y ago

I swear qntm has written something about this.

1 more reply

dalemhurley1y ago

I have been playing a lot with emoji smuggling with https://toolnames.com/utilities/emoji-smuggler and https://chat.full.cx inspired by https://emoji.paulbutler.org/

I had not considered it as an injection attack.

verisimi1y ago

> Out of compliance, the US Treasury posted this notice to US lawmakers, breaking the news that a “China state-sponsored Advanced Persistent Threat (APT) actor” had breached their systems.

https://www.youtube.com/watch?v=ekr2nIex040

ROSÉ & Bruno Mars - APT. (Official Music Video)

nickdothutton1y ago

Today in “why enumerating badness is stupid”.

somat1y ago

I got nerd sniped by a recent xkcd.

https://xkcd.com/3054/ (the scream cypher)

And halfway through implementing it. I realized that these combining characters will attach to any letter, and it would be easy to have crappy stegonography on top of your crappy ceaser cypher.

Probably nothing new, but I had fun.

D̊o̝ n̝o̊ť f̔o͞r̊g̝e͞t̍ t̊h̠e̗ milk

    import string
    
    def combine_list():
            l = []
            for c in range(0x0300, 0x0370):
                    if c != 0x340f: #non joiner mark
                            l.append(chr(c))
                    #print(str(c), 'A' + chr(c))
            return l
    
    class codebook_a():
            def __init__(self):
                    self.cb = dict(zip(string.printable, combine_list()))
                    self.cb[''] = ''
                    self.d = {}
                    for c in self.cb:
                            self.d[self.cb[c]] = c
            def encode(self, message, fake):
                    if len(message) > len(fake):
                            raise ValueError('fake must be longer than message')
                    ml = list(message)
                    fl = list(fake)
                    encode = []
                    while fl:
                            c = fl.pop(0)
                            if c != ' ':
                                    if ml:
                                            m = ml.pop(0)
                                    else:
                                            m = ''
                                    encode.append(c + self.cb[m])
                            else:
                                    encode.append(c)  
                    return ''.join(encode)
            def decode(self, message):
                    plain = []
                    for c in message:
                            p = self.d.get(c)
                            if p is not None:
                                    plain.append(p)
                    return ''.join(plain)
    def cli(args):
            cb = codebook_a()
            #check for coded text
            if len(args) == 0:
                    print('message fake : two args')
                    print('a message with no combining charactors(dicritics) will be coded onto fake')
                    print('with combining charactors will extract the real message')
                    return 1
            m = args[0]
            for c in m:
                    if cb.d.get(c):
                            print('decode')
                            print(cb.decode(m))
                            return 0
            print('encode')
            print(cb.encode(m, args[1]))

    if __name__ == '__main__':
            import sys
            cli(sys.argv[1:])

netsharc1y ago

Geez, any summary of this article that tells it like the reader isn't five years old?

defen1y ago

UTF-8 encodes a unicode codepoint into 1, 2, 3, or 4 bytes. Assuming that you have a valid UTF-8 encoding of a codepoint, then the first byte tells you how many bytes are in the encoding. 0-127 inclusive means one byte, 192-223 means 2, 224-239 means 3, and 240-247 means 4. If the first byte is 0xC0 (192), then the sequence is two bytes long. However, not every 2-byte sequence that starts with 0xC0 is valid UTF-8. The uppermost bits of the second byte must be `10` in a valid 2-byte UTF-8 sequence. 0x27 does not meet that criteria, so `0xC0 0x27` is not valid UTF-8. If your escape function operates at the level of unicode codepoints but doesn't actually verify that they're valid, you end up copying a single quote into your "escaped" buffer that downstream parts of the code will hit.

account421y ago

The funny part is that not having any Unicode support in this part of the code and treating the data as ASCII (plus mistery bytes) would have worked correctly.

0xbadcafebee1y ago

A PHP app called a Postgres library function to "escape strings" for use in Postgres, and that called a function to get a utf8 string length, but the function was bullshit:

> The PQescapeStringInternal method doesn’t actually validate that the string it is parsing with pg_utf_mblen is valid Unicode. So, instead, it just takes the length of 2, and grabs the next byte.

So the bug was a shitty function in a generic open source library which was probably never properly tested or fuzzed, which ended up letting attackers move laterally through the database. And this is one reason you want full test coverage; tiny stupid functions matter.

(Another fix for this is to enforce at the boundaries of every function that the input data has been "blessed" or sanitized by some other function whose purpose is just to validate that the data is what it's supposed to be. That would have to happen before escaping, and every function that uses that data would need to confirm that it got blessed. Basically you want a home-rolled strong-typing system with types (or data classes?) for all your data. But that's a lot of work, I don't expect many would do that for most apps)

gowld1y ago

* Custom Unicode parsing code in the middle of an "escape" function

* No static type-checking for Unicode data

j / k navigate · click thread line to collapse

78 comments

cogman101y ago

> Now, all of this might have been fine had Beyond Trust not written a feature which allowed users to directly, programmatically interact with psql (the postgres command line interface).

That's the buried lede.

Yes, there was a vulnerability in psql... but that's so much less a problem than the huge gaping hole of allowing users to directly interact with psql.

No DB can be safe if you are turning untrusted user commands into psql executions. It'd be like giving untrusted users ssh access and then complaining when they find a privilege elevation exploit.

gowld1y ago

They had one job, and they failed at it. This amateur-level mistake should sink the entire company.

pjc501y ago

Didn't happen when Crowdstrike broke all their customers.

1oooqooq1y ago

but they got the right sales people to get to the IRS and they hired the "right" (wrong) certification company. so that's three jobs at least.

we now have one job to ask for accountability and will not do it.

somat1y ago

There is an interesting project https://github.com/Abstrct/Schemaverse where the whole game takes place entirely within a postgres database that the players connect directly to.

But the author has fun with this, There is a trophy in the game that you can only get by putting your name in the trophy table.

0xbadcafebee1y ago

Systems are built on a set of expectations. Undermine the expectations and you undermine the system.

cogman101y ago

> They sanitized the data, so it should have been fine.

This is a 101 rookie level approach to SQL or injection defense.

It's dumb for exactly the same reason why this is dumb

    "SELECT * FROM foo WHERE bar=" + sanitize(userInput)

The correct way to do something like this will always be parameterized input which looks something like this

    "SELECT * FROM foo WHERE bar=?"
    bindParameter(1, userInput);

You simply can't use a generalized sanitizer and expect good results.

7 more replies

zoeysmithe1y ago

Just a guess but this looks like politically powerful dev culture overwriting cybersecurity culture, demanding, thus getting an exception from management for 'productivity' and 'being agile.'

Especially as corporate sees devs like they see salesmen (big moneymakers who deserve perks, exceptions) and top-down security culture as a cost center.

I also imagine the first high performance enterprise friendly drop-in db written in something like rust is going to one day be a big deal.

Terr_1y ago

randmeerkat1y ago

> Especially as corporate sees devs like they see salesmen…

singpolyma31y ago

Giving untrusted users ssh access... You mean like every shared hosting company or shell provider?

charcircuit1y ago

Which is why shared hosting companies get hacked relatively more.

charcircuit1y ago

>Beyond Trust did their due diligence by properly calling a sanitization method on the user’s string input using it in a PostgreSQL query.

Spivak1y ago

Yep, it's prepared statements or bust. But the long tail of legacy code, examples, documentation, that uses escaping is gonna take a while to get through.

One of the nice things about modern ORMs like SQLAlchemy 2 is that it forces you to use prepared statements even when when calling raw queries.

1oooqooq1y ago

without the PR spin: beyond trust did the bare minimum when implementing an open back door to a federal database

kevinsync1y ago

Fun fact: MySQL (and I'm sure many other databases?) lets you pass in values directly as hexadecimal strings.

I've avoided SQL injection since ancient times not by escaping strings, but by transforming any given input via "0x" + bin2hex(value) and plopping that into the query.

No quotes needed, no code buried deep in included libraries needed, handles any kind of data possible, and also no funny business sneaking in based on how you may have mangled the input.

vessenes1y ago

kevinsync1y ago

mtrovo1y ago

So many questions:

- why a no side-effects function on a database can be used to get lateral access to the whole database instance

- why do you need to validate strings on the database itself and not on the client anyway, heck why are there no type safe way of doing it

- why would you want to execute shell commands from the database itself

It's not always PHP but there are some kirks that are shrugged off on PHP that makes me really concerned about the reliability of projects coded with it.

benmmurphy1y ago

rapind1y ago

Yeah we’re missing some info.

What account were they authenticating with when attaching to psql?

If you have the connection string why does psql even matter, couldn’t you use any client? Or is this a case of your input being forwarded to a running, already authenticated, psql instance?

And finally, why do we need unicode support for schema? I assume it’s because the schema is itself data?

c45y1y ago

In this case PAM is the name of a type of security product and not the Linux PAM system.

jarebear6expepj1y ago

marsovo1y ago

PHP has grown up but in its wild youth was notorious for such gems as mysql_escape_string vs mysql_real_escape_string, rather than proper parameterization

It's not so much about Turing as it is libraries and patterns

After all, as I understand it this very issue was caused by escaping SQL rather than parameterizing it

xmprt1y ago

I learned about the breach a few days ago but I didn't know that you could "adopt" an emoji: https://aac.unicode.org/sponsors

That's a neat tidbit.

jfengel1y ago

Does "adopting an emoji" mean something other than "appearing on that page"?

From the adoption page: Each adoption comes with a digital badge and certificate that you can proudly display.

Implicated1y ago

It also appears to be a $5k permanent backlink from unicode.org, custom anchor text even.

nkrisc1y ago

It’s a fun way to donate.

thrdbndndn1y ago

It's just a way to donate, similar to lots of "adopt a xxx" programs.

Validark1y ago

Could someone please explain to me why "sanitizing database inputs" was ever considered a good idea? Why not just add a feature in SQL like so?

  SELECT * FROM users WHERE username = [<LEN_OF_TEXT>]raw-text-of-len-not-parsed-at-all

E.g.

  SELECT * FROM users WHERE username = [21]flyin' and wavin' guy
                                           ^^^^^^^^^^^^^^^^^^^^^ 
                          these 21 chars are NOT parsed AT ALL, just taken as data

I am not very familiar with SQL so you might need a different prefix but hopefully the idea is obvious.

whatnow373731y ago

How is this not "sanitizing inputs"?

The basic idea behind your proposal exists and is called prepared statements. It's actually, I hope, the normal way to write queries these days.

You write your query like: "SELECT * FROM users WHERE username = ?" and execute your query like "execute(query, username)".

The problem? It's optional.

Validark1y ago

"Prepared statements" sounds EXACTLY like what I was thinking! I don't understand why people would ever use anything else.

1 more reply

billpg1y ago

Most (all?) SQL client libraries will allow you indicate a parameter placeholder and supply that parameter value separately.

Terr_1y ago

> In order to to this, PQescapeStringInternal must call pg_utf_mblen.

For a moment I had a dev-flashback to the problem of utf8mb4 versus (broken) utf8 in mySQL. Easy now, this is Postgres, everything is safe(er)...

nthingtohide1y ago

Oh, the fun we had when a single emoji which crashed youtube or whatsapp.

https://www.youtube.com/watch?v=jC4NNUYIIdM

Bluecobra1y ago

hnlmorg1y ago

As someone who’s done a fair amount with parsing of Unicode strings lately, I’m not at all surprised by this bug.

Unicode is a surprisingly elegant system but also an open invitation for all kinds of abuse.

rapind1y ago

> Unicode is a surprisingly elegant system…

s/elegant/clever

What could go wrong? I bet unicode is how AGI escapes and enslaves humanity.

hnlmorg1y ago

The cleverness is in the simplicity of its implementation while maintaining backwards compatibility. Which satisfies the definition of “elegance”.

Working with Unicode is anything but elegant, but that’s another story.

1 more reply

giancarlostoro1y ago

pavel_lishin1y ago

I swear qntm has written something about this.

1 more reply

dalemhurley1y ago

I have been playing a lot with emoji smuggling with https://toolnames.com/utilities/emoji-smuggler and https://chat.full.cx inspired by https://emoji.paulbutler.org/

I had not considered it as an injection attack.

verisimi1y ago

> Out of compliance, the US Treasury posted this notice to US lawmakers, breaking the news that a “China state-sponsored Advanced Persistent Threat (APT) actor” had breached their systems.

https://www.youtube.com/watch?v=ekr2nIex040

ROSÉ & Bruno Mars - APT. (Official Music Video)

nickdothutton1y ago

Today in “why enumerating badness is stupid”.

somat1y ago

I got nerd sniped by a recent xkcd.

https://xkcd.com/3054/ (the scream cypher)

And halfway through implementing it. I realized that these combining characters will attach to any letter, and it would be easy to have crappy stegonography on top of your crappy ceaser cypher.

Probably nothing new, but I had fun.

D̊o̝ n̝o̊ť f̔o͞r̊g̝e͞t̍ t̊h̠e̗ milk

    import string
    
    def combine_list():
            l = []
            for c in range(0x0300, 0x0370):
                    if c != 0x340f: #non joiner mark
                            l.append(chr(c))
                    #print(str(c), 'A' + chr(c))
            return l
    
    class codebook_a():
            def __init__(self):
                    self.cb = dict(zip(string.printable, combine_list()))
                    self.cb[''] = ''
                    self.d = {}
                    for c in self.cb:
                            self.d[self.cb[c]] = c
            def encode(self, message, fake):
                    if len(message) > len(fake):
                            raise ValueError('fake must be longer than message')
                    ml = list(message)
                    fl = list(fake)
                    encode = []
                    while fl:
                            c = fl.pop(0)
                            if c != ' ':
                                    if ml:
                                            m = ml.pop(0)
                                    else:
                                            m = ''
                                    encode.append(c + self.cb[m])
                            else:
                                    encode.append(c)  
                    return ''.join(encode)
            def decode(self, message):
                    plain = []
                    for c in message:
                            p = self.d.get(c)
                            if p is not None:
                                    plain.append(p)
                    return ''.join(plain)
    def cli(args):
            cb = codebook_a()
            #check for coded text
            if len(args) == 0:
                    print('message fake : two args')
                    print('a message with no combining charactors(dicritics) will be coded onto fake')
                    print('with combining charactors will extract the real message')
                    return 1
            m = args[0]
            for c in m:
                    if cb.d.get(c):
                            print('decode')
                            print(cb.decode(m))
                            return 0
            print('encode')
            print(cb.encode(m, args[1]))

    if __name__ == '__main__':
            import sys
            cli(sys.argv[1:])

netsharc1y ago

Geez, any summary of this article that tells it like the reader isn't five years old?

defen1y ago

account421y ago

The funny part is that not having any Unicode support in this part of the code and treating the data as ASCII (plus mistery bytes) would have worked correctly.

0xbadcafebee1y ago

A PHP app called a Postgres library function to "escape strings" for use in Postgres, and that called a function to get a utf8 string length, but the function was bullshit:

> The PQescapeStringInternal method doesn’t actually validate that the string it is parsing with pg_utf_mblen is valid Unicode. So, instead, it just takes the length of 2, and grabs the next byte.

gowld1y ago

* Custom Unicode parsing code in the middle of an "escape" function

* No static type-checking for Unicode data

j / k navigate · click thread line to collapse