undefined | Better HN

0 pointsaidenn08y ago0 comments

> "Web development is slowly reinventing the 1990's."

> The 90s were slowly reinventing UNIX and stuff invented at Bell Labs.

Yes, this reminds me of: "Wasn't all this done years ago at Xerox PARC? (No one remembers what was really done at PARC, but everyone else will assume you remember something they don't.)" [1]

> "Buffers that don’t specify their length"

> Is this really a common problem in web apps? Most web apps are built in languages that don't have buffer overrun problems. There are many classes of security bug to be found in web apps, some unique to web apps...I just don't think this is one of them. This was a common problem in those C/C++ programs from the 90s the author is seemingly pretty fond of. Not so much web apps built in PHP/JavaScript/Python/Ruby/Perl/whatever.

Most injection attacks are due to this; if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately.

1: https://www.cs.purdue.edu/homes/dec/essay.criticize.html

0 comments

jacquesm8y ago

> if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately.

That's not really the problem. The problem is there is no distinction between data and control leading to everything coming to you in one binary stream. If the control aspect would be out-of-band then the problem would really go away.

Length prefixes will just turn into one more thing to overwrite or intercept and change. That's much harder to do when you can't get at the control channel but just at the data channel. Many old school protocols worked like this.

stouset8y ago

Thank you.

This is the important takeaway here. Changing the encoding simply swaps out one set of vulnerabilities and attacks for another. Separating control flow and data is the actual silver bullet for this category of attacks.

Unfortunately, there’s rarely ever a totally clear logical separation between the two. Anything you want to bucket into “control”, someone else is going to want the client to be able to manipulate as data.

mbreese8y ago

I'm having a hard time seeing how having separate control and data streams would have an effect here. Using FTP to retrieve a document isn't more secure than HTTP... the problem is in how the document itself is parsed. If you added a separate side channel for requesting data (a la FTP), you'd still have the issue of parsing the HTML on the other side.

Granted, if you made that control channel stateful, you'd make a lot of problems go away. But you could do that with a combined control/data stream too.

What am I missing? How would an out-of-band control channel make things easier?

That said, I think many issues with the web could be solved by implementing new protocols as opposed to shoehorning everything into HTTP just to avoid a firewall...

jacquesm8y ago

It makes sure that all your code is yours and that no matter what stuff makes it into the data stream it will never be able to do anything because it is just meant to be rendered.

So <html>abc</html> would go as

<html><datum 1></html> where datum 1 would refer to the first datum in the data stream, being 'abc' and no matter what trickery you'd pull to try to put another tag or executable bit or other such nonsense in the datum it would never be interpreted. This blocks any and all attacks based on being able to trick the server or eventual recipient browser of the two streams to do something active with the datum, it can only be passive data by definition.

For comparison take DTMF, which is inband signalling and so easily spoofed (and with the 'bluebox' additional tones may be generated that unlock interesting capabilities in systems on the line) and compare with GSM which does all its signaling out-of-band, and so is much harder to spoof.

The web is basically like DTMF, if you can enter data into a form and that data is spit back out again in some web page to be rendered by the browser later on you have a vector to inject something malicious and it will take a very well thought out sanitation process to get rid of all the possibilities in which you might do that.

If the web were more like GSM you could sit there and inject data in to the data channel until the cows came home but it would never ever lead to a security issue.

No amount of extra encoding and checks will ever close these holes completely as long as the data stays 'in band' with the control information.

2 more replies

51stpage8y ago

SQL injection attacks are an excellent example where code and data are mixed. One solution is to do a lot of clever escaping of 'attackable' characters that instruct the DBMS to stop treating a character string as data and start executing things [1]. Escaping attackable characters attempts to partition data from code. This usually works but not perfectly.

Or, run your data through stored procedures instead. It took me a while to figure out why stored procedures were so much more secure than regular queries. I finally figured out it was because a stored procedure does exactly what the grandparent post says: It treats all inputs as data with no possibility to run as code.

[1] https://xkcd.com/327/

6 more replies

bastawhiz8y ago

> if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately.

If this was the case, it would be near-impossible to write HTML by hand. And if you're writing HTML with a tool (React, HAML etc.), the tool could be doing HTML escaping correctly instead. This isn't an issue with HTML, it's an issue with human error.

TeMPOraL8y ago

> This isn't an issue with HTML, it's an issue with human error.

All security issues are due to human error. Those are solved by building better tools.

> If this was the case, it would be near-impossible to write HTML by hand.

If, besides the text form, there would be a well-defined length-prefixed binary representation, we could simply compile HTML to binary-HTML, which would immediately made the web not only safer, but also much more efficient (it's scary if you think just how much parsing and reparsing goes on when displaying a web page).

lou13068y ago

One could build something similar by using a set of "conventional" canonical S-expressions: https://en.wikipedia.org/wiki/Canonical_S-expressions

1 more reply

quadrangle8y ago

If you have an issue with human error and don't design your programmed tool to avoid letting the errors out into the world, then it is the fault of the tool.

bastawhiz8y ago

I'm not sure what the argument you're putting forth is. All of the HTML-generating tools I'm aware of (barring dumb string templating tools) work sufficiently well and prevent human error.

My point is that there's nothing wrong with HTML. HTML isn't a tool, it's a format for storing and transmitting hypertext. If you're using React or HAML or any of the other HTML-generating tools, you're effectively immune from XSS. I'm putting forth that developers aren't using effective tools (shame on every templating engine that doesn't escape by default), and that calling the web as a platform bad is a bit nonsensical. It's like saying "folks are writing asm by hand and their code has security issues, therefore x86_64 is insecure".

2 more replies

SwellJoe8y ago

"Most injection attacks are due to this; if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately."

How so? If you allow the user to send arbitrary data, and your handling of that data is where the problem lies, it isn't going to matter whether the client sends a length-prefixed piece of data. You still have to sanitize that data.

HTML, and whether it uses closing tags or not, is pretty much irrelevant to the way injection attacks work, as far as I can tell. Maybe I'm missing something...do you have an example or a reference to how this could solve injection attacks?

sp3328y ago

If the length is not pre-defined, the input has to be parsed to look for the closing tag. That makes your code vulnerable if the input tricks it into finding the wrong closing tag. But if the length is fixed, you don't have to parse it at all. That would avoid a whole class of vulnerabilities.

dgoldstein8y ago

True, assuming that programmers don't compute code (HTML,SQL, etc) from user input and miscompute the length of a fragment.

It would be interesting to see if this idea could work in practice.

2 more replies

an_account8y ago

If you can say, “the next 450 characters are plain text and should be rendered as such”, then even if the text includes script tags (or whatever), they won’t be parsed or executed.

SwellJoe8y ago

This seems like an argument for strong types. Which is reasonable. But, one could do that with closing tags, too. We already know that relying on a programmer to specify the length of data is prone to bugs (C/C++). And, you can't trust the client to specify the length of data.

I feel like this is conflating two different problems and potential solutions.

I'm not saying injection attacks aren't real. I'm saying that whether HTML uses closing tags or not is orthogonal to the solution. But, again, maybe I'm missing something obvious here. I just don't see how what you're suggesting can be done without types and I don't see how types require prefixing data size in order to work.

zaroth8y ago

There is a .innerText property which works perfectly fine for this if you want to ship your content inside JSON and then plug it in...

sbov8y ago

> Most injection attacks are due to this; if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately

No it wouldn't. It wouldn't fix sql injection and it also wouldn't fix the path bug the op linked.

The problem is not length, it is context unaware strings. The problem is our obsession with primitive types that pervade our codebases.

andrewstuart28y ago

SQL injection is not a web problem. If you create SQL queries based on any untrusted (e.g. user) input on any platform, you have to escape/explicitly type your input.

Injection in general is simply a trust problem. If you can trust all inputs fully (hint: you can't, because nobody can), then you will never have an injection attack.

vbezhenar8y ago

SQL injection is a problem with SQL, which is similar to problems with HTML. SQL was created as human-friendly query languages, it wasn't created to be built from strings in a programming language. Proper database API should be just a bunch of query builder calls and with this API SQL-injection is not possible.

3 more replies

danpalmer8y ago

The point is that if you know the length of some data up-front before starting to parse it, you don't have to inspect the data in any way to see when it ends. This means that you don't need to know what the SQL injection looks like and protect against it, or what JS looks like to sanitise your inputs – the problem does go away to a large extent.

always_good8y ago

That doesn't make sense.

Obviously nobody is going to be typing length prefixes manually, so our tools are going to do it for us.

Now we're back where we started where you accidentally inline user content as HTML, except now HTML has the added cruft of someone's HN comment solution.

1 more reply

roywiggins8y ago

This doesn't do anything for Bobby DROP TABLE injections, right? The whole thing is a user-supplied slug, there's no source of truth on how long a user's name is. Or am I missing something?

2 more replies

katastic8y ago

>The problem is not length

Oh thank God. I'm going to forward this to my wife.

krapp8y ago

"Buffer? I don't even know her!"

Ha ha. I'll get my coat.

1 more reply

edoceo8y ago

Even when the sender tells you the length of the data to expect the receiver still needs to read every thing that is sent?

Or were senders always going to send true values for length and data?

Really, you can't trust any sender, so the data should be validated anyway.

There's been known attacks where a sender says here's 400 bytes and the receiver stupidly trusted that length specifier, and the sender's sends more (or less) crafted bytes and BOOM!

Known good data start and end specifiers, which HTML has, seems a good answer when dealing with untrusted senders (read:everyone)

j / k navigate · click thread line to collapse

0 comments

jacquesm8y ago

> if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately.

stouset8y ago

Thank you.

mbreese8y ago

Granted, if you made that control channel stateful, you'd make a lot of problems go away. But you could do that with a combined control/data stream too.

What am I missing? How would an out-of-band control channel make things easier?

That said, I think many issues with the web could be solved by implementing new protocols as opposed to shoehorning everything into HTTP just to avoid a firewall...

jacquesm8y ago

It makes sure that all your code is yours and that no matter what stuff makes it into the data stream it will never be able to do anything because it is just meant to be rendered.

So <html>abc</html> would go as

If the web were more like GSM you could sit there and inject data in to the data channel until the cows came home but it would never ever lead to a security issue.

No amount of extra encoding and checks will ever close these holes completely as long as the data stays 'in band' with the control information.

2 more replies

51stpage8y ago

[1] https://xkcd.com/327/

6 more replies

bastawhiz8y ago

> if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately.

TeMPOraL8y ago

> This isn't an issue with HTML, it's an issue with human error.

All security issues are due to human error. Those are solved by building better tools.

> If this was the case, it would be near-impossible to write HTML by hand.

lou13068y ago

One could build something similar by using a set of "conventional" canonical S-expressions: https://en.wikipedia.org/wiki/Canonical_S-expressions

1 more reply

quadrangle8y ago

If you have an issue with human error and don't design your programmed tool to avoid letting the errors out into the world, then it is the fault of the tool.

bastawhiz8y ago

I'm not sure what the argument you're putting forth is. All of the HTML-generating tools I'm aware of (barring dumb string templating tools) work sufficiently well and prevent human error.

2 more replies

SwellJoe8y ago

"Most injection attacks are due to this; if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately."

sp3328y ago

dgoldstein8y ago

True, assuming that programmers don't compute code (HTML,SQL, etc) from user input and miscompute the length of a fragment.

It would be interesting to see if this idea could work in practice.

2 more replies

an_account8y ago

If you can say, “the next 450 characters are plain text and should be rendered as such”, then even if the text includes script tags (or whatever), they won’t be parsed or executed.

SwellJoe8y ago

I feel like this is conflating two different problems and potential solutions.

zaroth8y ago

There is a .innerText property which works perfectly fine for this if you want to ship your content inside JSON and then plug it in...

sbov8y ago

> Most injection attacks are due to this; if html used length-prefixed tags rather than open/close tags most injection attacks would go away immediately

No it wouldn't. It wouldn't fix sql injection and it also wouldn't fix the path bug the op linked.

The problem is not length, it is context unaware strings. The problem is our obsession with primitive types that pervade our codebases.

andrewstuart28y ago

SQL injection is not a web problem. If you create SQL queries based on any untrusted (e.g. user) input on any platform, you have to escape/explicitly type your input.

Injection in general is simply a trust problem. If you can trust all inputs fully (hint: you can't, because nobody can), then you will never have an injection attack.

vbezhenar8y ago

3 more replies

danpalmer8y ago

always_good8y ago

That doesn't make sense.

Obviously nobody is going to be typing length prefixes manually, so our tools are going to do it for us.

Now we're back where we started where you accidentally inline user content as HTML, except now HTML has the added cruft of someone's HN comment solution.

1 more reply

roywiggins8y ago

This doesn't do anything for Bobby DROP TABLE injections, right? The whole thing is a user-supplied slug, there's no source of truth on how long a user's name is. Or am I missing something?

2 more replies

katastic8y ago

>The problem is not length

Oh thank God. I'm going to forward this to my wife.

krapp8y ago

"Buffer? I don't even know her!"

Ha ha. I'll get my coat.

1 more reply

edoceo8y ago

Even when the sender tells you the length of the data to expect the receiver still needs to read every thing that is sent?

Or were senders always going to send true values for length and data?

Really, you can't trust any sender, so the data should be validated anyway.

There's been known attacks where a sender says here's 400 bytes and the receiver stupidly trusted that length specifier, and the sender's sends more (or less) crafted bytes and BOOM!

Known good data start and end specifiers, which HTML has, seems a good answer when dealing with untrusted senders (read:everyone)

j / k navigate · click thread line to collapse