* Out-of-bounds Write
* Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
* Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection')
* Use After Free
You avoid them by using tools that make it difficult or impossible to introduce such vulnerabilities to begin with. Such as modern, memory safe programming languages.
For many decades, carpenters have been educated about table saw safety. But what finally stopped thousands of fingers getting chopped off every year was the introduction of the SawStop, and similar technologies.
Safety is a matter of using the right tools, not of "taking better care".
The reason XSS (and CORS) are tricky is because they fundamentally don't work in a world where a website may be spread over a couple different domains. I get a taste of this in my dayjob where we have to manage cookie scoping across a couple different region domains and have several different subdomains for different cookie behaviors. It's easy to be clean on paper up until you need to interface with some piece of software that insists on doing it its own way - for example the Azure excel embedded functionality requires the ID token to be passed in the request body, meaning you have to pull in the request body and parse it in your gateway layer (or delegate that to a microservice)... potentially with multi-GB files being sent in the body as well!
It's super easy on paper to start from greenfield and design something that is sane and clean, bing boom so simple. But once you acquire a couple of these fixed requirements, the cleanliness of the system degrades quite a bit, because that domain uses a format that's not shared by anything else in the system, and it's a bad one, and we can't do anything about it, and now that's a whole separate identity token that has to be managed in parallel.
Anyway, you could say that buffer overflow or use-after-free are kind of an impedence mismatch for memory management/ownership in C. Well, XSS and CORS are an impedence mismatch for domain-based scoping models in a REST-based world. Obviously the correct answer is to simply not write vulnerable systems, but is domain-based scoping making that easier or harder?
Actually the SQL one is arguably in that category too, to a lesser extent. Libraries could, and should, make it obvious how to do parametrized SQL queries in your language. I would guess that for every extra minute of their day a programmer in your language must spend to get the parametrized version to work over just lazy string mangling, you're significantly adding to the resulting vulnerability count because some of them won't bother.
Bonus points if your example code, which people will copy-paste, just uses a fixed query string because it was only an example and surely they'll change that.
Kids get into it just by having the tenacity to do whatever it takes to make it chooch. It's all that counts.
This is often ignored as it simply takes too much time and it often does not hurt much as it’s ‘internal’ (to the company using the saas or whatever).
https://security.googleblog.com/2021/02/mitigating-memory-sa...
https://www.chainguard.dev/unchained/building-the-first-memo...
(Not in use rust for everything bandwagon, genuinely curious)
Either way, they're effectively "solved" from a programmer's perspective if you're willing to adopt modern frameworks instead of string-concatenating HTML or SQL manually.
The middle two are out of reach of a typical PL or type system (there are exceptions like Ur, but I don't think it's adopted widely). It's a problem that is typically solved via libraries and Rust is not unique in terms of providing safe libraries around generating SQL or HTML.
for 4 the static analyzer should help, and, also set your pointer to NULL immediately after free too(for double free)
Changing everything to take lengths is definitely a good change - but challenging to retrofit into existing codebases. Apple has a neat idea for automatically passing lengths along via compilation changes rather than source changes, but if you want to do things in source you have to deal with the fact that there is some function somewhere that takes a void*, increments it locally, reinterpret_casts it to some type, and then accesses one of its fields and you've got a fucking mess of a refactor on your hands.
Use after free is actually gaining popularity, up 3 since last year.
- Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting') (#2)
- Improper Neutralization of Special Elements used in an SQL Command ('SQL Injection') (#3)
- Improper Neutralization of Special Elements used in an OS Command ('OS Command Injection') (#4)
- Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal') (#8)
- Improper Neutralization of Special Elements used in a Command ('Command Injection') (#16)
- Improper Control of Generation of Code ('Code Injection') (#23)
All of these came from trying to avoid structured data, and instead using strings with "special characters". It's crazy how many times this mistake has been repeated: file paths, URLs, log files, CSV, HTML, HTTP (cookies, headers, query strings), domain names, SQL, shell commands, shell pipelines... One unescaped character, from anywhere in the stack, and it all blows up.
One could say "at least it's human-readable", but that's not reliable either. Take files names, for example. Two visually identical file names may map to different files (because confusables[1] or surrounding spaces), or two different names map to the same file (because normalization[2]), or the ".jpg" at the end may not actually be the extension (because right-to-left override[3]).
So the computer interpretation of a string might be wrong because a special character sneaked in. And even if everyone was perfectly careful, the human interpretation might still be wrong. For the sake of the next generations, I hope we leave strings for human text and nothing more.
[1] https://unicode.org/cldr/utility/confusables.jsp
[2] https://developer.apple.com/library/archive/qa/qa1173/_index...
[3] https://krebsonsecurity.com/2011/09/right-to-left-override-a...
But yes, URLs should have been structured. We already see paths rendered with breadcrumbs, the protocol replaced with an icon, `www` auto-inserted and hidden, and the domain highlighted. If that's not a structure, I don't know what is.
By cramming everything into the same string, we open ourselves to phishing attacks by domains like `www.google.com.evil.com`, malicious traversal, 404s from mangled relative paths, and much more.
Two visually identical file names may map to different files (because confusables[1]), or two different names map to the same file (because normalization[2]), or the ".jpg" at the end may not actually be the extension (because right-to-left override[3]),
Those are all because of Unicode, which is an even worse idea in general.
> it's just that for some reason (like basic arithmetic) people don't seem to be taught enough about it to understand.
That's the same argument used to defend manual memory management. But education is not enough. Escaping is something you have to remember to do every time*, or it'll blow up spectacularly. Even knowledgeable professionals mess it up, or it wouldn't occupy 6 of the 25 spots in this list.
> Those are all because of Unicode, which is an even worse idea in general.
What's the alternative? Japanese speakers writing file names in ASCII? Unicode is a modern marvel, it's our fault we use it where it doesn't belong.
* Not necessarily every input/output, but at least every system that interacts with it.
We make it look like it is a request response with a chat bot, but it is more realistic to say we are making a single document and having the model fill out the rest. That is, there is no out of band. There is only the document.
Thank god Spring dropped this interface in the Framework 6.x / Boot 3.x release, and the end for non-commercial support is this year for the old stuff.
https://github.com/spring-projects/spring-framework/issues/2... https://github.com/advisories/GHSA-4wrc-f8pq-fpqp
RCE via deserilaization seems valid 9.8 even if it requires the developer to use less common APIs or using them in strange ways. In the bug they have a comment that the documentation warns about these API but that doesn't really impact a CVSS score. Am I missing something about this specific CVE on why you think its unfair?
1. CWE-787 Out-of-bounds Write: C, C++, Assembly
4. CWE-416 Use After Free: C, C++
7. CWE-125 Out-of-bounds Read: C, C++
10. CWE-434 Unrestricted Upload of File with Dangerous Type: ASP.NET, PHP, Class: Not Language-Specific
12. CWE-476 NULL Pointer Dereference: C, C++, Java, C#, Go
15. CWE-502 Deserialization of Untrusted Data: Java, Ruby, PHP, Python, JavaScript
17. CWE-119 Improper Restriction of Operations within the Bounds of a Memory Buffer: C, C++, Assembly
21. CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition'): C, C++, Java
23. CWE-94 Improper Control of Generation of Code ('Code Injection'): Interpreted
In java you'll get an exception, while in C you might dissapear your cat. Those 2 are quite incomparable when talking about "dangerous-ness" of a mistake
> 21. CWE-362 Concurrent Execution using Shared Resource with Improper Synchronization ('Race Condition'): C, C++, Java
Why those languages specifically? I would say these two issues apply to all languages.
I understand memory safety is important, but still: only one in the podium (though it is first), only 3 in the top 10… clearly security is about much more than memory safety.
But it is embarrassing that we've been living with memory safety issues for 50 years and they still remain very common and very severe, despite being addressable via type systems in ways that something like a logical bug that leads to data leakage isn't.
To the extent that memory safety is slowly, oh so slowly, but steadily dropping down the list, it is because we are taking it seriously as a foundational issue and actually addressing it. To turn around and then use the success we've had as evidence that it isn't important is making a serious error.
There is no reason to use a memory unsafe language anymore, except legacy codebases, and that is also slowly but surely diminishing. I'm still yet to hear this amazingly compelling reason that you just need memory unsafe languages. In terms of cost/benefits analysis, memory unsafety is literally all costs. Even if you do have one of the rare cases when you need it, and you only need a very particular variant of it (reading bytes in memory of one type as bytes of another type, you never need to write out of the scope of an array or dereference into an unallocated memory page), you can still get it through explicit unsafe support that every language has one way or another. You do not need a language that is pervasively unsafe with every line you write so that on those three lines of code out of millions that you actually need it, you can have it with slightly less ceremony. That's just a mind-blowingly bad tradeoff and engineering decision.
How are we supposed to address the other issues from a foundation of a memory unsafe language? If we can't even have such a basic guarantee, we sure aren't going to get more complicated ones later.
If you don't think about the other classes, I'm still gonna escalate privileges, root your box, ransom your data, send spam, charge a half million dollars in cloud spend to your account, steal your customers' PII/PHI, etc etc etc. Without ever using a language specific exploit.
Escaping by default has become a standard practice with HTML templating languages, see the Go html template standard library for a very detailed breakdown of what is escaped where.
More modern PHP frameworks like Laravel provide their own templating solution in part because of this. But the vast majority of websites run on default PHP templates, so it's not surprising that these kinds of vulnerabilities are so high up in the list.
The whole problem is that you mix code and data, and that third party resource loading is 'on' by default in browsers, especially for scripts and things that can embed scripts. This is not something you can fix once and for all at the library level.
I've noticed that using Valgrind on Python systems is almost impossible because most modules have not been built with Valgrind in mind and thus you get swamped in noise.
I suppose the same is true for any large system that uses many different third party libraries.
I'm also not sure if asan has an equivalent to --leak-check=full
I have even made it recognize my custom allocators and report bugs with them too.
When combined with my second favorite tool, AFL++, I have a good shot at eliminating most memory bugs. AFL++ finds paths through the software, and I run every single one of those paths through Valgrind. It's beautiful.
From the optimistic side, it looks like the safest language to write an app today with is TypeScript.