Scaling Static Analyses at Facebook (opens in new tab)

(m-cacm.acm.org)

126 pointsdons6y ago25 comments

25 comments

I think one of the unstated problems with static analysis is just keeping track of the results. I know that when I started working with these tools, it was a huge PITA just dealing with the various output files.

That's why I created tools to convert the output from different tools into a common CSV format that can be databased and used to compare output from different tools, or from different versions of the code (e.g., after fixing errors reported by the tools).

These tools currently work with cppcheck, clang and PVS-Studio and can be found here: http://btorpey.github.io/blog/categories/static-analysis/

nh_996y ago

Interesting approach. Where I work, we use Jenkins for collecting results. That way, for each build of our application we have a history of results for static analysis. Jenkins has good tools for storing and displaying this information, as well as the ability to show trends over time.

wallstprog6y ago

If Jenkins works for you, great. It does seem to support both clang and cppcheck, although not PVS-Studio (which is one of the better tools out there in my experience).

Personally, I'm happier with plain old text files that can be manipulated with awk, grep, etc., can be databased if needed (since they're csv files) -- and can also be compared using my all-time favorite software, Beyond Compare. (http://btorpey.github.io/blog/2013/01/29/beyond-compare/).

nickpsecurity6y ago

One of the things I like about this article is that it gives another example showing how formal methods catches deep errors unlikely to be caught with human review or testing:

"Overall, the error trace found by Infer has 61 steps, and the source of null, the call to X509 _ gmtime _ adj () goes five procedures deep and it eventually encounters a return of null at call-depth 4. "

I think the example Amazon gave for TLA+ was thirty-something steps. Most people's minds simply can't track 61 steps into software. Tests always have a coverage issue.

SanchoPanda6y ago

> Zoncolan catches more SEVs than either manual security reviews or bug bounty reports. We measured that 43.3% of the severe security bugs are detected via Zoncolan. At press time, Zoncolan's "action rate" is above 80% and we observed about 11 "missed bugs."

>. For the server-side, we have over 100-million lines of Hack code, which Zoncolan can process in less than 30 minutes. Additionally, we have 10s of millions of both mobile (Android and Objective C) code and backend C++ code

> All codebases see thousands of code modifications each day and our tools run on each code change. For Zoncolan, this can amount to analyzing one trillion lines of code (LOC) per day.

11 "missed bugs" on the 100 mm server-side lines of code per run, or ever?

m0zg6y ago

Also, the main issue with static analysis tools tends to be not false negatives, but false positives. That is, they churn out tons and tons of alerts that aren't actually bugs. Some such systems alert so much that they aren't worth using.

Matthias2476y ago

Yes, that's the main culprit with traditional static analysis. No one wants to review the results, because the amount of signal to noise is far too low. And also since it's an optional thing and not enforced by the compiler.

I think this is where languages with stronger inbuilt analysis (e.g. Rust) win: The results are better, and since the analysis is always running as part of a compiler pass there are no huge jumps in indicated bugs at once (like what would happen if one would run Coverity on a legacy C++ codebase).

3 more replies

muglug6y ago

It sounds (from the article) like they have some sort of heuristic for determining potential severity, and they're ok with more false-positives in areas where the potential damage from a false-negative is very high.

1 more reply

muglug6y ago

It's explained in the article:

> We also use the traditional security programs to measure missed bugs (that is, the vulnerabilities for which there is a Zoncolan category), but the tool failed to report them. To date, we have had about 11 missed bugs, some of them caused by a bug in the tool or incomplete modeling.

A missed bug is presumably one that the tool is designed to spot, but which it didn't during the period in which it has been running.

sanxiyn6y ago

We should start to run Infer on all open source C and C++ code in existence.

dhekir6y ago

Not only Infer, but other static analyzers would also be useful.

Hopefully Software Heritage (https://www.softwareheritage.org) will help with that.

apaprocki6y ago

Coverity scans most open-source software you likely depend on: https://scan.coverity.com/projects

mhxion6y ago

Is there something wrong with acm's load balancer or whatever? First managed to read to the end of the article, but to download the PDF showed "Oops! This website is under heavy load." Now article page is under heavy load too.

Edit: It worked again right after I posted this comment.

sjtindell6y ago

Always cool to read about scale.

j / k navigate · click thread line to collapse

25 comments

wallstprog6y ago

These tools currently work with cppcheck, clang and PVS-Studio and can be found here: http://btorpey.github.io/blog/categories/static-analysis/

nh_996y ago

wallstprog6y ago

If Jenkins works for you, great. It does seem to support both clang and cppcheck, although not PVS-Studio (which is one of the better tools out there in my experience).

nickpsecurity6y ago

One of the things I like about this article is that it gives another example showing how formal methods catches deep errors unlikely to be caught with human review or testing:

I think the example Amazon gave for TLA+ was thirty-something steps. Most people's minds simply can't track 61 steps into software. Tests always have a coverage issue.

SanchoPanda6y ago

> All codebases see thousands of code modifications each day and our tools run on each code change. For Zoncolan, this can amount to analyzing one trillion lines of code (LOC) per day.

11 "missed bugs" on the 100 mm server-side lines of code per run, or ever?

m0zg6y ago

Matthias2476y ago

3 more replies

muglug6y ago

1 more reply

muglug6y ago

It's explained in the article:

A missed bug is presumably one that the tool is designed to spot, but which it didn't during the period in which it has been running.

sanxiyn6y ago

We should start to run Infer on all open source C and C++ code in existence.

dhekir6y ago