When you're writing a virtual patch you know exactly what data you're dealing with and you can allow through only what's known to be good. Any other approaches (e.g., generic rules) deal with text in bulk and are prone to false positives.
Even with this narrower focus, it's still a difficult problem. Here's a paper I wrote on this subject a while ago: https://blog.qualys.com/wp-content/uploads/2012/07/Protocol-...
Source: I am the original author of ModSecurity (but not of any of the rules packages).
For my context, I’m coming from a place of adding it to very new deployments, where the needs are constantly changing, which is why it feels a bit square-peg-round-hole I think.
I think ideally you'd want to use the rules to create some kind of temporal risk score for a given IP / client. Eg, if a single IP hits your service several times in 5 minutes with suspicious requests, then you block the request. But this isn't possible so you basically have to ensure all your rules are only looking for the most obvious and suspicious requests, otherwise you're going to get far too many false positives.
The only argument I could make in favour of using it is a lot of attacks these days are automated and therefore are quite naive because they're simply poking around for holes.
This results in fun debugging sessions for issues, where random requests are blocked, also often redirects from Azure AD logins where it apparently triggers on the JWT token.
This is the same kind of "safety measure actually increases attack surface" like antivirus programs.
All additions to a system inherently increase risk and require thoughtful alterations to existing preventative maintenance and disaster recovery planning, but the price of these products and their attendant marketing often leave no room for this to actually occur.
Sorry users, the string “a > b” is not allowed any more. But fear not, “å > b” works just fine
(not http.request.uri.path contains "." and any(http.request.headers["content-range"][*] contains "bytes"))
my dynamic pages shouldn't contain any . (extension) so if a request contains content-range: bytes*, we challenge the request.you may have to modify for your needs
Though I would just leave those strings out of the WAF.
Blocking responses based on the content returned is pretty silly in the first place, but the whole point is to prevent the data from leaving, not from coming in. In fact the whole reason the rules exist is to prevent the case where your database starts burping up data you don't want it to. But if you were blocking the data from being accepted in the first place you wouldn't have that data in your database to begin with.
I mean, many of these dumb mistakes that someone would want their WAF to save them from, wouldn't be for leaks of user-provided PII, but rather for leaks of ops-provided secrets (e.g. connection credentials for upstream APIs), no?
Until you have any kind of JS code where the contents of an input box are round-tripped, so that the user enters a number and then either the interface brake so it starts getting blanked out against their will.
I have worked with such filters at some point earlier in life and had completely forgotten about them. This article brought back weird memories. It seemed like a good idea at the time. I think.
1: xkcd://327