> We study the relative efficiencies of the random and systematic approaches to automated software testing. Using a simple but realistic set of assumptions, we propose a general model for software testing and define sampling strategies for random (R) and systematic (S₀) testing, where each sampling is associated with a sampling cost: 1 and c units of time, respectively. The two most important goals of software testing are: (i) achieving in minimal time a given degree of confidence x in a program's correctness and (ii) discovering a maximal number of errors within a given time bound n̂. For both (i) and (ii), we show that there exists a bound on c beyond which R performs better than S₀ on the average. Moreover for (i), this bound depends asymptotically only on x. We also show that the efficiency of R can be fitted to the exponential curve. Using these results we design a hybrid strategy H that starts with R and switches to S₀ when S₀ is expected to discover more errors per unit time. In our experiments we find that H performs similarly or better than the most efficient of both and that S₀ may need to be significantly faster than our bounds suggest to retain efficiency over R.
Also, for (say) C programs of moderate size, I believe current program analysis techniques do not achieve anything near 100% branch coverage. In the KLEE paper they manage to get ~90% on average for the GNU coreutils using concolic execution – which is really impressive! But the coreutils are tiny programs. For larger programs the cost of symbolic execution is high and random testing can often get you more bugs faster.