A typical thing that performance benchmarks do to negate guest OS caching is to process significantly more data as what the available RAM is set to. For example, if your guest OS RAM is set 512MB, process 10GB of random data. Of course then the question is how-to get random data as you don't want to end up testing the random generator ;) or your host OS caching.
Another way to make sure you test data committed to disk could be to include a "shutdown guest OS" part and measure total time until the guest has been fully shut down.
I know that at least VMware has the ability to turn off disk caching (in Fusion, select Settings, advanced "Hard Disk Buffering" <- disable). I am not aware of a similar feature in Virtual Box, it might exist though.
Even while you tested the same guest OS, we don't even know if the hard disk adapters where both using the same hard disk drivers. Performance differs between IDE/Sata/SCSI drivers. SCSI drivers have queue depths, IDE drivers have not.