The c1.xlarge referenced in the article, which has more disk than c3.xlarge, really doesn't have room for OS cache unfortunately. You can try experimenting with lowering heap, to leave more for the OS, but we found that HBase needed all we could give it or risk OOM.
I haven't used c3.xlarge in production, because it has so much less disk. But for that reason you could probably get away with less heap in the RS, leaving more for the OS. However, keep in mind that the HBase block cache is optimized for HBase use case, whereas OS cache is not. There have been some profiles done, and block cache usually performs better -- http://hadoop-hbase.blogspot.com/2012/12/hbase-profiling.htm... for example. So I would value that over OS cache on a low memory system.
The ideal, in my opinion is i2.4xlarge. You can give 25GB heap, which is manageable with java7's G1 GC, giving plenty of block cache, and still have 100GB to split between the DataNode and OS cache, or anything else you want to run. I'll cover that in another post.