I'm asking because in my experience the extra level of abstraction provided by Cascading, Crunch etc is a huge advantage, and if you're making a conscious choice to operate at a lower level, you better be getting something significant in return; it's not clear to me yet what that is.
But if you are thinking about learning Hadoop using the standard Hadoop API, or if you need for some particular reason to use it for your project, we recommend you to use Pangool instead.
Or if you are considering to implement another abstraction on top of Hadoop, probably using Pangool for it would also be a good idea.
In fact, what we believe is that the default Hadoop API should look like Pangool.
return Pattern.compile(regex).split(this, limit);
The benchmark seems fair to me.