I'm writing this because I've done it too, for a performance-critical embedded application, and I think it's an insight worth sharing, for anybody else who comes across this and is trying to do something similar. Not to put you down or one-up you or anything; I'm very sorry to have come across that way. Can I try again?
For a domain like [0,1], the Remez algorithm will do a great job -- because the exponent is barely changing, just the mantissa. For a range like [-10, 10], you won't find any polynomial of reasonable order that has acceptable error.
But, there's a really neat trick that does let the Remez algorithm work over the entire domain of floats, and get excellent performance with just a fourth or fifth order polynomial, or even third order if you're pressed for time, and minimal extra computation.
That is, with just a little simplification, you multiply x by (1/ln(2)), floor it, and stuff the resulting integer into the exponent bits. Then, you take the remainder (what is left over after taking the floor), and make that your input to the polynomial, and stuff the result into the mantissa bits.
There's a little more to it; signs and NaNs need to be handled correctly, subnormal numbers can be treated as a special case with a different Remez polynomial, and a few other corner cases and exact values might be important (e^0, e^1, e^-1). But the result works wonders and it's basically just using Remez but transforming the input to a domain that behaves more like a polynomial, leveraging the magic of IEEE-754.