Briefly, about this implementation:
1. it is a GRUSU-based, but not exactly the GRISU2;
2. for now produces only a raw ASCII representation, e.g. -22250738585072014e-324 without dot and '\0' at the end;
3. compared to Ryu, it significantly less code size and spends less clock cycles per digit, but is slightly inferior in a whole because generates a longer ASCII representation.
Now I would like to get feedback, assess how much this is in demand and collect suggestions for further improvements. For instance, I think that it is reasonable to implement conversion with a specified precision (i.e., with a specified number of digits), but not provide a printf-like interface.
The benchmark source code: https://github.com/leo-yuriev/dtoa-benchmark
The d2a() implementation: https://github.com/leo-yuriev/erthink/blob/master/erthink_d2a.h The test source code: https://github.com/leo-yuriev/erthink/blob/master/test/d2a.cxx
Any suggestions are welcome!
Actually I created some improvements to a quicksort-based algorithm and it work pretty for target scenarios (i.e. faster than std::sort, that based on introsort). So I want to check/estimate these improvements on a good suite of test datasets to get a better picture of the pros and cons.