When running on the commit & code you point to here, here are my new results:
$ hyperfine -N --warmup 5 './benchmark/fast_parser data/fastq_test.fastq' './benchmark/needletail_benchmark/target/release/rust_parser data/fastq_test.fastq '
Benchmark 1: ./benchmark/fast_parser data/fastq_test.fastq
Time (mean ± σ): 675.0 ms ± 2.4 ms [User: 399.3 ms, System: 269.4 ms]
Range (min … max): 670.5 ms … 677.5 ms 10 runs
Benchmark 2: ./benchmark/needletail_benchmark/target/release/rust_parser data/fastq_test.fastq
Time (mean ± σ): 840.8 ms ± 3.0 ms [User: 578.0 ms, System: 257.0 ms]
Range (min … max): 837.0 ms … 847.7 ms 10 runs
Summary
./benchmark/fast_parser data/fastq_test.fastq ran
1.25 ± 0.01 times faster than ./benchmark/needletail_benchmark/target/release/rust_parser data/fastq_test.fastq
Which indeed shows your parser running about 25% faster than the needletail version.