time ./startup_time.pl 0.016s
time ./startup_time.py 0.295s
Parsing a 20Mb log file with a regex: time ./parse_log.pl 0.683s
time ./parse_log.py 1.534s
For which metric are you claiming Perl is slower than Python? #!/usr/bin/env perl
use 5.026;
open my $fh, '<', 'logs1.txt';
while (<$fh>) {
chomp;
say if /\b\w{15}\b/;
}
close $fh;
Python 3.8 #!/usr/bin/env python
import re
with open('logs1.txt', 'r') as fh:
for line in fh:
if re.search(r'\b\w{15}\b', line): print(line, end='')
Why isn't it a fair comparison? The startup time is generic and string parsing is a major feature of, say, web development. I didn't say Perl5 numerics match Python's but even there Python relies on external libs. #!/usr/bin/env python3
import re
with open('logs1.txt', 'r') as fh:
regex = re.compile(r'\b\w{15}\b')
for line in fh:
if regex.search(line): print(line, end='')
Perl almost certainly does this by default for regex literals, and that's a fair advantage for the "kitchen sink" style of language design versus orthogonal features (regex library, raw strings) that Python uses. time ./index.pl 0.258s
time ./index.py 0.609s
If you factor-in that Python startup time is 0.279s slower than Perl the processing differential comes down to 0.072s.The issue here is that pythons regex engine has overhead, and with lots of sequential calls with small strings like that the overhead adds up.
If you batch lines together in chunks you’ll see a huge improvement in speed, but the point is that it’s not “Python vs Perl” it’s “pythons regex engine vs Perl’s regex engine”. Which is about a contrived Perl-biased benchmark if ever there was one.
[admin@localhost ~]$ time ./test.py
real 0m0.295s
user 0m0.158s
sys 0m0.138s
[admin@localhost ~]$ time ./test.pl
real 0m0.164s
user 0m0.158s
sys 0m0.006s
#!/usr/bin/env perl
use 5.16.3;
open my $fh, '<', 'logs1.txt';
while (<$fh>) {
chomp;
if (/\b\w{15}\b/) {}
}
close $fh;
#!/usr/bin/env python
import re
regex = re.compile(r'\b\w{15}\b')
with open('logs1.txt', 'r') as fh:
for line in fh:
if regex.search(line):
continuePerl5 won the dec2bin benchmark.
The other thing I learned was that PHP's binary/decimal functions are two orders of magnitude slower, despite its core interpreter performance being best-in-class.
It has even less backcompat problems than the native regex.