Thu, 17 Aug 2006
Perl: The next level
We have ran into a problem that Data::Dumper
module escapes the UTF-8 characters as \x{codepoint}
.
This is probably intended, as the result is then usable regardless of
whether the "use utf8
" pragma is active or not. But it is not
much readable when the data contains lots of Czech text.
I have solved it by filtering the Data::Dumper
output through
the following substitution:
s/\\x{[0-9a-f]+}/"\"$&\""/geexms
This has moved me to the next level of my Perl proficiency, as probably
for the first time I have used the "/ee
" in a real-world code.
4 replies for this story:
oozy wrote:
Very nice.
Adelton wrote:
Why the "xms"?
Yenya wrote: xms
Perl Best Practices - Chapter 12: Regular Expressions: "Always use /xms". I think Damian Conway is right with this one. Well, it is not suitable for the Perl Golf, but as a programming habit it is actually good.
Adelton wrote:
Benchmark: timing 10000000 iterations of normal, xms... normal: 6 wallclock secs ( 6.79 usr + 0.03 sys = 6.82 CPU) @ 1466275.66/s (n=10000000) xms: 8 wallclock secs ( 7.52 usr + 0.01 sys = 7.53 CPU) @ 1328021.25/s (n=10000000)