So, playing with my shiny new toy while waiting for remote SAS jobs to finish, I ended up trying to install PerlIO::via::gzip which pulls in PerlIO::Util. PerlIO::Util
failed some of its tests, specifically, one relating to PerlIO::tee.
Some of the failures involved an extra 0d before a 0d0a
These failures were perplexing to me, but, at least CPAN testers showed me that I was not alone.
On the other hand, looking at that list, there are clearly Perl installations on Windows where this does not happen. I have been unable to figure out the underlying cause, but I suspect it is at least somewhat related to the series of puzzled posts I made about UTF-8 output from perl in a cmd.exe Window.
I went ahead, and force installed PerlIO::Util
just ’cause I am lazy, and I wanted to use PerlIO::via::gzip
. Here is a short script to start with. We are printing to scalars:
#!/usr/bin/env perl
use 5.020;
use strict;
use warnings;
use PerlIO::Util;
open my $f, '>:tee', \(my ($x, $y))
or die "tee open: $!";
binmode $f, ':crlf' or die "binmode: $!";
print $f "\n";
close $f;
say hexdump($_) for $x, $y;
# Thanks for the tip in the comments
sub hexdump { sprintf('%*v02x', ' ', $_[0]) }
And, the output is:
t.pl
0d 0a
0d 0a
or
t.pl | xxd
0000000: 3064 2030 610d 0a30 6420 3061 0d0a 0d 0a..0d 0a..
But, if I do this:
#!/usr/bin/env perl
use 5.020;
use strict;
use warnings;
use PerlIO::Util;
open my $f, '>:tee', 'x', 'y'
or die "tee open: $!";
binmode $f, ':crlf' or die "binmode: $!";
print $f "abc\n";
close $f;
I get:
xxd x
0000000: 6162 630d 0d0a abc...
xxd y
0000000: 6162 630d 0d0a abc...
That is, it looks like the "\n"
above gets translated to CRLF, and then another layer translates the last LF to CRLF again.
It seems to my untrained eye that whatever is happening is probably happening within this function in PerlIO-Util.xs:
if(tab && tab->Open){
f = tab->Open(aTHX_ tab, layers, i, mode,
fd, imode, perm, f, narg, args);
/* apply 'upper' layers
e.g. [ :unix :perlio :utf8 :creat ]
~~~~~
*/
if(f && ++i < n){
if(PerlIO_apply_layera(aTHX_ f, mode, layers, i, n) != 0){
PerlIO_close(f);
f = NULL;
}
}
}
A quick inspection seems to verify this. Here is the list of layers before the application of the :crlf
layer:
---
- unix
- ~
- - CANWRITE
- OPEN
- TRUNCATE
---
- crlf
- ~
- - CANWRITE
- FASTGETS
- CRLF
- TRUNCATE
---
- tee
- y
- - CANWRITE
- FASTGETS
- CRLF
- TRUNCATE
and, here is the list of layers after the binmode $f, ':crlf'
:
---
- unix
- ~
- - CANWRITE
- OPEN
- TRUNCATE
---
- crlf
- ~
- - CANWRITE
- FASTGETS
- CRLF
- TRUNCATE
---
- tee
- y
- - CANWRITE
- FASTGETS
- CRLF
- TRUNCATE
---
- crlf
- ~
- - CANWRITE
- FASTGETS
- CRLF
- TRUNCATE
I have demonstrated in the past that I don’t necessarily understand PerlIO layers very well. But, perldoc PerlIO says:
:crlf
A layer that implements DOS/Windows like CRLF line endings. On read converts pairs of CR,LF to a single “\n” newline character. On write converts each “\n” to a CR,LF pair. Note that this layer will silently refuse to be pushed on top of itself. (emphasis mine)
Any ideas?