On my blog, I use the excellent highlight.js library to apply syntax highlighting to source code in the browser. This has the benefit of being able to copy & paste source code directly in to the post (enclosed in a [% FILTER html %]
block), instead of having to transform it somehow. There is also the added benefit of keeping the number of tag-enclosed pieces of text to a minimum, keeping the original DOM simple which, one hopes, means faster downloads and faster initial rendering.
A recent Stackoverflow question introduced me to the PPI::HTML module which uses the amazing PPI module to parse Perl source code, and associate CSS classes with the various elements.
If you ask the module to produce a complete HTML page, it will also embed the relevant CSS in the page, and will produce a pretty, colorful document. By default, the class names are rather verbose, and the module offers limited flexibility, but PPI::HTML::CodeFolder provides some enhancements that may be useful.
What if you wanted to produce a self-contained chunk of syntax-highlighted Perl without depending on external CSS or JavaScript? In that case, you can resort to a somewhat grungy technique I use when I am generating HTML email: Post process the HTML to replace classes on elements with style attributes.
Here is an example Perl script which generates a syntax highlighted version of its own source code:
#!/usr/bin/env perl
use strict;
use warnings;
use PPI;
use PPI::HTML;
use HTML::TokeParser::Simple;
my %colors = (
'#339999',
cast => '#008080',
comment => '#FF0000',
core => '#999999',
double => '#FF0000',
heredoc_content => '#883333',
interpolate => '#0000FF',
keyword => '#666666',
line_number => '#999999',
literal => '#0099FF',
magic => '#9900FF',
match => '#990000',
number => '#DD7700',
operator => '#008080',
pod => '#990000',
pragma => '#9900FF',
regex => '#664444',
single => '#9900FF',
substitute => '#9900FF',
transliterate => '#40c080',
word =>
);
my $highlighter = PPI::HTML->new(line_numbers => 0);
my $html = $highlighter->html(\ do { local $/; open 0; <0> });
print qq{<pre style="background-color:#fff;color:#000">},
$html, \%colors),
map_class_to_style(qq{</pre>\n}
;
sub map_class_to_style {
my $html = shift;
my $colors = shift;
my $parser = HTML::TokeParser::Simple->new(string => $html);
my $out;
while (my $token = $parser->get_token) {
next if $token->is_tag('br');
my $class = $token->get_attr('class');
if ($class) {
$token->delete_attr('class');
if (defined(my $color = $colors->{$class})) {
# shave off some characters if possible
$color =~ s{
\A \#
([[:xdigit:]])\1
([[:xdigit:]])\2
([[:xdigit:]])\3
\z
}{#$1$2$3}x;
$token->set_attr(style => "color:$color");
}
}$out .= $token->as_is;
}$out;
}
And the output, in a rather distasteful color scheme, I admit:
The original script is 1,690 bytes. On the other hand, the syntax highlighted chunk above is 8,599 which is about a 408% increase.
PS: You can discuss this post on /r/perl.