And, it will help you save time and make money ;-)
I was motivated to post this because of another one of those Stackoverflow questions. I decided at the outset not to answer that question because the poster basically wants a job done for him for free:
I need the script to get the HTML, parse the table then to save the content (User + Online time), I would also want it to run every 15 mins and to make a report in the end of the day.
However, a so-called answer stated:
in my opinion perl can get a little ugly.
does it need to be perl….if it does ot i would recommend python.
Of course, I am kinda used to people proclaiming Perl sucks, but the supreme irony of the ugliness of the post asserting Perl’s ugliness motivated me.
HTML::TableExtract is beautiful. Over the years, it has saved me a lot of time, and even helped me make some money.
So, consider the Personal Income table available from the Bureau of Economic Analysis.
Let’s say I want to get the Unemployment Insurance row out of that table. Here’s how you do it using HTML::TableExtract
:
#!/usr/bin/env perl
use strict; use warnings;
use HTML::TableExtract;
my $te = HTML::TableExtract->new(
attribs => { id => 'tbl' },
);
# local copy of
# http://bea.gov/iTable/iTableHtml.cfm?reqid=9&step=3&isuri=1&903=58
$te->parse_file('personal-income.html');
my ($table) = $te->tables;
for my $row ($table->rows) {
my ($undef, $label, @row) = @$row;
next unless defined $label;
if ($label eq 'Unemployment insurance') {
print "$label\t@row\n";
}
}
And, here is the output:
C:\temp> uu Unemployment insurance 101.1 127.9 144.8 148.7 152.8 137.4 135.8 128.7 117.5 108.8 103.0 100.1
Of course, things can be refined, but this is pretty beautiful.