Here are basics, with the given table fragment completed to a meaningful table.
use warnings;
use strict;
use feature 'say';
use HTML::TableExtract;
use Scalar::Util qw(looks_like_number);
my $filename = shift // die "Usage: $0 file\n";
my $te = HTML::TableExtract->new;
$te->parse_file($filename);
my ($tbl) = $te->tables; # one table in the sample file
my (@values1, @values2); # for 90.5% and such, processing options
foreach my $row ($tbl->rows) {
# Tables often come with empty fields; keep them, for counting and such
my @fields = map { defined($_) ? $_ : '--' } @$row;
printf "%8s ", $_ for @fields;
say '';
# Criteria for how to identify the number aren't explained,
# but may it be the fourth column in a row starting with 'EM'?
if ($fields[0] =~ /^\s*EM\s*$/) {
push @values1, $fields[3] =~ s/^\s*|\s*$//gr; # see note in text
}
# Or is it simply the number ending with % sign?
foreach my $fld (@fields) {
if ($fld =~ /\s*(.+)\s*%/ and looks_like_number($1)) {
push @values2, $1;
}
}
}
say "@values1";
say "@values2";
NOTE: The /r modifier in the regex was added in v5.14. If your Perl is older see footnote†
Much of processing is shown for a demo. One needn't print those values (once you figure out where the items of interest are), and I wouldn't replace undef fields with --, done for clearer printout, but rather with '' (empty string). Also, we would need one criterion, not two.
Note that in the first case we keep the % sign, and use regex to clean up the spaces; in the second case the percentile sign is left out (and spaces end up cleaned out by regex naturally, in matching). These can both be adjusted to what you actually need of course.
Since neither the actual table nor data in it, nor the exact criteria, are given I can only offer hints and examples of code. With more detail this can be made more specific.
Note, when it comes to nailing down the last details, often by regex, things tend to become picky and sensitive to details; so careful.
The html file, completed off of the fragment in the question, used above:
<html>
<style> th, td { padding: 10px } </style> <!-- to better see it -->
<table align="center" rules="all">
<tr>
<th rowspan="2">EM</th>
<th colspan="2"><a href="http://www.google.com">ggl</a></th>
<td> 90.5%</td>
</tr>
<tr>
<td>data</td> <td>more</td> <td>etc</td>
</tr>
<tr>
<th rowspan="2">EM</th>
<th colspan="2"><a href="http://www.google.com">ggl</a></th>
<td> 0.0%</td>
</tr>
<tr> <td>data</td> <td>more</td> <td>etc</td> </tr>
</table>
</html>
† The /r modifier on the substitution regex makes it return the changed string (and it leaves the original as it is). This is precisely what one wants in many situations, the one in the code here being a good example (we merely want to add the changed string to an array).
However, in Perls older than version 5.14, when this feature was introduced, one has to do otherwise: either create a new variable that has the change, if you need to keep the original unchanged, or change that original and then use it.
Since we are pruning extra spaces here it may well be best to simply change $fields[3] and then add it to the array. So instead of
if ($fields[0] =~ /^\s*EM\s*$/) {
push @values1, $fields[3] =~ s/^\s*|\s*$//gr;
}
do
if ($fields[0] =~ /^\s*EM\s*$/) {
$fields[3] =~ s/^\s*|\s*$//g; # strip leading/trailing spaces
push @values1, $fields[3];
}
tabletag and thus contains no tables. Please make a minimal but reproducible example, i.e. skip everything not actually needed to show your problem (like use of LWP) and provide a complete data file which should work in your opinion but does not with your code.$teis an "undefined value" as that error indicates then things are really not right since the good module you use couldn't even parse a table. Then, the methodrowsgoes on tables, so$table->rows. See documentation; also, there are posts here with complete examples for processing tables (for example, this one, with links in it for two more complete examples)