NAME Markup::Content - Extract content markup information from a markup document SYNOPSIS my $content = Markup::Content->new( target => 'noname.html', template => 'noname.xml', target_options => { no_squash_whitespace => [qw(script style pi code pre textarea)] }, template_options => { callbacks => { title => sub { print shift()->get_text(); } } }); $content->extract(); $content->tree->save_as(\*STDOUT); DESCRIPTION This modules uses a description of another markup page (template) to match against a specified markup document (target). The point is to extract formatted content from a markup page. While this module in itself lends a good deal of flexibility and reuse, the script [to be] written around this module is probably a better choice. See . ARGUMENTS template This can be a file name, glob or internet address, or if you already have a Markup::MatchTree you want to use as the template, you may set this argument to the tree. This argument will be passed directly to the "set_template" method. See the section "TEMPLATES" for more information on what is meant by "template". template_options This HASHREF will be sent directly to Markup::MatchTree as the "parser_options" option. target This can be a file name, glob or internet address, or if you already have a Markup::Tree you want to use as the target, you may set this argument to the tree. This argument will be passed directly to the "set_target" method. target_options This HASHREF will be sent directly to Markup::Tree as the "parser_options" option. template_name The name of the template. This is unused right now, but will eventually be a nice-to-have-if-set option. METHODS set_template(FILE|"Markup::MatchTree") Makes a template tree from the FILE or "Markup::MatchTree". See the section "TEMPLATES" for more information on what is meant by "template". set_target(FILE|"Markup::Tree") Makes a target tree from the FILE or "Markup::Tree". extract Based on the "template" and "target" it will build a Markup::Tree, with just the content, accessible as $content->tree. TEMPLATES A template, as wanted by this module, is nothing more than a simple XML document. I will try to outline the document structure below. The XML root node should be template.