1

Here is the thing that I am trying to accomplish:

In broader sense, parse the XML data using a SAX parser and insert it into the appropriate database column in a MySQL table.

Here is sample Books.xml

 <?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSpy v2009 sp1 (http://www.altova.com)-->
<bks:books xsi:schemaLocation="urn:books Untitled1.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:bks="urn:books">
        <book id="String">
                <author>String</author>
                  <authorFirstName>String</authorFirstName>
                  <authorLastName>String</authorLastName>
                <title>String</title>
                   <titleNo>3</titleNo>
                <genre>String</genre>
                <offer>String</offer>
                   <price>3.14159E0</price>
                <pub_date>1967-08-13</pub_date>
                <review>String</review>
                  <reviewsratings></reviewratings>
        </book>
</bks:books>

Perl script that uses a SAX Parser:

#!usr/bin/perl -w

use XML::SAX::ParserFactory;
use MySaxHandler;
my $handler = MySaxHandler->new();
my $parser = XML::SAX::ParserFactory->parser(Handler => $handler);
$parser->parse_uri("books.xml")

SAX Parser Handler Module MySaxHandler.pm:

package MySaxHandler;
use Data::Dumper;
use base qw(XML::SAX::Base);
my $in_books = 0;
sub start_document{
    my($self,$data) = @_;
    print "Parsing Started:\n";
}
sub start_element {
    my ($self,$data) = @_;
    my %attribs = %{$data->{'Attributes'}}; foreach( keys( %attribs )) { print " $_ = " . $attribs{$_}->{Value} . "\n"; } # -> Prints values of attributes.
    print "Starting element: ".$data->{Name}."\n\t";
    $in_books++;
    }
sub end_element {
    my($self,$data) = @_;
    print "\t Ending element:".$data->{Name}."\n";
    $in_books--;
}
sub characters{
    my($self,$data) = @_;
     if($in_productOffering){
      print "\t Element Values:".$data->{Data}."\n";
     }
}
sub end_document{
    my($self,$data) = @_;
        print "Parsing Completed\n";
}
1;

Steps:

  1. Parse XML using XML::SAX parser
  2. For each node in the XML if it has child nodes then traverse through it and once I reach the leaf node then generate insert statement dynamically for inserting that node value into Database table's column; e.g., if I have XML leaf node as price and its parent node is offer then Perl script should insert value of price node in price column of offer table in database.
  3. I have a large XML file which has many nodes and each node in turn has many child nodes and grand child nodes.

Question:

  1. How can I insert values of element tags in books.xml into the appropriate MySQL database columns for. E.g. price value should go into offer table's price column while I am parsing through the XML using the SAX Parser ?

    OR

  2. How can I generate all inserts and once I have all the insert statements then go and connect to a MySQL database and just dump them?

Tricky Part:

Tricky part is that there are some data that have interdependencies. E.g. price node in offer parent node is also related to default price node and so while generating inserts statements we have to keep in mind that values are properly inserted into the databases, but we are not allowed to use InnoDB table of MySQL but the only engine we are allowed to use is MyISAM.

What are possible suggestions in Perl to work around these issues?

9
  • 4
    Your question is quite confusing. It might help to provide some sample code, with comments explaining what you hope to accomplish at the points where you're currently stumped. Commented Nov 17, 2009 at 18:04
  • 2
    Could you possibly edit down your code and data to a small sample that exhibits the problem you are trying to solve? Commented Nov 17, 2009 at 18:49
  • 1
    It's not clear now. Condensing down the problem is necessary both for you to be able to explain the problem clearly and for everyone (including yourself) to understand it. Commented Nov 17, 2009 at 18:59
  • 1
    What you consider important parts are parts that would only be relevant to someone who had experience with almost the same exact problem as you. If you remove the java-centric terminology, trim the examples, and describe what want in a more abstract level without relying on your past experiences, your question will be more attractive to a wider audience and you might learn a more perl-appropriate solution. Commented Nov 17, 2009 at 19:10
  • 1
    Please provide a sample of the XML, along with what you expect to happen in the database. Commented Nov 17, 2009 at 21:21

2 Answers 2

2

I think your problem is step two. Based on your description of the problem, it sounds like you need to collect more information before you insert a record. Instead of inserting a record once you reach a node, wait until you collect everything you need. That might mean adding the record to a queue, for instance.

That's all I can say based on the very limited information you've provided. In you want better advice, construct a small demonstration script, perhaps with sample data, that illustrates what you are trying to do. When I have these sorts of problems, I test ideas in small programs instead of the big, production code.

Sign up to request clarification or add additional context in comments.

2 Comments

Right now am in the process of generating small program which would clearly explain what am trying to achieve. I will post my response as soon as am done with creating a sample script.
Yes. My problem is step 2. I can parse the XML but the thing I want to accomplish is to generate insert statements dynamically while am parsing through the XML. I have created modules like Offer.pm which would give me insert statements but my query is about how can I dynamically invoke those module while am parsing the XML.
0

From my pov you may try to populate the data first into a Perl internal data structure (either an object or into a hash).
I would then write the SQLs based on this data structure - thus you have decoupled the database and the XML parsing.
It depends on the amount of data, if you are better of to do a bulk insert or perform a commit after every item.

2 Comments

As mentioned am using SAX parser and so parsing is done as event gets fired. Now from this how can I generate SQLs Inserts for inserting into Database ?
SAX parsing simply means that you need to keep the history of read elements yourself - thus you need to cache all relevant elements until you move the data to the database. For this task you need a hash as a cache - after every new record you may then create the SQL based on the hash/cache and null it after each logical element.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.