Find Jobs
Hire Freelancers

Basic XML Parser skeleton (simple if you know DOM/SAX)

$100-150 USD

In Progress
Posted about 16 years ago

$100-150 USD

Paid on delivery
The purpose of this project is to create an XML parser that utilized? a combination of both? DOM and SAX to parse a specially formatted XML file, executes formatting functions for each field and then adds the results into a database. An example of the XML format? will be? provided. The purpose of this assignment is to create a well formatted skeleton class that would read the provided XML input file and provide me with a place to later add special logic for processing the XML content. ## Deliverables **XML Parser Skeleton** **Overview:** The purpose of this project is to create an XML parser that utilized? a combination of both? DOM and SAX to parse a specially formatted XML file, executes formatting functions for each field and then adds the results into a database. An example of the XML format is provided below. The purpose of this assignment is to create a well formatted skeleton class that would read the provided XML input file and provide me with a place to later add special logic for processing the XML content. **Input: **Here is an example input file: | *<xml> <record num="0"> ? ? ? <Description> Some text here <br/>* *? ? ? ? ? ? May include any amount? <p> html </p> code, etc.* *? ? ? </Description>* *? ? ? <Name>* *? ? ? ? ? ? Some text here as well* *? ? ? ? </Name>* *? ? ? <Date>* *? ? ? ? ? ? ? 5/5/5 something <a href="...">something</a>* *? ? ? </Date> </record>* *<record num="1">* *? ? ? <Date> 7/7/7? 5:5:5 </Date> ? ? ? <Name> <div><a href="..."> originalAttribute="href" originalAttribute="href" originalPath="..."> originalAttribute="href" originalPath=""...">" Some name </a> </div>* *? ? ? ? </Name>* *? ? ? <Extra_Info> Lots of text </hr>? May include any amount? <p font="font"> html </p> code, etc.* *? ? ? </Extra_Info>* *</record> </xml>* | ? About the format of the XML file: ? 1)? ? ? ? The file is always split up into “<record>?? entries. The file can be very large - thousands or even tens of thousands of records. For this reason you need to use a SAX parser to read in the individual records. However, the records themselves will never be too large therefore the records themselves can be loaded via DOM ??" more info on this further in the document. 2)? ? ? ? Within each record are “fields??. Each field is the top level XML tag name within the record. For example, the first record had the fields “<Description>??, “<Name>?? and “<Date>??. Each field will need to be associated with a method for processing it, and the contents of the field needs to be passed to the processing method as a DOM tree. 3)? ? ? ? Within each field there could be any amount of HTML/XML tags, they all need to be loaded in memory in DOM but only the method that deals with the field would ever process them. Very often the methods processing the fields would just use “asText()?? to get all the text contents, but sometimes they will need to use the DOM elements also. **Output**: The output of the program will be actually be adding the processed contents of the records into a MySQL database. The database connection preferences can be hard coded. When starting the program needs to open a database connection, at the end of each record the formatted contents would be added to the database. **Program Requirements:** 1)? ? ? ? The program needs be a simple stand-alone command line executable. 2)? ? ? ? The name of the input file should be passed as a command line argument. 3)? ? ? ? The code needs to be very neatly spaced and commented. Remember you are writing a skeleton into which someone else will be adding logic, so it needs to be easy to work with. 4)? ? ? ? Please make an “ant?? build file (it will be short but we still need one). 5)? ? ? ? Please configure log4j and set it up to log errors/warnings to standard output. **Code Requirements** Please split up the code into the following two classes: ? “WkXMLParser?? -? ? ? ? ? ? We will create one instance of this class -? ? ? ? ? ? Before we parse the file we need to map field names to handling methods: o? ? ? addFieldHandler(fieldName, handlerMethod) §? Maps methods to the field name they handle. §? It is very important that the functions are called in the same order in which they were mapped, not in the order that fields occurred in the XML file. o? ? ? addRecordStartHandler(handlerMethod) o? ? ? addRecordEndHandler(handlerMethod) -? ? ? ? ? ? The final method would be “importXML(inputFile)??. If the start handler or end handler aren’t defined it should throw an exception. -? ? ? ? ? ? If during importing you encounter a field name with no method associated with it you should log a warning but continue. ? “WkImport?? -? ? ? ? ? ? This is the class with the main(), the one we run. -? ? ? ? ? ? Creates database connection. -? ? ? ? ? ? Instantiates WkXMLParser. -? ? ? ? ? ? Contains the field processing methods. -? ? ? ? ? ? Adds field processing methods to WkXMLParser. -? ? ? ? ? ? Contains “startRercord()?? and “endRecord()??. -? ? ? ? ? ? For the purpose of the skeleton, provide the following methods: o? ? ? startRecord() ??" clears any previous data stored in the member variables (name and descrition). o? ? ? fieldName(…) ??" Gets the text value of the DOM tree using asText and saves it to member variable “name??. o? ? ? fieldDescription(…) ??" Gets the text value of the DOM tree using asText and saves it to member variable “description??. o? ? ? endRecord() ??" creates a new record in the Database with name and description. ? **General Specification Points** -? ? ? ? ? ? The importXML method should read the input XML file using SAX, then for every record load the contents into memory using DOM. -? ? ? ? ? ? First at the start of each record it should call the record start handler. -? ? ? ? ? ? Within each record you need to call the individual field handlers. Again, it is important that you call the field handlers in the same order that they were added using the “addFieldHandler?? method, and NOT in the order that they appear in the XML file. This is one reason why the record should be loaded in DOM. -? ? ? ? ? ? Keep track of which fields were used, if there are any fields left that were not associated with any field handler then log warnings for each of them. -? ? ? ? ? ? Keep track of the number of the record being processed and display it in any warnings.
Project ID: 3787748

About the project

6 proposals
Remote project
Active 16 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
See private message.
$127.50 USD in 14 days
5.0 (123 reviews)
5.7
5.7
6 freelancers are bidding on average $105 USD for this job
User Avatar
See private message.
$93.50 USD in 14 days
4.9 (98 reviews)
5.1
5.1
User Avatar
See private message.
$85 USD in 14 days
4.8 (18 reviews)
4.0
4.0
User Avatar
See private message.
$85 USD in 14 days
5.0 (2 reviews)
0.3
0.3
User Avatar
See private message.
$110.50 USD in 14 days
5.0 (2 reviews)
0.0
0.0
User Avatar
See private message.
$127.50 USD in 14 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
United States
4.9
48
Member since Jan 30, 2008

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.