Find Jobs
Hire Freelancers

XML Data Feed Parser

$100-400 USD

Completed
Posted almost 17 years ago

$100-400 USD

Paid on delivery
At a high level, this project involves FTP'ing a zipped XML data feed to the host server, decompressing the data file, parsing it, and inserting the data into a database. As usual, it isn't quite as simple as it sounds. :-) For example, if the product already exists in the database, you just insert a daily record with the price and other related information. Since there is no mechanism for deleting products, we also have to scan the feed for all products in the database. If a product is in the database and not in the feed, we mark it deleted in the database. Details are in the Deliverables section below. ## Deliverables 1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done. 2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables): For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment. 3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement). Detailed Project Requirements ----------------------------------- 1. Create a PHP page named "[login to view URL]" that will be called once per day as a Windows Timed Task (my server is a Windows 2003 server with PHP 5.2.1 installed). I will set up the timed task, you will just need to create the page. The [login to view URL] page will download the ClickBank data feed via an HTTP GET at: [login to view URL] This URL should be configurable. 2. Decompress the archive and save the data feed to disk into an archive directory (the location of which is also configurable). Then parse the feed data. Insert the records into our "cbproduct" and "cbproductinfo" tables, the structure of which is show below. -- -- Table structure for table `cbproduct` -- CREATE TABLE `cbproduct` ( `productID` int(11) NOT NULL auto_increment, `title` varchar(128) NOT NULL default '', `description` varchar(255) NOT NULL default '', `link` varchar(255) NOT NULL default '', `dategory` varchar(80) NOT NULL default '', `siteID` varchar(80) NOT NULL default '', `deleted` tinyint(4) NOT NULL default '0', `addDate` timestamp NOT NULL default CURRENT_TIMESTAMP, PRIMARY KEY (`productID`), KEY `title` (`title`,`siteID`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ; -- ----------------------------------------------------- -- -- Table structure for table `cbproductinfo` -- CREATE TABLE `cbproductinfo` ( `productinfoID` int(11) NOT NULL auto_increment, `productID` int(11) NOT NULL default '0', `addDate` timestamp NOT NULL default CURRENT_TIMESTAMP, `popularity` int(11) NOT NULL default '0', `gravity` float NOT NULL default '0', `earnedPerSale` float NOT NULL default '0', `percentPerSale` float NOT NULL default '0', `referred` float NOT NULL default '0', `commission` int(11) NOT NULL default '0', PRIMARY KEY (`productinfoID`), KEY `productID` (`productID`,`addDate`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ; -- -- Constraints for dumped tables -- -- -- Constraints for table `cbproductinfo` -- ALTER TABLE `cbproductinfo` ADD CONSTRAINT `cbproductinfo_ibfk_1` FOREIGN KEY (`productID`) REFERENCES `cbproduct` (`productID`) ON DELETE CASCADE; The insert process must follow these rules: a. Check if the product already exists using the Id + Category fields from the data file as a primary key. For our purposes we consider the same product in two different categories to be two separate products. b. If the product already exists, get its productID (the [login to view URL] field) and insert a new related cbproductinfo record. The cbproductinfo table contains the popularity, gravity, earnedPerSale, percentPerSale, referred, and commission fields. c. If the product does not exist, insert a new cbproduct row, get its database ID, and then insert a new related cbproductinfo row. d. The [login to view URL] field is not represented in the ClickBank data feed. In order to derive this value, 1. Emulate the search form post at: [login to view URL] The keyword value should be the Id field from the feed. 2. Grab the URL of the link in the result. This will involve parsing HTML of the page returned after the form post. e. Last but not least, iterate all the products in the cbproduct database. For each product that exists in the XML data feed, set [login to view URL] = false (0), and for each product that does not exist in the feed, set [login to view URL] = true (1). I would like this project coded using an object oriented architecture utilizing PHP classes. I would like two data access classes, named after each table (i.e. "cbproduct" and "cbproductinfo"). The [login to view URL] page can and should utilize these two classes to do its work. The classes should expose basic create, update, and delete functionality via the following interface: * Save() - Saves the current object to the database. If the object already exists, perform an update. If not, an insert. * Delete() - Deletes the current object from the database using the currently defined primary key. * Delete(pk) - Deletes the row with primary key pk from the database (does not require fetching data first) * Get(pk) - Returns a reference to an object populated with data from the row with a primary key value of pk * GetCollection() - For the cbproduct class; returns all rows not marked as deleted * GetCollection(fk) - For the cbproductinfo class; returns all rows where productID = fk The mapping of the data feed's XML to database columns will be like this: Table Column XML Element -------------------------------------------------------- cbproduct productID auto-increment identity column cbproduct title cbproduct siteID cbproduct description cbproduct link n/a (scraped from site) cbproduct category cbproduct deleted Set to 1 if prod not in feed cbproduct addDate CURRENT_TIMESTAMP cbproductinfo productInfoID auto-increment identity col cbproductinfo productID related cbproduct row cbproductinfo addDate CURRENT_TIMESTAMP cbproductinfo popularity cbproductinfo gravity cbproductinfo earnedPerSale cbproductinfo percentPerSale cbproductinfo referred cbproductinfo commission Miscellanea: 1. The DTD is included in the data feed's zip archive. 2. The DTD provides the possibility of subcategories, i.e. categories within categories. We will just append the category names, thus: category/subcategory1/subcategory2 ... would be a single category the name for which is "category/subcategory1/subcategory2". If you've handled zip archives and XML parsing before, I would think this project should be very straightforward. I've tried to be very thorough as to the requirements; there shouldn't be any significant changes. If some change does come up, I would consider it a change order and would increase pay accordingly. ## Platform Windows 2003 Server PHP 5.2.1 MySql 4.2
Project ID: 2915939

About the project

10 proposals
Remote project
Active 17 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
Awarded to:
User Avatar
See private message.
$297.50 USD in 44 days
5.0 (27 reviews)
4.5
4.5
10 freelancers are bidding on average $253 USD for this job
User Avatar
See private message.
$255 USD in 44 days
5.0 (142 reviews)
5.7
5.7
User Avatar
See private message.
$340 USD in 44 days
3.9 (12 reviews)
4.9
4.9
User Avatar
See private message.
$340 USD in 44 days
5.0 (33 reviews)
4.7
4.7
User Avatar
See private message.
$276.25 USD in 44 days
4.7 (13 reviews)
4.3
4.3
User Avatar
See private message.
$255 USD in 44 days
4.7 (9 reviews)
3.9
3.9
User Avatar
See private message.
$127.50 USD in 44 days
5.0 (1 review)
0.0
0.0
User Avatar
See private message.
$85 USD in 44 days
0.0 (0 reviews)
0.0
0.0
User Avatar
See private message.
$255 USD in 44 days
0.0 (0 reviews)
0.0
0.0
User Avatar
See private message.
$297.50 USD in 44 days
0.0 (0 reviews)
0.0
0.0

About the client

Flag of UNITED STATES
United States
5.0
15
Member since Nov 28, 2000

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.