At a high level, this project involves FTP'ing a zipped XML data feed to the host server, decompressing the data file, parsing it, and inserting the data into a database.
As usual, it isn't quite as simple as it sounds. :-)
For example, if the product already exists in the database, you just insert a daily record with the price and other related information.
Since there is no mechanism for deleting products, we also have to scan the feed for all products in the database. If a product is in the database and not in the feed, we mark it deleted in the database.
Details are in the Deliverables section below.
## Deliverables
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
Detailed Project Requirements
-----------------------------------
1. Create a PHP page named "[login to view URL]" that will be called once per day as a Windows Timed Task (my server is a Windows 2003 server with PHP 5.2.1 installed). I will set up the timed task, you will just need to create the page. The [login to view URL] page will download the ClickBank data feed via an HTTP GET at:
[login to view URL]
This URL should be configurable.
2. Decompress the archive and save the data feed to disk into an archive directory (the location of which is also configurable). Then parse the feed data. Insert the records into our "cbproduct" and "cbproductinfo" tables, the structure of which is show below.
--
-- Table structure for table `cbproduct`
--
CREATE TABLE `cbproduct` (
`productID` int(11) NOT NULL auto_increment,
`title` varchar(128) NOT NULL default '',
`description` varchar(255) NOT NULL default '',
`link` varchar(255) NOT NULL default '',
`dategory` varchar(80) NOT NULL default '',
`siteID` varchar(80) NOT NULL default '',
`deleted` tinyint(4) NOT NULL default '0',
`addDate` timestamp NOT NULL default CURRENT_TIMESTAMP,
PRIMARY KEY (`productID`),
KEY `title` (`title`,`siteID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
-- -----------------------------------------------------
--
-- Table structure for table `cbproductinfo`
--
CREATE TABLE `cbproductinfo` (
`productinfoID` int(11) NOT NULL auto_increment,
`productID` int(11) NOT NULL default '0',
`addDate` timestamp NOT NULL default CURRENT_TIMESTAMP,
`popularity` int(11) NOT NULL default '0',
`gravity` float NOT NULL default '0',
`earnedPerSale` float NOT NULL default '0',
`percentPerSale` float NOT NULL default '0',
`referred` float NOT NULL default '0',
`commission` int(11) NOT NULL default '0',
PRIMARY KEY (`productinfoID`),
KEY `productID` (`productID`,`addDate`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1 ;
--
-- Constraints for dumped tables
--
--
-- Constraints for table `cbproductinfo`
--
ALTER TABLE `cbproductinfo`
ADD CONSTRAINT `cbproductinfo_ibfk_1` FOREIGN KEY (`productID`) REFERENCES `cbproduct` (`productID`) ON DELETE CASCADE;
The insert process must follow these rules:
a. Check if the product already exists using the Id + Category fields from the data file as a primary key. For our purposes we consider the same product in two different categories to be two separate products.
b. If the product already exists, get its productID (the [login to view URL] field) and insert a new related cbproductinfo record. The cbproductinfo table contains the popularity, gravity, earnedPerSale, percentPerSale, referred, and commission fields.
c. If the product does not exist, insert a new cbproduct row, get its database ID, and then insert a new related cbproductinfo row.
d. The [login to view URL] field is not represented in the ClickBank data feed. In order to derive this value,
1. Emulate the search form post at: [login to view URL]
The keyword value should be the Id field from the feed.
2. Grab the URL of the link in the result. This will involve parsing HTML of the page returned after the form post.
e. Last but not least, iterate all the products in the cbproduct database. For each product that exists
in the XML data feed, set [login to view URL] = false (0), and for each product that does not exist in the feed, set [login to view URL] = true (1).
I would like this project coded using an object oriented architecture utilizing PHP classes. I would like two data access classes, named after each table (i.e. "cbproduct" and "cbproductinfo"). The [login to view URL] page can and should utilize these two classes to do its work.
The classes should expose basic create, update, and delete functionality via the following interface:
* Save() - Saves the current object to the database. If the object already exists, perform an update. If not, an insert.
* Delete() - Deletes the current object from the database using the currently defined primary key.
* Delete(pk) - Deletes the row with primary key pk from the database (does not require fetching data first)
* Get(pk) - Returns a reference to an object populated with data from the row with a primary key value of pk
* GetCollection() - For the cbproduct class; returns all rows not marked as deleted
* GetCollection(fk) - For the cbproductinfo class; returns all rows where productID = fk
The mapping of the data feed's XML to database columns will be like this:
Table Column XML Element
--------------------------------------------------------
cbproduct productID auto-increment identity column
cbproduct title
cbproduct siteID
cbproduct description
cbproduct link n/a (scraped from site)
cbproduct category
cbproduct deleted Set to 1 if prod not in feed
cbproduct addDate CURRENT_TIMESTAMP
cbproductinfo productInfoID auto-increment identity col
cbproductinfo productID related cbproduct row
cbproductinfo addDate CURRENT_TIMESTAMP
cbproductinfo popularity
cbproductinfo gravity
cbproductinfo earnedPerSale
cbproductinfo percentPerSale
cbproductinfo referred
cbproductinfo commission
Miscellanea:
1. The DTD is included in the data feed's zip archive.
2. The DTD provides the possibility of subcategories, i.e. categories within categories. We will just append the category names, thus:
category/subcategory1/subcategory2
... would be a single category the name for which is "category/subcategory1/subcategory2".
If you've handled zip archives and XML parsing before, I would think this project should be very straightforward. I've tried to be very thorough as to the requirements; there shouldn't be any significant changes. If some change does come up, I would consider it a change order and would increase pay accordingly.
## Platform
Windows 2003 Server
PHP 5.2.1
MySql 4.2