Scrape hp website for drivers. MUST USE: Perl, Web::Scraper
Looking for quality, clean, reusable, modern Perl here. Comments expected, so fluent English speakers only.
[url removed, login to view]
We need to follow the following product links on the start url:
Handheld Printing ›
Multifunction and All-in-One ›
Network Print Servers ›
for each of the above products, we need to traverse the links under it until we reach the product page (they may or may not be multiple levels deep). For example, follow these links:
Printers > HP LaserJet Printers > HP LaserJet P4500 Printer series > HP LaserJet P4515xm Printer (you should reach the following url if you followed the instructions correctly: [url removed, login to view])
You will now have arrived at the product page for the $PRODUCT_NAME1 HP LaserJet P4515xm Printer. $PRODUCT_NAME1's value should be the text of the last link we traversed to get here. We will also need $PRODUCT_NAME2 to be set to 'HP LaserJet P4510 Printer series' which is on the product page itself.
We will need the scraper to do all languages and all operating systems. For the purposes of this explaination, make sure English (American) is selected, and select Microsoft Windows 7 (32-bit) to reach the third url.
You should be here if you followed the instructions correctly: [url removed, login to view]
Here is where we get the rest of our variables. First variable will be $TYPE and $TYPE_DESCRIPTION. (examples: 'Driver - Universal Print Driver' $TYPE = Driver $TYPE_DESCRIPTION = Universal Print Driver) (Note: sometimes it will just say like 'Firmware', in which case set both variables to 'Firmware' or whatever the single type is)
For each set ($TYPE,$TYPE_DESCRIPTION) we need to get each download and the information for it. For the first download on our page we could create a row (csv, tab delimited, or mySQL) that would look like:
HP LaserJet P4515xm Printer,HP LaserJet P4510 Printer series,Driver,Universal Print Driver,1 - HP Universal Print Driver for Windows PCL6,[url removed, login to view],27 Jun 2012,[url removed, login to view],$DIRECT_URL_TO_DOWNLOAD
1. If, on the product's download page a download item's description says '(Downloadable Driver Not Available)' then skip. (ex. [url removed, login to view])
2. Follow the same rule if the download link says 'obtain software' (ex. same as above example)
1. Written in Perl, using Web::Scraper
2. Be familiar with Perl best practices. Modular, documented, don't repeat yourself, etc.
3. Modern Perl please
Please read the project first and write the word "Understood". Also write your questions, steps and suggestions to complete the project with a short description of what you understood.