Scrape hp website

Scrape hp website for drivers. MUST USE: Perl, Web::Scraper

Looking for quality, clean, reusable, modern Perl here. Comments expected, so fluent English speakers only.

Start URL:

[url removed, login to view]

We need to follow the following product links on the start url:

Handheld Printing ›

Multifunction and All-in-One ›

Network Print Servers ›

Printers ›

Second URLs:

for each of the above products, we need to traverse the links under it until we reach the product page (they may or may not be multiple levels deep). For example, follow these links:

Printers > HP LaserJet Printers > HP LaserJet P4500 Printer series > HP LaserJet P4515xm Printer (you should reach the following url if you followed the instructions correctly: [url removed, login to view])

You will now have arrived at the product page for the $PRODUCT_NAME1 HP LaserJet P4515xm Printer. $PRODUCT_NAME1's value should be the text of the last link we traversed to get here. We will also need $PRODUCT_NAME2 to be set to 'HP LaserJet P4510 Printer series' which is on the product page itself.

We will need the scraper to do all languages and all operating systems. For the purposes of this explaination, make sure English (American) is selected, and select Microsoft Windows 7 (32-bit) to reach the third url.


You should be here if you followed the instructions correctly: [url removed, login to view]

Here is where we get the rest of our variables. First variable will be $TYPE and $TYPE_DESCRIPTION. (examples: 'Driver - Universal Print Driver' $TYPE = Driver $TYPE_DESCRIPTION = Universal Print Driver) (Note: sometimes it will just say like 'Firmware', in which case set both variables to 'Firmware' or whatever the single type is)

For each set ($TYPE,$TYPE_DESCRIPTION) we need to get each download and the information for it. For the first download on our page we could create a row (csv, tab delimited, or mySQL) that would look like:


HP LaserJet P4515xm Printer,HP LaserJet P4510 Printer series,Driver,Universal Print Driver,1 - HP Universal Print Driver for Windows PCL6,[url removed, login to view],27 Jun 2012,[url removed, login to view],$DIRECT_URL_TO_DOWNLOAD

Notice the last value, DRIVER_URL, which has a value of $DIRECT_URL_TO_DOWNLOAD. I'm leaving that for you to figure out, as the download button uses javascript to construct a url.


1. If, on the product's download page a download item's description says '(Downloadable Driver Not Available)' then skip. (ex. [url removed, login to view])

2. Follow the same rule if the download link says 'obtain software' (ex. same as above example)


1. Written in Perl, using Web::Scraper

2. Be familiar with Perl best practices. Modular, documented, don't repeat yourself, etc.

3. Modern Perl please

Please read the project first and write the word "Understood". Also write your questions, steps and suggestions to complete the project with a short description of what you understood.

Skills: Mobile App Development, Perl, Script Install, Software Architecture, Web Scraping

See more: windows 8 website, web apps examples, use case levels, steps and what you need to create a website, row 44, hp com, first bit set, drivers select, driver select, best software for make website, american systems, cc systems, best case software, description and uses of the product, windows universal, website product scrape, Universal windows, scrape website, scrape for links, reusable, product printer, printer firmware, Printer driver, page scrape, install hp windows 8

About the Employer:
( 91 reviews ) Cairo, Egypt

Project ID: #2526566

3 freelancers are bidding on average $160 for this job


Hi, check PM please. Thanks.

$200 USD in 4 days
(38 Reviews)

Please check the PMB

$30 USD in 1 day
(0 Reviews)

please check pm

$250 USD in 5 days
(0 Reviews)