SEO HTML Web Scraper/Parser and Data Extraction

IN PROGRESS
Bids
21
Avg Bid (USD)
$2300
Project Budget (USD)
$1000 - $3000

Project Description:
This project is for a web scraper tool. The entire project can be in Python or PHP. I'm interested to hear your thoughts on which is the preferred language.

The idea is to start with a web page that requests email and url on a web form. Once submit is pressed, on-page SEO values such as title, H1, meta description and more would be analyzed for their existence and character counts, all using AJAX/jQuery (no page refresh).

Complete list of data to be analyzed and parsed:
- meta title, description, keyword, robots (count characters on title and desc)
- title and description should be compared to url for keywords
- canonical url exist? what is it?
- is seo friendly url? no variables such as [url removed, login to view]
- facebook open graph tags exist?
- twitter card tags exist?
- h1, h2, h3, number times used and contents
- alt tags on images and contents along with image filename
- existence of facebook, twitter, pinterest, linkedin, youtube, google+ links
- presence of any inline code (javasript and css)
- existence of [url removed, login to view] file at root directory
- does flash exists?
- count W3C validation errors
- count backlinks, get mozrank using SEOMoz API
- grab top keywords using SEMRush

Code should handle large volume as it will be used in a CRM in the future.

Data extracted should have the ability to add to a mysql database and a export to a PDF for display and sent to the email from the starting web form.

See these links for PHP examples:
[url removed, login to view]
http://simplehtmldom.sourceforge.net/

This is a good example of a live site:
[url removed, login to view]

Thank you for reading this project. I'm looking to find a coder for a long term relationship on future projects. NDA will be required to start work and I prefer US/UK developers.

Please let me know if you have any questions on the requirements.

Skills required:
PHP, Python, Software Architecture
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


$ 4639
in 60 days
Hire e3d
$ 2222
in 25 days
$ 2577
in 20 days
$ 3092
in 20 days
$ 2886
in 35 days
$ 3092
in 35 days
$ 2268
in 30 days
Hire asarfraz
$ 2000
in 45 days
$ 2472
in 10 days
$ 2631
in 35 days