Q&A extraction from FAQ pages

CLOSED
Bids
15
Avg Bid (USD)
$556
Project Budget (USD)
$250 - $750

Project Description:
We have a list of websites with FAQ pages.

We need all questions and answers to be extracted from those pages.

The list goes like this:

http://australia-nikefreerun.com/
http://serppagerank.com/
http://efyco.com/
http://anerkennung-in-deutschland.de/
http://escarpinslouboutincfr.com/
http://ellas-music.net/
http://cubedcherry.co.za/
http://8fig.com/
http://onlineprofitsdirect.com/
http://ekwity.com/
http://radical.net/
http://retro4airjordan.com/

Since they all have different structures, you need to come up with a one-size-fits-all solution first to find, and then to parse their FAQ pages.
So i need an spreadsheed file with three columns: website, q1, a1 followed by other questions.

Please describe how you will manage to design a flexible parser to get the best result.

There are approximately 100.000 websites on our list.

Skills required:
PHP, Software Architecture
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


Hire ithinksolutions
$ 600
in 10 days
$ 500
in 5 days
Hire robindersingh
$ 787
in 15 days
$ 350
in 10 days
$ 550
in 25 days
Hire d0tnet12
$ 300
in 10 days
Hire segpacto
$ 300
in 8 days
$ 825
in 5 days
$ 770
in 25 days
$ 770
in 25 days