Closed

Q&A extraction from FAQ pages

This project received 15 bids from talented freelancers with an average bid price of $556 USD.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
N/A
Total Bids
15
Project Description

We have a list of websites with FAQ pages.

We need all questions and answers to be extracted from those pages.

The list goes like this:

[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]

Since they all have different structures, you need to come up with a one-size-fits-all solution first to find, and then to parse their FAQ pages.
So i need an spreadsheed file with three columns: website, q1, a1 followed by other questions.

Please describe how you will manage to design a flexible parser to get the best result.

There are approximately [url removed, login to view] websites on our list.

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online