Scrape Address Data and Info from Fanpages

IN PROGRESS
Bids
21
Avg Bid (EUR)
429
Project Budget (EUR)
€250 - €750

Project Description:
Dear Freelancer,

the task is to create a scraping process including these steps:

1) scrape gelbeseiten - dot - de for firm data, extract as much as possible firm-addresses.
2) scan firm names for fanpages using google with query (site: facebook.com $firmname)

as the data until now has a lack of quality as not all entries match (google finds fanpages for firms that dont have any or that are better ranked).

therefore
3) pull the firm name of the info Tab or Page title of the Fanpage and match it with the firm name that was found in Gelbeseiten - dot - de, calculate a ratio how much letters are similar so there is a linguistic parameter to estimate correctness. express using numbers between 0 and 100 so it can be used to rank data.
4) pull fancount and "talk about" number from Facebook Fanpages as well as info whether its a page, wiki entry or place

I would like to do a test run with 10000 entries, that optimize the data-quality and find the best way to identify the real fanpage of any firm. on that basis we could start scraping data continously.

I will pay one amount for coding the script, and then we will estimate a price for each 10.000 entries of the raw data
that run through the script you have made.

Kind Regards,
Bjoern

Skills required:
Data Mining, Excel, Software Architecture, Web Scraping, Web Search
About the employer:
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


Hire greggfletcher
€ 500
in 10 days
€ 450
in 15 days
€ 250
in 10 days
€ 400
in 3 days
Hire phpXpertbd
€ 450
in 10 days
Hire creatorul
€ 750
in 8 days
€ 475
in 14 days
Hire TetySoft
€ 400
in 4 days
Hire softwarevamp
€ 699
in 7 days
€ 375
in 5 days