the task is to create a scraping process including these steps:
1) scrape gelbeseiten - dot - de for firm data, extract as much as possible firm-addresses.
2) scan firm names for fanpages using google with query (site: [url removed, login to view] $firmname)
as the data until now has a lack of quality as not all entries match (google finds fanpages for firms that dont have any or that are better ranked).
3) pull the firm name of the info Tab or Page title of the Fanpage and match it with the firm name that was found in Gelbeseiten - dot - de, calculate a ratio how much letters are similar so there is a linguistic parameter to estimate correctness. express using numbers between 0 and 100 so it can be used to rank data.
4) pull fancount and "talk about" number from Facebook Fanpages as well as info whether its a page, wiki entry or place
I would like to do a test run with 10000 entries, that optimize the data-quality and find the best way to identify the real fanpage of any firm. on that basis we could start scraping data continously.
I will pay one amount for coding the script, and then we will estimate a price for each [url removed, login to view] entries of the raw data
that run through the script you have made.
21 freelancers are bidding on average €429 for this job
Sir, I can do your project. I have over 8 years of experiences working as a Virtual Assistant - details of which will be disclosed on request. Waiting to hear from you. thanks
I'm a very experienced software developer and an expert in data mining, web scraping and web crawling. I've already made a webscraper for gelbeseiten.de. Please check your PMB.
Dear bzmedia Greeting from Dream Media Solutions. Thank you very much for giving us an opportunity to bid for your project. We are very much interested in this project. Waiting for your valuable reply. Thanks & Regards