I am looking for someone with experience in Beautiful Soup and using it to scrape data.
I need a large directory site scraped, url given to those who PM me and ask. Leave a dummy bid and contact me.
There are approx 4 mil records.
There is some structure to the data.
The end result is clean data in a csv file. I need to be able to run this manually myself to get updated records.
There is Ongoing, lots of ongoing work, if you know what your doing and do it for a good price.
We will discuss this as part of your bid.
If you have another way to scrape sites that I dont know about let me know.
OK so I have had enough of my time wasted.
You are all now required to give me 2000 scraped items from the website I have given you via PM. This is to show me you can even join the conversation. To annoyed at the people trying to fool me.
The quote will cover the following.
1. Full scrape of over 4 MILLION URLS. Seriously, you start at ID # 1 and scrape to ID 4000000. Not rocket science.
2. Clean and organised data.
3. Something I can run myself, it needs to be open source, run on my server, and I need to be able to run it whenever I want update data, this may be daily, weekly or monthly. My choice.
4. Consider the banning of IP addresses and work out how you can incorporate proxies.
5. Dont ever ever ever ever ever consider trying to give me a price based on 1000 records. Its a sure fire way to get the flick. I have already scraped this data 6 months ago. I know whats involved.
6. Dont tell me you can do it, dont tell me how badly you need the job. I dont care if your neighbours fcriends brothers dog died and you need money to bury it. Im in business. You either can or you cant, and the only way to show me you can is to do it. Give me a 2000 records, show me how you did it and show me how you plan on handling my project. Work with me and we can buy 100 dogs to replace that little one.
7. Dont try and scam me with some fake bit of data, I am not stupid.
8. Im looking at scraping about 20-30 million web pages on various sites, this is a long term gig. Spend some time working through the info above and figure out if you want to deal with someone who has a clue, or if you want to move onto the next scam.