Social Media Aggregation /Crawler

  • Status Closed
  • Budget $750 - $1500 USD
  • Total Bids 16

Project Description

If you have a look at the attached screen shot its from twitter.

If you see the RED area - that information I want to be pulled from twitter and stored into a database, as well as the twitter user's address. I will be able to use a cloud database.

However the challenge is that I want to be able to grab this data from as many twitter users in USA as possible (possibly around 20 million). I understand that this might have to be done via a crawler or using the twitter api and would take a few weeks of solid crawling or have to use a cloud service to do, twitter enable us to to this but limit results so would be slow to get a lot of data.

These resources might help

[url removed, login to view]

[url removed, login to view]

The rest of the API requires OAuth, but not search.

To use the search API you can just make a request against the following URL: [url removed, login to view][keywords]

For example to search for pizza: [url removed, login to view]

You get JSON data back that you can read in any program. If you use PHP, you can use cURL to make the request and json_decode() to convert the result into an object you can iterate through in a foreach() loop.

[url removed, login to view]

The issue is that they have certain limits

[url removed, login to view]

and so would have to make this distributed somehow to get it done - over a long time frame (maybe would take 10+ computers a month?)

If you think this would interest you please let me know! I am also interested in using the facebook, tublr, instagram and maybe linkedln API's to build up a large dataset of users that have specific jobs!

I understand this is a large data warehousing project and would also need a fast search to retrieve data back

Get free quotes for a project like this

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online