Data Collection from a newspaper and Congress Talks
$30-250 USD
In Progress
Posted over 12 years ago
$30-250 USD
Paid on delivery
Hello all:
I am a researcher at a university and I need someone who is experienced in crawling and data collection to help me with the following:
1. Crawl into a newspaper website (I will provide which site) and scape (1) the text of articles that appeared on the site in the past 1-2 years (2) scrape the comments to the articles (written by users) for each article.
2. Collect the information that is available on the news website for each user.
3. From the congress database (open to public) collect the congressional speech texts from the past 1-2 years.
If we can manage the above, I have follow up projects that I can potentially work with you given we mutually agree on it. I am looking for someone I can work with for a long term if I am happy with the work.
I have worked before on system for processing textual data with system doing what you want extracting texts from newspapers and speech libraries for later processing/clustering/mining and analysis.
I've been working with C/C++/Python for about 6 years. I'm wanting to work long term doing good quality work.
This sounds like a very interesting project and scraper programs have always been my favorite. I've written more than one for myself that already do what you've described (digging not only into content, but comments about the content). I'm sure I can do what's needed and I can provide a desktop application for your future use if needed (can discuss later).
Hi,
After reading your project description, I felt that I can provide what you are looking for. I am an experienced programmer. Have been for the past 4 years. I would like to work on this project for you.
Thanks