Project Description:
I am looking to hire an experienced programmer (with a preference for Python) for a very urgent project.
Here are the details: I have a CSV file that contains tweets published by people who ran for office during the 2012 U.S. congressional election. Each row is a unique tweet, and these are my column headers:
Candidate Name | Tweet Date | Tweet Text | Mentions? | Replies?
The "Mentions?" and "Replies?" columns either have a "1" or a "0". "1" essentially represents a "yes", and "0" represents a "no."
Note that I have all of the tweets published by each candidate, so the same Candidate Name appears in multiple rows. There are roughly 700 unique candidate names, so the total row count is pretty large.
I need a script that will analyze the CSV file and give me the following information FOR EACH CANDIDATE:
total # of tweets
total # of mentions
total # of replies
The total # of tweets is essentially just the total number of rows that contain that specific candidate name. Ideally, the output will be in its own CSV file. So, the column headers would be something like this:
Candidate Name | Total # of Tweets | Total # of Mentions | Total # of Replies
I need this completed ASAP. Please message me if you need further clarification.
Additional Project Description:
02/28/2013 at 12:21 MVT
To clarify:
"Mentions?" is basically whether the tweet mentions another user.
"Replies?" is whether the tweet is a reply to another user.
So I need to calculate how many tweets each user published in total, and of those, how many mentioned another user, and how many were a reply to someone else's tweet.
Sorry for the confusion!