Project Description:
Collusion is a really fun add-on to Firefox that allows the visualization of which sites track you as you browse the web. For this project, we'd like a simplified version of Collusion without visualization (https://github.com/toolness/collusion) for a small data gathering project.
Changes required:
1. Must read a csv file with a set of URL (3 columns: URLId [bigint], SiteId [bigint], URL [string])
2. Must output two CSV files:
2.1 List of all third-party trackers encountered (TrackerId [autogenerated], TrackerDomainName [Collusion provides this])
2.2 Trackers encountered on each URL (URLId [from input csv], TrackerId [connecting to TrackerId in 2.1], GoogleAnalyticsAnon [If Google Analytics tracks a URL, is it set to anonymization?]
For information on checking whether Google Analytics is set to Anonymization, see: http://support.google.com/analytics/bin/answer.py?hl=en&answer=2763052.
The project will be split into three deliverables with a percentage payment for each:
1. Freelancer is able to provide sample data for an input file we will provide (15% upon approval)
2. Freelancer provides executables runnable on a Mac or Windows Server along with source code and documentation (55% upon approval)
3. After Freelancer runs the tool with our own input file (same as for #1) and find the top 10 trackers and examine whether these provide an anonymization option, Freelancer will collect information in the same fashion as for output file #2. If no other top 10 trackers do the same, Freelancer will be paid without additional work (30% upon approval).
Additional Project Description:
03/04/2013 at 14:14 MST
Note: the project does not need to be delivered as a Firefox plug-in (we'd actually prefer it as a stand alone), but if keeping it as a plug-in is easier, that's fine as well.