Hi, I've got 200,000 record MySQL database of crime reports from Denver, Colorado. It includes street addresses sometimes, and cross streets sometimes. It also includes offense, timestamp, and some other geographic information. I'd like you to build a new database that normalizes this data and adds latitude / longitude to the addresses. This would also include a cron script, preferably in python, that can look up new records entered (there are usually a couple hundred new records added every day) and add them to the database. Django familiarity is a plus, as is geopy familiarity ( [url removed, login to view] ).
**More about Django and this project:**
The initial database (where the information gets dumped to) was built based on the data that's scraped. These are the fields:
The production database will be administered by Django, and the front-end templates run through Django. I would prefer to get Django data models mapped to the production database, yes. However, for folk intimate with python and MySQL but not Django, I would be willing to have them do what they do best and build the Django data models (based on the production database design) myself.
I plan on taking care of the server administration -- what I was thinking about re: deliverables is the python files, and a database schema dump I can run to generate the tables on the server. That means using MySQL [url removed, login to view], and python 2.4, and if you need additional libraries let's talk that through at the start.
I don't have a planned schema for normalization / table structures -- Django-generated would be ideal, Django-friendly at minimum. What does "Django-friendly" mean? It means every one-to-one relationship is a field, and the many-to-many relationships use a separate table to associate the ids of the relationship.
**A few other notes:**
I'm looking for a one-time job -- I expect to get a python script I can run on a daily crontab that can pull from the scraped data and put it into the production database. The scraped data right now is in a database, but can also be provided via text file (it's pulled into the database via a LOAD DATA INFILE command).
An acceptable timeframe would be 3 weeks -- however, I'm new to Rent-A-Coder, so if communication / time-zone differences get in the way, I will understand.
I attached a dump of 1,000 records from the database, along with a dump of the db structure.