PHP or Ruby discussion board scrapers

In Progress

We're looking for a Ruby or PHP app that will accept the name or id of a message/discussion board on the following sites and download its content into a MySQL database:

1. [url removed, login to view]

2. [url removed, login to view]

3. any Simple Machines Forum

Both sites 1 & 2 use extensive Javascript to generate views of their boards. You must be willing to license your code under GPL or a similar open license. Your code must not adversely affect the host site's performance. We are willing to break the project into milestones or individual projects for each site.

The basic structure of the DB will be (your comments and suggestions welcome):

Sites: id, name, front_page_url, other_metadata

Users: id, name, join_date. running_post_total, other_metadata, site_id

Topics: id, subject, date_opened, running_post_total, other_metadata, site_id

Messages: id, content, date_posted, other_metadata, user_id, parent_message_id, topic_id, site_id

Summary of relationships -

users:sites 1:1 (may turn out to be 1:many, for now assume 1:1),

topics:sites 1:1,

messages:sites 1:1,

messages:messages 1:many (replies),

users:topics 1:many,

users:messages 1:many,

topics:messages 1:many.

May not need to store site_id everywhere given those relationships, but DB should allow for fast querying of topics per site, for instance.

Skills: Javascript, MySQL, PHP, Ruby on Rails

See more: turn site into app, subject php, simple php forum code, projects for php, php gpl, performance board, hulu com, gpl php, ruby, project board, php ruby, php app, javascript ruby, c or c++ projects, simple php message board, structure php, javascript projects database, topics simple projects, structure javascript project, site performance download, basic content php, php host, php simple message board, php code ruby code, discussion forum project php

About the Employer:
( 6 reviews ) Chicago, United States

Project ID: #1147486