Convert usenet archives into a "puff" format

  • Status Closed
  • Budget $250 - $750 USD
  • Total Bids 6

Project Description

You will find an archive of Usenet posting here:

[url removed, login to view]

There are about 2GB of messages.

The goal of this project is to take these messages and convert them into puffball format. The full description of puffball format can be found here:

[url removed, login to view]

The content of each message will go in the "content" field, and the username will be formed from the user's email address by replacing the at sign with a dot. Each message will be "signed" with a key generated just for that user. We will provide you with the functions (in Javascript) to sign the content. The most complicated field is "parents", which needs to reference all of the messages that the user is replying to (it is possible that a user has replied to more than one message). However, the way that the archives are structured should make it easy to locate the messages being replied to, given the header information. You should confirm this!

You can write the function that parses the archives and creates the puffs in Python, PHP, Perl, JavaScript (with node), or as a linux shell script. We will want the code you create, as well as the puffs it creates.

Get free quotes for a project like this
Awarded to:

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online