Data analyzer + text similarity comparison

This project was awarded to sveralex for $440 USD.

Get free quotes for a project like this
Project Budget
$250 - $750 USD
Total Bids
Project Description

Need to create software(script) that will be store, filter and analize data. The number of records in the database can be up to hundreds of thousands of records, you should use the most optimized algorithms and development technologies.(New data will be upload every day)

The process of working has next structure:

.CSV data file -> Database and text comparation analizer-> Processing macros and filters -> output in .TXT format

1 step) Loading data from a .CSV file with a fixed structure in the current database

2) After data uploaded, it should compare one field(text) for each record with another records in !ALL DATABASE for similarity

(!!! This is the most difficult part of this project, it need to compare two text(two records), and return similirity of it in percents)

Example of working you can find at:

[url removed, login to view]

[url removed, login to view] (There is russian language interface(can be translated in Google translate))

After current record was compared with all records in DB, it add info of MAXIMUM percent of similiry and ID of the record that is most similiar to.

So we saved this info for each record in db.

3) One record has next structure:

Field 1;Field 2;Field 3;...;Max percent of simility;ID of most similiar record

4)The ability to create flexible filters (macros) to sort the data (filters (macros) should be able to save)

Macro consists of several filters (fields has different types: date, text, numerical)..

For example

Macro =


Field 1 contains "John"


Field 4 is equal to "address" OR field 4 is equal to "Andy"


So macros has a complex structure with the logical relations between the filters inside AND \ OR

5) After processing the macro data that we received, export in .TXT file


Awarded to:

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online