Completed

Need a python based, natural language processing solution built

This project was successfully completed by rsonbol for £2000 GBP in 35 days.

Get free quotes for a project like this
Employer working
Completed by:
Project Budget
£1500 - £3000 GBP
Completed In
35 days
Total Bids
11
Project Description

We will require two versions of the software. The first will be a standalone version which is installable on a customer PC, alongside the other prereqs, and should contain a licence key which expires after a defined period. New licence keys can then be provided to allow the user to continue operating the software.

The second version will be cloud based, and this part of the project will take longer to develop, and is dependent upon the success of the first. This will require user level authentication, and an area where the user has access to the reports/analysis they have paid for/requested, along with a safe haven for their data. They should be able to upload their data here, analyse it and output results in whatever format they require (e.g. download/print/email).

It is required that the software should be compatible with all types of system (e.g. Windows, MAC OS, Linux).

A full list of prerequisites will be required on delivery of the solution.

Common Functionality:

Import data from .csv or .xls format into CouchDB. The data import needs to be flexible in that the user should be able to specifiy what fields should be analysed once the data has been imported, and the system should be able to cope with many different datasets (e.g. different fields in each file).

Perform NLP processes on the data, writing the results against the CouchDB record.

Required analysis:

Total word count

Number of different words

Complexity factor (lexical density)

Readability

Total number of characters

Number of characters without spaces

Average syllables per word

Sentence count

Average sentence length (words)

Maximum sentence length (words)

Minimum sentence length (words)

Basic analysis of the words used, sentence lengths,

Sentiment analysis

Word cluster diagram, broken down into paradigmatic and syntagmatic forms so the user can choose which output to have.

Frequency and top words.

There will need to be two versions. Initially, a python based program which will run on a user's PC, containing all of the required elements, plus we will need to install CouchDB etc. This version should require some form of licence key which expires within a fixed time period, and that a new key will need to be provided for. It is anticipated that this installation will need to be able to work on a machine which is not connected to the internet.

The next version will need to be more sophisticated, and should be cloud based, requiring user level authentication to enable users to login to their own section, upload their data and then run the analysis they require, with the results output back to them.

The cloud version will need to allow the user to register their Facebook, Twitter, Google+ etc. social media accounts, and would then be required to monitor those feeds, providing feedback to the user when certain criteria is matched within the posts.

This criteria would include a range of standard items: e.g. profanity, extreme anger/disappointment etc.

The criteria would then allow the user to select certain words or phrases which should then be monitored constantly by the system, providing alerts as required.

Ultimately, we would like the user to be able to turn on automatic responding, whereby the system responds to comments based on an analysis of the response.

For certain customers, we would need to provide automatic, daily outputs of data based on the information collated over the previous day/week/month in a specific format (to be provided later) which contains some elements of the analysis.

The system needs to automatically report on the amount of data being processed per user, what type of analysis is being run, records read, time taken.

All of the systems built must provide a high level of security for the data, and should ultimately provide auditing of the data to ensure that there is a record of who did what when!

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online