Need a python based, natural language processing solution built

IN PROGRESS
Bids
11
Avg Bid (GBP)
£2145
Project Budget (GBP)
£1500 - £3000

Project Description:
We will require two versions of the software. The first will be a standalone version which is installable on a customer PC, alongside the other prereqs, and should contain a licence key which expires after a defined period. New licence keys can then be provided to allow the user to continue operating the software.

The second version will be cloud based, and this part of the project will take longer to develop, and is dependent upon the success of the first. This will require user level authentication, and an area where the user has access to the reports/analysis they have paid for/requested, along with a safe haven for their data. They should be able to upload their data here, analyse it and output results in whatever format they require (e.g. download/print/email).

It is required that the software should be compatible with all types of system (e.g. Windows, MAC OS, Linux).

A full list of prerequisites will be required on delivery of the solution.

Common Functionality:

Import data from .csv or .xls format into CouchDB. The data import needs to be flexible in that the user should be able to specifiy what fields should be analysed once the data has been imported, and the system should be able to cope with many different datasets (e.g. different fields in each file).

Perform NLP processes on the data, writing the results against the CouchDB record.

Required analysis:
Total word count
Number of different words
Complexity factor (lexical density)
Readability
Total number of characters
Number of characters without spaces
Average syllables per word
Sentence count
Average sentence length (words)
Maximum sentence length (words)
Minimum sentence length (words)
Basic analysis of the words used, sentence lengths,
Sentiment analysis
Word cluster diagram, broken down into paradigmatic and syntagmatic forms so the user can choose which output to have.
Frequency and top words.

There will need to be two versions. Initially, a python based program which will run on a user's PC, containing all of the required elements, plus we will need to install CouchDB etc. This version should require some form of licence key which expires within a fixed time period, and that a new key will need to be provided for. It is anticipated that this installation will need to be able to work on a machine which is not connected to the internet.

The next version will need to be more sophisticated, and should be cloud based, requiring user level authentication to enable users to login to their own section, upload their data and then run the analysis they require, with the results output back to them.

The cloud version will need to allow the user to register their Facebook, Twitter, Google+ etc. social media accounts, and would then be required to monitor those feeds, providing feedback to the user when certain criteria is matched within the posts.

This criteria would include a range of standard items: e.g. profanity, extreme anger/disappointment etc.
The criteria would then allow the user to select certain words or phrases which should then be monitored constantly by the system, providing alerts as required.
Ultimately, we would like the user to be able to turn on automatic responding, whereby the system responds to comments based on an analysis of the response.

For certain customers, we would need to provide automatic, daily outputs of data based on the information collated over the previous day/week/month in a specific format (to be provided later) which contains some elements of the analysis.

The system needs to automatically report on the amount of data being processed per user, what type of analysis is being run, records read, time taken.

All of the systems built must provide a high level of security for the data, and should ultimately provide auditing of the data to ensure that there is a record of who did what when!

Additional Project Description:
02/07/2013 at 20:50 EET

We require a cloud based natural language processing system which can
have data imported into it for analysis, and then output back to the
user. It will require user level authentication, and an area where the
user has access to the reports/analysis they have paid for/requested,
along with a safe haven for their data. They should be able to upload
their data here, analyse it and output results in whatever format they
require (e.g. download/print/email).

It is required that the software should be compatible with all types of
system (e.g. Windows, MAC OS, Linux).

A full list of prerequisites will be required on delivery of the solution.

Common Functionality:

Import data from .csv or .xls format into CouchDB. The data import needs
to be flexible in that the user should be able to specifiy what fields
should be analysed once the data has been imported, and the system
should be able to cope with many different datasets (e.g. different
fields in each file).

Perform NLP processes on the data, writing the results against the
CouchDB record.

Required analysis:
Total word count
Number of different words
Complexity factor (lexical density)
Readability
Total number of characters
Number of characters without spaces
Average syllables per word
Sentence count
Average sentence length (words)
Maximum sentence length (words)
Minimum sentence length (words)
Basic analysis of the words used, sentence lengths,
Sentiment analysis
Word cluster diagram, broken down into paradigmatic and syntagmatic
forms so the user can choose which output to have.
Frequency and top words.

The user will be able to register their Facebook, Twitter, Google+ etc.
social media accounts, and the system would then be required to monitor
those feeds, providing feedback to the user when certain criteria is
matched within the posts.

This criteria would include a range of standard items: e.g. profanity,
extreme anger/disappointment etc.
The criteria would then allow the user to select certain words or
phrases which should then be monitored constantly by the system,
providing alerts as required.

The system needs to automatically report on the amount of data being
processed per user, what type of analysis is being run, records read,
time taken.

All of the systems built must provide a high level of security for the
data, and should ultimately provide auditing of the data to ensure that
there is a record of who did what when!


02/10/2013 at 13:13 EET
A more full and complete description of the project requirements are contained within the attachment.

Skills required:
Big Data, Natural Language, NoSQL Couch & Mongo, Python, Social Networking
Additional Files: System+Specification.doc
Hire dombov
Project posted by:
dombov United Kingdom
Verified
Public Clarification Board
Bids are hidden by the project creator. Log in as the employer to view bids or to bid on this project.
You will not be able to bid on this project if you are not qualified in one of the job categories. To see your qualifications click here.


Hire rsonbol
£ 2000
in 35 days
Hire biddyweb
£ 3000
in 60 days
£ 2000
in 120 days
Hire HadiAsiaie
£ 2000
in 20 days
£ 1500
in 20 days
Hire exprtsolution
£ 1600
in 8 days
Hire meclinton382
£ 1500
in 1 days
£ 1500
in 25 days
£ 2500
in 28 days
£ 3000
in 90 days