Python programmer needed. Program/script needs to read/import all TXT files within directory and produce ngrams(bigrams, trigrams, etc.) This script needs to produce a report or file with the frequency distribution as shown in the example. I need to be able to simply modify the script to produce ngrams of any length easily.
This is simple and easy to do but this task serves as both a prequalifier for more complex natural language processing projects.
As an added optional bonus, I will pay $50 to the person that can acquire the entire english language book collection from project gutenberg(in txt format) and adapt the script separately to process these textbooks. A sample of a book from project gutenberg is included. I can provide a google drive folder to upload.
All code must be commented appropriately to reasonably explain what complex lines of code do(You don't need to comment simple python statements like import etc etc)
Script/program needs to be able to be multithreaded/parallel for future use when there are hundreds of thousands of books that need processing.
19 freelancers are bidding on average $159 for this job
Hi, I am a Python Programmer with 5 years of experience. I did NLP and Big Data projects using Python before. I would like to know more details about your project. Contact me :)
I have read your project details and I am very keen to take up this project. I am still as student but have done natural language processing a few times.