Word Cloud

Completed Posted Mar 29, 2014 Paid on delivery
Completed Paid on delivery

Implement a program that reads a text document, counts the occurrence of words, and determines the top ten frequent words in the document. Common words (also called ‘stop words’) such as “a” or “the” need to be removed from the document, as their frequency is usually not of interest. Finally, to make sure we do not artificially differentiate words based on their capitalization, normalize them by turning all word occurrences into their lowercase [url removed, login to view] addition, you should also strip all words that are less than three characters long, even if they do not appear on the stop word [url removed, login to view] solution also needs to strip punctuation, non-ASCII characters, etc.

To solve this, use one or multiple Tables. The Java Collections Framework (JCF) provides a number of implementations of the ADT Table, called Map in the JCF terminology (HashMap, TreeMap, and LinkedHashMap). Alternatively, for the stop words, you could save them in a Set. The dominant operations for stop words will be to insert them into the (set of) stop words when reading in the file, and to test candidate words whether they are in this set. For document words, you insert them into a Map (together with the occurrence count) and you also need to retrieve the word and associated count and updated the count in case a word occurs multiple times.

Finally, once you counted the word occurrences, you need to sort all words, based on their frequencies, to determine the 10 most commonly occurring ones. Use the SORT method provided in the JCF for that purpose. As the sort order is atypical (i.e., do not sort the entries alphabetically by key), you probably need to implement your own comparison object, called a comparator.

Test the program on "[url removed, login to view]",attached within.

Engineering Java Software Architecture

Project ID: #5737251

About the project

2 proposals Remote project Active Mar 29, 2014

Awarded to:

xvxvyanzi

一个有效的提议尚未被提供

$66 USD in 1 day
(0 Reviews)
0.0

2 freelancers are bidding on average $158 for this job

wjx823

hello, client. i am interested in your job. i have seen your job post and i understand what you want to do. please give me. jixing.

$250 USD in 3 days
(2 Reviews)
3.4