In Progress

MySQL-PHP Recursive Script to Count Words & Phrases

TABLE_1 (input)

----------------------------------------------

phrases | phrase_appearance_count

Note: Each text phrase consists of one or few words.

TABLE_2 (output/result)

----------------------------------------------

word_or_phrase | total_number_of_appearances | phrase_count

1. For each word in table1 script calculates:

- total number of word appearances in table1 (SUM of phrase_appearance_count)

- number of phrases in the table1 that includes this word.

Result is placed in to table2.

2. Then each 2 words [within each phrase] script calculates:

- total appearance count of phrases in table1 that has both word1 and word2 (SUM of phrase_appearance_count)

- total number phrases in table1 that has both word1 and word2

Result is placed in to table2. Duplicate results are skipped.

3. Then each 3 words [within each phrase] script calculates:

- total appearance count of phrases in table1 that has word1 and word2 and word3 (SUM of phrase_appearance_count)

- total number phrases in table1 that has word1 and word2 and word3

Result is placed in to table2. Duplicate results are skipped.

4. Then each 4 words......

5. Then each 5 words......

............

............

............

Repeats until all phrases and existing words combination in phrases are analyzed.

That's it.

---------------------------------------------------------------------------------

=================================================

Requirements / Notes / Clarifications:

=================================================

1) Sequence of words in a phrase doesn't matter:

"a b" == "b a"

"a b c" == "a c b" == "b a c" == "b c a" == "c a b" == "c b a"

etc.

2) Real data consists of ~20,000 phrases and could be even larger. Choose wisely to rely on MySQL functions/joins vs. Array function or a mix. Data manipulation MUST be very FAST and EFFICIENT!

3) Simple Example:

INPUT

------------------

"a", 26

"c a b" , 107

"b g", 25

OUTPUT

-----------------------------

"a", 133 , 2

"b", 132, 2

"c", 107, 1

"g", 25, 1

"a b", 107, 1

"a c", 107, 1

"b c", 107, 1

"b g", 25, 1

"a b c", 107, 1

4) Real Data will look more like:

------------------------------

"apple", 45

"apple banana cherry", 4

"apple eggplant fruit", 5

"banana grape" 16

"orange fruit tomato a potato", 1

"pears and apples", 3

..............

5) Recursion may be suggested. Possible algorithm (just an idea/suggestion ...)

5.1) Convert TABLE1 into WORDS_TABLE by splitting phrases into separate words.

TABLE_1 (input)

----------------------------------------------------------------

word_id | phrase_id | word | phrase | phrase_appearance_count

----------------------------------------------------------------

For efficiency create necessary indexes [or use ARRAY....]

5.2) Group by word and count SUM of appearances, sort by total appearances

5.3) create recursive function that will have arguments (filter_words_array, data_table/array)

-- function loops through by each word

and recursively calls self until all the data is analyzed.

-- during process function inserts unique phrases, total_number_of_appearances and phrase_count into table2

.........

6) Script shell be php/mysql.

i.e.:

<?php

// Database Connection

$hostname = "localhost";

$database = 'test';

$username = "root";

$password = "";

$connection = mysql_pconnect($hostname, $username, $password) or die(mysql_error());

mysql_select_db($database, $connection);

...........................

...........................

...........................

etc.

mysql_free_result($result); // disconnect from DB

?>

Skills: PHP, SQL

See more: recursion combination php mysql, appearance count, php script phrases, mysql php recursive, php script count word phrase, count mysql php, test algorithm, sort array c, sort algorithm, simple recursive function, simple recursive algorithm, simple recursion example, simple recursion, simple algorithm example, recursively, recursive function example, recursive algorithm example, recursive algorithm, recursion in algorithm, recursion function, recursion algorithm, php and mysql connection, example algorithm, efficiency of an algorithm, efficiency of algorithm

About the Employer:
( 9 reviews ) Tustin, United States

Project ID: #530612

Awarded to:

nazmulbh

Hi, Please consider me for this job. I think this is not a hard task. I will prove. Thanks.

$30 USD in 1 day
(4 Reviews)
4.4

2 freelancers are bidding on average $30 for this job

sumonbd09

Hi, I'm very familiar with this task. I can do your task 100% efficiently and effectively. I developed lots of forms like what you are looking for. I’m a LAMP developer with 4+ years of developing experience on PHP/M More

$30 USD in 2 days
(1 Review)
0.0