I need a complete work based ONLY on a language at the start. (Italian)
I'm not able to provide datasets needed for the work, so you've to create custom ones to be able to complete those tasks. I think will be needed a full dataset of italian words with categorization, grammar rules, style, genre, synonyms and antonyms, phrase examples for each and so on to be able to work as intended.
- Grammar correction: From the given text, the output must be an array of errors with correction with % of precision and the corrected output with % of preicison. Must correct grammar errors, random spaces, capital errors, typos and order/style of the sentences.
Parameters: Text, min total % of confidence to fix the errors for the AI, min total % of confidence with the output.
- Text keywords: From the given text, the output must be an array of keywords with % of precision. Those MUST not be just the most repeated words into the given text, but a result of AI. Words that maybe are not into the text but are actually still pretty involved in the text meaning. Such if the text contain multiple flowers names but not the flower/s word, flowers must be a pretty % keyword of the text. Or an actor name or author involved in the text and so on. For sure must not divide multiple words keywords, such as full name or company names.
Parameters: Text, number of keywords, min total % for the keywords.
- Text summerization: From the given text, the output must be a summer of the given one. Not just the repeated sentences or the ones containing keywords. The AI must get the meaning of the text and get only the parts really important, maybe with dates, names, important things and so on. Must return with a % of precision.
Parameters: Text, max total of words, min total % for the output
- Text rewriter: From the given text, the output must be the same text meaning of the given input but full rewritten. New sentences with new words order and so on but keeping the full meaning of the input. Using the best synonyms whenever possible, changing the punctuation and order of some words. Using the reflexive form in some cases or reversing the phase complements. Must return the text output & the %.
Parameters: Text, min total % for the output
- Random text generation: From the given keywords, must return a text random generated based on the keywords without grammar errors and with basic meaning. Must return the text generated with a %.
Parameters: keywords, max words of the text, min total % for the output.
- Random senteces generation: From the given keywords, must return an array of sentences random generated based on the keywords without grammar errors and with basic meaning with %.
Parameters: keywords, numer of sentences.
Dataset must be completed of all Italian words, insults, articles, pronouns, prepositions, verbs, nouns for the best possible result.
With % I mean confidence of the AI.
Work must be full documented and must be provided books to use as test on google collab.
Everything must be functions based and well commented.
Everything must have a language parameter that will be by default "Italian" since is the only supported language for now. Parameters about % must have a default value if not inserted.