Plagiarism detection - NEED IT IN 5 days TOPS!
Find common phrases and sentences between documents (source and suspicious). Find all plagiarized parts. There can be 4 cases:
- Copy paste
- Copy paste + word order change
- Copy paste + paraphrasing
- Copy paste + word order change and paraphrasing
I suggest here using multithreaded Needleman Wunch algorithm for document similarity comparison and plWordnet for synonyms and paraphrasing checks.
Input file
There are sets of document pairs suspected to be plagiarized and files that previous have been plagiarized from. In plain text format named as [login to view URL], [login to view URL], where XXXX in pair number, in first there is something plagiarized from the second one.
Output file
In plagiarism was detected we want to save all information in XML file as follows:
<?xml version="1.0" encoding="UTF-8"?>
<alignment document="[login to view URL]" source="[login to view URL]">
<passage documentFrom="123" documentTo="123" sourceFrom="123" sourceTo="123" />
<passage documentFrom="234" documentTo="234" sourceFrom="234" sourceTo="234" />
</alignment>
Tag passage means that plagiarism was detected:
• documentFrom – beginning index of recognized plagiarized fragment from [login to view URL]
• documentTo- ending index of recognized plagiarized fragment from [login to view URL],
• sourceFrom- beginning index of recognized plagiarized fragment from [login to view URL],
• sourceTo- ending index of recognized plagiarized fragment from [login to view URL] m.
save all to suspiciousXXXX-sourceXXXX.xml. For entire task, it will be a set of XML files.
Measures
In order to measure quality, I will use
• precision,: Claude, Webb, Geoffrey I., “Encyclopedia of Machine Learning and Data Mining Sammut”, 2017, precision
• recall: Claude, Webb, Geoffrey I., “Encyclopedia of Machine Learning and Data Mining Sammut”, 2017, precision and recall
• granularity,: Potthast, Martin, et al. “An evaluation framework for plagiarism detection.” Proceedings of the 23rd international conference on computational linguistics: Posters. Association for Computational Linguistics, 2010.
• pladget score (main score),: Potthast, Martin, et al. “An evaluation framework for plagiarism detection.” Proceedings of the 23rd international conference on computational linguistics: Posters. Association for Computational Linguistics, 2010.
Trial set
Trial set is attahed:
• pl/en – division between PL and EN documents,
• src (inside pl/en) – source documents,
• susp (inside pl/en) – suspicious documents,
• xml (inside pl/en) – proper answers.
Evaluation tool
Is attached as JAR file that needs newest Java 8.
Arguments:
• -e evaluation method,
• -i path to ZIP file with reesulting XML files,
• -t path to folder with answers.
Example:
java -jar [login to view URL] -i c:\\[login to view URL] -t c:\\dataset -e TASK1
Baseline
THE BASELINE SLUTION TO THIS TASK in general can be based on suffix array. To find Longest Common Substring between documents.
In pre processing this documents will be:
• Remove special characters,
• Normalize white symbols in text,
• Remove EN stop-words,
• Remove PL stop-words,
Such data is then divided in 15-grams phrases and put into suffix array. The result of this is as follows:
• precision: 0.861901, recall 0.123821, granularity: 1.352459, plagdet: 0.175451
Nong, Ge, Sen Zhang, and Wai Hong Chan. “Linear suffix array construction by almost pure induced-sorting.” Data Compression Conference, 2009. DCC’09.. IEEE, 2009.
Hi
I have read your job details.
I have worked on the job similar to this job.
Also I have a lot of experience in web development, web scraping & crawling, reverse engineering and programming like c++, python, java or something.
Please contact me.
Best regards.
bestit4u
I am very proficient in c and c++. I have 16 years c++ developing experience now, and have worked for more than 7 years. My work is online game developing, and mainly focus on server side, using c++ under Linux environment. I made many great projects using c++, for example, I made the tools which could convert java codes into c++ scripts, of course garbage collection included, this was very similar to a compiler, and was very complex. I also made our own mobile game using c++, I can show you the demo of client, if you like. I am very proficient in java also. I have a very good review on Freelancer.com, I never miss a project once I accept the job, you can check my review. Trust me, please let expert help you.
Hi I read your project description and found u are looking for me
I already worked plagiarism project for faculty of university to analysis python code
My prev project has following features
-upload python files and compare faculty sample code and student python code
-And after analysis codes and display similar rate and error, warnnings, ...
-faculty see it and regarding student mark
-Form which display student and faulty file compare windows
I have a lot of experience in this fields
My client tried to use MOSS for this but this service was stopped so now it could not use it
Please share your work with me
Thanks
Hello
More 20 years programming experience.
I would like to discuss some details to set real price and time.
Regards.
---------------------------------------------------------------------------------------------------------------------------------------------------
Hi,dear. I am a senior software developer. I have just checked your project report, I am able to perform this task with my developer team. I am looking forward to your proposal...
Me and my team has 5 years of experience into Python/Django,iFrame/flask/Golang & Data Scraping or Web Crawling. Can very well execute this Project and can work at US hours.
Hello
My name is Ishtiyaq, I am certified python expert I have 4 years+ experience in python language and I have completed 100+ projects using python .. Expertise : Python, Django, Django-Rest- Framework and many python packages. My key Skills are: Python, AngularJS, Scala, JavaScript, Go, PHP, SQL, HTML, Jython, Perl, CSS. Platforms: Linux, Amazon Web Services (AWS), Google App Engine, Windows, Mac OS X You can test the quality of my work if needed.
Regards Ishtiyaq
5 days tops.
We're looking at $1000 for the speed required and the complexity of this task. My final year university project was building something very similar to this.
Hello,
I am Tahsinul Alam, completed Masters in Software Engineering now working as one of the project manager in Python team of Workspace Infotech Ltd,
software/Outsourcing firm located in Melbourne, Austrlia. We have 16 different teams to work in different mobile,web and desktop
technologies and give quality services all over the world.
Technology:
We have excellent and dedicated Team of web developers & designers specially in Python in raw python and in Django framework.
We workd in Anguljs and Javascript too. We used tools like Jira, ASANA, trello, Bitbucket for our project management.
We can give u full team or individual (worked directly under u) as fixed salary or project basis or hourly support as u required.
About your project:
We go through your project details & we are confident in doing this whole project in due time.
End point:
Check the reviews for our previous work & all other questions may arise.
Please send us more specific information as needed ( we need to sit for more details, after that we can give u perfect budget & timeline)
So that we can move forward easily.
Hope we can do a good business in near future.
thanks and waiting for ur response.
Thanks
Tahsinul Alam
Director & Project Manager,
Python Team
WorkspaceIT-Australia
Hello,
Well, to brief you about me - I am a professional with 8 years of development experience and have delivered into lot of similar projects. We have designed and developed various websites in different domains. We provide complete solutions from website scratching to website development.
I would really appreciate if you consider my bid and give me an opportunity here so that I can showcase my experience in specific.
I tend to deliver high quality work within the same day with negligible margin of changes from your end (this is guaranteed)
Looking forward to you to hire me.
Let’s discuss on skyp jayeshsartanpara1