Find Jobs
Hire Freelancers

Mass machine translation of articles

$500-5000 USD

Closed
Posted over 15 years ago

$500-5000 USD

Paid on delivery
This project is for translating a large number of articles from? Spanish, French, German, Italian, Portuguese, Dutch, Swedish, Russian, Polish, Chinese, Japanese and Korean into English.? ## Deliverables This project is for translating foreign language Wikipedias from Spanish, French, German, Italian, Portuguese, Dutch, Swedish, Russian, Polish, Chinese, Japanese and Korean into English. [login to view URL] It must run automatically and continuously without human intervention.? What needs to be done in order of processing:? P1. Set up Systran Web Translator on a server to accept article translations automatically. (We will buy a copy).? P2. Download each non-English language Wikipedia from [login to view URL] P3. Identify which language Wikipedias have been updated.? P4. Store each article for each language in a database (but ignore the User: namespace). MediaWiki already has tools to import from XML to SQL (see [login to view URL]:Importing_XML_dumps) P5. Import the langlinks table [login to view URL] for the English Wikipedia - this already has the language links between articles.? P6. For each article follow the language links, retrieving the wiki article in languages other than English. P7. Using an installation of MediaWiki, convert this article to HTML.? P8. Machine translate the article into English, store as HTML in a database (see fields below). You will need to find a way to automate this as I believe the program was designed to be run manuallly.? P9. Retrieve images for each. P10. Build very simple frontend that simply shows all translations to English for a particular article with the name of the source language in the title for each. It must be able to display the images from the same box stored in P9.? P11. Repeat steps P2-P9 for each language. Only pause if all the work is done. Only retranslate each article if more than 3% of the words have changed.? Fields to include in the translation table:? F1. source language F2. article (just the translated version is fine) F3. title in english (indexed) F4. title in original language? F5. size of original language article F6. the revision ID (this is just sitting in the dump. May be useful later to know which revision it is for other features).? F7. date of timestamp field in the XML)? F8. date of langlink F9. date article first created F10. date article was last updated Suggested milestones. I list these in order of development:? M1: Importation of data into HTML and language links - (P2 to P5) - 10% payment.? M2: Create machine translated HTML for one language with frontend just to demonstrate it - (P1, P6, P7, P8, P10) - 10% payment M3: Retrieve images for just these articles (P9)- 10% payment? M4: Run for all languages (images and articles - (P1 - P11) - 40% payment M5: Demonstrate that it's running on an ongoing basis - 30% payment Deliverables: D1: MySQL table with all data described in F. D2: Images stored on disk D3: Any code you required to do this.? D4: Documentation with step-by-step instructions on how to install and run the whole system In your bid please include: * description of your experience working with large databases and files (over 10GB) * whether you agree with the proposed milestone schema or what you suggest your own to be.? * what dates you can deliver each milestone by (be conservative if you're unsure).
Project ID: 3474299

About the project

3 proposals
Remote project
Active 15 yrs ago

Looking to make some money?

Benefits of bidding on Freelancer

Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
3 freelancers are bidding on average $6,800 USD for this job
User Avatar
See private message.
$7,650 USD in 14 days
4.8 (35 reviews)
7.4
7.4
User Avatar
See private message.
$8,500 USD in 14 days
4.8 (55 reviews)
6.8
6.8
User Avatar
See private message.
$4,250 USD in 14 days
5.0 (10 reviews)
5.2
5.2

About the client

Flag of AUSTRALIA
Australia
4.8
48
Member since Apr 4, 2007

Client Verification

Thanks! We’ve emailed you a link to claim your free credit.
Something went wrong while sending your email. Please try again.
Registered Users Total Jobs Posted
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Loading preview
Permission granted for Geolocation.
Your login session has expired and you have been logged out. Please log in again.