OCR processing of old magazines

Completed Posted Jan 23, 2013 Paid on delivery
Completed Paid on delivery

We need to process existing PDF files of old magazines - they must be OCRed and converted to clean HTML. Please find enclosed sample of what you will be dealing with. If you are interested in this job, please submit a processed result of the attached sample.

Please also state what OCR software you use and what version. You MUST use an OCR software, no typing is allowed.

The result MAY NOT contain headers, footers and page numbers from the magazine, just clean text with only very basic formatting - such that would be suitable for publishing on a web site i.e. bold, italics, underline should be preserved, paragraphs, headings, basic tables, uppercase, lowercase, upper index and lower index should be preserved, every other formatting (text flow, columns, weird fonts, page numbers etc.) should be discarded. The desired result is clean readable text with clean formatting.

The result must be in clean HTML (such as that produced by [url removed, login to view] - you can use that if you wish or anything else you prefer) and separated into single concise articles.

The most important consideration - the result should be as typo-free as humanly possible. You are not supposed to check for grammatical correctness, just that the recognized text is the same as in the source file. You MAY correct an obvious typo in the original text, but don't have to.

IMPORTANT:

Bid as if for 400 pages of text (the actual sizes of the files vary). Maximum acceptable price is $0.12 per page as in the included example. If your price is higher than this, please DO NOT BID. In order to qualify, you need to submit processed sample file (in clean HTML) AND your price must be less or equal $0.12 per page. No negotiations. Offers not meeting these criteria WILL BE IGNORED.

We will select several providers for this job as we have many files to process. This can turn into a long-term job for you, if you can deliver.

Thank you for looking!

Data Processing OCR

Project ID: #4166338

About the project

9 proposals Remote project Active Jan 24, 2013

Awarded to:

rnatharva

Sir, I am interested in your project. I can deliver the project in specified time. Waiting to hear from you. Best regards.

$48 USD in 3 days
(1 Review)
1.4

9 freelancers are bidding on average $125 for this job

NEEMISH

sir please see pm and reply.

$48 USD in 1 day
(7 Reviews)
2.7
iqbalshuvo

Hello, I'm expert here. Look forward to working with you. Please have a look on your personal message box for more details. Thanks.

$48 USD in 3 days
(3 Reviews)
2.1
najihacyber

please see the result of my unedited OCR.

$48 USD in 1 day
(0 Reviews)
0.0
thegrey1

Good morning, today I changed my offer as I have pointed out that the work you have requested is more complex than a simple conversion. It is converted into three steps: 1 - editable text, make corrections to typograp More

$750 USD in 21 days
(0 Reviews)
0.0
web1programmer

I am very good at doing data processing jobs. You will get 100% satisfaction after seeing my work done. """"" Can start from now itself """"" Please see private message also.

$48 USD in 2 days
(0 Reviews)
0.0
karugajoe

I have extensive experience on the same. Please see attached the HTML version of your PDF file.

$48 USD in 2 days
(0 Reviews)
0.0
gjbouwhuis

Experienced in data processing (working part time at the administration of a university) You can find all the requirements in the private message.

$40 USD in 3 days
(0 Reviews)
1.7
pravinojha

Hi, I'm new on this website but still you can check my resume so that you can rely on my work. I've worked for data. I'm sending you attachment with sample work. Regards, Pravin

$44 USD in 10 days
(0 Reviews)
0.0