Proofread xml files with a pdf book and update data accordingly - repost

This project received 11 bids from talented freelancers with an average bid price of $63 USD.

Get free quotes for a project like this
Employer working
Skills Required
Project Budget
Total Bids
Project Description

I have 2 versions of a book. I have generated few xml files from one version of the book. The book is about 400 pages. The other version is more updated and in a pdf file. What I need is to update the data from the pdf into those xml.

This task is urgent, so you have to work regularly. Also, you need to be accurate in proofreading and updating the data. During your task if you need any help understanding the task, you can get help from my developer over Skype or by email.

More detail:

Data scraped in Open Office Document (.odt) files from website.
[url removed, login to view]

Web page contents are entries from an older version of book. The goal is to make
odt files conform to new version of the book. PDF version of the book will be given to the winner of the bid.

Proof Reading and data editing in odt files has to be carried on in accordance with 3 sections of the book

1) repertory 2) materia medica 3) SRT

procedure to edit data in section 1 and section 3 are same.

[url removed, login to view]
A sample of how repertory data is imported in [url removed, login to view] file (link at the end)

An explanation of what these columns means

Ridx: Comma separated values of index of a row, its level no and index of its parent row. For level 2 elements index of row and parent row will be same.
Look at row with ridx value 12006,3,12004. Here the row index 12006, it is a level 3 element and its parent row index 12004. The book entry has a particular
indentation to determine which row is parent of which. Ridx rows will always have style SK-T-L1. Press F11 in case you don't see styling (see [url removed, login to view]).

Level 2, Level 3, Level 4 and Level 5: Text of these level. Have styling like SK-T-L2 and so on

see: Book entry contains references like see (sample in [url removed, login to view]). This is actually a reference to another entry. So see column will contain index of that entry, not the name like in the book.

rep-ref: It is applicable only for SRT section ([url removed, login to view]). In book page number of an entry is given. rep-ref will contain index of that entry, not the page no. like in the book

compare: same as see ([url removed, login to view]).

Remedy, Gr, Constraint, Footenote, added: Every entry has some remedies. A remedy has grade, may have some constraint. In book different graded remedy has different formatting and in website different color.

After understanding procedure to proofread section 1 and 3, section 2 (materia medica will be discussed.)

Referenced files:
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]
[url removed, login to view]

Looking to make some money?

  • Set your budget and the timeframe
  • Outline your proposal
  • Get paid for your work

Hire Freelancers who also bid on this project

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online