Program to create XML from OCR Document

Budget N/A
Bids 17
Average Bid $1797

I require a talented programmer to code a program that has the capability to run a process where:

1. a PDF document will be OCR'd to produce an editable document.

2. the program must locate certain strings of words in the OCR document;

3. then when those strings appear, to create an XML file, namely an answer file for the program called HotDocs (a document generation program) ([url removed, login to view]). The same "answers" should be able to be saved to SQL database if necessary.

The XML Answer file will need to be used by HotDocs to generate a written report. HotDocs has the ability to have a DLL created whereby answers can be "absorbed" into the HotDocs system enabling a report to be prepared. See [url removed, login to view]
The "absorbed answers" written to the HotDocs XML answer file will be the word / phrases which are outputted by your software after the OCR process has occurred.

From a PDF document (see attached dummy contract) I wanted for example on the front page to be able to


1. Extract who the Real Estate Agent is (1st box - L J Hooker Ashfield) and put those details into the answer file

2. Extract 'identifier' details of the address of the property being bought (middle of page) including street number / name / suburb (1 Smith Street Smithville)

3. Extract details of when contract is due to be completed (12 weeks after the date of the Contract)

4. Extract details of the date of the contract which will usually be a handwritten date in the last box of the first page.

The info extracted would be placed as an answer in the hotdocs XML answer file. The answer file would then be used to prduce a document. For example

"Dear Sir

You as the Purchaser are required to complete the Contract and finalise the purchase of [details extracted from no 2.]. in [details extracted from no 3.] days after the date of the Contract which is [details extracted from no 4.].

You also promise to the Vendor that you have only been introduced to [details extracted from no 2.] by the Agent shown on the front page of the Contract being [details extracted from no 1.]. If you have been introduced to the property by another agent please advise us immediately as you may be liable to pay the commission of another Agent. "

If you look at the dummy Contract there is information that I will require which may not be shown simply in text but also in diagrams (see the plans attached to the dummy contract). I need to extract info from the diagrams to confirm that the information matches what's on the front page of the Contract.

These contracts are usually 30 - 40 pages. They will be about 2 - 3mb in size but can be bigger or smaller. I am happy to consider using any OCR process but would prefer an open source one to minimise cost.

Post a Project Like This

Looking to make some money?

  • Set your budget and the time frame
  • Outline your proposal
  • Get paid for your work

Bids on this Project

  • agrau Profile Picture


    Ciudad autónoma de Buenos Aires,  Argentina

    I'm a Visual C++ programmer with experience in areas: Audio / Video HD / video stereo 3D / Image processing Ray Tracing GPU Programming with CUDA and OpenCL.

    PHP, C Programming, Linux, and Engineering

  • SigmaVisual Profile Picture


    Taxila Cantt,  Pakistan

    Web, Bots, Crawlers, and Scrapers Development. I have expertise in automation services and I can automate any manual process.

    PHP, Python, SEO, and Data Entry

  • dasfreelancer Profile Picture


    Kolkata,  India

    I am an expert in many fields - My goal is to provide you with a wow experience. I am effective at finding easy and robust solutions to your problems, I am good at multitasking, I will offer you only good advice or none. I will make sure I go beyond your expectations at all times

    Java, JSP, Javascript, and XML

  • kiberg Profile Picture


    Moscow,  Russian Federation

    I'm programmer. I like C# language. I'm interesting in video recognition, image processing.

    XML, .NET, Android, and Testing / QA

  • tekarcsolutions Profile Picture


    bangalore,  India

    Technical Architect

    PHP, Perl, ASP, and Java

  • oliveinfosys Profile Picture


    Kathmandu,  Nepal

    Olive Infosys Pvt. Ltd.

    PHP, C Programming, Javascript, and XML

  • esafeguard Profile Picture


    Zielona Gora,  Poland

    I'm a freelance web developer, designer, SEO expert (on site optimization) and translator from Poland. I have a one person EU company called e-safeguard which can be verified in EU databases, so I can issue a EU VAT invoice, if you need one. As a web developer, I'm interested in any projects in JavaScript, PHP, HTML, CSS. I specialize in interactive application development. I have experience in creating interactive maps, charts and flowcharts. I also provide services connected with SEO, DTP, typesetting, text formatting (including LaTeX) and English-Polish translations. My terms: I don't use Skype, so please don't send your contact details via Freelancer. I can't accept a project without receiving a specification, or a detailed description of the project. The specification/description should be written in full sentences and in your own words, and use links only to illustrate parts of the description.

    PHP, Javascript, XML, and Website Design

  • nittilegupta Profile Picture


    Indore,  India

    We will just keep working and tweaking the design until I gain your full satisfaction with not less than 10/10 satisfaction rate. Freelancer Preferred Badge by serving more then 50 clients. Excellent Reputation: Check "Feedback" Tab on my profile page to get an idea about the opinion of my previous satisfied customers. Affordable Price Range: Not so high and not so cheap. I enjoy what I do & I look forward to be working with you We are a group of experienced talented developers from India delivering the best web solutions via for the last 3 year. All of our team members have around 5 years of experience in web development industry. Our team members are highly dedicated and working hard to deliver the projects on time to our clients. We always are keeping healthy relationship with our client and provide best services from our end and looking for long term business relationship with our clients.

    PHP, .NET, Website Design, and Internet Marketing

  • duzy Profile Picture


    Shenzhen,  China

    Started working at embedded Linux C/C++ software development since 2005, and Android development since 2009, over 10 years of Linux working experiences. I'm exceptional excellent at C/C++/Go, and Android Java, having rich development experiences in system components, frameworks/libraries, apps,games. I'm using Linux and Emacs mostly, and having wide range of skills, e.g. HTTP/CGI in Golang, 2D graphics in skia/cairo, Qt, Gstreamer for videos, Gaming with Cocos2d-x, Box2D, Bullet, OpenGL, Linux system administrator, open source project porting, lexical parsing (like parsing a language). I'm working on hourly basis, no more fixed budget now. The rate is negotiable base on the project.

    C Programming, Linux, Android, and Cloud Computing

  • ibapi Profile Picture


    Hyderabad,  India

    Freelancer with a difference. When i bid on your project, its a bid for a new relationship with the promise of a lifetime support.

    PHP, Perl, C Programming, and Java