Looking to build an automatic filing system for our scanner.
We want to keep this as simple as possible. We already have an API which connects to our database and downloads to an XML file.
We need to have a SQL Lite database which converts this xml file into a database.
(See the attached .xml file)
Here is the job flow:
1. We scan a document from our scanner.
2. The scanner uses a OCR reader on the .pdf and saves it into the “Scan Folder”
3. The scanned items will appear in the GUI interface of our c# program ,which will be watching the “Scan Folder” and will update it as items appear. Every client in the database will have there own folder (this needs to be managed by syncing the folders with the .xml files. The updates need to happen automatically every hour.
4. All of the following will only happen once we push the “Automatch folder:
5. We then parse the .pdf with iText 7 Community .NET and save the parsed text to an array.
6. We then use regex to search for the TFN or ABN numbers. The TFN numbers are 8 digits or 9 digits in a row, and an ABN is an 11 digit number. We then run the whole document through [login to view URL] API to identify who the main “People” or “Organisations” are in case a TFN or ABN is not present. In the situation when a TFN or ABN is not present, then we will have to match the results from this against all of the customers in our database (Scan Folder). The year also needs to be regexed based on the parsed text.
7. Once we have identified potential matches we will display them in the GUI interface
8. The folder is just the client name, the Match, needs to display on what basis there is a match. If the TFN is automatched and the data is automatched then it can just be automatically allocated to the client folder. In which case the status updates to Complete . If only the TFN is automatched then it needs to confirm the date.
9. If the ABN is matched/there is a name match between the .pdf and the database we need the possible matches to be identified, and then the users will have to select the right one.
10. Once an item has been automatched or manually matched, it needs to have the status of complete, and we need it to be saed to the [login to view URL] file. It also needs to be moved to the correct client folder.
16 freelancers are bidding on average $407 for this job
I can build the solution in C# that can parse the PDF to extract ABN and do the other requirements. Can work on a demo if you like Relevant Skills and Experience C# Proposed Milestones $1000 AUD - Full