Data Entry in Database
This project was awarded to jayeshsuratsl for $46.75 USD.Get free quotes for a project like this
Project Budget$30-$5000 USD
A.) View information in a flat database on scanned documents and enter viewed information into an ACT! or CSV compatable database. The information is on documents from which the information may not be imported or exported, other than viewing and manually typing the viewed information into a database.
The information comes to me on a CD, and could be converted to an ISO image, and then either made available for download, or FTP'd to the data entry entity. There would be approx. 800 images every 2 weeks, each reflecting one record. The information to be entered in the database in this part (part A) is mostly handwritten, and is in American English. The information to be read and transferred includes the following data elements:
1.) First and Last Name; 2.) Address; 3.) A dollar amount, easily discerned from the document; 4.) A date, easily discerned from the document; 5.) A unique number, reflected on the document.
B.) Access an online database (at [[url removed, login to view]]) and based on a certain set of sort criteria, obtain and create a database containing the same dataset as in "A."
I am interested in a **per-record** quote on each of these activities, an estimated turn-around time, and at least a 95% accuracy rate, with monetary penalties for <95% accuracy.
C.) If the entity or individual is only interested in the application portion of this bid request, I wish to have either a compiled application, or Access Templates and Macros,allowing creation of these databases, and afford the user (me) the opportunity to purge duplicate records from later databases.
Almost 100% of this functionality is contained in the infacta Group Mail application at [[url removed, login to view]].
I am interested in separate bid(s) on the data entry portion(s) of this bid request, the application portion of this bid request, OR BOTH.
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).
4.) Data deliverables must be in .CSV or .dbf format (according to advance agreement), and a data sample must be provided and tested before approval of the project, or acceptance of a given dataset. Corrupted or unusable datasets will be rejected.
5.) All deilverables must run on Windows XP Pro / XP Pro 64 bit / Windows SBS 2003.
6.) Deliverable would allow extensive searching and duplicate checking and allow deletion of duplicates from prior datasets. So, if John Doe at 1221 E. 23rd Street is in a database from 2 months ago, John Doe would be displayed, an option given to purge from the most recent database. An elegant application would find possible duplicates, display the possible duplicate record, and allow the user to either delete that record, or allow it to remain in the current database. An elegant application would have an automated method to select prior databases, for consideration in the duplication comparison, and run the comparison in one session, instead of needing to select a single prior database, compare it, select another prior file, ... the next, etc. For example, a selection box to either select or deselect all files within a specific folder (The startup folder) would be an ideal methodology. A standard naming method for the prior databases would include the date of creation of the prior database, and the source (not disclosed here). An additional desired feature would be a search feature so, for example, records could be searched for a specific name or address. Another feature would be a search for any out-of-state addresses. A binary NOT (Oklahoma) search would probably be the best way to accomplish this.
7.) The data would consist only of private individuals, be free of company names, be free of duplicates from within this dataset (this could be done by software, prior to transmission back to me), and be free of records of persons residing in a different state (from mine). (There are few records for companies and individuals residing out of state).
8.) Records would be free of incomplete or unreadable addresses. However, an address with no zip code is not an incomplete address, and should be included, with or without the zip code.
* * *This broadcast message was sent to all bidders on Monday Dec 5, 2005 11:12:38 AM:
I have added a sample of the Affidavits to the bid, per request from a number of you. Please note there are TWO DIFFERENT and DISTINCT data sources. One is flat .tif files, such as the one attached, and the other is very similar data obtained at www.realtytrac.com. The data elements to be captured/typed from the sample affidavits uploaded to the bid are: 1.) Defendant's Name(s), Defendant's Address, The case number (SC-XX-XXXX), the amount being sought in the judgment ($[url removed, login to view]), The date the Affidavit was filed. One additional note to those of you bidding on the data entry portion, the requirement is a bid for PER RECORD entry. I have received a number of creative bids that do not conform to this requirement, and they will not be considered (e.g., per keystroke). Also, I have received a fair amount of response to my posting. If you have a specific question about the prop osal, please ask it. A number of you have posted you have questions, but have not stated the question. In fairness, I do not have time to contact each of you and discuss the matter. Please articulate any questions by clearly posting them. Failure to do this will result in no response.
* * *This broadcast message was sent to all bidders on Monday Dec 5, 2005 5:21:51 PM:
The information in the online database is viewed online, is text, and may be re-typed, or cut-and-pasted.
I want unreadable records ignored/skipped.
* * *This broadcast message was sent to all bidders on Tuesday Dec 6, 2005 11:38:04 AM:
Again, due to inquiries, I am again increasing specificity. Following is a table of sample values and data elements to be extracted from [url removed, login to view] (one of the Affidavit's in the sample bid file).
"Case Number: SC-2005-9548
Amount being Sought: $[url removed, login to view]
Date of Filing: 05/12/2005
Defendant's Full Name(s): Arnetta L. Greasham
Defendant's Last Name: Greasham
Defendant's Address1: 1900 N. Bryant
Defendant's City: Oklahoma City
Defendant's State: OK
Defendant's Zip: 73121
Amount being Sought: $[url removed, login to view]
Date of Filing: 05/12/2005"
If there are two or more Defendant's on the same record, do not create a separate record for each, place both names in the "Defendant's Full Name(s):" field, e.g., "Jane and John Doe".
"Defendant's Address2:" field is a placeholder, in case there is more than one line to the street address.
* * *This broadcast message was sent to all bidders on Tuesday Dec 6, 2005 3:02:23 PM:
OK. I am still getting inquiries regarding what a bid on a per record basis means. The 800 records every 2 weeks in the bid proposal is an ESTIMATE (based on my prior experience). You may post a main bid for data entry, based on an estimate of 800 records, but here is how I need you to break it down: If I submit 800 records, and your bid is $100 for 800 records, then that is .08 cents, (US) per record. I need to know that you will do the data entry, for example, for 8 cents per record, if that is your bid. The number of records is not going to be exactly 800 each time. It could be 500 records one time, and 1000 records the next. It depends fully on the number of records given to me by the Court Clerk, which depends on the number of filings for that time period. I am not implying your bid should be 8 cents, I am using 8 cents as an example.
Also, I have asked for a mo netary penalty for datasets at less than 95% accuracy, and very few of you have addressed this requirement.
Finally, I asked for a turnaround time, and some have responded to this portion of the bid request, and some have not.
So, for clarity and consistency, a properly formed bid response would be something like,
"My bid, responsive to your bid proposal, is XX cents per record, with an XX% penalty for every percent less than 95% accuracy, with an XX day turnaround time, based on your estimate of 800 records. A variation of 10% or more in the number of records will likely result in shorter or longer turnaround of data. Nevertheless, the turnaround will be commensurate with the estimate, with consideration to the variation in the actual number of records."
If you then have additional information about your company or capabilities, feel free to additionally share that informaiton, and it will be considered, as a part of the total bid package.
* * *This broadcast message was sent to all bidders on Wednesday Dec 7, 2005 8:51:59 AM:
It has been brought to my attention I made a mathmatical error in my prior broadcast posting. The error does not affect the meaning of the message, or the requirements, but to correct my example error: .08 cents for 800 data records would correlate to a gross $64 bid, and a $100 gross bid would correlate to .125 (12 and 1/2 cents per record).
Thanks to each of you who have bid on this project.
* * *This broadcast message was sent to all bidders on Friday Mar 24, 2006 10:37:31 AM:
1.) No company names are to be included, and no out of state (non-Oklahoma) addresses are to be included. For these reasons, a couple of the records should have been skipped (Example, Alabama).
2.) Most often, the city is going to be Oklahoma City, Midwest City, Del City, Moore, Edmond, and Guthrie.
A complete list of all Oklahoma Cities can be found at [url removed, login to view], If the City has the word "City" with it, that word should be included. For example, "Oklahoma City," and "Midwest City" should be typed in as "Oklahoma City," and "Midwest City" not "Oklahoma," and not "Midwest."
3.) The state must be only Oklahoma. (Skip/delete records from other states, (example Alabama, Texas, Arkansas)).
4.) Names should be in initial capitals only, and not all capital letters.
5.) Please do not enter records from any other documents that may appear, except the ones that say "Affidavit."
6.) Duplicates are to be skipped or deleted. A record is considered a duplicate if it is the same or similar name at the same address (even if the small claims dollar amount and filename are different).
7.) Records must contain a valid mailing address (Address, City, State), or they are to be skipped. Records are NOT required to have the zip, if the zip is missing in the original data file.
8.) Filename is the name of the data file, for example, XXXXX.tif. (This is to enable auditing of the records).
9.) Please do not enter records from any other documents that may appear, except the ones that say "Affidavit."
10.) Use the header file uploaded with this message as your CSV data elements.
11.) If you have an FTP server, enter the details for uploading data with your bid.
12.) If you do not have an FTP server, enter your email address in the form email (at)[url removed, login to view], with your bid, so I may send you a link to a downloadable .iso file containing the data.
13.) The turnaround time you state in your bid will be considered firm, once the bid is entered and accepted. State the number of records that will be entered per day, because the total number of records varies.
1. What is the format for the date"
A: MM/DD/YYYY (example, 02/16/1959)
2. There are two addresses in the image - residence address and mailing address. Which one is actually to be entered?
A: The residence address is to be entered.
3. In some of the names there is one letter as third/last name. Shall we consider that as the last name?
A: A: If there is truly only a single letter, skip that record.
However, it could be a middle initial, such as my name:
Timothy L. Wallace
Data would be input as follows:
D_Full_N: Timothy L. Wallace
Some females have hyphenated names, such as:
D_Full_N: Betty Boop-Monroe
What you are probably seeing is something like Wallace, Timothy L. In this case, the last name is listed first, then the first name, and finally, the middle name.
Again, if the last name appears as a single initial, please just skip that record, because it can't be correct.
4. In the corrections that you send one entry of Oklahoma is stricken. Can you tell me why it is.
A: The entry was a business. Businesses are to be skipped.
5. Is there any way to differentiate whether the affidavit is in the name of a person or company
A: Not easily. The companies almost all have an adjective in them, instead of just nouns, such as "trucking," "hauling," "concrete," "construction," "auto," "automotive," "store," or a company designator, such as "Co.," "Company," "Corp.," "LLC," "LLP," "PC," or "Partnership." I will not expect you to pick out every one of the co mpanies because the language bariers make it difficult, but I do expect you to make an effort, and improve as I point out companies that slip through.
If you have more questions, please ask them.
Windows XP Pro
Windows XP Pro 64
Windows SBS 2003
Availability of SQL Server
Availability of Server 2003
Availability of Server 2003, Web edition
Looking to make some money?
- Set your budget and the timeframe
- Outline your proposal
- Get paid for your work
Hire Freelancers who also bid on this project
Looking for work?
Work on projects like this and make money from home!Sign Up Now
- The New York Times
- Wall Street Journal
- Times Online