I have 16 (old/public-domain dictionary) source reference word-lists with about 1,000 to 3,000 words per source/book. There are about 20,000 unique words or terms in total. Each of the word lists is on a different Excel spreadsheet arranged alphabetically by excel books/pages (All the “A’s on the first page, “B’s” on the second etc.).
In the column/cell beside each word is a path to a jpeg or GIF of the original page containing that word from that original source. There are about 20,000 pages in total and I have all of the original page scans as 300 dpi colour tiffs from which to create new jpegs or GIF’s or whatever is most efficient size-wise.
What I need is a kind of auto-complete function where I start typing the word “cathedral”, for example, and the program tells me that, say, ten of the sixteen original source-books contain the word “cathedral” and displays the resulting links. I then click on the one that I want and the program calls up and displays the image file of the original page that contained that word (so that I can read it on the screen).
I have wide tiff photos of all the original books together on a shelf, and I want, again for example, each of the ten (of the total sixteen) that contain that word to “light-up” or slide forward to indicate that those particular books are the ten that contain the word “cathedral”. I would like to be able to select the one that I want by clicking on that book in the photo.
Once the page-image that I want is displayed the program also needs a “next page” and “previous page” function that will call up the next page (when a particular reference entry covers two or more pages).
Each of the spreadsheet word-list entries also has a “print range” that defines the pages that need to print to cover a given selected word. For example the word “cathedral” might correspond to a five-page article in one of the books and so the program would need to know that it needs to print page 0024 to page 0028 when it gets a print request for that book/word. I think about 75% of the words are completely contained on a single page in a given book and the remaining 25% involve multiple pages.
That above is phase one of what I need.
For phase two, depending on the degree of difficulty, I also need a program that will allow me to expand the virtual library as I complete the scanning or photographing of new reference books in my collection. I have, for example, a seventeenth book that I have not yet scanned and which has about 1,000 reference words in it. I would manually create the new word-list in Excel which will probably involve about 950 words that are already covered by the first sixteen, and 50 new words that do not appear in any of the first sixteen. The second program needs to integrate the new material (word-lists and linked page images) with the existing material. I anticipate adding one new reference work about every two months so it would be good to automate the expansion process as much as is reasonably possible.
I have a semi-working model of the first five books (about 1/3 of the total) that was done as a virtual-website-on-DVD and which ran under a Mozilla browser. It was done about ten years ago and was about six gigabytes in total size using jpegs. I don't have the source code but it needs to be redone from scratch anyway and I understand that image file compression has improved substantially in the meantime. I would like all sixteen to fit on a 16 gig flash drive if possible.
I think that the project is quite simple in that it only involves a few simple elements but the total size of the image files is large. It is not necessarily intended as a website but rather for distribution on a flash-drive. I am indicating a "fixed price" under the "Project type" but am flexible depending on the complexity of it.