The work is very simple.
The input will be some PDF files (which are usually technical books).
(1). The scope is to split those PDFs in logical to read and small (5 minutes or less reading) chunks. This will be done using the table of contents of each book. The rule of thumb is that (by approximation) 1 minute reading = 1 page. So 5 minutes or less reading may be translated to 5 pages or less chunks.
(2). (1) will be achieved usually by splitting by the second level (first level being the chapter level) of the table of contents. However if the resulted chunk is too big then splitting by the third level of the TOC might be required sometimes. If third level TOCs are individually too small (1 page or less) then it might be required to combine few of them together in a single chunk (5 pages or less chunk rule should be respected). No chunk should be bigger then 10 pages.
(3). Chunk Name should be - Book Title: Chapter Number (Title): Paragraph Title: Sub-Paragraph Title
(4). Chunk content should be - Chapter Number (Title): Paragraph Title: Sub-Paragraph Title ... pages 21-26
(5). Page Numbers should be the absolute page numbers where the information is found in the PDF file (and not the page numbers which are displayed in the table of contents). This information will be used to 'Go To Page' directly to the reading material.
The chunks will be created in a wiki file having some predefined tags. Currently the splitting work is manual.
Good work will bring other work.