Indexing Text
Table of Contents
1 Abstract
A means of indexing text files is presented.
The text files are OrgMode and Markdown source files, or any text files in general.
Results are produced in a meta-file in RDB list format, suitable for subsequent formatting in one of the authoring tools.
2 Current
At the moment, a shell function library and an awk script search text files to produce the result. Objectives are to
- find whole words regardless of case
- ignore partial matches
- find short phrases which may wrap a text line
- create the index entry for the lowest level section header
First the rina_doc and rin_testd functions, then followed by the results on this test data. Here, testd attempts both sides of the include/exclude requirement.
rina_doc () { : stdout: list format -- /rdb -- of index references; : format: header, htlin, text; : args: TOKEN FILE ...; :; } rin_testd () { set ./testcase.org; rina member $1; rina members $1; rina embership $1; rina Membership $1; set $(fullpath ./testcase.org); rina Member $1; rina membership $1; rina 'the treasurer' $1 }
Here is the resulting RDB list format. Contrasted with the RDB table format, in the list format the first line is blank, the first word on each line is the field name; blank lines separate records.
header member htlink file::./testcase.org::*Qualifications text Qualifications header member htlink file::./testcase.org::*Recommendation text Recommendation header members htlink file::./testcase.org::*Contemporary text Contemporary header Membership htlink file::./testcase.org::*Recommendation text Recommendation header Member htlink file::/Users/applemcg/Dropbox/commonplace/lit/testcase.org::*Qualifications text Qualifications header Member htlink file::/Users/applemcg/Dropbox/commonplace/lit/testcase.org::*Recommendation text Recommendation header membership htlink file::/Users/applemcg/Dropbox/commonplace/lit/testcase.org::*Recommendation text Recommendation header the treasurer htlink file::/Users/applemcg/Dropbox/commonplace/lit/testcase.org::*Recommendation text Recommendation
3 commands
set testcase; ln -f $1.org $1.txt
4 Referneces
- the Literature Tools chapter
- this paper: http://mcgowans.org/pubs/marty3/commonplace/lit/indexing.html, online
- Markdown
- OrgMode
- RDB – Unix Relational Database Management