Load the script file (usually index.idx) and process it one line at a time,
producing one or more index term per (non-comment) line.
Reading all lines builds a list of terms to index.
Some of those may be terms defined (by you) directly in the script file,
others may be terms found by scanning C++ header and source files that
were specified by the !scan-path directive.
Once the complete list of terms to index is complete,
it loads the Docbook XML file. (If this comes from Quickbook/Doxygen/Boostbook/Docbook
then this is the complete documentation after conversion to Docbook format).
AutoIndex builds an internal Document
Object Model (DOM) of the Docbook XML. This internal representation
then gets scanned for occurrences of the terms to index.
This scanning works at the XML paragraph level (or equivalent sibling such
as a table or code block) - so all the XML encoding within a paragraph
gets flattened to plain text. This flattening means the regular expressions
used to search for terms to index can find anything
that is completely contained within a paragraph (or code block etc).
For each term found then an indexterm Docbook element
is inserted into the Document
Object Model (DOM) (provided internal index generation is off),
Also the AutoIndex's internal index representation gets updated.
Once the whole XML document has been indexed, then, if AutoIndex has been
instructed to generate the index itself, it creates the necessary XML and
inserts this into the Document
Object Model (DOM).
Finally the whole Document
Object Model (DOM) is written out as a new Docbook XML file, and
normal processing of this continues via the XSL stylesheets (with xsltproc)
to actually build the final human-readable docs.