Check: HTML validity, CSS validity, links.

Start / Lightning Talks / Jochen Stenzel
Next: Binary data   Previous: Author index   TOC

Automatic cross references

In this demo, pages are crosslinked all over the site. This is done by explicit links, and it works fine for all the pages provided by the CD team. But for the talk pages, this is no working approach - especially if there are archives. Too much manual work, too many maintenance overhead in case of changes (which use to happen permanently), too much effort to scan and categorize all the abstracts and bios.

But ... it would be nice to see which talks are related. Can this be done automatically?

It can! That's what PerlPoint index references were invented for. This is the base idea:

So PerlPointCD produces index based cross references. They are added to a page whenever related pages were found, sorted in the order of match counts.

There are three configuration settings controlling the search. They are accessible as makefile macros and default to values that produced good results.

Base search depth

There can be subchapters in a page we are searching related pages for. The start page has index entries, and the subchapters add probably more. Which number of subchapter levels should be taken into account when building our index entry base?

Provide one of the keywords startpage or full to the makefile macro INDEXREL_READDEPTH.


  # include the start page only
  INDEXREL_READDEPTH=startpage

  # include all subchapters
  INDEXREL_READDEPTH=full

As it seems best to find a base of as much key phrases as possible, this setting defaults to full.

Related page search depth

Related pages might have subchapters as well. How many of their subchapter levels should be taken into account when searching for matches to the start page?

Again, provide one of the keywords startpage or full to the makefile macro INDEXREL_RELDEPTH.


  # consider the he start page only
  INDEXREL_RELDEPTH=1

  # scan all subchapters
  INDEXREL_RELDEPTH=full

This setting defaults to full as well.

Threshold

Configured by the makefile macro INDEXREL_THRESHOLD, this setting controls which number or percentage of the start file index entries must be found on another page to call this other page related. Matches below this threshold are discarded.

The setting can be made absolutely or by a percentage value, with 20% as the default.


  # skip everything below a threshold of 30%
  INDEXREL_THRESHOLD=30%

  # require three matches at least
  INDEXREL_THRESHOLD=3

To get really good results, encourage your speakers to make (lots of ;-)) index entries!

Index-related:


Next: Binary data   Previous: Author index   TOC
Start / Lightning Talks / Jochen Stenzel