Record-Boundary Discovery Algorithm
Output: The record separator of D.
Step1: Create the tag tree T of D.
Step2: Locate the highest-fan-out subtree HF in T.
Step3: Extract the set of candidate tags CT from HF.
Step4: Apply the five individual heuristics OM, SD, IT, HT, and RP to CT.
Step5: For each candidate tag in CT, apply Stanford certainty theory to the results of all five heuristics (ORSIH).
Step6: Choose the candidate tag with the highest compound certainty factor as the record separator for D.