Text this: Automatic identification of cross-document structural relationships