Look up language and title in domain and title indexes.
Look up language and title in domain and title indexes. Side effect: store title in index if not already set.
Read dump file.
Read dump file. Must be called before writeTriples() if readTriples() is not called. Initializes the following fields: languages, dates, domains, titles, titleCount, links
Read links from interlanguage-links triple files.
Read links from interlanguage-links triple files. Accesses (and in the end nulls) titleKeys and domainKeys, so it must be called after setLanguages() and must not be called twice. Must be called before writeTriples() if readDump() is not called. Initializes the following fields: titles, titleKeys, titleCount, links Clears the following fields: titleKeys, domainKeys
Build domain index.
Build domain index. Must be called before readTriples(). Initializes the following fields: languages, dates, domains, domainKeys
Write links to interlanguage-links-same-as and interlanguage-links-see-also triple files.
Write links to interlanguage-links-same-as and interlanguage-links-see-also triple files. Must be called after either readDump() or readTriples() and sortLinks(). Accesses the following fields: titles, domains, links
TODO: it would be nice if we could also produce quad files, not just triple files. But: we don't want to store all context URIs in memory, so we should process the input files again in this method, not just serialize the link array. We would have to refactor readTriples() so we can re-use its main loop for writeTriples() as well. Only the innermost part of the main loop would differ, and of course some stuff before and after the main loop.