org.dbpedia.extraction.scripts
Split multistream Wikipedia dumps (e.g. [1]) into size-configurable chunks [1] http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles-multistream.xml.bz2 Note: This script only works with multistream dumps!
Usage: ../run WikipediaDumpSplitter /path/to/multistream/dump/enwiki-latest-pages-articles-multistream.xml.bz2 /path/to/mulstistream/dump/index/enwiki-latest-pages-articles-multistream-index.txt.bz2 /output/directory 64
Split multistream Wikipedia dumps (e.g. [1]) into size-configurable chunks [1] http://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles-multistream.xml.bz2 Note: This script only works with multistream dumps!
Usage: ../run WikipediaDumpSplitter /path/to/multistream/dump/enwiki-latest-pages-articles-multistream.xml.bz2 /path/to/mulstistream/dump/index/enwiki-latest-pages-articles-multistream-index.txt.bz2 /output/directory 64