org.dspace.testing
Class PubMedToImport
java.lang.Object
org.dspace.testing.PubMedToImport
public class PubMedToImport
- extends Object
Simple class to transform a medline.xml file from PubMed into DSpace import package(s)
This is a distinctly incomplete implementation - it doesn't even attempt to map a number of fields,
and has no means of customizing the mapping. More importantly, it makes assumptions in parsing the xml
that would be problematic for a production instance.
However, it does use SAX parsing, which means it has no problems with handling a 1GB+ input file.
This means it is a good way to generate a large number of realistic import packages very quickly -
simply go to http://www.ncbi.nlm.nih.gov/pubmed and search for something that returns a lot of records
('nature' returns over 300,000 for example). Download the results as a medline.xml (and yes, it will attempt
to download all 300,000+ into a single file), and then run this class over that file to spit out import packages
which can then be loaded into DSpace using ItemImport.
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PubMedToImport
public PubMedToImport()
main
public static void main(String[] args)
Copyright © 2012 DuraSpace. All Rights Reserved.