org.dspace.testing
Class PubMedToImport

java.lang.Object
  extended by org.dspace.testing.PubMedToImport

public class PubMedToImport
extends Object

Simple class to transform a medline.xml file from PubMed into DSpace import package(s) This is a distinctly incomplete implementation - it doesn't even attempt to map a number of fields, and has no means of customizing the mapping. More importantly, it makes assumptions in parsing the xml that would be problematic for a production instance. However, it does use SAX parsing, which means it has no problems with handling a 1GB+ input file. This means it is a good way to generate a large number of realistic import packages very quickly - simply go to http://www.ncbi.nlm.nih.gov/pubmed and search for something that returns a lot of records ('nature' returns over 300,000 for example). Download the results as a medline.xml (and yes, it will attempt to download all 300,000+ into a single file), and then run this class over that file to spit out import packages which can then be loaded into DSpace using ItemImport.


Constructor Summary
PubMedToImport()
           
 
Method Summary
static void main(String[] args)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PubMedToImport

public PubMedToImport()
Method Detail

main

public static void main(String[] args)


Copyright © 2012 DuraSpace. All Rights Reserved.