Class TextMDMetadata


  • public class TextMDMetadata
    extends Object
    Encapsulation of the textMD metadata for text files. See http://www.loc.gov/standards/textMd for more information.
    Author:
    Thomas Ledoux
    • Constructor Detail

      • TextMDMetadata

        public TextMDMetadata()
    • Method Detail

      • getCharset

        public String getCharset()
        Returns:
        the charset
      • setCharset

        public void setCharset​(String charset)
        Parameters:
        charset - the charset to set
      • getByte_order

        public int getByte_order()
        Returns:
        the byte_order
      • getByte_orderString

        public String getByte_orderString()
      • setByte_order

        public void setByte_order​(int byte_order)
        Parameters:
        byte_order - the byte_order to set
      • getByte_size

        public String getByte_size()
        Returns:
        the byte_size
      • setByte_size

        public void setByte_size​(String byte_size)
        Parameters:
        byte_size - the byte_size to set
      • getCharacter_size

        public String getCharacter_size()
        Returns:
        the character_size
      • setCharacter_size

        public void setCharacter_size​(String character_size)
        Parameters:
        character_size - the character_size to set
      • getLinebreak

        public int getLinebreak()
        Returns:
        the linebreak
      • getLinebreakString

        public String getLinebreakString()
        Returns:
        the linebreak in String form
      • setLinebreak

        public void setLinebreak​(int linebreak)
        Parameters:
        linebreak - the linebreak to set
      • getLanguage

        public String getLanguage()
        Returns:
        the language
      • setLanguage

        public void setLanguage​(String language)
        Parameters:
        language - the language to set
      • getMarkup_basis

        public String getMarkup_basis()
        Returns:
        the markup_basis
      • setMarkup_basis

        public void setMarkup_basis​(String markup_basis)
        Parameters:
        markup_basis - the markup_basis to set
      • getMarkup_basis_version

        public String getMarkup_basis_version()
        Returns:
        the markup_basis_version
      • setMarkup_basis_version

        public void setMarkup_basis_version​(String markup_basis_version)
        Parameters:
        markup_basis_version - the markup_basis_version to set
      • getMarkup_language

        public String getMarkup_language()
        Returns:
        the markup_language
      • setMarkup_language

        public void setMarkup_language​(String markup_language)
        Parameters:
        markup_language - the markup_language to set
      • getMarkup_language_version

        public String getMarkup_language_version()
        Returns:
        the markup_language_version
      • setMarkup_language_version

        public void setMarkup_language_version​(String markup_language_version)
        Parameters:
        markup_language_version - the markup_language_version to set
      • toTextMDCharset

        public static String toTextMDCharset​(String srcCharset)
        Transform a given charset in the "authorized" list given in the textMD schema enumeration. From the schema documentation on charset (http://www.loc.gov/standards/textMD/elementSet/index.html#element_charset). The character set employed by the text. Controlled vocab using IANA names for character sets: http://www.iana.org/assignments/character-sets. The problem arises because the java Charset uses the (preferred MIME name) where textMD uses the Name ...
        Parameters:
        srcCharset - charset from the file
        Returns:
        normalized charset
      • toISO_639_2

        public static String toISO_639_2​(String srcLang)
        Transform a language to the ISO_639-2 language (only enumeration allowed in textMD schema).
        Parameters:
        srcLang - language in the file
        Returns:
        normalized language in 3 letters (except qaa-qtz)