Class Path

  • Direct Known Subclasses:
    UXAddress

    public class Path
    extends Object
    Represents an RFC822 (and successors) eMail address header content, either From or To, or subsets. In domain literals (square brackets) the General-address-literal syntax is not recognised (as downstream MTAs cannot support it as no use is specified yet), and a IPv6 Zone Identifier isn’t supported as it’s special local use only. Handling of line endings is lenient: CRLF := ([CR] LF) / CR Create a new instance via the of(String) factory method by passing it the address list string to analyse. Then call one of the parse methods on the instance: asAddressList() to validate recipients, asMailboxList() or forSender(boolean) for message senders (but read their JavaDoc). Validating unlabelled addr-specs is possible with asAddrSpec().
    Author:
    mirabilos (t.glaser@tarent.de)
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      class  Path.Address
      Representation for an address (either mailbox or group)
      class  Path.AddressList
      Representation for an address-list or a mailbox-list
      class  Path.AddrSpec
      Representation for an addr-spec (eMail address) comprised of localPart and domain
      protected class  Path.AddrSpecSIDE
      Representation for a local-part (FWS unfolded) or a domain (dot-atom only)
      static interface  Path.ParserResult
      Methods all Path parser results implement.
      protected class  Path.UnfoldedSubstring
      Representation for a substring of the input string, FWS unfolded
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected Path​(String input)
      Private constructor, use the factory method of(String) instead
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected int accept()
      Advances the current position to the next character
      Path.AddressList asAddressList()
      Parses the address as address-list, e.g.
      Path.AddrSpec asAddrSpec()
      Parses the address as addr-spec (unlabelled address) This method is mostly used in input validation.
      Path.AddressList asMailboxList()
      Parses the address as mailbox-list, e.g.
      protected int bra​(int deltapos)
      Jumps to a specified input character position, relative jump
      protected int cur()
      Returns the wide character at the current position
      Path.Address forSender​(boolean allowRFC6854forLimitedUse)
      Parses the address for the Sender and Resent-Sender headers These headers normally use the address production, but RFC6854 allows for the mailbox production, with the RFC2026 §3.3(d) Limited Use caveat that permits it but only for specific circumstances.
      protected static boolean is​(int c, byte what)  
      protected static boolean isAtext​(int c)  
      protected static boolean isCtext​(int c)  
      protected static boolean isDtext​(int c)  
      protected static boolean isQtext​(int c)  
      protected int jmp​(int pos)
      Jumps to a specified input character position, absolute jump
      protected static <T extends org.evolvis.tartools.rfc822.Parser>
      T
      of​(Class<T> cls, String input)
      Constructs a parser.
      static Path of​(String addresses)
      Creates and initialises a new parser for eMail addresses.
      protected Path.Address pAddress()  
      protected Path.AddressList pAddressList()  
      protected Path.AddrSpec pAddrSpec()  
      protected Path.AddrSpec pAngleAddr()  
      protected org.evolvis.tartools.rfc822.Path.Word pAtom()
      Returns the parse result of the atom production: result.body is a raw Substring of the atom, with surrounding CFWS stripped (no unfolding necessary), no extra data result.cfws is null or the trailing CFWS as raw Substring, not unfolded
      protected boolean pCcontent()  
      protected org.evolvis.tartools.rfc822.Parser.Substring pCFWS()
      Parses CFWS
      protected org.evolvis.tartools.rfc822.Parser.Substring pComment()
      Parses comment
      protected org.evolvis.tartools.rfc822.Parser.Substring pDisplayName()  
      protected org.evolvis.tartools.rfc822.Parser.Substring pDomain()  
      protected org.evolvis.tartools.rfc822.Parser.Substring pDomainLiteral()  
      protected org.evolvis.tartools.rfc822.Parser.Substring pDotAtom()  
      protected int peek()
      Returns the wide character after the one at the current position
      protected org.evolvis.tartools.rfc822.Parser.Substring pFWS()
      Parses FWS
      protected Path.Address pGroup()  
      protected Path.AddrSpecSIDE pLocalPart()  
      protected Path.Address pMailbox()  
      protected Path.AddressList pMailboxList()  
      protected Path.Address pNameAddr()  
      protected int pos()
      Returns the current input character position, for saving and restoring (with jmp(int)) and for error messages
      protected org.evolvis.tartools.rfc822.Parser.Substring pPhrase()  
      protected int pQcontent()  
      protected int pQuotedPair()  
      protected org.evolvis.tartools.rfc822.Path.Word pQuotedString()
      Returns the parse result of the quoted-string production: result.body is an Path.UnfoldedSubstring of the entire quoted string, with surrounding double quotes; its String data is dequoted and backslash-removed result.cfws is null or the trailing CFWS as raw Substring, not unfolded
      protected org.evolvis.tartools.rfc822.Path.Word pWord()  
      protected String s()
      Returns the input string, for use with substring comparisons (this is safe because Java™ strings are immutable)
      protected int skip​(BiFunction<Integer,​Integer,​Boolean> matcher)
      Advances the current position as long as the matcher returns true and end of input is not yet reached
      protected int skip​(Function<Integer,​Boolean> matcher)
      Advances the current position as long as the matcher returns true and end of input is not yet reached; cf.
      static String unfold​(String s)
      Removes all occurrences of CR and/or LF from a string.
      protected org.evolvis.tartools.rfc822.Parser.Substring unfold​(org.evolvis.tartools.rfc822.Parser.Substring ss)
      Unfolds FWS in the passed Substring if necessary
    • Constructor Detail

      • Path

        protected Path​(String input)
        Private constructor, use the factory method of(String) instead
        Parameters:
        input - string to analyse
    • Method Detail

      • is

        protected static boolean is​(int c,
                                    byte what)
      • isAtext

        protected static boolean isAtext​(int c)
      • isCtext

        protected static boolean isCtext​(int c)
      • isDtext

        protected static boolean isDtext​(int c)
      • isQtext

        protected static boolean isQtext​(int c)
      • unfold

        public static String unfold​(String s)
        Removes all occurrences of CR and/or LF from a string.
        Parameters:
        s - input string
        Returns:
        null if there was nothing to remove, a new shorter String otherwise
      • unfold

        protected org.evolvis.tartools.rfc822.Parser.Substring unfold​(org.evolvis.tartools.rfc822.Parser.Substring ss)
        Unfolds FWS in the passed Substring if necessary
        Parameters:
        ss - Parser.Substring to unfold
        Returns:
        instance of an unfolded equivalent of the original substring
      • of

        public static Path of​(String addresses)
        Creates and initialises a new parser for eMail addresses.
        Parameters:
        addresses - to parse
        Returns:
        null if addresses was null or very large, the new instance otherwise
      • asMailboxList

        public Path.AddressList asMailboxList()
        Parses the address as mailbox-list, e.g. for the From and Resent-From headers (but see asAddressList() for RFC6854’s RFC2026 §3.3(d) Limited Use)
        Returns:
        parser result; remember to call isValid() on it first!
      • forSender

        public Path.Address forSender​(boolean allowRFC6854forLimitedUse)
        Parses the address for the Sender and Resent-Sender headers These headers normally use the address production, but RFC6854 allows for the mailbox production, with the RFC2026 §3.3(d) Limited Use caveat that permits it but only for specific circumstances.
        Parameters:
        allowRFC6854forLimitedUse - use address instead of mailbox parsing
        Returns:
        parser result; remember to call isValid() on it first!
      • asAddrSpec

        public Path.AddrSpec asAddrSpec()
        Parses the address as addr-spec (unlabelled address) This method is mostly used in input validation. In most cases, use forSender(boolean)(false) instead which allows “user <lcl@example.com>” then extract the addr-spec “lcl@example.com” from the return value.
        Returns:
        parser result; remember to call isValid() on it first!
      • asAddressList

        public Path.AddressList asAddressList()
        Parses the address as address-list, e.g. for the Reply-To, To, Cc, (optionally) Bcc, Resent-To, Resent-Cc and (optionally) Resent-Bcc headers. RFC6854 (under RFC2026 §3.3(d) Limited Use circumstances) allows using this for the From and Resent-From headers, normally covered by the asMailboxList() method.
        Returns:
        parser result; remember to call isValid() on it first!
      • pDisplayName

        protected org.evolvis.tartools.rfc822.Parser.Substring pDisplayName()
      • pPhrase

        protected org.evolvis.tartools.rfc822.Parser.Substring pPhrase()
      • pWord

        protected org.evolvis.tartools.rfc822.Path.Word pWord()
      • pAtom

        protected org.evolvis.tartools.rfc822.Path.Word pAtom()
        Returns the parse result of the atom production: result.body is a raw Substring of the atom, with surrounding CFWS stripped (no unfolding necessary), no extra data result.cfws is null or the trailing CFWS as raw Substring, not unfolded
        Returns:
        result (see above)
      • pQuotedPair

        protected int pQuotedPair()
      • pQcontent

        protected int pQcontent()
      • pQuotedString

        protected org.evolvis.tartools.rfc822.Path.Word pQuotedString()
        Returns the parse result of the quoted-string production: result.body is an Path.UnfoldedSubstring of the entire quoted string, with surrounding double quotes; its String data is dequoted and backslash-removed result.cfws is null or the trailing CFWS as raw Substring, not unfolded
        Returns:
        result (see above)
      • pFWS

        protected org.evolvis.tartools.rfc822.Parser.Substring pFWS()
        Parses FWS
        Returns:
        raw Substring, not unfolded
      • pCcontent

        protected boolean pCcontent()
      • pComment

        protected org.evolvis.tartools.rfc822.Parser.Substring pComment()
        Parses comment
        Returns:
        raw Substring, not unfolded (unfolded is human-visible form for now; may wish to simplify quoted-pairs)
      • pCFWS

        protected org.evolvis.tartools.rfc822.Parser.Substring pCFWS()
        Parses CFWS
        Returns:
        raw Substring, not unfolded
      • pDotAtom

        protected org.evolvis.tartools.rfc822.Parser.Substring pDotAtom()
      • pDomainLiteral

        protected org.evolvis.tartools.rfc822.Parser.Substring pDomainLiteral()
      • pDomain

        protected org.evolvis.tartools.rfc822.Parser.Substring pDomain()
      • of

        protected static <T extends org.evolvis.tartools.rfc822.Parser> T of​(Class<T> cls,
                                                                             String input)
        Constructs a parser. Intended to be used by subclasses from static factory methods *only*; see of(String) for an example.
        Type Parameters:
        T - subclass of Parser to construct
        Parameters:
        cls - subclass of Parser to construct
        input - user-provided String to parse
        Returns:
        null if input was null or too large, the new instance otherwise
      • jmp

        protected final int jmp​(int pos)
        Jumps to a specified input character position, absolute jump
        Parameters:
        pos - to jump to
        Returns:
        the codepoint at that position
        Throws:
        IndexOutOfBoundsException - if pos is not in or just past the input
      • bra

        protected final int bra​(int deltapos)
        Jumps to a specified input character position, relative jump
        Parameters:
        deltapos - to add to the current positioin
        Returns:
        the codepoint at that position
        Throws:
        IndexOutOfBoundsException - if pos is not in or just past the input
      • pos

        protected final int pos()
        Returns the current input character position, for saving and restoring (with jmp(int)) and for error messages
        Returns:
        position
      • s

        protected final String s()
        Returns the input string, for use with substring comparisons (this is safe because Java™ strings are immutable)
        Returns:
        String input
      • cur

        protected final int cur()
        Returns the wide character at the current position
        Returns:
        UCS-4 codepoint, or -1 if end of input is reached
      • peek

        protected final int peek()
        Returns the wide character after the one at the current position
        Returns:
        UCS-4 codepoint, or -1 if end of input is reached
      • accept

        protected final int accept()
        Advances the current position to the next character
        Returns:
        codepoint of the next character, or -1 if end of input is reached
        Throws:
        IndexOutOfBoundsException - if end of input was already reached
      • skip

        protected final int skip​(BiFunction<Integer,​Integer,​Boolean> matcher)
        Advances the current position as long as the matcher returns true and end of input is not yet reached
        Parameters:
        matcher - gets called with cur() and peek() as arguments
        Returns:
        codepoint of the first character where the matcher returned false, or -1
      • skip

        protected final int skip​(Function<Integer,​Boolean> matcher)
        Advances the current position as long as the matcher returns true and end of input is not yet reached; cf. skip(BiFunction)
        Parameters:
        matcher - gets called with just cur() as argument
        Returns:
        codepoint of the first character where the matcher returned false, or -1