Class Path
- java.lang.Object
-
- org.evolvis.tartools.rfc822.Path
-
- Direct Known Subclasses:
UXAddress
public class Path extends Object
Represents an eMail address header content (parser). That is, RFC822 (and successors)
From,To, and subsets, for use on the public internet. Handling of line endings is lenient:CRLF := ([CR] LF) / CRIn domain literals (square brackets), the
General-address-literalsyntax is not recognised because downstream MUAs cannot support it as no use is specified at the moment. Similarily, an IPv6 scope (Zone Identifier) is not supported because this parser targets use on the general internet. This class is concerned with on-wire formats; separate classes will implement MIME support and the likes later.To use, create a new instance via the
of(String)factory method passing the string to analyse for eMail address(es). Then call one of the parse methods on the instance, depending on what to expect:asAddrSpec()checks for unlabelledaddr-spec, such asfoo@example.com, which are useful for MSA invocations.forSender(boolean)withfalseargument validates onemailbox, that isFoo <foo@example.com>, such as used for theSenderheader. Labels must be ASCII and confirm to the RFC.forSender(boolean)withtrueargument validates oneaddress, that is either amailboxas above or agroup(Test:a@example.com,b@example.com;); RFC6854 adds them toSenderheaders under the RFC2026 §3.3(d) Limited Use caveat.asMailboxList()validates a, comma-separated, list of mailboxen as above, normally for theFromheader.asAddressList()validates a comma-separated list that can include a mix ofmailboxandgroupaddresses and normally is used for recipient headers (To, …) but, under the same Limited Use caveat, can be used per RFC6854 for aFrom(and like) header.
All of these return an instance of
Path.ParserResultornullif the parsing failed;Path.ParserResult.isValid()will return true only if, in addition, extra syntax and semantic checks passed; only if so, the address list can be used on the public internet safely;Path.ParserResult.toString()pretty-prints the on-wire representation. Some result objects may have extra methods that can be useful.- Author:
- mirabilos (t.glaser@tarent.de)
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classPath.AddressRepresentation for anaddress(eithermailboxorgroup).static classPath.AddressListRepresentation for anaddress-listor amailbox-list.static classPath.AddrSpecRepresentation for anaddr-spec(eMail address).protected classPath.AddrSpecSIDERepresentation for a local-part (FWS unfolded) or a domain (dot-atom only).static interfacePath.ParserResultMethods allPathparser results implement.protected classPath.UnfoldedSubstringRepresentation for a substring of the input string, FWS unfolded.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected intaccept()Advances the current position to the next character.Path.AddressListasAddressList()Parses the address asaddress-list, such as for theReply-To,To,Cc, (optionally)Bcc,Resent-To, … headers.Path.AddrSpecasAddrSpec()Parses the address asaddr-spec(unlabelled address).Path.AddressListasMailboxList()Parses the address asmailbox-list, such as for theFromandResent-Fromheaders.protected intbra(int deltapos)Jumps to a specified input character position, relative jump.protected intcur()Returns the wide character at the current position.Path.AddressforSender(boolean allowRFC6854forLimitedUse)Parses the address for theSenderandResent-Senderheaders.protected static booleanis(int c, byte what)protected static booleanisAtext(int c)protected static booleanisCtext(int c)protected static booleanisDtext(int c)protected static booleanisQtext(int c)protected intjmp(int pos)Jumps to a specified input character position, absolute jump.protected static <T extends org.evolvis.tartools.rfc822.Parser>
Tof(Class<T> cls, String input)Constructs a parser.static Pathof(String addresses)Creates and initialises a new (strict) parser for eMail addresses.protected Path.AddresspAddress()protected Path.AddressListpAddressList()protected Path.AddrSpecpAddrSpec()protected Path.AddrSpecpAngleAddr()protected org.evolvis.tartools.rfc822.Path.WordpAtom()Returns the parse result of theatomproduction:protected booleanpCcontent()protected org.evolvis.tartools.rfc822.Parser.SubstringpCFWS()Parses CFWS.protected org.evolvis.tartools.rfc822.Parser.SubstringpComment()Parses comment.protected org.evolvis.tartools.rfc822.Parser.SubstringpDisplayName()protected org.evolvis.tartools.rfc822.Parser.SubstringpDomain()protected org.evolvis.tartools.rfc822.Parser.SubstringpDomainLiteral()protected org.evolvis.tartools.rfc822.Parser.SubstringpDotAtom()protected intpeek()Returns the wide character after the one at the current position.protected org.evolvis.tartools.rfc822.Parser.SubstringpFWS()Parses FWS.protected Path.AddresspGroup()protected Path.AddrSpecSIDEpLocalPart()protected Path.AddresspMailbox()protected Path.AddressListpMailboxList()protected Path.AddresspNameAddr()protected intpos()Returns the current input character position.protected org.evolvis.tartools.rfc822.Parser.SubstringpPhrase()protected intpQcontent()protected intpQuotedPair()protected org.evolvis.tartools.rfc822.Path.WordpQuotedString()Returns the parse result of thequoted-stringproduction:protected org.evolvis.tartools.rfc822.Path.WordpWord()protected Strings()Returns the input string, for use with substring comparisons.protected intskip(Function<Integer,Boolean> matcher)Advances the current position using a regular matcher.protected intskipPeek(BiFunction<Integer,Integer,Boolean> matcher)Advances the current position using a peeking matcher.static Stringunfold(String s)Removes all occurrences of CR and/or LF from a string.protected org.evolvis.tartools.rfc822.Parser.Substringunfold(org.evolvis.tartools.rfc822.Parser.Substring ss)Unfolds FWS in the passed Substring if necessary.
-
-
-
Constructor Detail
-
Path
protected Path(String input)
Private constructor. Use the factory methodof(String)instead.- Parameters:
input- string to analyse
-
-
Method Detail
-
is
protected static boolean is(int c, byte what)
-
isAtext
protected static boolean isAtext(int c)
-
isCtext
protected static boolean isCtext(int c)
-
isDtext
protected static boolean isDtext(int c)
-
isQtext
protected static boolean isQtext(int c)
-
unfold
public static String unfold(String s)
Removes all occurrences of CR and/or LF from a string.- Parameters:
s- input string- Returns:
- null if there was nothing to remove, a new shorter
Stringotherwise
-
unfold
protected org.evolvis.tartools.rfc822.Parser.Substring unfold(org.evolvis.tartools.rfc822.Parser.Substring ss)
Unfolds FWS in the passed Substring if necessary.- Parameters:
ss-Parser.Substringto unfold- Returns:
- instance of an unfolded equivalent of the original substring
-
of
public static Path of(String addresses)
Creates and initialises a new (strict) parser for eMail addresses.- Parameters:
addresses- to parse- Returns:
- null if
addresseswas null or very large, the new parser instance otherwise
-
asMailboxList
public Path.AddressList asMailboxList()
Parses the address asmailbox-list, such as for theFromandResent-Fromheaders. SeeasAddressList()for RFC6854’s RFC2026 §3.3(d) Limited Use though.- Returns:
- parser result; remember to call isValid() on it first!
-
forSender
public Path.Address forSender(boolean allowRFC6854forLimitedUse)
Parses the address for the
SenderandResent-Senderheaders.These headers normally use the
mailboxproduction, but RFC6854 allows for theaddressproduction under the RFC2026 §3.3(d) Limited Use caveat that permits it but only for specific circumstances.- Parameters:
allowRFC6854forLimitedUse- use address instead of mailbox parsing- Returns:
- parser result; remember to call isValid() on it first!
-
asAddrSpec
public Path.AddrSpec asAddrSpec()
Parses the address as
addr-spec(unlabelled address).This method is mostly used in input validation or for constructing arguments for invoking an MSA. It may be better in most cases to instead use
forSender(boolean)(false) which permitsmailboxen like “user <lcl@example.com>” then extract theaddr-spec“lcl@example.com” from the return value viagetMailbox().- Returns:
- parser result; remember to call isValid() on it first!
-
asAddressList
public Path.AddressList asAddressList()
Parses the address asaddress-list, such as for theReply-To,To,Cc, (optionally)Bcc,Resent-To, … headers. RFC6854 (under RFC2026 §3.3(d) Limited Use circumstances) permits using this production for theFromandResent-Fromheaders, normally covered by theasMailboxList()method.- Returns:
- parser result; remember to call isValid() on it first!
-
pAddressList
protected Path.AddressList pAddressList()
-
pMailboxList
protected Path.AddressList pMailboxList()
-
pAddress
protected Path.Address pAddress()
-
pGroup
protected Path.Address pGroup()
-
pMailbox
protected Path.Address pMailbox()
-
pNameAddr
protected Path.Address pNameAddr()
-
pAngleAddr
protected Path.AddrSpec pAngleAddr()
-
pDisplayName
protected org.evolvis.tartools.rfc822.Parser.Substring pDisplayName()
-
pPhrase
protected org.evolvis.tartools.rfc822.Parser.Substring pPhrase()
-
pWord
protected org.evolvis.tartools.rfc822.Path.Word pWord()
-
pAtom
protected org.evolvis.tartools.rfc822.Path.Word pAtom()
Returns the parse result of the
atomproduction:result.
bodyis a rawParser.Substringof the atom, with surrounding CFWS stripped (no unfolding necessary), no extra dataresult.
cfwsis null or the trailing CFWS as rawParser.Substring, not unfolded- Returns:
- result (see above) as
Path.Word
-
pQuotedPair
protected int pQuotedPair()
-
pQcontent
protected int pQcontent()
-
pQuotedString
protected org.evolvis.tartools.rfc822.Path.Word pQuotedString()
Returns the parse result of the
quoted-stringproduction:result.
bodyis anPath.UnfoldedSubstringof the entire quoted string, with surrounding double quotes; itsStringdata is dequoted and backslash-removedresult.
cfwsis null or the trailing CFWS as rawParser.Substring, not unfolded- Returns:
- result (see above) as
Path.Word
-
pFWS
protected org.evolvis.tartools.rfc822.Parser.Substring pFWS()
Parses FWS.- Returns:
- raw
Parser.Substring, not unfolded
-
pCcontent
protected boolean pCcontent()
-
pComment
protected org.evolvis.tartools.rfc822.Parser.Substring pComment()
Parses comment.- Returns:
- raw
Parser.Substring, not unfolded (unfolded is human-visible form for now; may wish to simplify quoted-pairs)
-
pCFWS
protected org.evolvis.tartools.rfc822.Parser.Substring pCFWS()
Parses CFWS.- Returns:
- raw
Parser.Substring, not unfolded
-
pDotAtom
protected org.evolvis.tartools.rfc822.Parser.Substring pDotAtom()
-
pLocalPart
protected Path.AddrSpecSIDE pLocalPart()
-
pDomainLiteral
protected org.evolvis.tartools.rfc822.Parser.Substring pDomainLiteral()
-
pDomain
protected org.evolvis.tartools.rfc822.Parser.Substring pDomain()
-
pAddrSpec
protected Path.AddrSpec pAddrSpec()
-
of
protected static <T extends org.evolvis.tartools.rfc822.Parser> T of(Class<T> cls, String input)
Constructs a parser. Intended to be used by subclasses from static factory methods *only*; see
of(String)for an example.- Type Parameters:
T- subclass of Parser to construct- Parameters:
cls- subclass of Parser to constructinput- user-providedStringto parse- Returns:
- null if input was null or too large, the new parser subclass instance otherwise
-
jmp
protected final int jmp(int pos)
Jumps to a specified input character position, absolute jump.- Parameters:
pos- to jump to- Returns:
- the codepoint at that position
- Throws:
IndexOutOfBoundsException- if pos is not in or just past the input
-
bra
protected final int bra(int deltapos)
Jumps to a specified input character position, relative jump.- Parameters:
deltapos- to add to the current position- Returns:
- the codepoint at that position
- Throws:
IndexOutOfBoundsException- if pos is not in or just past the input
-
pos
protected final int pos()
Returns the current input character position. Useful for saving and restoring (withjmp(int)) and for error messages.- Returns:
- position
-
s
protected final String s()
Returns the input string, for use with substring comparisons. (This is safe because Java™ strings are immutable.)- Returns:
- String input
-
cur
protected final int cur()
Returns the wide character at the current position.- Returns:
- UCS-4 codepoint, or
-1if end of input is reached
-
peek
protected final int peek()
Returns the wide character after the one at the current position.- Returns:
- UCS-4 codepoint, or
-1if end of input is reached
-
accept
protected final int accept()
Advances the current position to the next character.- Returns:
- codepoint of the next character, or
-1if end of input is reached - Throws:
IndexOutOfBoundsException- if end of input was already reached
-
skipPeek
protected final int skipPeek(BiFunction<Integer,Integer,Boolean> matcher)
Advances the current position using a peeking matcher. Continues as long as thematcherreturns true and end of input is not yet reached.- Parameters:
matcher- gets called withcur()andpeek()as arguments- Returns:
- codepoint of the first character where the matcher returned false,
or
-1if end of input is reached - See Also:
skip(Function)
-
skip
protected final int skip(Function<Integer,Boolean> matcher)
Advances the current position using a regular matcher. Continues as long as thematcherreturns true and end of input is not yet reached.- Parameters:
matcher- gets called with justcur()as argument- Returns:
- codepoint of the first character where the matcher returned false,
or
-1if end of input is reached - See Also:
skipPeek(BiFunction)
-
-