m() measures the number of consonant sequences between 0 and j.
m() measures the number of consonant sequences between 0 and j. if c is a consonant sequence and v a vowel sequence, and <..> indicates arbitrary presence,
<c><v> gives 0 <c>vc<v> gives 1 <c>vcvc<v> gives 2 <c>vcvcvc<v> gives 3 ....
I think this can be recoded far more neatly.
cvc(i) is true <=> i-2,i-1,i has the form consonant - vowel - consonant and also if the second c is not w,x or y.
cvc(i) is true <=> i-2,i-1,i has the form consonant - vowel - consonant and also if the second c is not w,x or y. this is used when trying to restore an e at the end of a short word. e.g.
cav(e), lov(e), hop(e), crim(e), but snow, box, tray.
Porter stemmer in Scala. The original paper is in
Porter, 1980, An algorithm for suffix stripping, Program, Vol. 14, no. 3, pp 130-137,
See also http://www.tartarus.org/~martin/PorterStemmer
A few methods were borrowed from the existing Java port from the above page.
This version is adapted from the original by Ken Faulkner.