public final class Simplifiers extends Object
All methods return immutable objects provided the arguments are also immutable.
Modifier and Type | Method and Description |
---|---|
static Simplifier |
chain(List<Simplifier> simplifiers)
Constructs a new chain of simplifiers.
|
static Simplifier |
chain(Simplifier simplifier,
Simplifier... simplifiers)
Constructs a new chain of simplifiers.
|
static Simplifier |
removeAll(Pattern pattern)
Returns a simplifier that removes every subsequence of the input that
matches the pattern.
|
static Simplifier |
removeAll(String regex)
Returns a simplifier that removes every subsequence of the input that
matches the regex.
|
static Simplifier |
removeDiacritics()
Returns a simplifier that removes diacritics.
|
static Simplifier |
removeNonWord()
Returns a simplifier that removes all non-word
[^0-9a-zA-Z]
characters. |
static Simplifier |
removeNonWord(String replacement)
Returns a simplifier that removes all consecutive non-word characters
[^0-9a-zA-Z]+ and replaces them with the replacement . |
static Simplifier |
replaceAll(Pattern pattern,
String replacement)
Returns a simplifier that replaces every subsequence of the input that
matches the pattern with the given replacement string.
|
static Simplifier |
replaceAll(String regex,
String replacement)
Returns a simplifier that replaces every subsequence of the input that
matches the regex with the given replacement string.
|
static Simplifier |
replaceNonWord()
Returns a simplifier that replaces all individual non-word characters
[^0-9a-zA-Z] with a space. |
static Simplifier |
replaceNonWord(String replacement)
Returns a simplifier that replaces all individual non-word characters
[^0-9a-zA-Z] with the replacement . |
static Simplifier |
toLowerCase()
Returns a simplifier that transforms all upper case characters into their
lower case equivalent.
|
static Simplifier |
toLowerCase(Locale l)
Returns a simplifier that transforms all upper case characters into their
lower case equivalent.
|
static Simplifier |
toUpperCase()
Returns a simplifier that transforms all lower case characters into their
upper case equivalent.
|
static Simplifier |
toUpperCase(Locale l)
Returns a simplifier that transforms all lower case characters into their
upper case equivalent.
|
public static Simplifier chain(List<Simplifier> simplifiers)
simplifiers
- a non-empty list of simplifiersStringMetricBuilder
public static Simplifier chain(Simplifier simplifier, Simplifier... simplifiers)
simplifier
- the first simplifiersimplifiers
- the othersStringMetricBuilder
public static Simplifier removeAll(String regex)
regex
- the regex to removeMatcher.replaceAll(String)
public static Simplifier removeAll(Pattern pattern)
pattern
- the pattern to removeMatcher.replaceAll(String)
public static Simplifier removeDiacritics()
The input string is transformed to the canonical decomposition form.
After which any characters matching the regex
\p{InCombiningDiacriticalMarks}\p{IsLm}\p{IsSk}]+
are
removed. The resulting string will be in canonical decomposition form.
The returned simplifier is thread-safe and immutable.
public static Simplifier removeNonWord()
[^0-9a-zA-Z]
characters.
The returned simplifier is thread-safe and immutable.
removeAll(Pattern)
public static Simplifier removeNonWord(String replacement)
[^0-9a-zA-Z]+
and replaces them with the replacement
.
The returned simplifier is thread-safe and immutable.
replacement
- replaces the consecutive non word charactersremoveAll(Pattern)
public static Simplifier replaceAll(String regex, String replacement)
regex
- the regex to replacereplacement
- the replacement stringMatcher.replaceAll(String)
public static Simplifier replaceAll(Pattern pattern, String replacement)
pattern
- the pattern to replacereplacement
- the replacement stringMatcher.replaceAll(String)
public static Simplifier replaceNonWord()
[^0-9a-zA-Z]
with a space.
The returned simplifier is thread-safe and immutable.
public static Simplifier replaceNonWord(String replacement)
[^0-9a-zA-Z]
with the replacement
.
The simplifier class is thread-safe and immutable.
replacement
- replaces the non word characterspublic static Simplifier toLowerCase()
Uses the default locale to apply the transform.
The returned simplifier is thread-safe and immutable.
public static Simplifier toLowerCase(Locale l)
The returned simplifier is thread-safe and immutable.
l
- locale in which the transform is appliedpublic static Simplifier toUpperCase()
Uses the default locale to apply the transform.
The returned simplifier is thread-safe and immutable.
public static Simplifier toUpperCase(Locale l)
The returned simplifier is thread-safe and immutable.
l
- locale in which the transform is appliedCopyright © 2014–2018. All rights reserved.