public final class StringMetricBuilderExample extends Object
StringMetricBuilder
can be used to compose metrics.
A metric is used to measure the similarity between strings. Metrics can work
on strings, lists or sets tokens. To compare strings with a metric that works
on a collection of tokens a tokenizer is required.Constructor and Description |
---|
StringMetricBuilderExample() |
Modifier and Type | Method and Description |
---|---|
static float |
example00()
Simply comparing strings through a metric may not be very effective.
|
static float |
example01()
Simplification
Simplification increases the effectiveness of a metric by removing noise
and reducing the dimensionality of the problem.
|
static float |
example02()
Simplifiers can also be chained.
|
static float |
example03()
Tokenization
Tokenization cuts up a string into tokens.
|
static float |
example04()
Tokenizers can also be chained.
|
static float |
example05()
Tokens can be filtered to avoid comparing strings on common but otherwise
low information words.
|
static float |
example06()
Tokens can be transformed to a simpler form.
|
static float |
example07()
Tokenization and simplification can be expensive operations.
|
public static float example00()
public static float example01()
Simplifier
interface.public static float example02()
public static float example03()
Tokenizer
interface and is required for all metrics that work on collections of
tokens rather then whole strings; ListMetric
s and
SetMetric
spublic static float example04()
public static float example05()
Predicate
interface. By chaining predicates more complicated filters can be build.public static float example06()
Function
interface.public static float example07()
Copyright © 2014–2018. All rights reserved.