T
- type of the tokenpublic class MatchingCoefficient<T> extends Object implements ListMetric<T>
The matching coefficient between two lists is defined as ratio of elements that occur in both lists and elements that exclusively occur in either list. This metric is identical to Jaccard similarity. However repeated elements are considered as distinct occurrences.
similarity(a,b) = (a A b)| / (|a or b|)
The A operation takes the list intersection of a
and
b
. This is a list c
such that each element in has a
1-to-1 relation to an element in both a
and b
. E.g.
the list intersection of [ab,ab,ab,ac]
and
[ab,ab,ad]
is [ab,ab]
. *
This metric is identical to Jaccard but is insensitive to repeated tokens.
The list ["a","a","b"]
is identical to
["a","b","b"]
.
This class is immutable and thread-safe.
Constructor and Description |
---|
MatchingCoefficient() |
Modifier and Type | Method and Description |
---|---|
float |
compare(List<T> a,
List<T> b)
Measures the similarity between lists a and b.
|
String |
toString() |
public float compare(List<T> a, List<T> b)
ListMetric
Copyright © 2014–2018. All rights reserved.