![]() |
Leptonica
1.82.0
Image processing and image analysis suite
|
#include <string.h>
#include "allheaders.h"
Go to the source code of this file.
Functions | |
SARRAY * | sarraySort (SARRAY *saout, SARRAY *sain, l_int32 sortorder) |
SARRAY * | sarraySortByIndex (SARRAY *sain, NUMA *naindex) |
l_int32 | stringCompareLexical (const char *str1, const char *str2) |
L_ASET * | l_asetCreateFromSarray (SARRAY *sa) |
l_ok | sarrayRemoveDupsByAset (SARRAY *sas, SARRAY **psad) |
l_ok | sarrayUnionByAset (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
l_ok | sarrayIntersectionByAset (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
L_HASHMAP * | l_hmapCreateFromSarray (SARRAY *sa) |
l_ok | sarrayRemoveDupsByHmap (SARRAY *sas, SARRAY **psad, L_HASHMAP **phmap) |
l_ok | sarrayUnionByHmap (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
l_ok | sarrayIntersectionByHmap (SARRAY *sa1, SARRAY *sa2, SARRAY **psad) |
SARRAY * | sarrayGenerateIntegers (l_int32 n) |
l_ok | sarrayLookupCSKV (SARRAY *sa, const char *keystring, char **pvalstring) |
Sort SARRAY *sarraySort() SARRAY *sarraySortByIndex() l_int32 stringCompareLexical() Set operations using aset (rbtree) L_ASET *l_asetCreateFromSarray() l_int32 sarrayRemoveDupsByAset() l_int32 sarrayUnionByAset() l_int32 sarrayIntersectionByAset() Hashmap operations L_HASHMAP *l_hmapCreateFromSarray() l_int32 sarrayRemoveDupsByHmap() l_int32 sarrayUnionByHmap() l_int32 sarrayIntersectionByHmap() Miscellaneous operations SARRAY *sarrayGenerateIntegers() l_int32 sarrayLookupCSKV() We have two implementations of set operations on an array of strings: (1) Using an underlying tree (rbtree). This uses a good 64 bit hashing function for the key, that is not expected to have hash collisions (and we do not test for them). The tree is built up of the hash values, and if the hash is found in the tree, it is assumed that the string has already been found. (2) Building a hashmap from the keys (hashmap). This uses a fast 64 bit hashing function for the key, which is then hashed into a hashtable. Collisions of hashkeys are very rare, but the hashtable is designed to allow more than one hashitem in a table entry. The hashitems are put in a list at each hashtable entry, which is traversed looking for the key.
Definition in file sarray2.c.
[in] | sa |
Definition at line 229 of file sarray2.c.
Referenced by sarrayIntersectionByAset().
[in] | sa | input sarray |
Definition at line 439 of file sarray2.c.
References l_hashStringToUint64Fast(), L_NOCOPY, sarrayGetCount(), and sarrayGetString().
Referenced by sarrayIntersectionByHmap(), and sarrayRemoveDupsByHmap().
SARRAY* sarrayGenerateIntegers | ( | l_int32 | n | ) |
[in] | n |
Definition at line 625 of file sarray2.c.
References L_COPY, sarrayAddString(), and sarrayCreate().
Referenced by pixaCompareInPdf().
[in] | sa1 | |
[in] | sa2 | |
[out] | psad | intersection of the two string arrays |
Notes: (1) Algorithm: put the larger sarray into a set, using the string hashes as the key values. Then run through the smaller sarray, building an output sarray and a second set from the strings in the larger array: if a string is in the first set but not in the second, add the string to the output sarray and hash it into the second set. The second set is required to make sure only one instance of each string is put into the output sarray. This is O(mlogn), {m,n} = sizes of {smaller,larger} input arrays.
Definition at line 380 of file sarray2.c.
References l_asetCreateFromSarray(), sarrayCreate(), and sarrayGetCount().
[in] | sa1 | |
[in] | sa2 | |
[out] | *psad | intersection of the array values |
Definition at line 561 of file sarray2.c.
References L_COPY, l_hashStringToUint64Fast(), l_hmapCreateFromSarray(), L_NOCOPY, sarrayAddString(), sarrayCreate(), sarrayDestroy(), sarrayGetCount(), sarrayGetString(), and sarrayRemoveDupsByHmap().
l_ok sarrayLookupCSKV | ( | SARRAY * | sa, |
const char * | keystring, | ||
char ** | pvalstring | ||
) |
[in] | sa | of strings, each being a comma-separated pair of strings, the first being a key and the second a value |
[in] | keystring | an input string to match with each key in sa |
[out] | pvalstring | the returned value string corresponding to the input key string, if found; otherwise NULL |
Notes: (1) The input sa can have other strings that are not in comma-separated key-value format. These will be ignored. (2) This returns a copy of the first value string in sa whose key string matches the input keystring. (3) White space is not ignored; all white space before the ',' is used for the keystring in matching. This allows the key and val strings to have white space (e.g., multiple words).
Definition at line 666 of file sarray2.c.
References L_NOCOPY, sarrayCreate(), sarrayGetCount(), and sarrayGetString().
[in] | sas | |
[out] | psad | with duplicates removed |
Notes: (1) This is O(nlogn), considerably slower than sarrayRemoveDupsByHmap() for large string arrays. (2) The key for each string is a 64-bit hash. (3) Build a set, using hashed strings as keys. As the set is built, first do a find; if not found, add the key to the set and add the string to the output sarray.
Definition at line 273 of file sarray2.c.
Referenced by sarrayUnionByAset().
[in] | sas | |
[out] | psad | hash set of unique values |
[out] | phmap | [optional] hashmap used for lookup |
Definition at line 473 of file sarray2.c.
References L_Hashmap::hashtab, L_COPY, l_hmapCreateFromSarray(), L_INSERT, L_Hashitem::next, sarrayAddString(), sarrayCreate(), sarrayGetString(), L_Hashmap::tabsize, and L_Hashitem::val.
Referenced by sarrayIntersectionByHmap(), and sarrayUnionByHmap().
[in] | saout | output sarray; can be NULL or equal to sain |
[in] | sain | input sarray |
[in] | sortorder | L_SORT_INCREASING or L_SORT_DECREASING |
Notes: (1) Set saout = sain for in-place; otherwise, set naout = NULL. (2) Shell sort, modified from K&R, 2nd edition, p.62. Slow but simple O(n logn) sort.
Definition at line 97 of file sarray2.c.
References Sarray::array, L_SORT_DECREASING, L_SORT_INCREASING, sarrayCopy(), sarrayGetCount(), and stringCompareLexical().
Referenced by getSortedPathnamesInDirectory().
[in] | sain | |
[in] | naindex | na that maps from the new sarray to the input sarray |
Definition at line 147 of file sarray2.c.
References L_COPY, L_INSERT, numaGetIValue(), sarrayAddString(), sarrayCreate(), sarrayGetCount(), and sarrayGetString().
[in] | sa1 | |
[in] | sa2 | |
[out] | psad | union of the two string arrays |
Notes: (1) Duplicates are removed from the concatenation of the two arrays. (2) The key for each string is a 64-bit hash. (2) Algorithm: Concatenate the two sarrays. Then build a set, using hashed strings as keys. As the set is built, first do a find; if not found, add the key to the set and add the string to the output sarray. This is O(nlogn).
Definition at line 329 of file sarray2.c.
References sarrayCopy(), sarrayDestroy(), sarrayJoin(), and sarrayRemoveDupsByAset().
[in] | sa1 | |
[in] | sa2 | |
[out] | *psad | union of the array values |
Definition at line 525 of file sarray2.c.
References sarrayCopy(), sarrayDestroy(), sarrayJoin(), and sarrayRemoveDupsByHmap().
l_int32 stringCompareLexical | ( | const char * | str1, |
const char * | str2 | ||
) |
[in] | str1 | |
[in] | str2 |
Notes: (1) If the lexical values are identical, return a 0, to indicate that no swapping is required to sort the strings.
Definition at line 187 of file sarray2.c.
Referenced by sarraySort().