public class Simhash extends Object
| Constructor and Description |
|---|
Simhash()
构造
|
Simhash(int fracCount,
int hammingThresh)
构造
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
equals(Collection<? extends CharSequence> segList)
判断文本是否与已存储的数据重复
|
long |
hash(Collection<? extends CharSequence> segList)
指定文本计算simhash值
|
void |
store(Long simhash)
按照索引进行存储
|
public Simhash()
public Simhash(int fracCount,
int hammingThresh)
fracCount - 存储段数hammingThresh - 汉明距离的衡量标准public long hash(Collection<? extends CharSequence> segList)
segList - 分词的词列表public boolean equals(Collection<? extends CharSequence> segList)
segList - 文本分词后的结果public void store(Long simhash)
simhash - Simhash值Copyright © 2020. All rights reserved.