@DeclareRoles(value={"org.imixs.ACCESSLEVEL.NOACCESS","org.imixs.ACCESSLEVEL.READERACCESS","org.imixs.ACCESSLEVEL.AUTHORACCESS","org.imixs.ACCESSLEVEL.EDITORACCESS","org.imixs.ACCESSLEVEL.MANAGERACCESS"}) @RolesAllowed(value={"org.imixs.ACCESSLEVEL.NOACCESS","org.imixs.ACCESSLEVEL.READERACCESS","org.imixs.ACCESSLEVEL.AUTHORACCESS","org.imixs.ACCESSLEVEL.EDITORACCESS","org.imixs.ACCESSLEVEL.MANAGERACCESS"}) public class SolrIndexService extends Object
The service validates the solr index schema and updates the schema it changed.
The SolrIndexService is used by the SolrUpdateService and the SolrSearchService which are extending and implementing the Imix-Index concept.
The SolrIndexService can be configured by the following properties:
| Modifier and Type | Field and Description |
|---|---|
static String |
ANONYMOUS |
static int |
DEFAULT_MAX_SEARCH_RESULT |
static int |
DEFAULT_PAGE_SIZE |
static String |
DEFAULT_SEARCH_FIELD |
static int |
EVENTLOG_ENTRY_FLUSH_COUNT |
protected javax.enterprise.event.Event<IndexEvent> |
indexEvents |
| Constructor and Description |
|---|
SolrIndexService() |
| Modifier and Type | Method and Description |
|---|---|
String |
adaptImixsItemName(String itemName)
This method adapts an Imixs item name to the corresponding Solr field name.
|
String |
adaptSolrFieldName(String itemName)
This method adapts an Solr field name to the corresponding Imixs Item name.
|
protected String |
buildAddDoc(List<ItemCollection> documents)
This method returns a XML structure to add new documents into the solr index.
|
protected String |
buildUpdateSchema(String oldSchema)
This method builds a JSON structure to be used to update an existing Solr
schema.
|
boolean |
flushEventLog(int junkSize)
Flush the EventLog cache.
|
boolean |
flushEventLogByCount(int count)
This method flushes a given count of eventLogEntries.
|
void |
indexDocument(ItemCollection document)
This method adds a single document to the Lucene Solr index.
|
void |
indexDocuments(List<ItemCollection> documents)
This method adds a collection of documents to the Lucene Solr index.
|
void |
init()
Create a rest client instance
|
String |
query(String searchTerm,
int pageSize,
int pageIndex,
SortOrder sortOrder,
DefaultOperator defaultOperator,
boolean loadStubs)
This method post a search query and returns the result.
|
void |
rebuildIndex()
This method forces an update of the full text index.
|
void |
removeDocument(String id)
This method removes a single document from the Lucene Solr index.
|
void |
removeDocuments(List<String> documentIDs)
This method removes a collection of documents from the Lucene Solr index.
|
void |
setup(SetupEvent setupEvent)
This method verifies the schema of the Solr core.
|
protected String |
stripCDATA(String s)
This helper method strips CDATA blocks from a string.
|
protected String |
stripControlCodes(String s)
This helper method is to strip control codes and extended characters from a
string.
|
void |
updateSchema(String schema)
Updates the schema definition of an existing Solr core.
|
public static final String ANONYMOUS
public static final int EVENTLOG_ENTRY_FLUSH_COUNT
public static final String DEFAULT_SEARCH_FIELD
public static final int DEFAULT_MAX_SEARCH_RESULT
public static final int DEFAULT_PAGE_SIZE
@Inject protected javax.enterprise.event.Event<IndexEvent> indexEvents
@PostConstruct public void init()
public void setup(@Observes
SetupEvent setupEvent)
throws RestAPIException
The method assumes that a core is already created with a manageable schema.
setupEvent - RestAPIExceptionpublic void updateSchema(String schema) throws RestAPIException
The schema definition is build by the method builUpdateSchema(). The updateSchema adds or replaces field definitions depending on the fieldList definitions provided by the Imixs SchemaService. See the method builUpdateSchema() for details.
The method asumes that a core already exits. Otherwise an exception is thrown.
schema - - existing schema defintionRestAPIExceptionpublic void indexDocuments(List<ItemCollection> documents) throws RestAPIException
This method is used by the JobHandlerRebuildIndex only.
documents - of ItemCollections to be indexedRestAPIExceptionpublic void indexDocument(ItemCollection document) throws RestAPIException
documents - of ItemCollections to be indexedRestAPIExceptionpublic void removeDocuments(List<String> documentIDs) throws RestAPIException
documents - of collection of UniqueIDs to be removed from the indexRestAPIExceptionpublic void removeDocument(String id) throws RestAPIException
document - - UniqueID of the document to be removed from the indexRestAPIExceptionpublic void rebuildIndex()
public String query(String searchTerm, int pageSize, int pageIndex, SortOrder sortOrder, DefaultOperator defaultOperator, boolean loadStubs) throws QueryException
The method will return the documents containing all stored or DocValues fields. Only if the param 'loadStubs' is false, then only the field '$uniqueid' will be returnded by the method. The caller is responsible to load the full document from DocumentService.
Because fieldnames must not contain $ symbols we need to replace those field names used in a query.
searchterm - QueryExceptionpublic String adaptSolrFieldName(String itemName)
itemName - public String adaptImixsItemName(String itemName)
itemName - protected String buildUpdateSchema(String oldSchema)
The param oldSchema contains the current schema definition of the core.
In Solr there a two field types defining if the value of a field is stored and returned by a
{"add-field":{name=field1, type=text_general, stored=true, docValues=true}}
For both cases the values are stored in the lucene index and returned by a query.
Stored fields (stored=true) are row orientated. That means that like in a sql table the values are stored based on the ID of the document.
In difference the docValues are stored column orientated (forward index). The values are ordered based on the search term. For features like sorting, grouping or faceting, docValues increase the performance in general. So it may look like docValues are the better choice. But one important different is how the values are stored. In case of a stored field with multi-values, the values are exactly stored in the same order as they were indexed. DocValues instead are sorted and reordered. So this will falsify the result of a document returned by a query.
In Imixs-Workflow we use the stored attribute to return parts of a document at query time. We call this a document-stub which contains only a subset of fields. Later we load the full document from the SQL database. As stored fields in our workflow application are also often used for sorting we combine both attributes. In case of a non-stored field we set also docValues=false to avoid unnecessary storing of fields.
https://lucene.apache.org/solr/guide/8_0/docvalues.htmlprotected String buildAddDoc(List<ItemCollection> documents)
protected String stripControlCodes(String s)
Background:
In ASCII, the control codes have decimal codes 0 through to 31 and 127. On an ASCII based system, if the control codes are stripped, the resultant string would have all of its characters above 32 and not 127
s - include - https://rosettacode.org/wiki/Strip_control_codes_and_extended_characters_from_a_stringprotected String stripCDATA(String s)
s - public boolean flushEventLogByCount(int count)
count - the max size of a eventLog engries to remove.public boolean flushEventLog(int junkSize)
The method flushes the cache in smaller blocks of the given junkSize. to avoid a heap size problem. The default flush size is 16. The eventLog cache is tracked by the flag 'dirtyIndex'.
issue #439 - The method returns false if the event log contains more entries as defined by the given JunkSize. In this case the caller should recall the method which runs always in a new transaction. The goal of this mechanism is to reduce the event log even in cases the outer transaction breaks.
LuceneSearchServiceCopyright © 2006–2021 Imixs Software Solutions GmbH. All rights reserved.