|
Back
|
Main view
IMiS ARChive Elasticsearch plugin configuration
Product:
IMiS/ARChive
Release:
Since 10.3.2210
Date:
01/19/2024
Case:
IMiS ARChive supports Elasticsearch search engine for indexing and searching documents, saved on server.
In this article we describe, how to configure Elasticsearch plugin for search and indexing process.
Description:
IMiS ARChive supports searching and indexing using Elasticsearch as external search engine.
Elasticsearch search engine is accessed through IMiS ARChive plugin. Plugin is configured with xml configuration tags,
which are listed in next sections:
Network related xml tags:
"<ConnectionTimeout>": connection timeout in milliseconds (default 10000 milliseconds).
"<ReadTimeout>": read response timeout in milliseconds (default 10000 milliseconds).
"<SSLKSType>": SSL keystore type (supports all keystore types from Java).
"<SSLKSFile>": SSL keystore full path.
"<SSLKSPassword>": SSL keystore password.
"<SSLTSType>": SSL truststore type (supports all truststore types from Java). Use "IAKS" type for using internal certificate store from IMiS ARChive Server.
"<SSLTSPassword>": SSL truststore password. If "IAKS" type is used, tag is omitted.
"<SSLProtocols>": supported SSL protocols. Multiple values must be separated with ",". If multiple values are used, they are processed in the same order as they are defined in tag.
"<Proxy>": represent root tag for proxy support. Tag must contain "<Address>" tag, which represents proxy URL with port. Optionally, there may be "<Username>" and "<Password>" tags if proxy authentication is required.
Elasticsearch related xml tags:
"<IndexUrl>": Full patch to Elasticsearch index.
"<ApiKey>": API key value in case if it is used for Elasticsearch authentication.
"<QueryPageSize>": Number of hits when searching (default 5000 hits).
"<NestedHitsSize>": Number of nested hits when searching (default 100 hits).
"<Debug>": When true, plugin dumps debug information in server log (input parameters, elastic query ...). Default value is false.
"<FTSDisabledPropertyName>": Tag contains attribute name and corresponding attribute type which will be excluded from full text search.
Attribute type is passed with required "type" xml attribute.
"<FTSEnabledPropertyName>": Tag contains attribute name and corresponding attribute type which will be included in full text search.
Attribute type is passed with required "type" xml attribute.
"<QueryAsConstantScoreFilter>": True for executing query with scoring disabled, false for query with enabled scoring, which is default.
"<RefreshOnIndex>": Value indicates what kind of refresh will be executed when document is indexed.
Elasticsearch supported values are: "true", "false", "wait_for". By default, value is not set and therefore
Elasticsearch works with its predefined configuration.
"<
MaxContentLength>": Number of bytes allowed for content indexing. Cumulative JSON size (content and metadata) must not exceed Elasticsearch value "
http.max_content_length" which is by default set to 100MB. Default value for "MaxContentLength" is set to 80MB. If content size exceeds "MaxContentLength" limit, then indexing content will be truncated to content length limit.
Next table demonstrates translation between supported IMiS ARChive and Elasticsearch plugin attribute types.
These types are used during configuration of "FTSDisabledPropertyName" and "FTSEnabledPropertyName":
IMiS ARChive attribute type
Elasticsearch plugin attribute type
STRING10
18
STRING20
19
STRING30
20
STRING40
21
STRING50
22
STRING100
23
STRING200
24
FILE
42
By default, full text search uses all searchable attributes of type "string" (STRING10 to STRING200) and
searchable attributes of type FILE. By defining "FTSDisabledPropertyName" tag, attribute is excluded from full text search.
If tag is removed, then attribute is used again in full text search operation. This is preferred way for enabling or disabling
full text search attributes instead of using "FTSEnabledPropertyName" configuration tag.
Processing order of disabled and enabled full text search properties:
Disable properties.
Enabled properties.
Plugin uses Elasticsearch query DSL (query domain language) to perform different search operations:
For full text search, it uses simple query string.
Combination of match and range query for attribute searching.
Nested query for content searching.
There are a few limitations between searching with Elasticsearch engine or IMiS ARChive embedded database:
INT128 and UINT128 attribute types are not supported for indexing and searching with Elasticsearch.
Exclusive or operation (xor) is supported only on attribute nodes in Elasticsearch.
'sys:Accessed" attribute is not indexed in Elasticsearch.
Full text search is always executed on Elasticsearch engine. Manual search with attributes may be executed either on Elasticsearch or on IMiS ARChive embedded database.
When Elasticsearch plugin configuration is changed, IMiS ARChive server must be restarted in order the changes to take effect.
Next examples demonstrates Elasticsearch plugin configuration.
Exmple 1: Default configuration with http authentication:
<Arguments>
<IndexUrl>
https://elastic-server:9200/imis-archive-fti-index/
</IndexUrl>
<AuthenticationUsername>username</AuthenticationUsername>
<AuthenticationPassword>password</AuthenticationPassword>
<SSLTSType>JKS</SSLTSType>
<SSLTSFile>/iarc/work/fti/es.jks</SSLTSFile>
<SSLTSPassword>password</SSLTSPassword>
<SSLProtocols>TLS</SSLProtocols>
</Arguments>
Example 2: Default configuration with API key authentication:
<Arguments>
<IndexUrl>
https://elastic-server:9200/imis-archive-fti-index/
</IndexUrl>
<ApiKey>base64 encoded elasticsearch api key</ApiKey>
<SSLTSType>JKS</SSLTSType>
<SSLTSFile>/iarc/work/fti/es.jks</SSLTSFile>
<SSLTSPassword>password</SSLTSPassword>
<SSLProtocols>TLS</SSLProtocols>
</Arguments>
Example 3: Attribute 'sys:Content' is excluded from full text search:
<Arguments>
<IndexUrl>
https://elastic-server:9200/imis-archive-fti-index/
</IndexUrl>
<AuthenticationUsername>username</AuthenticationUsername>
<AuthenticationPassword>password</AuthenticationPassword>
<SSLTSType>JKS</SSLTSType>
<SSLTSFile>/iarc/work/fti/es.jks</SSLTSFile>
<SSLTSPassword>password</SSLTSPassword>
<SSLProtocols>TLS</SSLProtocols>
<FTSDisabledPropertyName type="42">sys:Content</FTSDisabledPropertyName>
</Arguments>
Searching on 'sys:Content' is still possible by explicit attribute search.
Example 4: Content indexing limit to 40MB:
<Arguments>
<IndexUrl>
https://elastic-server:9200/imis-archive-fti-index/
</IndexUrl>
<AuthenticationUsername>username</AuthenticationUsername>
<AuthenticationPassword>password</AuthenticationPassword>
<SSLTSType>JKS</SSLTSType>
<SSLTSFile>/iarc/work/fti/es.jks</SSLTSFile>
<SSLTSPassword>password</SSLTSPassword>
<SSLProtocols>TLS</SSLProtocols>
<
MaxContentLength
>
41943040
</
MaxContentLength
>
</Arguments>
Related Documents:
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-refresh.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-create-api-key.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html
https://www.infoq.com/articles/similarity-scoring-elasticsearch/
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html
|
Back
|
Main view
Copyright © Imaging Systems Ltd, 2024