5.0.0
版本发布时间: 2022-05-18 22:19:56
manticoresoftware/manticoresearch最新发布版本:6.3.8(2024-11-22 19:09:45)
Manticore Search 5.0.0, May 18th 2022
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Release blogpost https://manticoresearch.com/blog/manticore-search-5-0-0/
Major new features
- 🔬 Support for Manticore Columnar Library 1.15.2, which enables Secondary indexes beta version. Building secondary indexes is on by default for plain and real-time columnar and row-wise indexes (if Manticore Columnar Library is in use), but to enable it for searching you need to set
secondary_indexes = 1
either in your configuration file or using SET GLOBAL. The new functionality is supported in all operating systems except old Debian Stretch and Ubuntu Xenial. - Read-only mode: you can now specify listeners that process only read queries discarding any writes.
- New /cli endpoint for running SQL queries over HTTP even easier.
- Faster bulk INSERT/REPLACE/DELETE via JSON over HTTP: previously you could provide multiple write commands via HTTP JSON protocol, but they were processed one by one, now they are handled as a single transaction.
-
#720 Nested filters support in JSON protocol. Previously you couldn't code things like
a=1 and (b=2 or c=3)
in JSON:must
(AND),should
(OR) andmust_not
(NOT) worked only on the highest level. Now they can be nested. - Support for Chunked transfer encoding in HTTP protocol. You can now use chunked transfer in your application to transfer large batches with lower resource consumption (since you don't need to calculate
Content-Length
). On the server's side Manticore now always processes incoming HTTP data in streaming fashion without waiting for the whole batch to be transferred as previously, which:- decreases peak RAM consumption, which lowers a chance of OOM
- decreases response time (our tests showed 11% decrease for processing a 100MB batch)
- lets you overcome max_packet_size and transfer batches much larger than the largest allowed value of
max_packet_size
(128MB), e.g. 1GB at once.
-
#719 HTTP interface support of
100 Continue
: now you can transfer large batches fromcurl
(including curl libraries used by various programming languages) which by default doesExpect: 100-continue
and waits some time before actually sending the batch. Previously you had to addExpect:
header, now it's not needed. -
- Pseudo sharding is enabled by default.
- Having at least one full-text field in a real-time/plain index is not mandatory anymore. You can now use Manticore even in cases not having anything to do with full-text search.
-
Fast fetching for attributes backed by Manticore Columnar Library: queries like
select * from <columnar table>
are now much faster than previously, especially if there are many fields in the schema. -
⚠️ BREAKING CHANGE: Implicit cutoff. Manticore now doesn't spend time and resources processing data you don't need in the result set which will be returned. The downside is that it affects
total_found
in SHOW META and hits.total in JSON output. It is now only accurate in case you seetotal_relation: eq
whiletotal_relation: gte
means the actual number of matching documents is greater than thetotal_found
value you've got. To retain the previous behaviour you can use search optioncutoff=0
, which makestotal_relation
alwayseq
. -
⚠️ BREAKING CHANGE: All full-text fields are now stored by default in plain indexes. You need to use
stored_fields =
(empty value) to make all fields non-stored (i.e. revert to the previous behaviour). - #715 HTTP JSON supports search options.
Minor changes
-
⚠️ BREAKING CHANGE: Index meta file format change. Previously meta files (
.meta
,.sph
) were in binary format, now it's just json. The new Manticore version will convert older indexes automatically, but:- you can get warning like
WARNING: ... syntax error, unexpected TOK_IDENT
- you won't be able to run the index with previous Manticore versions, make sure you have a backup
- you can get warning like
-
⚠️ BREAKING CHANGE: Session state support with help of HTTP keep-alive. This makes HTTP stateful when the client supports it too. For example, using the new /cli endpoint and HTTP keep-alive (which is on by default in all browsers) you can call
SHOW META
afterSELECT
and it will work the same way it works via mysql. Note, previouslyConnection: keep-alive
HTTP header was supported too, but it only caused reusing the same connection. Since this version it also makes the session stateful. - You can now specify
columnar_attrs = *
to define all your attributes as columnar in the plain mode which is useful in case the list is long. - Faster replication SST
-
⚠️ BREAKING CHANGE: Replication protocol has been changed. If you are running a replication cluster, then when upgrading to Manticore 5 you need to:
- stop all your nodes first cleanly
- and then start the node which was stopped last with
--new-cluster
(run toolmanticore_new_cluster
in Linux). - read about restarting a cluster for more details.
- Replication improvements:
- Faster SST
- Noise resistance which can help in case of unstable network between replication nodes
- Improved logging
- Security improvement: Manticore now listens on
127.0.0.1
instead of0.0.0.0
in case nolisten
at all is specified in config. Even though in the default configuration which is shipped with Manticore Search thelisten
setting is specified and it's not typical to have a configuration with nolisten
at all, it's still possible. Previously Manticore would listen on0.0.0.0
which is not secure, now it listens on127.0.0.1
which is usually not exposed to the Internet. - Faster aggregation over columnar attributes.
- Increased
AVG()
accuracy: previously Manticore usedfloat
internally for aggregations, now it usesdouble
which increases the accuracy significantly. - Improved support for JDBC MySQL driver.
-
DEBUG malloc_stats
support for jemalloc. - optimize_cutoff is now available as a per-table setting which can be set when you CREATE or ALTER a table.
-
⚠️ BREAKING CHANGE: query_log_format is now
sphinxql
by default. If you are used toplain
format you need to addquery_log_format = plain
to your configuration file. - Significant memory consumption improvements: Manticore consumes significantly less RAM now in case of long and intensive insert/replace/optimize workload in case stored fields are used.
- shutdown_timeout default value was increased from 3 seconds to 60 seconds.
- Commit ffd0499d Support for Java mysql connector >= 6.0.3: in Java mysql connection 6.0.3 they changed the way they connect to mysql which broke compatibility with Manticore. The new behaviour is now supported.
- Commit 1da6dbec disabled saving a new disk chunk on loading an index (e.g. on searchd startup).
- Issue #746 Support for glibc >= 2.34.
-
Issue #784 count 'VIP' connections separately from usual (non-VIP). Previously VIP connections were counted towards the
max_connections
limit, which could cause "maxed out" error for non-VIP connections. Now VIP connections are not counted towards the limit. Current number of VIP connections can be also seen inSHOW STATUS
andstatus
. - ID can now be specified explicitly.
⚠️ Other minor breaking changes
- ⚠️ BM25F formula has been slightly updated to improve search relevance. This only affects search results in case you use function BM25F(), it doesn't change behaviour of the default ranking formula.
- ⚠️ Changed behaviour of REST /sql endpoint:
/sql?mode=raw
now requires escaping and returns an array. - ⚠️ Format change of the response of
/bulk
INSERT/REPLACE/DELETE requests:- previously each sub-query constituted a separate transaction and resulted in a separate response
- now the whole batch is considered a single transaction, which returns a single response
- ⚠️ Search options
low_priority
andboolean_simplify
now require a value (0/1
): previously you could doSELECT ... OPTION low_priority, boolean_simplify
, now you need to doSELECT ... OPTION low_priority=1, boolean_simplify=1
. - ⚠️ If you are using old php, python or java clients please follow the corresponding link and find an updated version. The old versions are not fully compatible with Manticore 5.
- ⚠️ HTTP JSON requests are now logged in different format in mode
query_log_format=sphinxql
. Previously only full-text part was logged, now it's logged as is.
New packages
-
⚠️ BREAKING CHANGE: because of the new structure when you upgrade to Manticore 5 it's recommended to remove old packages before you install the new ones:
- RPM-based:
yum remove manticore*
- Debian and Ubuntu:
apt remove manticore*
- RPM-based:
-
New deb/rpm packages structure. Previous versions provided:
-
manticore-server
withsearchd
(main search daemon) and all needed for it -
manticore-tools
withindexer
andindextool
-
manticore
including everything -
manticore-all
RPM as a meta package referring tomanticore-server
andmanticore-tools
The new structure is:
-
manticore
- deb/rpm meta package which installes all the above as dependencies -
manticore-server-core
-searchd
and everything to run it alone -
manticore-server
- systemd files and other supplementary scripts -
manticore-tools
-indexer
,indextool
and other tools -
manticore-common
- default configuration file, default data directory, default stopwords -
manticore-icudata
,manticore-dev
,manticore-converter
didn't change much -
.tgz
bundle which includes all the packages
-
-
Support for Ubuntu Jammy
-
Support for Amazon Linux 2 via YUM repo
Bugfixes
- Issue #287 out of memory while indexing RT index
- Issue #604 Breaking change 3.6.0, 4.2.0 sphinxql-parser
- Issue #667 FATAL: out of memory (unable to allocate 9007199254740992 bytes)
- Issue #676 Strings not passed correctly to UDFs
- ❗Issue #698 Searchd crashes after trying to add a text column to a rt index
- Issue #705 Indexer couldn't find all columns
- ❗Issue #709 Grouping by json.boolean works wrong
- Issue #716 indextool commands related to index (eg. --dumpdict) failure
- ❗Issue #724 Fields disappear from the selection
-
Issue #727 .NET HttpClient Content-Type incompatibility when using
application/x-ndjson
- Issue #729 Field length calculation
- ❗Issue #730 create/insert into/drop columnar table has a memleak
- Issue #731 Empty column in results under certain conditions
- ❗Issue #749 Crash of daemon on start
- ❗Issue #750 Daemon hangs on start
- ❗Issue #751 Crash at SST
- Issue #752 Json attribute marked as columnar when engine='columnar'
- Issue #753 Replication listens on 0
- Issue #754 columnar_attrs = * is not working with csvpipe
- ❗Issue #755 Crash on select float in columnar in rt
- ❗Issue #756 Indextool changes rt index during check
- Issue #757 Need a check for listeners port range intersections
- Issue #758 Log original error in case RT index failed to save disk chunk
- Issue #759 Only one error reported for RE2 config
- ❗Issue #760 RAM consumption changes in commit 5463778558586d2508697fa82e71d657ac36510f
- Issue #761 3rd node doesn't make a non-primary cluster after dirty restart
- Issue #762 Update counter gets increased by 2
- Issue #763 New version 4.2.1 corrupt index created with 4.2.0 with morphology using
- Issue #764 No escaping in json keys /sql?mode=raw
- ❗Issue #765 Using function hides other values
- ❗Issue #766 Memleak triggered by a line in FixupAttrForNetwork
- ❗Issue #767 Memleak in 4.2.0 and 4.2.1 related with docstore cache
- Issue #768 Strange ping-pong with stored fields over network
- Issue #769 lemmatizer_base reset to empty if not mentioned in 'common' section
- Issue #770 pseudo_sharding makes SELECT by id slower
- Issue #771 DEBUG malloc_stats output zeros when using jemalloc
- Issue #772 Drop/add column makes value invisible
- Issue #773 Can't add column bit(N) to columnar table
- Issue #774 "cluster" gets empty on start in manticore.json
- ❗Commit 1da4ce89 HTTP actions are not tracked in SHOW STATUS
- Commit 381000ab disable pseudo_sharding for low frequency single keyword queries
- Commit 800325cc fixed stored attributes vs index merge
- Commit cddfeed6 generalized distinct value fetchers; added specialized distinct fetchers for columnar strings
- Commit fba4bb4f fixed fetching null integer attributes from docstore
-
Commit f3009a92
ranker
could be specified twice in query log