v1.10.0-rc.0
版本发布时间: 2024-07-29 17:19:05
meilisearch/meilisearch最新发布版本:v1.10.1(2024-09-02 17:59:14)
[!WARNING] Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.
With Meilisearch v1.10 we keep innovating by introducing the really demanded federated search! You can now apply multi-search requests and get one single list of results 🎉. This version also includes a setting to define your index languages (even if multiple languages are in your documents!), and new experimental features like the CONTAINS
operator and the ability to update a subset of your dataset by using a simple function.
New features and updates 🔥
Federated search
By using the POST /multi-search
endpoint, you can now return a single search result object, whose list of hits
is built by merging the hits coming from all the queries in descending ranking score order.
curl \
-X POST 'http://localhost:7700/multi-search' \
-H 'Content-Type: application/json' \
--data-binary '{
"federation": {
"offset": 5,
"limit": 10,
}
"queries": [
{
"q": "Batman",
"indexUid": "movies"
},
{
"q": "Batman",
"indexUid": "comics"
}
]
}'
Response:
{
"hits": [
{
"id": 42,
"title": "Batman returns",
"overview": "..",
"_federation": {
"indexUid": "movies",
"queriesPosition": 0
}
},
{
"comicsId": "batman-killing-joke",
"description": "..",
"title": "Batman: the killing joke",
"_federation": {
"indexUid": "comics",
"queriesPosition": 1
}
},
...
],
processingTimeMs: 0,
limit: 20,
offset: 0,
estimatedTotalHits: 2,
semanticHitCount: 0,
}
If federation
is empty ({}
) default values of offset
and limit
are used, so respectively 0 and 20.
If federation
is null or missing, a classic multi-search will be applied, so a list of search result objects for each index will be returned.
To customize the relevancy and the weight applied to each index in the search result, use the federationOptions
parameter in your request:
curl \
-X POST 'http://localhost:7700/multi-search' \
-H 'Content-Type: application/json' \
--data-binary '{
"federation": {},
"queries": [
{
"q": "apple red",
"indexUid": "fruits",
"filter": "BOOSTED = true",
"_showRankingScore": true,
"federationOptions": {
"weight": 3.0
}
},
{
"q": "apple red",
"indexUid": "fruits",
"_showRankingScore": true,
}
]
}'
weight
must be positive (>=0)
- if < 1.0, the hits from this query are less likely to appear in the results.
- if > 1.0, the hits from this query are more likely to appear in the results.
- if missing, the default value is applied (1.0)
📖 More information about the merge algorithm on the here.
Done by @dureuill in #4769.
Experimental: CONTAINS
filter operator
Enabling the experimental feature will make a new CONTAINS
operator available while filtering on strings.
This is similar to the SQL LIKE
operator used with %
.
Activate the experimental feature:
curl \
-X PATCH 'http://localhost:7700/experimental-features/' \
-H 'Content-Type: application/json' \
--data-binary '{
"containsFilter": true
}'
Use the newly introduced CONTAINS
operator:
curl \
-X POST http://localhost:7700/indexes/movies/search \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "super hero",
"filter": "synopsis NOT CONTAINS spider"
}'
🗣️ This is an experimental feature, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.
Done by @irevoire in #4804.
Languages settings
You can now set up the language of your index in your settings and during the search. This will prevent users from using alternative Meilisearch images we were separately created until now (like for Swedish and Japanese)
Done by @ManyTheFish in #4819.
Index settings
Use the newly introduced localizedAttributes
setting (here is an example of handling multi-language documents):
curl \
-X PATCH 'http://localhost:7700/indexes/movies/settings' \
-H 'Content-Type: application/json' \
--data-binary '{
"localizedAttributes": [
{"locales": ["jpn"], "attributePatterns": ["*_ja"]},
{"locales": ["eng"], "attributePatterns": ["*_en"]},
{"locales": ["cmn"], "attributePatterns": ["*_zh"]},
{"locales": ["fra", "ita"], "attributePatterns": ["latin.*"]},
{"locales": [], "attributePatterns": ["*"]}
]
}'
locales
is a list of language codes to assign to a pattern, the supported codes are: epo
, eng
, rus
, cmn
, spa
, por
, ita
, ben
, fra
, deu
, ukr
, kat
, ara
, hin
, jpn
, heb
, yid
, pol
, amh
, jav
, kor
, nob
, dan
, swe
, fin
, tur
, nld
, hun
, ces
, ell
, bul
, bel
, mar
, kan
, ron
, slv
, hrv
, srp
, mkd
, lit
, lav
, est
, tam
, vie
, urd
, tha
, guj
, uzb
, pan
, aze
, ind
, tel
, pes
, mal
, ori
, mya
, nep
, sin
, khm
, tuk
, aka
, zul
, sna
, afr
, lat
, slk
, cat
, tgl
, hye
.
attributePattern
is a pattern that can start or end with a *
to match one or several attributes.
"locales": [], "attributePatterns": ["*"]
means the is the default rule.
Notes:
- if an attribute matches several rules, only the first rule in the list will be applied
- if the locales list is empty, then Meilisearch is allowed to auto-detect any language in the matching attributes
- These rules are applied to the
searchableAttributes
, thefilterableAttributes
, and thesortableAttributes
.
At search time
The search route accepts a new parameter, locales
allowing the end-user to define the language used in the current query:
curl \
-X POST http://localhost:7700/indexes/movies/search \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "進撃の巨人",
"locales": ["jpn"]
}'
The locales
parameter overrides eventual locales
in the index settings.
Experimental: edit documents by using a function
You can edit documents by executing a Rhai function on all the documents of your database or a subset of them that you can select by a Meilisearch filter.
Activate the experimental feature:
curl \
-X PATCH 'http://localhost:7700/experimental-features/' \
-H 'Content-Type: application/json' \
--data-binary '{
"editDocumentsByFunction": true
}'
By indexing this movies dataset, you can run the following Rhai function on all of them. This function will uppercase the titles of the movies with an id
> 3000 and add sparkles around it. Rhai templating syntax is applied here:
curl http://localhost:7700/indexes/movies/documents/edit \
-H 'content-type: application/json' \
-d '{
"filter": "id > 3000",
"function": "doc.title = `✨ ${doc.title.to_upper()} ✨`"
}'
📖 More information here.
🗣️ This is an experimental feature and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.
Done by @Kerollmops in #4626.
Experimental AI-powered search: quality of life improvements
For the purpose of future stabilization of the feature, we are applying changes and quality-of-life improvements.
Done by @dureuill in #4801, #4815, #4818, #4822.
⚠️ Breaking changes - changing the parameters of the REST API
The old parameters of the REST API are too numerous and confusing.
Removed parameters: query
, inputField
, inputType
, pathToEmbeddings
and embeddingObject
.
Replaced by
-
request
: A JSON value that represents the request made by Meilisearch to the remote embedder. The text to embed must be replaced by the placeholder value“{{text}}”
. -
response
: A JSON value that represents a fragment of the response made by the remote embedder to Meilisearch. The embedding must be replaced by the placeholder value"{{embedding}}"
.
Before:
// v1.9 (old) version
{
"source": "rest",
"url": "https://localhost:10006",
"query": {
"model": "minillm",
},
"inputField": ["prompt"],
"inputType": "text",
"embeddingObject": ["embedding"]
}
// v1.10 (new) version
{
"source": "rest",
"url": "https://localhost:10006",
"request": {
"model": "minillm",
"prompt": "{{text}}"
},
"response": {
"embedding": "{{embedding}}"
}
}
[!CAUTION] This is a breaking change to the configuration of REST embedders. Importing a dump containing a REST embedder configuration will fail in v1.10 with an error: "Error: unknown field
query
, expected one ofsource
,model
,revision
,apiKey
,dimensions
,documentTemplate
,url
,request
,response
,distribution
at line 1 column 752".
Upgrade procedure (Cloud):
- Remove any embedder with source "rest"
- Follow the usual steps described here in the documentation
Upgrade procedure (Open-source users):
- Remove any embedder with source "rest"
- Follow the usual steps described here in the documentation
Add custom headers to REST embedders
When the source
of an embedder is set to rest
, headers
is now available as an optional parameter.
Previously, the parameter did not exist. Trying to use the headers
parameter for any other source
is an invalid_settings_embedder
error.
Embedding requests sent from Meilisearch to a remote REST embedder always contain two headers:
-
Authorization: Bearer <apiKey>
(only ifapiKey
was provided) -
Content-Type: application/json
When provided, headers
should be a JSON object whose keys represent the name of additional headers to send in requests, and the values represent the value of these additional headers.
If headers
is missing or null
for a rest
embedder, only Authorization
and Content-Type
are sent, as described above.
If headers
contains Authorization
and Content-Type
, the declared values will override the ones that are sent by default.
Other quality-of-life improvements
📖 More details here
- Add
url
parameter to the OpenAI embedder.url
should be an URL to the embedding endpoint (including the v1/embeddingspart) from OpenAI. Ifurl
is missing ornull
for anopenAi
embedder, the default OpenAI embedding route will be used (https://api.openai.com/v1/embeddings). -
dimensions
is now available as an optional parameter forollama
embedders. Previously it was only available for rest,openAi
anduserProvided
embedders. - Previously
_vectors.embedder
was omitted for documents without at least one embedding forembedder
. This was inconsistent and prevented the user from checking the value ofregenerate
. - When a request to a REST embedder fails, the duration of the exponential backoff is now randomized up to twice its base duration
- Truncate rather than embed by chunk when OpenAI embeddings are bigger than the max number of tokens
- Improve error message when indexing documents and embeddings are missing for a user-provided embedder
- Improve error message when a model configuration cannot be loaded and its "architectures" field does not contain "BertModel"
⚠️ Important change regarding the minimal Ubuntu version compatible with Meilisearch
Because the GitHub Actions runner now enforces the usage of a Node version that is not compatible with Ubuntu 18.04 anymore, we had to upgrade the minimal Ubuntu version compatible with Meilisearch. Indeed, we use these GitHub actions to build and provide our binaries.
Now, Meilisearch is only compatible with Ubuntu 20.04 and later and not with Ubuntu 18.4 anymore.
Done by @curquiza in #4783.
Other improvements
- Search speed optimization: implement intersection at the end of the search pipeline by @Kerollmops in #4717
- Indexing speed optimization: stop opening indexes to only check if they exist by @Karribalu in #4787
- Improve tenant token error messages by @irevoire in #4724
- Add null byte as hard context separator by @LukasKalbertodt in meilisearch/charabia#295
- Adds all math symbols to the default separator list by @phillitrOSU in meilisearch/charabia#301
Fixes 🐞
- Fix invalid primary key for big numbers @JWSong in #4725
- Fix wrong HTTP status and confusing error message on wrong payload by @Karribalu in #4716
- Fix the missing geo distance when one or both of the lat/lng are string by @irevoire in #4731
Misc
- Dependencies updates
- Update most of the dependencies by @irevoire in #4786
- Update yaup by @irevoire in #4703
- Bump docker/build-push-action from 5 to 6 by @dependabot in #4758
- Bump zerovec from 0.10.1 to 0.10.4 by @dependabot in #4785
- Update rustls as much as possible by @irevoire in #4806
- CIs and tests
- Fix CI with Rust v1.79 by @dureuill in #4723
- Fix flaky test by @irevoire in #4730
- Specify the rust toolchain by @irevoire in #4706
- Add
vX
Docker tag when publishing Docker image by @curquiza in #4761 - Add search benchmarks by @dureuill in #4762
- Add tests on the rest embedder by @irevoire and @dureuill in #4755
- Documentation
- Add june 11th webinar banner by @Strift in #4691
- Revert "Add june 11th webinar banner" by @curquiza in #4705
- Update the README to link more demos by @Kerollmops in #4711
- Update README.md by @Strift in #4721
- Change the Meilisearch logo to the kawaii version by @Kerollmops in #4778
- Misc
- New workload to ignore the initial compression phase by @Kerollmops in #4773
- Rename the sortable into the filterable movies workload by @Kerollmops in #4774
- Correct apk usages in Dockerfile by @PeterDaveHello in #4781
- Make milli use edition 2021 by @hanbings in #4770
- Allow
MEILI_NO_VERGEN
env var to skip vergen by @dureuill in #4812
❤️ Thanks again to our external contributors:
- Meilisearch: @Karribalu, @hanbings, @junhochoi, @JWSong, @PeterDaveHello.
- Charabia: @LukasKalbertodt, @phillitrOSU.
1、 meilisearch-linux-aarch64 113.61MB
2、 meilisearch-linux-amd64 115.1MB
3、 meilisearch-macos-amd64 110.91MB
4、 meilisearch-macos-apple-silicon 109.04MB
5、 meilisearch-windows-amd64.exe 110.82MB