v1.9.0
版本发布时间: 2024-07-01 13:51:03
meilisearch/meilisearch最新发布版本:v1.10.1(2024-09-02 17:59:14)
Meilisearch v1.9 includes performance improvements for hybrid search and the addition/updating of settings. This version benefits from multiple requested features, such as the new frequency
matching strategy and the ability to retrieve similar documents.
🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.
Some SDKs might not include all new features. Consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).
New features and updates 🔥
Hybrid search updates
This release introduces multiple hybrid search updates.
Done by @dureuill and @irevoire in #4633 and #4649
⚠️ Breaking change: Empty _vectors.embedder
arrays
Empty _vectors.embedder
arrays are now interpreted as having no vector embedding.
Before v1.9, Meilisearch interpreted these as a single embedding of dimension 0. This change follows user feedback that the previous behavior was unexpected and unhelpful.
⚠️ Breaking change: _vectors
field no longer present in search results
When the experimental vectorStore
feature is enabled, Meilisearch no longer includes _vectors
in returned search results by default. This will considerably improve performance.
Use the new retrieveVectors
search parameter to display the _vectors
field:
curl \
-X POST 'http://localhost:7700/indexes/INDEX_NAME/search' \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "SEARCH QUERY",
"retrieveVectors": true
}'
⚠️ Breaking change: Meilisearch no longer preserves the exact representation of embeddings appearing in _vectors
In order to save storage and run faster, Meilisearch is no longer storing your vector "as-is". Meilisearch now returns the float in a canonicalized representation rather than the user-provided representation.
For example, 3
may be represented as 3.0
Document _vectors
accepts object values
The document _vectors
field now accepts objects in addition to embedding arrays:
{
"id": 42,
"_vectors": {
"default": [0.1, 0.2 ],
"text": {
"embeddings": [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6]],
"regenerate": false
},
"translation": {
"embeddings": [0.1, 0.2, 0.3, 0.4],
"regenerate": true
}
}
}
The _vectors
object may contain two fields: embeddings
and regenerate
.
If present, embeddings
will replace this document's embeddings.
regenerate
must be either true
or false
. If regenerate: true
, Meilisearch will overwrite the document embeddings each time the document is updated in the future. If regenerate: false
, Meilisearch will keep the last provided or generated embeddings even if the document is updated in the future.
This change allows importing embeddings to autoembedders as a one-shot process, by setting them as regenerate: true
. This change also ensures embeddings are not regenerated when importing a dump created with Meilisearch v1.9.
Meilisearch v1.9.0 also improves performance when indexing and using hybrid search, avoiding useless operations and optimizing the important ones.
New feature: Ranking score threshold
Use rankingScoreThreshold
to exclude search results with low ranking scores:
curl \
-X POST 'http://localhost:7700/indexes/movies/search' \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "Badman dark returns 1",
"showRankingScore": true,
"limit": 5,
"rankingScoreThreshold": 0.2
}'
Meilisearch does not return any documents below the configured threshold. Excluded results do not count towards estimatedTotalHits
, totalHits
, and facet distribution.
⚠️ For performance reasons, if the number of documents above rankingScoreThreshold
is higher than limit
, Meilisearch does not evaluate the ranking score of the remaining documents. Results ranking below the threshold are not immediately removed from the set of candidates. In this case, Meilisearch may overestimate the count of estimatedTotalHits
, totalHits
and facet distribution.
Done by @dureuill in #4666
New feature: Get similar documents endpoint
This release introduces a new AI-powered search feature allowing you to send a document to Meilisearch and receive a list of similar documents in return.
Use the /indexes/{indexUid}/similar
endpoint to query Meilisearch for related documents:
curl \
-X POST /indexes/:indexUid/similar
-H 'Content-Type: application/json' \
--data-binary '{
"id": "23",
"offset": 0,
"limit": 2,
"filter": "release_date > 1521763199",
"embedder": "default",
"attributesToRetrieve": [],
"showRankingScore": false,
"showRankingScoreDetails": false
}'
-
id
: string indicating the document needing similar results, required -
offset
: number of results to skip when paginating, optional, defaults to0
-
limit
: number of results to display, optional, defaults to20
-
filter
: string with a filter expression Meilisearch should apply to the results, optional, defaults tonull
-
embedder
: string indicating the embedder Meilisearch should use to retrieve similar documents, optional, defaults to"default"
-
attributesToRetrieve
: array of strings indicating which fields Meilisearch will include in the response, optional, defaults to["*"]
-
showRankingScore
: boolean indicating if results should include ranking score information, optional, defaults tofalse
-
showRankingScoreDetails
: boolean indicating if results should include detailed ranking score information, optional, defaults tofalse
-
rankingScoreThreshold
: Excludes search results with a ranking score lower than the defined number, optional, defaults tonull
.
/indexes/{indexUid}/similar
supports GET
and POST
routes. Use URL query parameters to configure your GET
request, or include your parameters in the request body if using the POST
route. Both offer identical functionality.
Done by @dureuill in #4647
New feature: frequency
matching strategy
This release adds a new matching strategy, frequency
. Use it to prioritize results containing the least frequent query terms:
curl \
-X POST 'http://localhost:7700/indexes/{index_uid}/search' \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "cheval blanc",
"matchingStrategy": "frequency"
}'
Done by @ManyTheFish in #4667
Set distinctAttribute
at search time
This release introduces a new search parameter: distinct
which you can use to specify the distinct attribute at search time:
curl \
-X POST 'http://localhost:7700/indexes/{index_uid}/search' \
-H 'Content-Type: application/json' \
--data-binary '{
"q": "kefir le double poney",
"distinct": "book.isbn"
}'
If a distinct attribute is already defined in the settings it'll be ignored in favor of the one defined at search time.
Done by @Kerollmops in #4693
Improve indexing speed when updating/adding settings
Meilisearch now limits operations when importing settings by avoiding unnecessary writing operations in its internal database and reducing disk usage.
Additionally, when changing embedding settings, Meilisearch will now only regenerate the embeddings for the embedders whose settings have been modified, instead of for all embedders. When only the documentTemplate
is modified, embeddings will only be regenerated for documents where this modification leads to a different text to embed.
Done by @irevoire, @Kerollmops, @ManyTheFish and @dureuill in #4646, #4680, #4631 and #4649
Other improvements
- Speed up filter ANDs operations during the search (#4682) @Kerollmops
- Speed up facet distribution during the search (#4713) @Kerollmops
- Improve language support (#4684) @ManyTheFish @Soham1803 @mosuka @tkhshtsh0917
- Add new normalizer to normalize œ to oe and æ to ae
- Fix
chinese-normalization-pinyin
feature flag compilation
- Prometheus experimental feature: Use HTTP path pattern instead of full path in metrics (#4619) @gh2k
- ⚠️ Remove
exportPuffinReport
experimental feature. Use logs routes and logs modes instead (#4655) @Kerollmops
Fixes 🐞
- All fields now have the same impact on relevancy when
searchableAttributes: ["*"]
. Consult the GitHub issue for a detailed breakdown of these changes (#4631) @irevoire - Fix
searchableAttributes
behavior when handling nested fields. Consult the GitHub issue for more information (#4631) @irevoire - Fix security issue in dependency: bump Rustls to non-vulnerable versions (#4622) @Kerollmops
- Reset other embedding settings when changing the
source
of an embedder. This prevents misleading error messages when configuring the embedders (#4649) @dureuill - Fix panic in hybrid search when removing all embedders from the DB (#4715) @irevoire
- Hybrid search now respects the
offset
andlimit
parameters when returning keyword results early (#4746) @dureuill
Misc
- Dependencies updates
- Update actix-web 4.5.1 -> 4.6.0 (#4675) @dureuill
- Update mini-dashboard to 2.13 -> 2.14 (#4712) @curquiza
- CIs and tests
- Add "precommands" to benchmark (#4624) @dureuill
- Allow to comment with the results of benchmark invocation (#4651) @dureuill
- Fix ci tests (#4685) @ManyTheFish
- Documentation
- Update README.md (#4664) @tpayet
- Misc
- Fix comment typos (#4568) @yudrywet
- Fix comment typos (#4582) @writegr
❤️ Thanks again to our external contributors:
- Meilisearch: @gh2k, @writegr, @yudrywet.
- Charabia: @mosuka, @Soham1803, @tkhshtsh0917.
1、 meilisearch-linux-aarch64 110.75MB
2、 meilisearch-linux-amd64 112.09MB
3、 meilisearch-macos-amd64 108.24MB
4、 meilisearch-macos-apple-silicon 106.49MB
5、 meilisearch-windows-amd64.exe 107.74MB
6、 meilisearch.deb 74.16MB