v1.7.0-rc.0
版本发布时间: 2024-02-12 19:38:10
meilisearch/meilisearch最新发布版本:v1.10.1(2024-09-02 17:59:14)
⚠️ Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.
Meilisearch v1.7.0 mostly focuses on improving v1.6.0 features, indexing speed and hybrid search. GPU computing is now supported.
New features and improvements 🔥
Improve AI with Meilisearch (experimental feature)
🗣️ AI work is still experimental, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.
To use it, you need to enable vectorSearch
through the /experimental-features
route.
💡 More documentation about AI search with Meilisearch here.
Add new OpenAI embedding models & ability to override their models dimensions
When using OpenAi
as source
in your embedders
index settings (an example here), you can now specify two new models:
-
text-embedding-3-small
with a default dimension of 1536. -
text-embedding-3-large
with a default dimension of 3072.
The new models:
- are cheaper
- produce more relevant results in standardized tests
- allow to set up the dimensions of the embeddings to control the trade-off between accuracy and performance (including storage)
It means that it is now possible to pass the dimensions field when using the OpenAi
source. This was previously only available for the userProvided
source in the previous releases.
There are some rules, though, which we detail with these examples:
"embedders": {
"large": {
"source": "openAi",
"model": "text-embedding-3-large",
"dimensions": 512 // must be >0, must be <= 3072 for "text-embedding-3-large"
},
"small": {
"source": "openAi",
"model": "text-embedding-3-small",
"dimensions": 1024 // must be >0, must be <= 1536 for "text-embedding-3-small"
},
"legacy": {
"source": "openAi",
"model": "text-embedding-ada-002",
"dimensions": 1536 // must =1536 for "text-embedding-ada-002"
},
"omitted_dimensions": { // uses the default dimension
"source": "openAi",
"model": "text-embedding-ada-002",
}
}
Done in #4375 by @Gosti.
Add GPU support to compute embeddings
Enabling the CUDA feature allows using an available GPU to compute embeddings with a huggingFace
embedder.
On an AWS Graviton 2, this yields a x3 - x5 improvement on indexing time.
👇 How to enable GPU support through CUDA for HuggingFace embedding generation:
Prerequisites
- Linux distribution with a compatible CUDA version
- NVidia GPU with CUDA support
- A recent Rust compiler to compile Meilisearch from source
Steps
- Follow the guide to install the CUDA dependencies
- Clone Meilisearch:
git clone https://github.com/meilisearch/meilisearch.git
- Compile Meilisearch with the
cuda
feature:cargo build --release --package meilisearch --features cuda
- In the freshly compiled Meilisearch, enable the vector store experimental feature:
❯ curl \
-X PATCH 'http://localhost:7700/experimental-features/' \
-H 'Content-Type: application/json' \
--data-binary '{ "vectorStore": true }'
- Add an HuggingFace embedder to the settings:
curl \
-X PATCH 'http://localhost:7700/indexes/your_index/settings/embedders' \
-H 'Content-Type: application/json' --data-binary \
'{ "default": { "source": "huggingFace" } }'
Done by @dureuill in #4304.
Improve indexing speed & reduce memory crashes
- Auto-batch the task deletions to reduce indexing time (#4316) @irevoire
- Improve indexing speed for vector store (makes the Hybrid search experimental feature indexing time more than 10 times faster) (#4332) @Kerollmops @irevoire
- Reduce memory usage, so reduce the memory crashes, by capping the maximum memory of the grenad sorters (#4388) @Kerollmops
Stabilize scoreDetails
feature
In v1.3.0, we introduced the experimental feature scoreDetails
. We got enough positive feedback on the feature, and we now stabilize it, making this feature enabled by default.
View detailed scores per ranking rule for each document with the showRankingScoreDetails
search parameter:
curl \
-X POST 'http://localhost:7700/indexes/movies/search' \
-H 'Content-Type: application/json' \
--data-binary '{ "q": "Batman Returns", "showRankingScoreDetails": true }'
When showRankingScoreDetails
is set to true
, returned documents include a _rankingScoreDetails
field. This field contains score values for each ranking rule.
"_rankingScoreDetails": {
"words": {
"order": 0,
"matchingWords": 1,
"maxMatchingWords": 1,
"score": 1.0
},
"typo": {
"order": 1,
"typoCount": 0,
"maxTypoCount": 1,
"score": 1.0
},
"proximity": {
"order": 2,
"score": 1.0
},
"attribute": {
"order": 3,
"attributes_ranking_order": 0.8,
"attributes_query_word_order": 0.6363636363636364,
"score": 0.7272727272727273
},
"exactness": {
"order": 4,
"matchType": "noExactMatch",
"matchingWords": 0,
"maxMatchingWords": 1,
"score": 0.3333333333333333
}
}
Done by @dureuill in #4389.
Logs improvements
We made some changes regarding our logs to help with debugging and bug reporting.
Done by @irevoire in #4391
Log format change
⚠️ If you did any automation based on Meilisearch logs, be aware of the changes. More information here.
The default log format evolved slightly from this:
[2024-02-06T14:54:11Z INFO actix_server::builder] starting 10 workers
To this:
2024-02-06T13:58:14.710803Z INFO actix_server::builder: 200: starting 10 workers
Experimental: new routes to manage logs
This new version of Meilisearch introduces 3 new experimental routes
-
POST /logs/stream
: streams the log happening in real-time. Requires two parameters:-
target
: selects what logs you’re interested in. It takes the form ofcode_part=log_level
. For example,index_scheduler=info
-
mode
: selects in what format of log you want. Two options are available:human
(basic logs) orprofile
(a way more complex trace)
-
-
DELETE /logs/stream
: stops the listener from the meilisearch perspective. Does not require any parameters.
💡 More information in the New experimental routes section of this file.
⚠️ Some remarks on this POST /logs/stream
route:
- You can have only one listener at a time
- Listening to the route doesn’t seem to work with
xh
orhttpie
for the moment - When killing the listener, it may stay installed on Meilisearch for some time, and you will need to call the
DELETE /logs/stream
route to get rid of it.
🗣️ This feature is experimental, and we need your help to improve it! Share your thoughts and feedback on this GitHub discussion.
⚠️ Experimental features may be incompatible between Meilisearch versions.
Other improvements
- Related to the Prometheus experimental feature: add job variable to Grafana dashboard (#4330) @capJavert
Misc
- Dependencies upgrade
- Bump rustls-webpki from 0.101.3 to 0.101.7 (#4263)
- Bump h2 from 0.3.20 to 0.3.24 (#4345)
- Update the dependencies (#4332) @Kerollmops
- CIs and tests
- Update SDK test dependencies (#4293) @curquiza
- Documentation
- Add Setting API reminder in issue template (#4325) @ManyTheFish
- Update README (#4319) @codesmith-emmy
- Misc
- Fix compilation warnings (#4295) @irevoire
❤️ Thanks again to our external contributors:
- Meilisearch: @capJavert, @codesmith-emmy and @Gosti
1、 meilisearch-linux-aarch64 121.03MB
2、 meilisearch-linux-amd64 122.39MB
3、 meilisearch-macos-amd64 114.02MB
4、 meilisearch-macos-apple-silicon 112.48MB
5、 meilisearch-windows-amd64.exe 113.59MB