v1.8.0
版本发布时间: 2023-05-31 22:53:21
argilla-io/argilla最新发布版本:v2.1.0(2024-09-05 23:11:08)
🔆 Highlights
New Feedback Task 🎉
Big welcome to our new `FeedbackDataset`! This new type of dataset is designed to cover the specific needs of working with LLMs. Use this task to gather demonstration examples, human feedback, curate other datasets... Questions of different types can be combined so you can adapt your dataset to the specific needs of your project. Currently, it supports `RatingQuestion` and `TextQuestion`, but more question types will be added shortly in the coming releases.In addition, these datasets support multiple annotations: all users with access to the dataset can give their responses.
The FeedbackDataset
has an enhanced integration with the Hugging Face Hub, so that saving a dataset to the Hub or pushing a FeedbackDataset
from the Hub directly to Argilla is seamless.
Check all the things you can do with Feedback Tasks in our docs
New LLM section in our docs
We've added a new section in our docs that covers:
- Useful concepts around work with LLMs
- How-to guides that cover all the functionalities of the new Feedback Task
- End-to-end examples
More training integrations
We've added new frameworks for the ArgillaTrainer
: ArgillaPeftTrainer
for Text and Token Classification and ArgillaAutoTrainTrainer
for Text Classification.
Changelog 1.8.0
Added
-
/api/v1/datasets
new endpoint to list and create datasets ([#2615]). -
/api/v1/datasets/{dataset_id}
new endpoint to get and delete datasets ([#2615]). -
/api/v1/datasets/{dataset_id}/publish
new endpoint to publish a dataset ([#2615]). -
/api/v1/datasets/{dataset_id}/questions
new endpoint to list and create dataset questions ([#2615]) -
/api/v1/datasets/{dataset_id}/fields
new endpoint to list and create dataset fields ([#2615]) -
/api/v1/datasets/{dataset_id}/questions/{question_id}
new endpoint to delete a dataset questions ([#2615]) -
/api/v1/datasets/{dataset_id}/fields/{field_id}
new endpoint to delete a dataset field ([#2615]) -
/api/v1/workspaces/{workspace_id}
new endpoint to get workspaces by id ([#2615]) -
/api/v1/responses/{response_id}
new endpoint to update and delete a response ([#2615]) -
/api/v1/datasets/{dataset_id}/records
new endpoint to create and list dataset records ([#2615]) -
/api/v1/me/datasets
new endpoint to list user visible datasets ([#2615]) -
/api/v1/me/dataset/{dataset_id}/records
new endpoint to list dataset records with user responses ([#2615]) -
/api/v1/me/datasets/{dataset_id}/metrics
new endpoint to get the dataset user metrics ([#2615]) -
/api/v1/me/records/{record_id}/responses
new endpoint to create record user responses ([#2615]) - showing new feedback task datasets in datasets list ([#2719])
- new page for feedback task ([#2680])
- show feedback task metrics ([#2822])
- user can delete dataset in dataset settings page ([#2792])
- Support for
FeedbackDataset
in Python client (parent PR [#2615], and nested PRs: [#2949], [#2827], [#2943], [#2945], [#2962], and [#3003]) - Integration with the HuggingFace Hub ([#2949])
- Added
ArgillaPeftTrainer
for text and token classification #2854 - Added
predict_proba()
method toArgillaSetFitTrainer
- Added
ArgillaAutoTrainTrainer
for Text Classification #2664 - New
database revisions
command showing database revisions info [#2615]: https://github.com/argilla-io/argilla/issues/2615
Fixes
- Avoid rendering html for invalid html strings in Text2text ([#2911]https://github.com/argilla-io/argilla/issues/2911)
Changed
- The
database migrate
command accepts a--revision
param to provide specific revision id -
tokens_length
metrics function returns empty data (#3045) -
token_length
metrics function returns empty data (#3045) -
mention_length
metrics function returns empty data (#3045) -
entity_density
metrics function returns empty data (#3045)
Deprecated
- Using argilla with python 3.7 runtime is deprecated and support will be removed from version 1.9.0 (#2902)
-
tokens_length
metrics function has been deprecated and will be removed in 1.10.0 (#3045) -
token_length
metrics function has been deprecated and will be removed in 1.10.0 (#3045) -
mention_length
metrics function has been deprecated and will be removed in 1.10.0 (#3045) -
entity_density
metrics function has been deprecated and will be removed in 1.10.0 (#3045)
Removed
- Removed mention
density
,tokens_length
andchars_length
metrics from token classification metrics storage (#3045) - Removed token
char_start
,char_end
,tag
, andscore
metrics from token classification metrics storage (#3045) - Removed tags-related metrics from token classification metrics storage (#3045)
As always, thanks to our amazing contributors!
- Fix image alignment on token classification by @cceyda in https://github.com/argilla-io/argilla/pull/2779
- Update cloud_providers.md by @chainyo in https://github.com/argilla-io/argilla/pull/2866