v1.15.0
版本发布时间: 2023-08-31 19:50:24
argilla-io/argilla最新发布版本:v2.1.0(2024-09-05 23:11:08)
🔆 Highlights
Argilla 1.15.0 comes with an enhanced FeedbackDataset
settings page enabling the update of the dataset settings, an integration of the TRL package with the ArgillaTrainer
, and continues adding improvements to the Python client for managing FeedbackDataset
s.
⚙️ Update FeedbackDataset
settings from the UI
FeedbackDataset
settings page has been updated and now it allows to update the guidelines
and some attributes of the fields
and questions
of the dataset. Did you misspell the title or description of a field or question? Well, you don't have to remove your dataset and create it again anymore! Just go to the settings page and fix it.
🤖 TRL integration with the ArgillaTrainer
The famous TRL package for training Transformers with Reinforcement Learning techniques has been integrated with the ArgillaTrainer, that comes with four new TrainingTask
: SFT, Reward Modeling, PPO and DPO. Each training task expects a formatting function that will return the data in the expected format for training the model.
Check this 🆕 tutorial for training a Reward Model using the Argilla Trainer.
🐍 Filter FeedbackDataset
and remove suggestions
In the 1.14.0 release we added many improvements for working with remote FeedbackDataset
s. In this release, a new filter_by
method has been added that allows to filter the records of a dataset from the Python client. For now, the records can be only filtered using the response_status
, but we're planning adding more complex filters for the upcoming releases. In addition, new methods have been added allowing to remove the suggestions created for a record.
1.15.0
Added
- Added
Enable to update guidelines and dataset settings for Feedback Datasets directly in the UI
(#3489) - Added
ArgillaTrainer
integration with TRL, allowing for easy supervised finetuning, reward modeling, direct preference optimization and proximal policy optimization (#3467) - Added
formatting_func
toArgillaTrainer
forFeedbackDataset
datasets add a custom formatting for the data (#3599). - Added
login
function inargilla.client.login
to login into an Argilla server and store the credentials locally (#3582). - Added
login
command to login into an Argilla server (#3600). - Added
logout
command to logout from an Argilla server (#3605). - Added
DELETE /api/v1/suggestions/{suggestion_id}
endpoint to delete a suggestion given its ID (#3617). - Added
DELETE /api/v1/records/{record_id}/suggestions
endpoint to delete several suggestions linked to the same record given their IDs (#3617). - Added
response_status
param toGET /api/v1/datasets/{dataset_id}/records
to be able to filter byresponse_status
as previously included forGET /api/v1/me/datasets/{dataset_id}/records
(#3613). - Added
list
classmethod toArgillaMixin
to be used asFeedbackDataset.list()
, also including theworkspace
to list from as arg (#3619). - Added
filter_by
method inRemoteFeedbackDataset
to filter based onresponse_status
(#3610). - Added
list_workspaces
function (to be used asrg.list_workspaces
, butWorkspace.list
is preferred) to list all the workspaces from an user in Argilla (#3641). - Added
list_datasets
function (to be used asrg.list_datasets
) to list theTextClassification
,TokenClassification
, andText2Text
datasets in Argilla (#3638). - Added
RemoteSuggestionSchema
to manage suggestions in Argilla, including thedelete
method to delete suggestios from Argilla viaDELETE /api/v1/suggestions/{suggestion_id}
(#3651). - Added
delete_suggestions
toRemoteFeedbackRecord
to remove suggestions from Argilla viaDELETE /api/v1/records/{record_id}/suggestions
(#3651).
Changed
- Changed
Optional label for * mark for required question
(#3608) - Updated
RemoteFeedbackDataset.delete_records
to use batch delete records endpoint (#3580). - Included
allowed_for_roles
for someRemoteFeedbackDataset
,RemoteFeedbackRecords
, andRemoteFeedbackRecord
methods that are only allowed for users with rolesowner
andadmin
(#3601). - Renamed
ArgillaToFromMixin
toArgillaMixin
(#3619). - Move
users
CLI app underdatabase
CLI app (#3593). - Move server
Enum
classes toargilla.server.enums
module (#3620).
Fixed
- Fixed
Filter by workspace in breadcrumbs
(#3577) - Fixed
Filter by workspace in datasets table
(#3604) - Fixed
Query search highlight
for Text2Text and TextClassification (#3621) - Fixed
RatingQuestion.values
validation to raise aValidationError
when values are out of range i.e. [1, 10] (#3626).
Removed
- Removed
multi_task_text_token_classification
fromTaskType
as not used (#3640). - Removed
argilla_id
in favor ofid
fromRemoteFeedbackDataset
(#3663). - Removed
fetch_records
fromRemoteFeedbackDataset
as now the records are lazily fetched from Argilla (#3663). - Removed
push_to_argilla
fromRemoteFeedbackDataset
, as it just works when calling it through aFeedbackDataset
locally, as now the updates of the remote datasets are automatically pushed to Argilla (#3663). - Removed
set_suggestions
in favor ofupdate(suggestions=...)
for bothFeedbackRecord
andRemoteFeedbackRecord
, as all the updates of any "updateable" attribute of a record will go throughupdate
instead (#3663). - Remove unused
owner
attribute for client Dataset data model (#3665)
As always, thanks to our amazing contributors
- @peppinob-ol made their first contribution in https://github.com/argilla-io/argilla/pull/3472
- @eshwarhs made their first contribution in https://github.com/argilla-io/argilla/pull/3605
Full Changelog: https://github.com/argilla-io/argilla/compare/v1.14.1...v1.15.0