0.2.2
版本发布时间: 2021-12-16 01:13:46
tensorflow/decision-forests最新发布版本:v1.6.0(2023-09-28 22:21:12)
Features
- Surface the
validation_interval_in_trees
,keep_non_leaf_label_distribution
and 'random_seed' hyper-parameters. - Add the
batch_size
argument in thepd_dataframe_to_tf_dataset
utility. - Automatically determine the number of threads if
num_threads=None
. - Add constructor argument
try_resume_training
to facilitate resuming training. - Check that the training dataset is well configured for TF-DF e.g. no repeat
operation, has a large enough batch size, etc. The check can be disabled
with
check_dataset=False
. - When a model is created manually with the model builder, and if the dataspec is not provided, tries to adapt the dataspec so that the model looks as if it was trained with the global imputation strategy for missing values (i.e. missing_value_policy: GLOBAL_IMPUTATION). This makes manually created models more likely to be compatible with the fast inference engines.
- TF-DF models
fit
method now passes thevalidation_data
to the Yggdrasil learners. This is used for example for early stopping in the case of GBT model. - Add the "loss" parameter of the GBT model directly in the model constructor.
- Control the amount of training logs displayed in the notebook (if using
notebook) or in the console with the
verbose
constructor argument andfit
parameter of the model.
Fixes
-
num_candidate_attributes
is not ignored anymore whennum_candidate_attributes_ratio=-1
. - Use the median bucket split value strategy in the discretized numerical splitters (local and distributed).
- Surface the
max_num_scanned_rows_to_accumulate_statistics
parameter to control how many examples are scanned to determine the feature statistics when training from a file dataset withfit_on_dataset_path
.