v0.6.0
版本发布时间: 2018-06-29 22:22:24
intel-analytics/ipex-llm最新发布版本:v2.1.0(2024-08-22 17:06:57)
Highlights
- We integrate MKL-DNN as an alternative execution engine for CNN models. MKL-DNN provides better training/inference performance and less memory consuming. On some CNN models, we find there’s 2x throughput improvement in our experiment.
- Support using different optimization methods to optimize different parts of the model. This is necessary when train some models.
- Spark 2.3 support. We have tested our code and examples on Spark 2.3. We release the binary for Spark 2.3, and Spark 1.5 will not be supported.
Details
- [New Feature] MKL-DNN integration. We integrate MKL-DNN as an alternative execution engine for CNN models. It supports speedup layers like: AvgPooling, MaxPooling, CAddTable, LRN, JoinTable, Linear, ReLU, SpatialConvolution, SpatialBatchnormalization, Softmax. MKL-DNN provides better training/inference performance and less memory consuming.
- [New Feature] Layer fusion. Support layer fusion on conv + relu, batchnorm + relu, conv + batchnorm and conv + sum(some of the fusion can only be applied in the inference). Layer fusion provides better performance especially on inference. Currently layer fusion are only available for MKL-DNN related layers.
- [New Feature] Multiple optimization method support in optimizer. Support using different optimization methods to optimize different parts of the model.
- [New Feature] Add a new optimization method Ftrl, which is often used in recommendation model training.
- [New Feature] Add a new example: Training Resnet50 on ImageNet dataset.
- [New Feature] Add new OpenCV based image preprocessing transformer ChannelScaledNormalizer.
- [New Feature] Add new OpenCV based image preprocessing transformer RandomAlterAspect.
- [New Feature] Add new OpenCV based image preprocessing transformer RandomCropper.
- [New Feature] Add new OpenCV based image preprocessing transformer RandomResize.
- [New Feature] Support loading Tensorflow Max operation.
- [New Feature] Allow user to specify input port when loading Tensorflow model. If the input operation accepts multiple tensors as input, user can specify which to feed data to instead of feed all tensors.
- [New Feature] Support loading Tensorflow Gather operation.
- [New Feature] Add random split for ImageFrame
- [New Feature] Add setLabel and getURI API into ImageFrame
- [API Change] Add batch size into the Python model.predict API.
- [API Change] Add generateBackward into load Tensorflow model API, which allows user choose whether to generate backward path when load Tensorflow model.
- [API Change] Add feature() and label() to the Sample.
- [API Change] Deprecate the DLClassifier/DLEstimator in org.apache.spark.ml. Prefer using DLClassifier/DLEstimator under com.intel.analytics.bigdl.dlframes.
- [Enhancement] Refine StridedSlice. Support begin/end/shrinkAxis mask just like Tensorflow.
- [Enhancement] Add layer sync to SpatialBatchNormalization. SpatialBatchNormalization can calculate mean/std on a larger batch size. The model with SpatialBatchNormalization layer can converge to a better accuracy even the local batch size is small.
- [Enhancement] Code refactor in DistriOptimizer for advanced parameter operations, e.g. global gradient clipping.
- [Enhancement] Add more models into the LoadModel example.
- [Enhancement] Share Const values when broadcast the model. The Const value will not be changed and we can share it when use multiple model for inference on a same node, which will reduce memory usage.
- [Enhancement] Refine the getTime and time counting implementation.
- [Enhancement] Support group serializer so that layers of the same hierarchy could share the same serializer.
- [Enhancement] Dockerfile use Python 2.7.
- [Bug Fix] Fix memory leak problem when using quantized model in predictor.
- [Bug Fix] Fix PY4J Java gateway not compatible in Spark local mode for Spark 2.3.
- [Bug Fix] Fix a bug in python inception example.
- [Bug Fix] Fix a bug when run Tensorflow model using loop.
- [Bug Fix] Fix a bug in the Squeeze layer.
- [Bug Fix] Fix python API for random split.
- [Bug Fix] Using parameters() instead of getParameterTable() to get weight and bias in serialization.
- [Document] Fix incorrectness in Quantized model document.
- [Document] Fix incorrect instructions when generate Sequence files for ImageNet 2012 dataset in the document.
- [Document] Move bigdl-core build document into a separated page and refine the format.
- [Document] Fix incorrect command in Tensorflow load and transfer learning examples.