v0.10.15
版本发布时间: 2024-08-31 03:09:23
google-ai-edge/mediapipe最新发布版本:v0.10.15(2024-08-31 03:09:23)
Build changes
- Fix unwanted dependency on GPU libraries.
- Adds TwoTapFirFilterCalculator.
- Add public visibility to
graph_service
headers. - Disable ASAN, TSAN and MSAN tests which take more than 10 minutes.
Framework and core calculator improvements
- Update
PointToForeign
with an optional cleanup object. - Enable
BeginLoopCalculator
for move-only types (e.g.Tensor
) withoutPacket::Consume
usage and copyable types without copying unless it's a fundamental type. - Ensure proper release of resources in case of multiple AHWB reads.
- Enables the configuration of GpuBufferPool options via GpuResources::Create();
- Bugfix to correctly handle landmark projection in the non-square case.
- add utility to wait for a sync (represented by FD)
- Change a RET_CHECK to RET_CHECK_EQ
- KinematicPathSolver: Avoid overshooting target
- Introduce GetDefaultGpuExecutor(GpuResources) to allow executing all calculators on MP GPU thread.
- No destruction for static ahwb_usage_track_.
- Unbind framebufffer in Affine Transformation Runner GL
- Move/isolate ahwb_usage_track_ into tensor_ahwb
- Guard ahwb_tensor_track_ with mutex.
- Add SidePacketConnectionTest
- Update C++ Graph Builder to support executors and support input/output stream handlers.
- Node::Input/OutputStreamHandler -> Node::SetInput/OutputStreamHandler
- Add
Packet::Share()
method in replacement ofSharedPtrWithPacket()
function. - Default to high-performance power preference hint for WebGL contexts. For some computers with dual GPUs (like MBP2019), this will more frequently give us the higher performance GPU, which is generally preferable for most of our use cases (realtime rendering and ML), since speed is more critical than power consumption. If necessary, the user can override this setting by requesting their canvas' WebGL context manually before initializing the graph.
- Introduce input_scale parameter to SpectogramCalculator.
- Improve documentation of graph options
- Add an option to PackMediaSequenceCalculator to add empty clip labels instead of ignoring them. This is useful when we want to distinguish processing errors from no-detections.
- Updates language detection headers
- Fix dangling error reporter pointer in memory mapped models
- Fix for possible infinite stall using setOptions immediately before a loadLoraModel call.
- Add relu1p5 op, abs op, Log op, mdspan and Lhs Broadcast Sub with test
- Fix missing member move in Tensor class
- Add support for single Tensor output streams for ImageToTensorCalculator.
- Fix some compilation errors in WebGPU code. These changes are all minor.
- Add single tensor output support to tensor_converter_calculator.
- Replace QCHECK with ABSL_QCHECK and CHECK with ABSL_CHECK.
- Fix a bug in TensorAHWB that triggers a crash with multiple delayed AHWB readers followed by a CPU reader.
- Fixes an unnecessary allocation of GraphServiceManager in case it is adopted from the calculator context.
- Fix triggering of DFATAL message.
- Remove xnn_enable_avx512fp16=false from .bazelrc
- Replace uses of TfLiteOperatorCreate with TfLiteOperatorCreateWithData
- Compile with '--keep_going' in setup.py
- Update ndk version so that our open source users get the best possible performance out of mediapipe.
- Correct address of android ndk
- Replace absl::make_unique with std::make_unique in tensor.cc and tensor_ahwb.cc.
- LLM decode benchmarks fill the cache with a predefined number of tokens before starting decoding.
- Add logic to drop the offending non-monotonically increasing timestamp in the MicrophoneHelper.
- Make packet payload const.
- Pass flag to indicate that consuming op may support prepacked GEMM.
- Get timestamp from OpenCV VideoCapture after first frame is read.
- Update XNNPack and cpuinfo
- Update TensorFlow to 2024-07-18.
- Remove deprecated TfLiteOperatorCreateWithData function
- Add option to use shifted window in SpectrogramCalculator.
- Move AhwbUsage struct and helper methods into a separate library.
- Make fields in
PacketGetter.Pair
public. - The GraphProfiler my be destoried before the task executed in the executor.
- Introduce flag in MicrophoneHelper to drop non-increasing timestamps.
- llm_test - add batch size of 8 for BM_Llm_QCINT8/512/128
- Add method to create MP Tensor from TfLite tensor specs
- Refactors AHardwareBufferView class to be instantiated with a TensorAhwbUsage pointer.
- Refactor LlmBuilder to have one graph
- Add
expected_seq_len
param to ComputeLogits() - Fix mediapipe::file::Exists() for >2GB files on Windows.
- Bump XNNPACK and KleidiAI versions.
- Update MP demo app to acquire wake lock
- Replace mediapipe::StatusOr with absl::StatusOr
- Sync on ssbo_writte_ before mapping an AHWB to a CpuReadView.
MediaPipe Tasks update
This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.
Android
- Bump targetSdkVersion to 34 throughout MediaPipe.
iOS
- Updated documentation in iOS audio classifier
- Added iOS holistic landmarker to vision framework build
- Changed method name in MPPAudioClassifierResult
- Added audio classifier options helpers
- Added audio classifier result helpers
- Added method to create audio record MPPAudioTaskRunner
- Removed unused imports in MPPAudioTaskRunner
- Added iOS audio embedder result, classifier result, classifier options, embedder options, embedder options helpers, classifier header and embedder result helpers
- Add missing argument for num_draft_tokens.
Javascript
- Set quantization bits for LoRA weight conversion to match those specified
- Warn on adding packets to a closed input stream instead of silently dropping packets.
- Enable experimental support for Chromium WGSL subgroups in LLM API, when available.
- Support multi-response generation.
Python
- Add prompt template to llm bundler.
Bug fixes
- class_weights flag cuases a crash for multiclass case
Model Maker changes
- Rename old BinaryAUC metric to BinarySparseAUC(used by text_classifier) and create a new BinaryAUC metric which does not expect sparse inputs.
- Allow configuration of num_parallel_calls and cycle_length in hparams
- Improve python code format.
- Use tf.io.gfile.GFile for writing metadata file in image classifier.
- Change SparsePrecision metric to BinarySparsePrecision metric, and same for SparseRecall->BinarySparseRecall in the core library. We only care about these metrics in the binary case, so this change makes the metric classnames more accurate for it's intended usage.
- Support multilabel model training in text classifier
- Create and add metrics for multi-class case
- Support a customized best model monitor for multiclass cases
MediaPipe Dependencies
- Update WASM files