v0.10.5
版本发布时间: 2023-09-15 12:22:29
google-ai-edge/mediapipe最新发布版本:v0.10.15(2024-08-31 03:09:23)
Framework and core calculator improvements
- Fix crash in SavePngTestOutput
- Log stack traces for combined CalculatorGraph statuses
- Add a GpuOrigin parameter to TensorConverterCalculator
- Replace some size EXPECTs by ASSERTs
- Add a support for label annotations (image/label/string and image/label/confidence). Also fixed some clang tidy issues.
- Set confidence score of the bounding box label.
- Add setGpuBufferVerticalFlip to GraphRunner TS API
- Remove unsafe cast.
- apply affine transform before drawing, in order to keep constant line width regardless of face cropping.
- Migrate packet messages auto registration to rely on MEDIAPIPE_STATIC_REGISTRATOR_TEMPLATE
- add end loop calculator for image size
- Provide a way to disable static registration using MEDIAPIPE_DISABLE_STATIC_REGISTRATION
- Header for callback_packet_calculator to allow dynamic registration for superusers
- Support more GPU formats in tensor converter calculator.
- Expose stream handlers in headers to allow dynamic registration for superusers
- Expose tool calculators in headers to enable dynamic registration by superusers.
- Dry-Run mode for static registration to make it easier to find all required static registrations
- Fix MediaPipe build in Chromium.
- Swap left and right hand labels.
- Don't access "document" in WebWorker
- Update PackMediaSequenceCalculator to support adding clip/media/id to the MediaSequence.
- update pose rendering
- Update the header information for EnsureMinimumDefaultExecutorStackSize.
- Move stream API loopback to third_party.
- Add pose landmarks constants
- Add an API in model_task_graph to create or use cached model resources.
- Move stream API image_size to third_party.
- Add C++ converters for C Text Classifier API
- Move stream API rect_transformation to third_party.
- Change the image label input from Classification to Detection.
- Update port includes with IWYU to fix clang warnings in code where corresponding ports are used.
- New image test utilities and memory management fixes.
- Add a custom op resolver for fused batch norm.
- Improving throttling logs by providing a node info corresponding to a throttling stream.
- Use ABSL_LOG in MediaPipe.
- Remove reference pointer to prevent using a constant reference in the looped iteration variable
- Remove unnecessary includes in threadpool_std_thread_impl.cc.
- Make cache writes optional in InferenceCalculatorAdvancedGL
- Update PackMediaSequenceCalculator to support setting clip/media/string, clip/media/confidence and clip/label/index.
- Some spelling and grammar fixes in the comments.
- Add notes/warnings for calculators which use dedicated GL contexts.
- Remove video and stream model in face stylizer.
- Move stream API landmarks_projection to third_party.
- Remove video and streaming mode for face stylizer.
- landmarks_to_detection stream utility function.
- Ensure that C header don't import C++ types
- Splitting GraphRunner into public API declared interfaces and private TS impls
- Add option for nearest neighbor interpolation.
- Fixes two issues with file handling on windows:
- Remove uncoditional texture params reset to make float textures handled correctly.
- fixes the non-unicode path of file_helpers on windows
- Modifying tensor_to_vector_float_calculator to take in D_BFLOAT16 values
- Don't define field in ExternalFileHandler that's not used on Windows.
- Clean up TensorConverterCalculator flipping behavior
- Fix win32 build break in mediapipe.
MediaPipe Tasks update
This section should highlight the changes that are done specifically for any platform and don't propagate to other platforms.
Android
- Adds option to use tensor_ahwb in Android vendor processes
- Add output size as parameters in Java ImageSegmenter
- Change SegmentationOptions.builder() to be public
- ImageGenerator Java API
- Provide API/options to show intermediate results and generating progress for Java Image Generator.
- Set enableFlowLimiting to false since only Image model is supported for face stylizer.
- Move loading tasks-vision-jni to individual vision task class
iOS
- Added refactored iOS vision task runner sources
- Removed convenience initializer from refactored MPPVisionTaskRunner
- Updated iOS docs to use swift names in place of objective c names
- Added gesture recognizer and hand landmarker to iOS vision framework
- Fixed directory creation issues in build_ios_framework.sh
- Changed delegate method to optional
- Added iOS image segmenter implementation file
- Updated image segmenter bazel target to add MPPImageSegmenter.mm
- Renamed option in MPPImageSegmenterOptions
- Updated iOS face detector to use refactored vision task runner
- Updated iOS image classifier to use refactored vision task runner
- Changed order of methods in MPPImageSegmenter.mm
- Fixed method call in MPPImageSegmenter.mm
- Updated face landmarker, gesture recognizer,hand landmarker,object detector to use refactored vision task runner
- Replaced the old iOS vision task runner with the refactored task runner
- Updated iOS gesture recognizer documentation to use Swift names
- Updated iOS hand landmarker documentation to use swift names
- Moved iOS MPPHandLandmark enum to MPPHandLandmarker.h
- Fixes iOS hand landmarker connections
Javascript
- vlog default executor and its config usage
- Updates the runners to support wasm-style binary assets files, and allows their URLs to be explicitly specified as part of the WasmFileset.
- Add 'types' to package.json
- Add externs to js_library targets
- Add API exports for MPMask and MPImage
- Add Handedness to JS, C++ and Android API
- Fix missing exports for FilesetResolver and static constants
- Add exports to ImageSegmenterResult and InteractiveSegmenterResult
Python
- Set the default running model to Image for face stylizer.
Bug fixes
- Internal fixes
Model Maker changes
-
Add tensorflow-addons to model_maker requirements.txt
-
Change to add the w_avg latent code to style encoding before layer swapping. This is a bug in the previous code. Also set training=True for encoder since this affect the encoding performance.
-
add metadata writer into face stylizer.
-
Refactor text_classifier preprocessor to move away from using classifier_data_lib
-
Import image_util for using it in mediapipe face stylizer open sourcing.
-
Fix image_util shortcut import line
-
Change supported_ops to a Tuple instead of List to match the API definition.
-
Add a new from_image API to create face stylizer dataset from a single image. Also deprecate the from_folder API since we only support one-shot use case now.
-
Add an API to run inference with face stylizer TF model.
-
Check if the image contains valid face that can be aligned for stylization. If not, throw an exception for invalid input image. This is applied to both input stylized face and raw face.
-
Add allow_custom_ops to model_util.convert_to_tflite and enable custom ops for face stylizer.
-
MediaPipe Dependencies
-
Update WASM files for 10.5 release