roboflow/maestro
Fork: 102 Star: 1411 (更新于 2024-12-08 19:55:37)
license: Apache-2.0
Language: Python .
streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL
最后发布版本: 0.1.0 ( 2023-11-29 20:54:56)
maestro
coming: when it's ready...
👋 hello
maestro is a tool designed to streamline and accelerate the fine-tuning process for multimodal models. It provides ready-to-use recipes for fine-tuning popular vision-language models (VLMs) such as Florence-2, PaliGemma, and Qwen2-VL on downstream vision-language tasks.
💻 install
Pip install the supervision package in a Python>=3.8 environment.
pip install maestro
🔥 quickstart
CLI
VLMs can be fine-tuned on downstream tasks directly from the command line with
maestro
command:
maestro florence2 train --dataset='<DATASET_PATH>' --epochs=10 --batch-size=8
SDK
Alternatively, you can fine-tune VLMs using the Python SDK, which accepts the same arguments as the CLI example above:
from maestro.trainer.common import MeanAveragePrecisionMetric
from maestro.trainer.models.florence_2 import train, Configuration
config = Configuration(
dataset='<DATASET_PATH>',
epochs=10,
batch_size=8,
metrics=[MeanAveragePrecisionMetric()]
)
train(config)
📚 notebooks
Explore our collection of notebooks that demonstrate how to fine-tune various vision-language models using maestro. Each notebook provides step-by-step instructions and code examples to help you get started quickly.
model and task | colab | video |
---|---|---|
Fine-tune Florence-2 for object detection | ||
Fine-tune Florence-2 for visual question answering (VQA) |
🦸 contribution
We would love your help in making this repository even better! We are especially looking for contributors with experience in fine-tuning vision-language models (VLMs). If you notice any bugs or have suggestions for improvement, feel free to open an issue or submit a pull request.
最近版本更新:(数据更新于 2024-09-18 07:04:51)
2023-11-29 20:54:56 0.1.0
主题(topics):
captioning, fine-tuning, florence-2, multimodal, objectdetection, paligemma, phi-3-vision, transformers, vision-and-language, vqa
roboflow/maestro同语言 Python最近更新仓库
2024-12-22 09:03:32 ultralytics/ultralytics
2024-12-21 13:26:40 notepad-plus-plus/nppPluginList
2024-12-21 11:42:53 XiaoMi/ha_xiaomi_home
2024-12-21 04:33:22 comfyanonymous/ComfyUI
2024-12-20 18:47:56 home-assistant/core
2024-12-20 15:41:40 jxxghp/MoviePilot