MyGit

roboflow/maestro

Fork: 102 Star: 1385 (更新于 2024-11-15 20:24:18)

license: Apache-2.0

Language: Python .

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL

最后发布版本: 0.1.0 ( 2023-11-29 20:54:56)

官方网址 GitHub网址

maestro

coming: when it's ready...

👋 hello

maestro is a tool designed to streamline and accelerate the fine-tuning process for multimodal models. It provides ready-to-use recipes for fine-tuning popular vision-language models (VLMs) such as Florence-2, PaliGemma, and Qwen2-VL on downstream vision-language tasks.

💻 install

Pip install the supervision package in a Python>=3.8 environment.

pip install maestro

🔥 quickstart

CLI

VLMs can be fine-tuned on downstream tasks directly from the command line with maestro command:

maestro florence2 train --dataset='<DATASET_PATH>' --epochs=10 --batch-size=8

SDK

Alternatively, you can fine-tune VLMs using the Python SDK, which accepts the same arguments as the CLI example above:

from maestro.trainer.common import MeanAveragePrecisionMetric
from maestro.trainer.models.florence_2 import train, Configuration

config = Configuration(
    dataset='<DATASET_PATH>',
    epochs=10,
    batch_size=8,
    metrics=[MeanAveragePrecisionMetric()]
)

train(config)

📚 notebooks

Explore our collection of notebooks that demonstrate how to fine-tune various vision-language models using maestro. Each notebook provides step-by-step instructions and code examples to help you get started quickly.

model and task colab video
Fine-tune Florence-2 for object detection Open In Colab YouTube
Fine-tune Florence-2 for visual question answering (VQA) Open In Colab YouTube

🦸 contribution

We would love your help in making this repository even better! We are especially looking for contributors with experience in fine-tuning vision-language models (VLMs). If you notice any bugs or have suggestions for improvement, feel free to open an issue or submit a pull request.

最近版本更新:(数据更新于 2024-09-18 07:04:51)

2023-11-29 20:54:56 0.1.0

主题(topics):

captioning, fine-tuning, florence-2, multimodal, objectdetection, paligemma, phi-3-vision, transformers, vision-and-language, vqa

roboflow/maestro同语言 Python最近更新仓库

2024-11-22 02:39:01 goauthentik/authentik

2024-11-22 00:03:47 comfyanonymous/ComfyUI

2024-11-21 22:06:18 rashevskyv/dbi

2024-11-21 21:09:02 xtekky/gpt4free

2024-11-21 20:03:58 ultralytics/ultralytics

2024-11-21 00:54:04 hect0x7/JMComic-Crawler-Python