autodistill / autodistill Goto Github PK

Images to inference with no labeling (use foundation models to train supervised models).

License: Apache License 2.0

Makefile 1.65% Python 89.94% JavaScript 8.40%

computer-vision model-distillation auto-labeling deep-learning foundation-models grounding-dino image-annotation image-classification instance-segmentation labeling-tool

autodistill's Introduction

notebooks | inference | autodistill | collect

Autodistill uses big, slower foundation models to train small, faster supervised models. Using autodistill, you can go from unlabeled images to inference on a custom model running at the edge with no human intervention in between.

Tip

You can use Autodistill on your own hardware, or use the Roboflow hosted version of Autodistill to label images in the cloud.

Currently, autodistill supports vision tasks like object detection and instance segmentation, but in the future it can be expanded to support language (and other) models.

🔗 Quicklinks

Tutorial	Docs	Supported Models	Contribute

👀 Example Output

Here are example predictions of a Target Model detecting milk bottles and bottlecaps after being trained on an auto-labeled dataset using Autodistill (see the Autodistill YouTube video for a full walkthrough):

🚀 Features

🔌 Pluggable interface to connect models together
🤖 Automatically label datasets
🐰 Train fast supervised models
🔒 Own your model
🚀 Deploy distilled models to the cloud or the edge

📚 Basic Concepts

To use autodistill, you input unlabeled data into a Base Model which uses an Ontology to label a Dataset that is used to train a Target Model which outputs a Distilled Model fine-tuned to perform a specific Task.

Autodistill defines several basic primitives:

Task - A Task defines what a Target Model will predict. The Task for each component (Base Model, Ontology, and Target Model) of an autodistill pipeline must match for them to be compatible with each other. Object Detection and Instance Segmentation are currently supported through the detection task. classification support will be added soon.
Base Model - A Base Model is a large foundation model that knows a lot about a lot. Base models are often multimodal and can perform many tasks. They're large, slow, and expensive. Examples of Base Models are GroundedSAM and GPT-4's upcoming multimodal variant. We use a Base Model (along with unlabeled input data and an Ontology) to create a Dataset.
Ontology - an Ontology defines how your Base Model is prompted, what your Dataset will describe, and what your Target Model will predict. A simple Ontology is the CaptionOntology which prompts a Base Model with text captions and maps them to class names. Other Ontologies may, for instance, use a CLIP vector or example images instead of a text caption.
Dataset - a Dataset is a set of auto-labeled data that can be used to train a Target Model. It is the output generated by a Base Model.
Target Model - a Target Model is a supervised model that consumes a Dataset and outputs a distilled model that is ready for deployment. Target Models are usually small, fast, and fine-tuned to perform a specific task very well (but they don't generalize well beyond the information described in their Dataset). Examples of Target Models are YOLOv8 and DETR.
Distilled Model - a Distilled Model is the final output of the autodistill process; it's a set of weights fine-tuned for your task that can be deployed to get predictions.

💡 Theory and Limitations

Human labeling is one of the biggest barriers to broad adoption of computer vision. It can take thousands of hours to craft a dataset suitable for training a production model. The process of distillation for training supervised models is not new, in fact, traditional human labeling is just another form of distillation from an extremely capable Base Model (the human brain 🧠).

Foundation models know a lot about a lot, but for production we need models that know a lot about a little.

As foundation models get better and better they will increasingly be able to augment or replace humans in the labeling process. We need tools for steering, utilizing, and comparing these models. Additionally, these foundation models are big, expensive, and often gated behind private APIs. For many production use-cases, we need models that can run cheaply and in realtime at the edge.

Autodistill's Base Models can already create datasets for many common use-cases (and through creative prompting and few-shotting we can expand their utility to many more), but they're not perfect yet. There's still a lot of work to do; this is just the beginning and we'd love your help testing and expanding the capabilities of the system!

💿 Installation

Autodistill is modular. You'll need to install the autodistill package (which defines the interfaces for the above concepts) along with Base Model and Target Model plugins (which implement specific models).

By packaging these separately as plugins, dependency and licensing incompatibilities are minimized and new models can be implemented and maintained by anyone.

Example:

pip install autodistill autodistill-grounded-sam autodistill-yolov8

Install from source

You can also clone the project from GitHub for local development:

git clone https://github.com/roboflow/autodistill
cd autodistill
pip install -e .

Additional Base and Target models are enumerated below.

🚀 Quickstart

See the demo Notebook for a quick introduction to autodistill. This notebook walks through building a milk container detection model with no labeling.

Below, we have condensed key parts of the notebook for a quick introduction to autodistill.

You can also run Autodistill in one command. First, install autodistill:

pip install autodistill

Then, run:

autodistill images --base="grounding_dino" --target="yolov8" --ontology '{"prompt": "label"}' --output="./dataset"

This command will label all images in a directory called images with Grounding DINO and use the labeled images to train a YOLOv8 model. Grounding DINO will label all images with the "prompt" and save the label as the "label". You can specify as many prompts and labels as you want. The resulting dataset will be saved in a folder called dataset.

Install Packages

For this example, we'll show how to distill GroundedSAM into a small YOLOv8 model using autodistill-grounded-sam and autodistill-yolov8.

pip install autodistill autodistill-grounded-sam autodistill-yolov8

Distill a Model

from autodistill_grounded_sam import GroundedSAM
from autodistill.detection import CaptionOntology
from autodistill_yolov8 import YOLOv8

# define an ontology to map class names to our GroundingDINO prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
base_model = GroundedSAM(ontology=CaptionOntology({"shipping container": "container"}))

# label all images in a folder called `context_images`
base_model.label(
  input_folder="./images",
  output_folder="./dataset"
)

target_model = YOLOv8("yolov8n.pt")
target_model.train("./dataset/data.yaml", epochs=200)

# run inference on the new model
pred = target_model.predict("./dataset/valid/your-image.jpg", confidence=0.5)
print(pred)

# optional: upload your model to Roboflow for deployment
from roboflow import Roboflow

rf = Roboflow(api_key="API_KEY")
project = rf.workspace().project("PROJECT_ID")
project.version(DATASET_VERSION).deploy(model_type="yolov8", model_path=f"./runs/detect/train/")

Visualize Predictions

To plot the annotations for a single image using autodistill, you can use the code below. This code is helpful to visualize the annotations generated by your base model (i.e. GroundedSAM) and the results from your target model (i.e. YOLOv8).

import supervision as sv
import cv2

img_path = "./images/your-image.jpeg"

image = cv2.imread(img_path)

detections = base_model.predict(img_path)
# annotate image with detections
box_annotator = sv.BoxAnnotator()

labels = [
    f"{base_model.ontology.classes()[class_id]} {confidence:0.2f}"
    for _, _, confidence, class_id, _ in detections
]

annotated_frame = box_annotator.annotate(
    scene=image.copy(), detections=detections, labels=labels
)

sv.plot_image(annotated_frame, (16, 16))

📍 Available Models

Our goal is for autodistill to support using all foundation models as Base Models and most SOTA supervised models as Target Models. We focused on object detection and segmentation tasks first but plan to launch classification support soon! In the future, we hope autodistill will also be used for models beyond computer vision.

✅ - complete (click row/column header to go to repo)
🚧 - work in progress

object detection

base / target	YOLOv8	YOLO-NAS	YOLOv5	DETR	YOLOv6
DETIC	✅	✅	✅	✅	🚧
GroundedSAM	✅	✅	✅	✅	🚧
GroundingDINO	✅	✅	✅	✅	🚧
OWL-ViT	✅	✅	✅	✅	🚧
SAM-CLIP	✅	✅	✅	✅	🚧
LLaVA-1.5	✅	✅	✅	✅	🚧
Kosmos-2	✅	✅	✅	✅	🚧
OWLv2	✅	✅	✅	✅	🚧
Roboflow Universe Models (50k+ pre-trained models)	✅	✅	✅	✅	🚧
CoDet	✅	✅	✅	✅	🚧
Azure Custom Vision	✅	✅	✅	✅	🚧
AWS Rekognition	✅	✅	✅	✅	🚧
Google Vision	✅	✅	✅	✅	🚧

instance segmentation

base / target	YOLOv8	YOLO-NAS	YOLOv5
GroundedSAM	✅	🚧	🚧
SAM-CLIP	✅	🚧	🚧
SegGPT	✅	🚧	🚧
FastSAM	🚧	🚧	🚧

classification

base / target	ViT	YOLOv8	YOLOv5
CLIP	✅	✅	🚧
MetaCLIP	✅	✅	🚧
DINOv2	✅	✅	🚧
BLIP	✅	✅	🚧
ALBEF	✅	✅	🚧
FastViT	✅	✅	🚧
AltCLIP	✅	✅	🚧
EvaCLIP (contributed by a community member)	✅	✅	🚧
Fuyu	🚧	🚧	🚧
Open Flamingo	🚧	🚧	🚧
GPT-4
PaLM-2

Roboflow Model Deployment Support

You can optionally deploy some Target Models trained using Autodistill on Roboflow. Deploying on Roboflow allows you to use a range of concise SDKs for using your model on the edge, from roboflow.js for web deployment to NVIDIA Jetson devices.

The following Autodistill Target Models are supported by Roboflow for deployment:

model name	Supported?
YOLOv8 Object Detection	✅
YOLOv8 Instance Segmentation	✅
YOLOv5 Object Detection	✅
YOLOv5 Instance Segmentation	✅
YOLOv8 Classification

🎬 Video Guides

Autodistill: Train YOLOv8 with ZERO Annotations

Published: 8 June 2023

In this video, we will show you how to use a new library to train a YOLOv8 model to detect bottles moving on a conveyor line. Yes, that's right - zero annotation hours are required! We dive deep into Autodistill's functionality, covering topics from setting up your Python environment and preparing your images, to the thrilling automatic annotation of images.

💡 Community Resources

Distill Large Vision Models into Smaller, Efficient Models with Autodistill: Announcement post with written guide on how to use Autodistill
Comparing AI-Labeled Data to Human-Labeled Data: A qualitative evaluation of Grounding DINO used with Autodistill across various tasks and domains.
How to Evaluate Autodistill Prompts with CVevals: Evaluate Autodistill prompts.
Autodistill: Label and Train a Computer Vision Model in Under 20 Minutes: Building a model to detect planes in under 20 minutes.
Comparing AI-Labeled Data to Human-Labeled Data: Explore the strengths and limitations of a base model used with Autoditsill.
Train an Image Classification Model with No Labeling: Use Grounded SAM to automatically label images for training an Ultralytics YOLOv8 classification model.
Train a Segmentation Model with No Labeling: Use CLIP to automatically label images for training an Ultralytics YOLOv8 segmentation model.
File a PR to add your own resources here!

🗺️ Roadmap

Apart from adding new models, there are several areas we plan to explore with autodistill including:

💡 Ontology creation & prompt engineering
👩‍💻 Human in the loop support
🤔 Model evaluation
🔄 Active learning
💬 Language tasks

🏆 Contributing

We love your input! Please see our contributing guide to get started. Thank you 🙏 to all our contributors!

👩‍⚖️ License

The autodistill package is licensed under an Apache 2.0. Each Base or Target model plugin may use its own license corresponding with the license of its underlying model. Please refer to the license in each plugin repo for more information.

Frequently Asked Questions ❓

What causes the `PytorchStreamReader failed reading zip archive: failed finding central directory` error?

This error is caused when PyTorch cannot load the model weights for a model. Go into the ~/.cache/autodistill directory and delete the folder associated with the model you are trying to load. Then, run your code again. The model weights will be downloaded from scratch. Leave the installation process uninterrupted.

💻 explore more Roboflow open source projects

Project	Description
supervision	General-purpose utilities for use in computer vision projects, from predictions filtering and display to object tracking to model evaluation.
Autodistill (this project)	Automatically label images for use in training computer vision models.
Inference	An easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
Notebooks	Tutorials for computer vision tasks, from training state-of-the-art models to tracking objects to counting objects in a zone.
Collect	Automated, intelligent data collection powered by CLIP.

autodistill's People

Contributors

Stargazers

Watchers

Forkers

debjyoti003 artyaltanzaya anoop-qasolve hariag alamehor skalskip scinet rfjohnso hsoleimanii louderthanthunderx1 unitycreatorkings kippapollo computerscienceiscool maximedebarbat arqam81 ve7ltx rayningtime michaelscheinfeild aixia121 ai-jie01 nisepulvedaa rajaramkuberan joeqian benjamesbabala ml-lab noiyu thanhmcisai lih2022 sunmingyang1987 transybao1393 zylkills yaoertech pengcheng001 ganeshkharde1 sobruce ivi-42 vital121 zhuewizz szad670401 kellyzxiaowei thanhpham1987 andrew-healey duckheada bodiman anupgoenka mohitburkule teenaxta merwanski mayankagarwals abdoujaouhar abdoujaouhargst pupu2014 sibozhang peat-ai onuralpszr lt00001 khaidoan25 tianhaofu kindow duke194 baudneo imcwjhere mvandermeulen paul007008 bulkpanda codehornets maldandan etiosa achangxd iuliaalda lsqm01 frontierkodiak crazypenguincode vilaksh01 gongwk tahhnik nitin-mane godofcong-1 jacksparal gds101054108 rafaelvanbelle lianzhanbiao navdeepgarg1 lunatik00 muhammad-usama-aleem pprp songjiahao-wq tamago55 tomgause ctl2016 craftbrainllc hunkyu supervisionai skylargivens qybing zhdai gebawe n0kovo stephansturges nguyenhoan1988

autodistill's Issues

Altering Box Threshold and Text Threshold Values

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Question

Hi @yeldarby @hariag @SkalskiP @andrew-healey
Is there any way to play with Box_threshold and text_threhsold while generating datasets for our custom datasets? When I directly used groundingDINO ,I have flexibility to use and change those values as per our requirements on our custom datasets.

Additional

No response

milk bottles datasets and Ontology

I want to reproduce the whole autodistill process with your demo video about milk bottle. Can you tell that where to get milk bottle dataset and what the 'Ontology' you set?

SAM-Clip model producing many boxes

Hello,

When I run SAM Clip model, it printouts several box and I am sure that it should detect only 1-2 boxes in the image. Why it is dumping so many boxes?

Labeling People/a-good-person.jpg: 2%|▌ | 2/115 [00:12<10:39, 5.66s/it]tensor([[0.5508, 0.4495]], device='cuda:0', dtype=torch.float16)
tensor([[0.0610, 0.9390]], device='cuda:0', dtype=torch.float16)
tensor([[0.1097, 0.8901]], device='cuda:0', dtype=torch.float16)
tensor([[0.6001, 0.3999]], device='cuda:0', dtype=torch.float16)
tensor([[0.5659, 0.4341]], device='cuda:0', dtype=torch.float16)
tensor([[0.1968, 0.8032]], device='cuda:0', dtype=torch.float16)
tensor([[0.3960, 0.6040]], device='cuda:0', dtype=torch.float16)
tensor([[0.1442, 0.8560]], device='cuda:0', dtype=torch.float16)
tensor([[0.1442, 0.8560]], device='cuda:0', dtype=torch.float16)
tensor([[0.0656, 0.9346]], device='cuda:0', dtype=torch.float16)
tensor([[0.0550, 0.9448]], device='cuda:0', dtype=torch.float16)
tensor([[0.0676, 0.9326]], device='cuda:0', dtype=torch.float16)
tensor([[0.2338, 0.7661]], device='cuda:0', dtype=torch.float16)
tensor([[0.0415, 0.9585]], device='cuda:0', dtype=torch.float16)
tensor([[0.1294, 0.8706]], device='cuda:0', dtype=torch.float16)
tensor([[0.1009, 0.8989]], device='cuda:0', dtype=torch.float16)
tensor([[0.2095, 0.7905]], device='cuda:0', dtype=torch.float16)
tensor([[0.1009, 0.8989]], device='cuda:0', dtype=torch.float16)
tensor([[0.2120, 0.7881]], device='cuda:0', dtype=torch.float16)
tensor([[0.0839, 0.9160]], device='cuda:0', dtype=torch.float16)
tensor([[0.1067, 0.8931]], device='cuda:0', dtype=torch.float16)
tensor([[0.0726, 0.9272]], device='cuda:0', dtype=torch.float16)
tensor([[0.4341, 0.5659]], device='cuda:0', dtype=torch.float16)
tensor([[0.4570, 0.5430]], device='cuda:0', dtype=torch.float16)
tensor([[0.1113, 0.8887]], device='cuda:0', dtype=torch.float16)
tensor([[0.0440, 0.9561]], device='cuda:0', dtype=torch.float16)
tensor([[0.0637, 0.9365]], device='cuda:0', dtype=torch.float16)
tensor([[0.0284, 0.9717]], device='cuda:0', dtype=torch.float16)
tensor([[0.0706, 0.9292]], device='cuda:0', dtype=torch.float16)
tensor([[0.1208, 0.8789]], device='cuda:0', dtype=torch.float16)
tensor([[0.0815, 0.9185]], device='cuda:0', dtype=torch.float16)
tensor([[0.4111, 0.5889]], device='cuda:0', dtype=torch.float16)
tensor([[0.1348, 0.8652]], device='cuda:0', dtype=torch.float16)
tensor([[0.1225, 0.8774]], device='cuda:0', dtype=torch.float16)
tensor([[0.1500, 0.8501]], device='cuda:0', dtype=torch.float16)
tensor([[0.1294, 0.8706]], device='cuda:0', dtype=torch.float16)
tensor([[0.1097, 0.8901]], device='cuda:0', dtype=torch.float16)
tensor([[0.1919, 0.8081]], device='cuda:0', dtype=torch.float16)
tensor([[0.1176, 0.8823]], device='cuda:0', dtype=torch.float16)
tensor([[0.1329, 0.8672]], device='cuda:0', dtype=torch.float16)
tensor([[0.0610, 0.9390]], device='cuda:0', dtype=torch.float16)
tensor([[0.2173, 0.7827]], device='cuda:0', dtype=torch.float16)
tensor([[0.1329, 0.8672]], device='cuda:0', dtype=torch.float16)
tensor([[0.0685, 0.9316]], device='cuda:0', dtype=torch.float16)
tensor([[0.0666, 0.9336]], device='cuda:0', dtype=torch.float16)
tensor([[0.1646, 0.8354]], device='cuda:0', dtype=torch.float16)
tensor([[0.0851, 0.9150]], device='cuda:0', dtype=torch.float16)
tensor([[0.3811, 0.6187]], device='cuda:0', dtype=torch.float16)
tensor([[0.1113, 0.8887]], device='cuda:0', dtype=torch.float16)
tensor([[0.3008, 0.6992]], device='cuda:0', dtype=torch.float16)
tensor([[0.0474, 0.9526]], device='cuda:0', dtype=torch.float16)
tensor([[0.1144, 0.8857]], device='cuda:0', dtype=torch.float16)

[Question] Exporting annotations / images in specific format

Is there an option to export the auto-annotated image annotations in a specific output format? say coco / yolo / voc etc.

ImportError: cannot import name 'DetectionOntology' from 'autodistill.detection' (/mnt/scratch/sibo/autodistill/autodistill/detection/init.py)

python code:
from autodistill_grounding_dino import GroundingDINO
from autodistill.detection import CaptionOntology

Traceback (most recent call last):
File "label_visualize.py", line 8, in
from autodistill_grounding_dino import GroundingDINO
File "/mnt/home/sibo/.conda/envs/autodistill/lib/python3.7/site-packages/autodistill_grounding_dino/init.py", line 1, in
from autodistill_grounding_dino.grounding_dino_model import GroundingDINO
File "/mnt/home/sibo/.conda/envs/autodistill/lib/python3.7/site-packages/autodistill_grounding_dino/grounding_dino_model.py", line 17, in
from autodistill.detection import CaptionOntology, DetectionBaseModel
File "/mnt/scratch/sibo/autodistill/autodistill/detection/init.py", line 1, in
from autodistill.detection.caption_ontology import CaptionOntology
File "/mnt/scratch/sibo/autodistill/autodistill/detection/caption_ontology.py", line 4, in
from autodistill.detection import DetectionOntology
ImportError: cannot import name 'DetectionOntology' from 'autodistill.detection' (/mnt/scratch/sibo/autodistill/autodistill/detection/init.py)

[Question] GPU Utilization

Is it normal for GPU utililization rate to be close to 0%? I'm trying to label a folder of images and it seems like the computation is almost 100% done on the CPU. Is there a setting I can change or a param I have to specify to allow GPU acceleration?

Dude

If only some of my frames have the object, is there any way to indicate this?
I trained some models, and it seems like it detects always something even if the object specify or not

Help, please

None class for some objects

Hi,
I am using your demo notebook for getting annotation but, for some objects label is None. How to resolve this ? Do i need to provide certain class names only which are predefined ?

how to convert created dataset to coco detection yolo format?

I am using GroundedSAM as base model to label all images , i have only 1 class.
Resulting dataset contains many points of identified labels (probably for segmentation usage as well).

However I need to convert this dataset to coco detection format (class and 4 points) to use it outside of auto distill.
Is it possible?

How to accelerate training on MPS (Apple M2 chip)

Unless I'm missing something, The docs do not show how accelerate training on MPS GPU

Autodistill quits after 2457 images. Memory leak?

I run autodistill like this:

autodistill  --base "grounded_sam" --model_type detection --epochs 50 --upload-to-roboflow False --target "yolov8" --ontology '{"caption": "label"}' --output ./dataset -y True ./images/

The images folder contains about 10.000 images.
It stops with this message, after processing 2457 images:
fish: Job 1, 'autodistill --base "grounded_s…' terminated by signal SIGTERM (Polite quit request)

This is repeatable. I observe it consumed a lot of memory, possibly about 32 GB. My machine has 64 GB and two RTX 2080, so I suspect it may be a memory leak that stopped it, but I wasn't at the machine to see this.

Ideally it wouldn't do this and it would have a resume capability that allows you to restart it where it left off.

Edit:
I think it must be a memory leak based on this code in classification_base_model.py:

        for f_path in progress_bar:
            progress_bar.set_description(desc=f"Labeling {f_path}", refresh=True)
            image = cv2.imread(f_path)

            f_path_short = os.path.basename(f_path)
            images_map[f_path_short] = image.copy()
            detections = self.predict(f_path)
            detections_map[f_path_short] = detections

So it keeps ALL the images in a dict, even after they were processed. This is unneccesary and will cause lots of memory consumption.

GroundedSAM Loadding Error

I get this error when I wnat to laod GroundedSAM model.
base_model = GroundedSAM(ontology=ontolgy)

ERROR:
trying to load grounding dino directly /home/radino/detection/autolabel/venv/lib/python3.8/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.) return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined] final text_encoder_type: bert-base-uncased Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight'] - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). downloading dino model weights final text_encoder_type: bert-base-uncased Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight'] - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Traceback (most recent call last): File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/autodistill_grounded_sam/helpers.py", line 86, in load_grounding_dino grounding_dino_model = Model( File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/groundingdino/util/inference.py", line 118, in __init__ self.model = load_model( File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/groundingdino/util/inference.py", line 32, in load_model checkpoint = torch.load(model_checkpoint_path, map_location="cpu") File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/torch/serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/torch/serialization.py", line 283, in __init__ super().__init__(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory During handling of the above exception, another exception occurred: Traceback (most recent call last): File "autolabel.py", line 70, in <module> base_model = GroundedSAM(ontology=ontology) File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/autodistill_grounded_sam/grounded_sam.py", line 38, in __init__ self.grounding_dino_model = load_grounding_dino() File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/autodistill_grounded_sam/helpers.py", line 105, in load_grounding_dino grounding_dino_model = Model( File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/groundingdino/util/inference.py", line 118, in __init__ self.model = load_model( File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/groundingdino/util/inference.py", line 32, in load_model checkpoint = torch.load(model_checkpoint_path, map_location="cpu") File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/torch/serialization.py", line 797, in load with _open_zipfile_reader(opened_file) as opened_zipfile: File "/home/radino/detection/autolabel/venv/lib/python3.8/site-packages/torch/serialization.py", line 283, in __init__ super().__init__(torch._C.PyTorchFileReader(name_or_buffer)) RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Could any body help me with this issue?

Corrupt JPEG data

Search before asking

I have searched the Autodistill issues and found no similar bug report.

Bug

I am following this tutorial: https://medium.com/@corpy.ai.lab/autodistill-automating-dataset-labeling-for-efficient-model-training-6aed5f63bea

and I am facing an error with the converted jpg images. This is from helpers.py's split_data function.
When I am done running this snippet:

        self.base_model.label(
            input_folder=config.image_dir,
            output_folder=config.dataset_dir,
            extension=".png",
        )

I run the following snippet:

        sv.DetectionDataset.from_yolo(
            images_directory_path=config.image_path,
            annotations_directory_path=config.annotation_path,
            data_yaml_path=config.data_yaml_path,
        )

I get this error:

Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: 1 extraneous bytes before marker 0xd9
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: premature end of data segment
Corrupt JPEG data: 3 extraneous bytes before marker 0xd9

If I use the original png images it works fine. I assume the conversion happens to compress the images? as far as I know yolo supports png images.

Environment

OS: macos 14.1
autodistil: 0.1.15
supervision: 0.16.0
Python: 3.8.13

Minimal Reproducible Example

import supervision as sv
from tqdm import tqdm
from pathlib import Path
from typing import Union
from autodistill_yolov8 import YOLOv8
from omegaconf import OmegaConf, DictConfig
from autodistill.detection import CaptionOntology
from autodistill_grounded_sam import GroundedSAM
from autodistill_grounding_dino import GroundingDINO

class AUTODISTILLATION:
    def __init__(self, config: Union[str, DictConfig]) -> None:
        if isinstance(config, DictConfig):
            self.config = config
        else:
            self.config = OmegaConf.load(config)
        self.onotology = CaptionOntology(self.config.labels)
        self.base_model = GroundingDINO(ontology=self.onotology)
        self.target_model = YOLOv8(self.config.target_model)

    @staticmethod
    def extract_frame(config: DictConfig):
        train_video_path = [config.train_video_path]
        image_dir = config.image_dir
        frame_stride = config.frame_stride
        for video_path in tqdm(train_video_path):
            if isinstance(video_path, str):
                video_path = Path(video_path)
            video_name = video_path.stem
            image_name_pattern = video_name + "-{:05d}.png"
            with sv.ImageSink(
                target_dir_path=image_dir, image_name_pattern=image_name_pattern
            ) as sink:
                for image in sv.get_video_frames_generator(
                    source_path=str(video_path), stride=frame_stride
                ):
                    sink.save_image(image=image)

    def label_dataset(self, config: DictConfig):
        print("[INFO] Start labelling")
        self.base_model.label(
            input_folder=config.image_dir,
            output_folder=config.dataset_dir,
            extension=".png",
        )

    def dataset_to_yolo(self, config: DictConfig):
        print("[INFO] Converting dataset to yolo format")
        sv.DetectionDataset.from_yolo(
            images_directory_path=config.image_path,
            annotations_directory_path=config.annotation_path,
            data_yaml_path=config.data_yaml_path,
        )

    def make_dataset(self, config: DictConfig):
        print("[INFO] Making dataset start ...")
        image_dir = Path(config.image_dir)
        if not any(image_dir.iterdir()):
            self.extract_frame(config)
        self.label_dataset(config)
        self.dataset_to_yolo(config)

    def train(self, config: DictConfig):
        print("[INFO] Training ....")
        self.target_model.train(
            config.data_yaml_path, epochs=config.epochs
        )

    def inference(self, source, conf):
        self.target_model.predict(source, conf)

if __name__ == "__main__":
    config_path = "config.yaml"
    config = OmegaConf.load(config_path)
    distillation = AUTODISTILLATION(config=config)
    distillation.make_dataset(config=distillation.config)
    # distillation.train(config=distillation.config)
    # distillation.inference(source=distillation.config.test_video_path, conf=distillation.config.conf)

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

Implement `autodistill-slime`

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Description

SLiMe is a one-shot segmentation model that uses a method based on Stable Diffusion. The SLiMe research materials show promising performance over SegGPT, another one-shot model available in the Autodistill ecosystem.

We would love to make SLiMe a base model so that people can easily use the model!

To learn about contributing a new base model to Autodistill, check out our base model implementation guide and template.

Use case

One-shot segmentation.

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

[Question] Extracting prompt for feature / actions

it is possible to extract features or actions from an image? like

instead of a person -> a person that is sitting or walking
instead of the player/person -> player who is kicking the ball
instead of bottle -> bottle that has fallen / sideway

Autodistill did'nt generate single annotated image

from autodistill.detection import CaptionOntology

ontology=CaptionOntology({
"licence": "number-plate ",
})

from autodistill_grounded_sam import GroundedSAM

base_model = GroundedSAM(ontology=ontology)
dataset = base_model.label(
input_folder=IMAGE_DIR_PATH,
extension=".png",
output_folder=DATASET_DIR_PATH)

Hi Team, I hope you are all doing well. I want to annotate my dataset called pak-number-plate. I was following the giving code pipeline. But the model didn't annotate a single image. Could you please verify my pipeline and also guide me a lil?

FileNotFoundError: file "/root/.cache/autodistill/groundingdino/GroundingDINO_SwinT_OGC.py" does not exist

When I try to use GroundedSAM to generate annotation, It went something wrong.

trying to load grounding dino directly
downloading dino model weights
Traceback (most recent call last):
File "/workspace/hpt/anaconda3/envs/paddle_env/lib/python3.7/site-packages/autodistill_grounded_sam/helpers.py", line 89, in load_grounding_dino
device=DEVICE,
File "/workspace/hpt/anaconda3/envs/paddle_env/lib/python3.7/site-packages/groundingdino/util/inference.py", line 121, in init
device=device
File "/workspace/hpt/anaconda3/envs/paddle_env/lib/python3.7/site-packages/groundingdino/util/inference.py", line 29, in load_model
args = SLConfig.fromfile(model_config_path)
File "/workspace/hpt/anaconda3/envs/paddle_env/lib/python3.7/site-packages/groundingdino/util/slconfig.py", line 185, in fromfile
cfg_dict, cfg_text = SLConfig._file2dict(filename)
File "/workspace/hpt/anaconda3/envs/paddle_env/lib/python3.7/site-packages/groundingdino/util/slconfig.py", line 79, in _file2dict
check_file_exist(filename)
File "/workspace/hpt/anaconda3/envs/paddle_env/lib/python3.7/site-packages/groundingdino/util/slconfig.py", line 23, in check_file_exist
raise FileNotFoundError(msg_tmpl.format(filename))
FileNotFoundError: file "/root/.cache/autodistill/groundingdino/GroundingDINO_SwinT_OGC.py" does not exist

feature request: add a search bar to the docs

Use labelled data to improve labeling

Say I have labeled some images in a format of one of the target models (e.g. YOLOv8).

Is it possible to use them to assist the automatic labeling by the base model?

In my use case, the classes to label (for instance segmentation) are not easy to describe in the ontology, which results in poor out-of-box performance.

Implement ```autodistill-yolov6```

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Description

With the recent release of Yolov6v3, I'd like to submit a PR for hacktoberfest with my team implementing this model.

Yolov6v3 is implemented in ultralytics, so the change shouldn't be too different than YoloNAS.

If anyone has any considerations before implementation would love to start a conversation, or perhaps an intro meeting to get acquainted.

Use case

Autodistill can be used alongside Yolov6 to go dataset->model like other model implementations.

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

Implement `autodistill-kosmos2`

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Description

Kosmos-2 is a multimodal language model that you can use to detect objects in images.

We would love to make Kosmos-2 a base model so that people can easily use the model!

To learn about contributing a new base model to Autodistill, check out our base model implementation guide and template.

Use case

Zero-shot object detection.

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

how to use with mps (m2 chip)

Windows install reverts ultralytics to old version. autodistill not found

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Question

How can I install ultralytics AND autodistill with latest version?

pip install --upgrade
Using cached ultralytics-8.0.196-py3-none-any.whl (631 kB)
Installing collected packages: ultralytics
  Attempting uninstall: ultralytics
    Found existing installation: ultralytics 8.0.81
    Uninstalling ultralytics-8.0.81:
      Successfully uninstalled ultralytics-8.0.81
**autodistill-yolov8 0.1.1 requires ultralytics==8.0.81, but you have ultralytics 8.0.196 which is incompatible.**
Successfully installed ultralytics-8.0.196

Additional

No response

Implement `autodistill-owlv2`

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Description

Google has open sourced a Colab showing how to use OWLv2 which shows promising results for zero-shot object detection.

We would love to make OWLv2 a base model so that people can easily use the model!

To learn about contributing a new base model to Autodistill, check out our base model implementation guide and template.

Use case

Zero-shot object detection.

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

Circular Import Error when Importing 'GroundedSAM' from 'autodistill_grounded_sam'

Description:
I encountered a circular import error when attempting to import the 'GroundedSAM' module from the 'autodistill_grounded_sam' package.

My system is:

Distributor ID: Ubuntu
Description:    Ubuntu 23.04
Release:        23.04
Codename:       lunar

nvidia: Driver Version: 525.125.06   CUDA Version: 12.0  
python version  3.10 in conda

Anyone had the same problem and was able to solve it?

could you provide a demo of segGPT autdistill

Consider to create a dataset folder structure like this...

I have this this Python script, for example:

from autodistill.detection import CaptionOntology
from autodistill_grounded_sam import GroundedSAM

annote = GroundedSAM(ontology = CaptionOntology({"object to detect": "object"}))
annote.label("./object images")

It creates a YOLO suitable dataset folder, but this folder has always an "Instance Segmentation structure".
How can I create a different YOLO dataset folder structure, suitable for Object Detection too? I suppose I cannot...

I think you should consider to create a folder structure like this (suitable both for Instance Segmentation and Object Detection):

This image came from Ultralytics Hub documentation (https://docs.ultralytics.com/hub/datasets)
Another problem for dataset folder created by Autodistill is that it is not compliant with Ultralytics Hub...
... So I cannot upload it there :(

I really like this library, it's one of my favorites, so I felt compelled to tell you my opinion, I hope it could help💪

GroundingDINO_SwinT_OGC.py does not exist

Hi, I have gone through the instructions and seem to run into the following error:

FileNotFoundError: file "/home/simon/.cache/autodistill/groundingdino/GroundingDINO_SwinT_OGC.py" does not exist

I've attached my console output and the file that I am running as well as a copy of my libraries installed. The Autodistil.txt is a .py but github won't let me upload the original .py file.

Any help here would be greatly appreciated.

All the best,

Simon

ConsoleOutput.txt
CondaEnv.txt
Autodistil (copy).txt

How to get visual confirmation of all annotations

I ran base_model.label and now have a nice dataset folder with all my images and annotations. Is there an easy way/app that I can use to visually confirm that the annotations are correct?

grounded DINO-V2

Hello,
are you planning on integrating DINO-2 for faster and more accurate results for bbox+labels, rather than grounded-dino?

fine-tune models to specific classes

looks promising! so if I understand correctly - there are two phases: (1) auto-label an image dataset, something that you already showed in the G-SAM previous video (2) split the dataset to test/train/val and build a CV model with YOLO-8
q1: since auto-labeling is not 100% fullproof, what about the human in the loop?
q2: what if the base model doesn't recognize the labels (name of specific boats I have in a marine dataset), how can I fine-tune it? so the model will find a boat, but will also "know" the brand of the boat

FastSAM instead of SAM ?

Will that increase the efficiency of the overall annotation rate of a huge dataset?

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

My issues image

My code

How do I solve this problem?

UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 2875: illegal multibyte sequence

!pip install -q
autodistill
autodistill-grounded-sam
autodistill-yolov8
supervision==0.9.0

error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [7 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\15852\AppData\Local\Temp\pip-install-ev32519r\rf-groundingdino_d3d18c18e7804384a8ed46bce84eecd1\setup.py", line 41, in
readme = readme_file.read()
^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 2875: illegal multibyte sequence
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

The attempts I have made are:
Add PYTHONIOENCODING="utf-8" to the system environment variable.

Install the necessary packages:
!pip install -q
opencv-python>=4.6.0
supervision
tqdm
Pillow>=7.1.2 \
PyYAML>=5.3.1.

Still can't fix the problem.

improve label speed for GroundedDINO

Thanks for the awesome work!
Inference speed is like 0.5 - 1 fps on V100 for GroundedDINO. Could I ask if there is anyway we can improve it? Using multiple GPU to inference images in the same time?

PytorchStreamReader failed reading zip archive: failed finding central directory

Documentation update may be needed

Search before asking

I have searched the Autodistill issues and found no similar bug report.

Bug

Current:

https://docs.autodistill.com/
autodistill images --base-model="grounding_dino" --target-model="yolov8" --ontology '{"prompt": "label"}' --output-folder ./dataset

Suggested update
autodistill images --base="grounding_dino" --target="yolov8" --ontology '{"prompt": "label"}' --output ./dataset

Environment

No response

Minimal Reproducible Example

No response

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

SAM Clip Error

Hello,

On Several images in the dataset. I get the following error. Is it the issue of error handling by the code?

Labeling People/beautiful-serene-black-woman-reflection-1296x728-header.jpg: 6%|▏ | 7/114 [00:39<09:57, 5.58s/it]
Traceback (most recent call last):
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/SAMClip_labels.py", line 10, in
base_model.label(input_folder=folder_name, output_folder="mldata")
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/autodistillenv/lib/python3.11/site-packages/autodistill/detection/detection_base_model.py", line 44, in label
detections = self.predict(f_path)
^^^^^^^^^^^^^^^^^^^^
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/autodistillenv/lib/python3.11/site-packages/autodistill_sam_clip/sam_clip.py", line 162, in predict
nms = sv.non_max_suppression(np.array(nms_data), 0.5)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/autodistillenv/lib/python3.11/site-packages/supervision/detection/utils.py", line 84, in non_max_suppression
rows, columns = predictions.shape
^^^^^^^^^^^^^
ValueError: not enough values to unpack (expected 2, got 1)
(autodistillenv) %$ open People/beautiful-serene-black-woman-reflection-1296x728-header.jpg

Partially Initialized Model

from autodistill_grounded_sam import GroundedSAM
from autodistill_yolov8 import YOLOv8
from autodistill.detection.caption_ontology import CaptionOntology

# define an ontology to map class names to our GroundingDINO prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
base_model = GroundedSAM(ontology=CaptionOntology({"car": "car"}))

# label all images in a folder called `context_images`
base_model.label(
  input_folder="D:/work/dev/python/segmentation/test/03_split/",
  output_folder="D:/work/dev/python/segmentation/test/03_split/"
)

#target_model = YOLOv8("yolov8n.pt")
#target_model.train("./dataset/data.yaml", epochs=200)

# run inference on the new model
#pred = target_model.predict("./dataset/valid/your-image.jpg", confidence=0.5)
#print(pred)

Error

D:\work\dev>C:/Python/Python311/python.exe d:/work/dev/github/autodistill/autodistill.py
Traceback (most recent call last):
  File "d:\work\dev\github\autodistill\autodistill.py", line 1, in <module>
    from autodistill_grounded_sam import GroundedSAM
  File "C:\Python\Python311\Lib\site-packages\autodistill_grounded_sam\__init__.py", line 1, in <module>
    from autodistill_grounded_sam.grounded_sam import GroundedSAM
  File "C:\Python\Python311\Lib\site-packages\autodistill_grounded_sam\grounded_sam.py", line 17, in <module>
    from autodistill.detection import CaptionOntology, DetectionBaseModel
  File "d:\work\dev\github\autodistill\autodistill\detection\__init__.py", line 1, in <module>
    from autodistill.detection.caption_ontology import CaptionOntology
  File "d:\work\dev\github\autodistill\autodistill\detection\caption_ontology.py", line 4, in <module>
    from autodistill.detection import DetectionOntology
ImportError: cannot import name 'DetectionOntology' from partially initialized module 'autodistill.detection' (most likely due to a circular import) (d:\work\dev\github\autodistill\autodistill\detection\__init__.py)

Error with OWLVit

Hello,

While using OWLVit as a base model, I get the following error:

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

Request: Accept numpy arrays in .label() method

When calling model.label() with autodistill, it'd be helpful if images can be provided as numpy arrays. Many use cases have images in this format already, and converting between image encoding can be compute intensive and cumbersome.

[Question] How to export model weights (e.g distilled yolov5)

Hello, is there a way to download the model weights for the distilled model?

TypeError: expected str, bytes or os.PathLike object, not ClassificationDataset

When I use autodistill-clip for image classification, the runtime prompts this error:TypeError: expected str, bytes or os.PathLike object, not ClassificationDataset

Error in import supervision as sv

Exception has occurred: AttributeError
partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)
File "/workspace/code/images_generation.py", line 2, in
import supervision as sv
AttributeError: partially initialized module 'cv2' has no attribute 'gapi_wip_gst_GStreamerPipeline' (most likely due to a circular import)

AttributeError: 'DETR' object has no attribute 'train'

Search before asking

I have searched the Autodistill issues and found no similar bug report.

Bug

I tried to distill Grounding SAM with a DETR object detector and got a weird output thought to report.

Basically, I edited the sample code you have in your doc but I added caption ontology and used the YAML folder to run the train method, first part went well and it downloaded the pretrained ResNet but ran into error on the second line, it seems that somehow it cannot find the train function (see below), I use autodistill_detr-0.1.0:


from autodistill_detr import DETR

# load the model
target_model = DETR(ontology)

# train for 10 epochs
target_model.train(DATA_YAML_PATH, epochs=10)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[36], line 8
      4 target_model = DETR(ontology)
      6 # train for 10 epochs
      7 #target_model.train("./roads", epochs=10)
----> 8 target_model.train(DATA_YAML_PATH, epochs=10)
     10 # run inference on an image
     11 #target_model.predict("./roads/valid/-3-_jpg.rf.bee113a09b22282980c289842aedfc4a.jpg")

AttributeError: 'DETR' object has no attribute 'train'

Environment

Kaggle jupyter notebook (also saw same behavior on AWS sagemaker)
GPU: T4 x2
autodistill 0.1.0
autodistill_detr-0.1.0

Minimal Reproducible Example

from autodistill.detection import CaptionOntology
from autodistill_detr import DETR

ontology=CaptionOntology({
    "circle": "19",
    "road sign" : "16", 
})

# load the model
target_model = DETR(ontology)

# train for 10 epochs
#target_model.train(DATA_YAML_PATH, epochs=10)

Additional

I opened this issue under autodistill-detr and got no attention, if requested I will reopen the original issue:
autodistill/autodistill-detr#2

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

optimize version? python, cuda, cudnn

What should be the python version and the cuda and cudnn versions to utilize autodistill?

Call to .split_data() in .label() causing confusion

The call to split_data is causing some confusion IMHO.

It should either:

be a flag we can toggle on/off.
copy the files to the YOLO train/val folders instead of moving them.
have some output in the command line to tell the user about the file being moved.

Right now, someone calling label() would see the YOLO folder structure appears with an empty annotations folder and waste a few minutes trying to figure out why.

To make it even more confusing, calling label() a second time would crash on the image copy (because it does not want to overwrite existing file) and keep all the annotations in the annotations folder.

Filtering the detection

I am using GroundedSAM to label e-scooters using
base_model.label(input_folder="./images", output_folder="./dataset")

however with
ontology = CaptionOntology({ "electric scooter": "e-scooter", }) it does not seem to work on all images.

But modifying this to ontology = CaptionOntology({ "e-scooter": "e-scooter", "electric scooter": "e-scooter", "kick scooter": "e-scooter", }) increases the detection rate.

However this results in the same object having multiple labels. I was wondering if there is any way I can avoid getting multiple detections for the same object?

An ugly way I went about it is to extract the detections and compare the overlap between the masks, then keep the one with the highest confidence. But this results in the datatype to change and thereby subsequent code all requiring modification.

Error in GroudingDINO label

Hello,

It seems there is a minor issue in using GroundingDINO. I get the following message: Perhaps, a simple check may solve the
issue.

Labeling People/thenewyorker_its-what-each-person-needs.jpg: 100%|█████████████████| 114/114 [00:31<00:00, 3.67it/s]
Traceback (most recent call last):
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/GroundingDINO_labels.py", line 10, in
base_model.label(input_folder=folder_name, output_folder='mldata')
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/autodistillenv/lib/python3.11/site-packages/autodistill/detection/detection_base_model.py", line 58, in label
split_data(output_folder)
File "/media/csverma/M2Disk/Projects/CompVis/ObjectDetection/AutoDistill/autodistillenv/lib/python3.11/site-packages/autodistill/helpers.py", line 66, in split_data
shutil.move(os.path.join(images_dir, file + ".jpg"), train_images_dir)
File "/home/linuxbrew/.linuxbrew/opt/[email protected]/lib/python3.11/shutil.py", line 823, in move
raise Error("Destination path '%s' already exists" % real_dst)
shutil.Error: Destination path 'mldata/train/images/images2.jpg' already exists

SAM Clip confidence level?

Search before asking

I have searched the Autodistill issues and found no similar feature requests.

Question

Hello,

I was wondering if the following line is correct in the code:

def predict(self, input: str, confidence: int = 0.5) -> sv.Detections:

Shouldn't this value be float instead of int?

Thanks

Additional

No response

autodistill / autodistill Goto Github PK

autodistill's Introduction

🔗 Quicklinks

👀 Example Output

🚀 Features

📚 Basic Concepts

💡 Theory and Limitations

💿 Installation

🚀 Quickstart

Install Packages

Distill a Model

📍 Available Models

object detection

instance segmentation

classification

Roboflow Model Deployment Support

🎬 Video Guides

💡 Community Resources

🗺️ Roadmap

🏆 Contributing

👩‍⚖️ License

Frequently Asked Questions ❓

What causes the PytorchStreamReader failed reading zip archive: failed finding central directory error?

💻 explore more Roboflow open source projects

autodistill's People

Contributors

Stargazers

Watchers

Forkers

autodistill's Issues

Search before asking

Question

Additional

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?

Search before asking

Question

Additional

Search before asking

Description

Use case

Additional

Are you willing to submit a PR?

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

Search before asking

Question

Additional

Recommend Projects

Recommend Topics

Recommend Org

What causes the `PytorchStreamReader failed reading zip archive: failed finding central directory` error?