Giter VIP home page Giter VIP logo

Comments (5)

glenn-jocher avatar glenn-jocher commented on July 17, 2024

@iokarkan hello,

Thank you for providing detailed information about your issue. It appears that the problem lies in the integration between Ultralytics YOLOv8 and Weights & Biases (W&B), specifically with the display of unused COCO class labels in the validation table images.

To help us investigate further, could you please provide a minimal reproducible example that includes the specific YAML configuration file you are using? This will allow us to replicate the issue more accurately. You can find guidance on creating a minimal reproducible example here.

Additionally, please ensure that you are using the latest versions of both Ultralytics YOLOv8 and Weights & Biases packages. Sometimes, issues like these are resolved in newer releases.

Here's a quick checklist to verify:

  1. Ensure your ultralytics package is up-to-date:
    pip install --upgrade ultralytics
  2. Ensure your wandb package is up-to-date:
    pip install --upgrade wandb

If the issue persists after updating, please share the YAML configuration and any additional relevant code snippets. This will help us diagnose and address the problem more effectively.

Thank you for your cooperation! 😊

from ultralytics.

iokarkan avatar iokarkan commented on July 17, 2024

The strategy to reproduce the bug is to use a OpenImagesv7 checkpoint and train on a modified COCO-8 dataset with 2 classes.

The venv uses ultralytics==8.1.27 and wandb==0.17.0, as in the original bug description. I noted I do get a warning:

wandb: WARNING This integration is tested and supported for ultralytics v8.0.238 and below.
wandb: WARNING             Please report any issues to https://github.com/wandb/wandb/issues with the tag `yolov8`.

therefore I will investigate also with the reported as supported version (changing the requirements line to 8.0.238 and re-running seems to give the same wandb result).

After training for 2 epochs, the following shows up in wandb:
image

From what I understand the extra classes are coming from the OpenImagesv7 dataset, and should not be predicted in my validation at all as I am not using them in my transfer-learning.


Below are the files used in the process:

  • coco8-reduced.yaml
    • This is a points to a modified coco8 dataset, with only 1 picture kept and the corresponding txt class label keeping only 2 classes, renamed to 0 and 1 (to match the yaml). I downloaded it and modified it to be able to train, as I could not find a faster way to prepare a dataset for transfer learning:

e.g. 000000000009.txt as

0 0.479492 0.688771 0.955609 0.5955
0 0.736516 0.247188 0.498875 0.476417
0 0.339438 0.418896 0.678875 0.7815
1 0.646836 0.132552 0.118047 0.0969375
1 0.773148 0.129802 0.0907344 0.0972292
1 0.668297 0.226906 0.131281 0.146896
1 0.642859 0.0792187 0.148063 0.148062
# Ultralytics YOLO πŸš€, AGPL-3.0 license
# COCO8 dataset (first 8 images from COCO train2017) by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/detect/coco8/
# Example usage: yolo train data=coco8.yaml
# parent
# β”œβ”€β”€ ultralytics
# └── datasets
#     └── coco8  ← downloads here (1 MB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco8 # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)

# Classes
names:
  0: cake
  1: potted plant

# Download script/URL (optional)
download: https://ultralytics.com/assets/coco8.zip
absl-py==2.1.0
asttokens==2.4.1
astunparse==1.6.3
beautifulsoup4==4.12.3
cachetools==5.3.3
certifi==2022.12.7
charset-normalizer==2.1.1
click==8.1.7
comm==0.2.2
contourpy==1.2.1
cycler==0.12.1
debugpy==1.8.1
decorator==5.1.1
decord==0.6.0
docker-pycreds==0.4.0
exceptiongroup==1.2.0
executing==2.0.1
filelock==3.9.0
fire==0.6.0
flatbuffers==24.3.25
fonttools==4.51.0
fsspec==2023.4.0
gast==0.5.4
gdown==5.1.0
gitdb==4.0.11
GitPython==3.1.43
google-auth==2.29.0
google-auth-oauthlib==1.0.0
google-pasta==0.2.0
grpcio==1.62.1
h5py==3.11.0
idna==3.4
ipykernel==6.29.4
ipython==8.23.0
jedi==0.19.1
Jinja2==3.1.2
jupyter_client==8.6.1
jupyter_core==5.7.2
keras==2.14.0
kiwisolver==1.4.5
libclang==18.1.1
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.3
matplotlib==3.8.4
matplotlib-inline==0.1.7
mdurl==0.1.2
ml-dtypes==0.2.0
mpmath==1.3.0
nest-asyncio==1.6.0
networkx==3.2.1
numpy==1.26.3
nvidia-cublas-cu11==11.11.3.6
nvidia-cuda-cupti-cu11==11.8.87
nvidia-cuda-nvcc-cu11==11.8.89
nvidia-cuda-runtime-cu11==11.8.89
nvidia-cudnn-cu11==8.7.0.84
nvidia-cufft-cu11==10.9.0.58
nvidia-curand-cu11==10.3.0.86
nvidia-cusolver-cu11==11.4.1.48
nvidia-cusparse-cu11==11.7.5.86
nvidia-nccl-cu11==2.16.5
oauthlib==3.2.2
# NOTE: there are conflicts when both libraries are installed
# https://stackoverflow.com/questions/55313610/importerror-libgl-so-1-cannot-open-shared-object-file-no-such-file-or-directo
opencv-python==4.9.0.80
opencv-python-headless==4.9.0.80
opencv-contrib-python-headless==4.9.0.80
opt-einsum==3.3.0
packaging==24.0
pandas==2.2.2
parso==0.8.4
pexpect==4.9.0
pillow==10.2.0
platformdirs==4.2.0
prompt-toolkit==3.0.43
protobuf==4.25.3
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
py-cpuinfo==9.0.0
pyasn1==0.6.0
pyasn1_modules==0.4.0
pybboxes==0.1.6
Pygments==2.17.2
pyparsing==3.1.2
PySocks==1.7.1
python-dateutil==2.9.0.post0
pytz==2024.1
PyYAML==6.0.1
pyzmq==26.0.0
requests==2.28.1
requests-oauthlib==2.0.0
retina-face==0.0.16
rich==13.7.1
rsa==4.9
# TODO: change this when 0.11.17 is accepted in PyPI
# sahi==0.11.16
git+https://github.com/obss/[email protected]#egg=sahi
scipy==1.13.0
seaborn==0.13.2
sentry-sdk==2.2.1
setproctitle==1.3.3
shapely==2.0.4
six==1.16.0
smmap==5.0.1
soupsieve==2.5
stack-data==0.6.3
sympy==1.12
tensorboard==2.14.1
tensorboard-data-server==0.7.2
tensorflow==2.14.0
tensorflow-estimator==2.14.0
tensorflow-io-gcs-filesystem==0.36.0
tensorrt==8.5.3.1
termcolor==2.4.0
terminaltables==3.1.10
thop==0.1.1.post2209072238
--extra-index-url https://download.pytorch.org/whl/cu118
torch==2.1.1+cu118
torchvision==0.16.1+cu118
tornado==6.4
tqdm==4.66.2
traitlets==5.14.2
triton==2.1.0
typing_extensions==4.8.0
tzdata==2024.1
ultralytics==8.1.27
urllib3==1.26.13
wandb==0.17.0
wcwidth==0.2.13
Werkzeug==3.0.2
wrapt==1.14.1
import wandb
wandb.login()

from ultralytics import YOLO, settings
from wandb.integration.ultralytics.callback import add_wandb_callback
settings.update({
    'datasets_dir': '../datasets/',
    'runs_dir': '../runs/',
    })

# View all settings
print(settings)

# Download/load the YOLOv8 model in the _weights folder
model_size = 'n'
model = YOLO(f'../_weights/yolov8{model_size}-oiv7.pt')

# initialize a wandb project
wandb.init(project="ultralytics-issue", name="test", job_type="training")

# track training with wandb
add_wandb_callback(model, enable_model_checkpointing=True)

results = model.train(
    project="ultralytics-issue",
    name="test",
    data="./coco8-reduced.yaml",
    epochs=2,
    patience=50,
    optimizer="Adam",
    seed=7,
    imgsz=640,
    batch=8,
    dropout=0.0,
    resume=False,
    device=0
    )

# finalize the W&B Run
wandb.finish()

# reset all settings to default values
settings.reset()

from ultralytics.

glenn-jocher avatar glenn-jocher commented on July 17, 2024

Hello @iokarkan,

Thank you for providing the detailed information and the reproducible example. It’s very helpful for diagnosing the issue.

From your description and the provided code, it seems that the problem lies in the integration between Ultralytics YOLOv8 and Weights & Biases (W&B), where unused COCO class labels are still appearing in the validation table images.

Steps to Address the Issue:

  1. Verify Package Versions: Ensure you are using the latest versions of both ultralytics and wandb. The warning you received indicates that the integration is tested for ultralytics v8.0.238 and below. However, you mentioned using ultralytics v8.1.27. Please try updating to the latest versions to see if the issue persists:

    pip install --upgrade ultralytics wandb
  2. Check Class Mappings: Ensure that the class mappings in your coco8-reduced.yaml file are correctly set and that the model is properly initialized with the new class labels. It appears you have done this, but double-checking might help.

  3. W&B Callback: The W&B callback should correctly log the new class labels. Ensure that the callback is correctly added and that the enable_model_checkpointing parameter is set to True:

    from wandb.integration.ultralytics import add_wandb_callback
    add_wandb_callback(model, enable_model_checkpointing=True)
  4. Debugging: To further debug, you might want to print out the class labels and predictions during the validation phase to ensure that the model is not predicting the excluded classes:

    results = model.val()
    print(results)

Example Code Snippet:

Here’s a concise example to ensure everything is set up correctly:

import wandb
from ultralytics import YOLO, settings
from wandb.integration.ultralytics.callback import add_wandb_callback

# Initialize W&B
wandb.login()
wandb.init(project="ultralytics-issue", name="test", job_type="training")

# Load the model
model = YOLO('../_weights/yolov8n-oiv7.pt')

# Add W&B callback
add_wandb_callback(model, enable_model_checkpointing=True)

# Train the model
results = model.train(
    project="ultralytics-issue",
    name="test",
    data="./coco8-reduced.yaml",
    epochs=2,
    imgsz=640,
    batch=8,
    device=0
)

# Validate the model
val_results = model.val()
print(val_results)

# Finalize W&B run
wandb.finish()

Additional Resources:

For more detailed guidance on integrating Ultralytics YOLOv8 with Weights & Biases, you can refer to the Ultralytics documentation on W&B integration.

If the issue persists after these steps, please let us know, and we can further investigate. Your cooperation and detailed reporting are greatly appreciated! 😊

from ultralytics.

iokarkan avatar iokarkan commented on July 17, 2024

Thank you for taking the time @glenn-jocher.

A couple of remarks:

  • The "extra" / "unused" labels in wandb come from the pre-trained checkpoint in both cases detailed in my posts, so in my latest MRE post the labels come from OIv7, not COCO.
  • I edited my post to say I did change to ultralytics==8.0.238 but the observed wandb behaviour persists.
  • The suggested wandb setup in the training script looks identical to what I posted, please let me know if there's something I missed.

Based on your other suggestion, I validated the model with model.val():

YOLOv8n summary (fused): 168 layers, 3006038 parameters, 0 gradients, 8.1 GFLOPs
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1/1 [00:00<00:00, 39.25it/s]
                   all          4          1    0.00182          1     0.0212     0.0129
                  cake          4          1    0.00182          1     0.0212     0.0129

from ultralytics.

glenn-jocher avatar glenn-jocher commented on July 17, 2024

Hello @iokarkan,

Thank you for the detailed follow-up and for providing the additional context. Your observations are very helpful in diagnosing the issue.

Key Points:

  1. Source of Extra Labels: It’s clear that the extra labels in W&B are originating from the pre-trained OpenImagesv7 (OIv7) checkpoint. This indicates that the issue is related to how W&B logs the class labels from the pre-trained model, even after they have been overridden.

  2. Version Verification: You've confirmed that the issue persists with ultralytics==8.0.238, which is within the supported range for W&B integration. This helps narrow down the potential causes.

  3. Validation Results: Your validation results show that the model is indeed predicting only the intended classes (cake), which aligns with your training setup. This further suggests that the issue is specific to the W&B logging rather than the model's predictions.

Next Steps:

To address the issue with W&B logging extra labels, consider the following steps:

  1. Explicit Class Mapping: Ensure that the class mapping is explicitly set in the W&B configuration. This can sometimes help in overriding the default labels from the pre-trained checkpoint.

  2. Custom Callback: You might want to create a custom W&B callback to ensure that only the intended classes are logged. Here’s a quick example of how you might modify the callback:

    from wandb.integration.ultralytics import WandbCallback
    
    class CustomWandbCallback(WandbCallback):
        def on_val_end(self, trainer, pl_module):
            # Custom logic to filter out unwanted labels
            super().on_val_end(trainer, pl_module)
            # Ensure only the intended classes are logged
            trainer.logger.experiment.log({"custom_classes": ["cake", "potted plant"]})
    
    # Use the custom callback
    model.add_callback(CustomWandbCallback())
  3. W&B Support: Since this issue seems to be specific to the W&B integration, it might be beneficial to reach out to W&B support or check their GitHub issues for similar reports. They might have additional insights or fixes for this behavior.

Conclusion:

The issue appears to be with how W&B logs class labels from the pre-trained checkpoint. By ensuring explicit class mapping and possibly using a custom callback, you can mitigate this issue. If the problem persists, reaching out to W&B support would be a prudent next step.

Thank you for your patience and detailed reporting. If you have any further questions or need additional assistance, feel free to ask. 😊

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.