Giter VIP home page Giter VIP logo

Comments (8)

glenn-jocher avatar glenn-jocher commented on June 25, 2024 1

Hello!

We're glad to hear the information was helpful! 🚀 To get started with modifying the codebase for your needs, I recommend first familiarizing yourself with the structure of the YOLOv8 model, particularly focusing on the files where the loss functions are defined and handled.

A good starting point would be to look into the models directory, where you'll find the model definitions and forward pass logic. Pay special attention to the loss computation sections within these files. This will give you insight into where and how to implement the conditional logic for handling images with and without bounding boxes.

If you encounter any specific issues or have questions as you go through the code, feel free to reach out. We're here to help!

Happy coding!

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 25, 2024 1

@sidharthanup hello!

Absolutely, I'd be happy to help you navigate the codebase! 😊

You're on the right track by looking into the trainer.py file. The backward call indeed indicates where the gradients are computed, but for modifying the loss calculations, you'll want to dive a bit deeper into the specifics of how the loss is constructed.

Here are the key areas to focus on:

  1. Loss Calculation:

    • The loss functions are typically defined in the model files. For YOLOv8, you might want to look into the loss.py file within the ultralytics/models directory. This file should contain the definitions for the different components of the loss (like classification loss, localization loss, etc.).
  2. Trainer Logic:

    • The trainer.py file is where the training loop is managed. This includes data loading, forward passes, loss computation, and backpropagation. You'll want to modify the part where the loss is computed to include your conditional logic for handling images with and without bounding boxes.

Here's a general outline of what you might need to do:

  1. Modify the Loss Function:

    • In loss.py, add logic to handle cases where only class labels are available. This might involve adding a new function or modifying an existing one to compute a scaled-down IoU loss when bounding boxes are not provided.
  2. Update the Training Loop:

    • In trainer.py, update the training loop to check if an image has bounding boxes. Based on this check, call the appropriate loss function.

Here's a small snippet to give you an idea:

# In loss.py
def compute_loss(predictions, targets, has_bboxes):
    if has_bboxes:
        # Compute full loss (classification + localization)
        loss = full_loss(predictions, targets)
    else:
        # Compute classification loss and scaled IoU loss
        loss = classification_loss(predictions, targets) + 0.3 * iou_loss(predictions, targets)
    return loss

# In trainer.py
for batch in dataloader:
    images, targets, has_bboxes = batch
    predictions = model(images)
    loss = compute_loss(predictions, targets, has_bboxes)
    loss.backward()
    optimizer.step()

This is a simplified example, but it should give you a starting point. Make sure to test thoroughly to ensure the new logic integrates well with the existing training process.

If you encounter any specific issues or need further guidance, feel free to ask. We're here to help! 🚀

Happy coding!

from ultralytics.

github-actions avatar github-actions commented on June 25, 2024

👋 Hello @sidharthanup, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 25, 2024

Hello!

Great question! To train a model on a mixed dataset with both bounding boxes and class labels only (YOLO9000 style), you'll need to modify the training process to handle each type of data appropriately.

For YOLOv8, you can implement a custom training loop that:

  1. Checks if an image has associated bounding boxes:
    • If yes, compute the full loss (localization + classification).
    • If no, select the bounding box with the highest prediction probability, compute the classification loss for this box, and apply a scaled-down version of the localization loss (e.g., 0.3 times the IoU loss as you mentioned).

This approach requires modifying the loss computation part of your model's training script. You might need to dive into the model's codebase to implement these conditional checks and loss adjustments.

If you're comfortable editing the model's training code, this could be a feasible approach. Otherwise, consulting with a developer familiar with the YOLO architecture and its implementation might be necessary.

Let us know if you need further assistance or specific guidance on the code changes!

from ultralytics.

sidharthanup avatar sidharthanup commented on June 25, 2024

Thank you @glenn-jocher! That makes sense. And yes I'm new to the codebase and I'll really appreciate it if you guys can help me get started on the code and a general sense of where I should be changing stuff.

from ultralytics.

sidharthanup avatar sidharthanup commented on June 25, 2024

Thank you! Can you help me with pointing out where the loss calculations (dfs, vfs etc) are? I see backward being called in the trainer code : https://github.com/ultralytics/ultralytics/blob/main/ultralytics/engine/trainer.py . Is that where I should be focusing on?

from ultralytics.

sidharthanup avatar sidharthanup commented on June 25, 2024

Thank you so much! I'll let you know how it goes!

from ultralytics.

glenn-jocher avatar glenn-jocher commented on June 25, 2024

@sidharthanup you're very welcome! 😊 We're excited to see how your implementation progresses.

Before you dive in, here are a couple of quick checks to ensure everything runs smoothly:

  1. Reproducible Example: If you encounter any issues, please provide a minimum reproducible code example. This helps us understand the context and reproduce the issue on our end. You can find more details on how to create one here: Minimum Reproducible Example.

  2. Latest Versions: Make sure you're using the latest versions of torch and ultralytics. Sometimes, bugs are fixed in newer releases, so it's always a good idea to update your packages and try again if you run into any problems.

If you need further assistance or run into any issues, don't hesitate to reach out. We're here to help!

Happy coding, and best of luck with your project! 🚀

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.