<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

👋 Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Labels problem with YoloV8 custom dataset,about ultralytics/ultralytics

Comments (14)

glenn-jocher commented on July 4, 2024 1

Thank you for the detailed follow-up and for providing the minimum reproducible example (MRE) along with the dataset sample. This information is incredibly helpful for us to diagnose the issue effectively.

Review and Next Steps:

Data Loader Verification:
Your code for visualizing the polygons on the images looks great, and it's clear that the labels are being read correctly. This confirms that the issue is likely not with the data loader.
Software Versions:
From your pip list, it appears you are using the correct versions of torch and ultralytics. This is good to see, as it rules out version incompatibility.
Training Configuration:
Your training script and configuration file also seem to be in order. However, let's ensure that the training process is correctly interpreting the labels.

Additional Checks:

Label Format:
Ensure that the labels are in the correct YOLO format. Each line in the label file should follow the format:
```
<class_id> <x_center> <y_center> <width> <height>
```
For segmentation, the format might differ slightly, but the key is to ensure that the coordinates are normalized (i.e., values between 0 and 1).

Training with Minimal Augmentation:
Since you've already tried disabling augmentations, let's try running a few epochs with minimal settings to see if the issue persists:

from ultralytics import YOLO

model = YOLO("yolov8s-seg.pt")

# Train the model with minimal settings
results = model.train(data="data_custom.yaml", epochs=10, imgsz=640, augment=False)

Inspecting Training Batches:
During training, you can inspect the training batches to ensure that the labels are being applied correctly. This can be done by modifying the training script to visualize a few batches:

from ultralytics import YOLO
import matplotlib.pyplot as plt

model = YOLO("yolov8s-seg.pt")

# Train the model and visualize batches
results = model.train(data="data_custom.yaml", epochs=10, imgsz=640, augment=False, visualize=True)

# Visualize a few training batches
for i, batch in enumerate(results.train_loader):
    images, targets = batch
    plt.imshow(images[0].permute(1, 2, 0).cpu().numpy())
    plt.show()
    if i == 5:  # Visualize first 5 batches
        break

Conclusion:

If the issue persists after these steps, it might be beneficial to further inspect the dataset sample you provided. We will take a closer look at the dataset and try to replicate the issue on our end.

Thank you for your patience and cooperation. We're committed to resolving this issue and appreciate your detailed feedback. If you need further assistance or have additional questions, please let us know!

from ultralytics.

github-actions commented on July 4, 2024

👋 Hello @Alexsrp, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

from ultralytics.

glenn-jocher commented on July 4, 2024

Hello! Thanks for providing detailed information and visuals regarding your issue with label completeness during training.

From your description and the images provided, it seems like the issue might be related to the label conversion process or possibly the augmentation settings in your training configuration. Here are a few steps you can take to diagnose and potentially resolve the issue:

Verify Label Files: Ensure that your label files (.txt) correctly map to the corresponding images and that the bounding box coordinates accurately reflect the objects' positions and dimensions in the images.
Check Augmentation Settings: Review the augmentation settings in your training configuration. Augmentations like random cropping or scaling might lead to partial labels if parts of objects are cut off from the training images.
Disable Augmentations Temporarily: Try training your model with minimal or no augmentations to see if the issue persists. This can help determine if the problem is due to aggressive augmentation settings.
Visual Inspection: Use a script or tool to visually inspect the bounding boxes drawn from your label files over the images to ensure they are correct.

If after these checks the problem still exists, it might be helpful to share your training configuration file and a few examples of your label files and images for further diagnosis.

Hope this helps! Let us know how it goes or if you need further assistance.

from ultralytics.

Alexsrp commented on July 4, 2024

Hello, thank you for your recommendations, but it did not work:

I already did this, all seems to be correct.
and 3. I tried to deactivate every augmentation setting and the problem persist, as you can see in the following train batch:

I did this before posting the question, and that the reason why I am confused, I attach here an example of a normal label, and a reconstructed label:

Original

Reconstructed

So seeing this, I assume the problem is not there.

As you requested I am going to attach a few labels and images, and the train and configuration files.
Images:

Labels:
uc3m_00001.txt
uc3m_00164_FH.txt
uc3m_00385_ILU2.txt
uc3m_00454_ILU2.txt

Train: I upload it in .txt, because .py are not supported

Train.txt

Config: Same in .txt

data_custom.txt

Thank you again for your help.

from ultralytics.

glenn-jocher commented on July 4, 2024

Hello,

Thank you for providing the additional details and for your thorough testing. It's clear you've made a comprehensive effort to diagnose the issue.

Given the persistence of the problem even after disabling augmentations and verifying label correctness, it might be beneficial to look into the following:

Model Configuration: Sometimes, subtle configuration nuances can affect how the model interprets the data. I'll review the attached training and configuration files to see if anything stands out.
Data Loader: There might be an issue in how the data is being loaded or preprocessed before entering the model. This can sometimes lead to discrepancies in what you expect versus what the model sees.
Software Version: Ensure that you are using the latest version of our software, as updates often fix bugs that could be related to your issue.

I'll analyze the files you've attached and get back to you with some insights or further questions. Your cooperation is greatly appreciated, and we're committed to resolving this issue together.

from ultralytics.

Alexsrp commented on July 4, 2024

Hello I didn't have much time last week, but I have tried what you told me and the problem is still there. I don't know what else can I do, at this point I am waiting your response.

Thanks in advance!

from ultralytics.

glenn-jocher commented on July 4, 2024

Hello,

Thank you for your patience and for following up with your testing results. Let's work together to get to the bottom of this issue.

Next Steps:

Minimum Reproducible Example:
To help us investigate further, could you please provide a minimum reproducible code example? This will allow us to replicate the issue on our end. You can find guidelines on how to create one here. This step is crucial for us to understand and address the problem effectively.
Software Versions:
Ensure that you are using the latest versions of torch and ultralytics. You can upgrade your packages using the following commands:
```
pip install --upgrade torch
pip install --upgrade ultralytics
```
Once upgraded, please try running your training again to see if the issue persists.

Additional Checks:

Data Loader: Sometimes, issues can arise from how the data is loaded or preprocessed. Ensure that your data loader is correctly configured and that the images and labels are being read as expected.
Configuration Review: I have reviewed the training and configuration files you provided, and they seem to be in order. However, subtle nuances can sometimes cause unexpected behavior. Double-checking these settings can sometimes reveal hidden issues.

Example Code:

Here's a small snippet to help you verify that your data loader is functioning correctly:

from ultralytics import YOLO
import matplotlib.pyplot as plt

# Load your model
model = YOLO('yolov8n.pt')

# Load a sample image and its corresponding label
image_path = 'path/to/your/image.jpg'
label_path = 'path/to/your/label.txt'

# Visualize the image and label
image = plt.imread(image_path)
plt.imshow(image)
plt.show()

# Load and visualize the label
with open(label_path, 'r') as file:
    labels = file.readlines()
    for label in labels:
        print(label)

This should help you ensure that the images and labels are correctly aligned and being read properly.

Thank you for your cooperation and understanding. We're here to help, and with your assistance, we can resolve this issue efficiently. Looking forward to your response!

from ultralytics.

Alexsrp commented on July 4, 2024

Hello, again thanks for the help provided, now I will reply to the things you asked for:

Check of data loader: I changed I little bit the code in order to display the polygons on the image, and all seems to be correct, it reads both label and image perfectly.

New code:

from ultralytics import YOLO
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import matplotlib.lines as mlines

# Load your model
model = YOLO('yolov8n.pt')

# Load a sample image and its corresponding label
image_path = "path"
label_path = "path"

# Load and visualize the image
image = plt.imread(image_path)
plt.imshow(image)
plt.axis('off')  # Hide axes for better visualization
plt.show()

# Load and visualize the labels
with open(label_path, 'r') as file:
    labels = file.readlines()

# Visualize the labels on the image
fig, ax = plt.subplots()
ax.imshow(image)

for label in labels:
    label_data = label.strip().split()
    class_id = label_data[0]
    points = list(map(float, label_data[1:]))
    
    # Separate x and y coordinates
    x_points = points[0::2]
    y_points = points[1::2]
    
    # Convert from relative coordinates to absolute coordinates
    img_height, img_width, _ = image.shape
    x_points = [x * img_width for x in x_points]
    y_points = [y * img_height for y in y_points]
    
    # Create a Polygon patch
    polygon = patches.Polygon(list(zip(x_points, y_points)), linewidth=1, edgecolor='r', facecolor='none')
    
    # Add the patch to the Axes
    ax.add_patch(polygon)
    
    # Optional: Add class_id text at the first vertex of the polygon
    ax.text(x_points[0], y_points[0], class_id, color='white', fontsize=8, verticalalignment='top')

plt.axis('off')
plt.show()

Image of the test:

As you can see, all edges are perfectly labled, and this photo correspond with one of the ones that appears on the train_batch that I upload in one of the first post, you can see it in the next image:

MRE:

Train file:

from ultralytics import YOLO

model = YOLO("yolov8s-seg.pt")  # load a pretrained model (recommended for training)


# Train the model
results = model.train(data="data_custom.yaml", epochs=100, imgsz=640)

Config file:

train: /home/alejandro/TFM/RN/Yolo/train
val: /home/alejandro/TFM/RN/Yolo/val
test: /home/alejandro/TFM/RN/Yolo/test

nc: 19

names: 
  0: road
  1: sidewalk
  2: building
  3: wall
  4: fence
  5: pole
  6: traffic light
  7: traffic sign
  8: vegetation
  9: terrain
  10: sky
  11: person
  12: rider
  13: car
  14: dog
  15: door
  16: static
  17: unlabeled
  18: bicycle

Yolo Check:

Pip list:

Dataset sample: Since my has to do with my own dataset I provide a small sample of it.
Dataset_sample.zip

About the error: As it has been exposed throught this thread, my error is that yolo is not reading my labels properly.

If you need anything else please teel me, I await your response.

from ultralytics.

Alexsrp commented on July 4, 2024

Hello, I just checked all you tell me in the previous comment, and the error persist, and in this point I feel that this error is going to drive me crazy, because I don't know what else to do.

Just in case I want to doble check again the label format:

You said that it is;
<class_id> <x_center> <y_center> <width> <height>

But that is the format for bounding boxes, in this case for segmentation the format I am using, I saw in you official page that for segmentation is:
<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>

I am almost 100% sure it is correct, but I just in case...

In the last part of the comment you say that you can take a closer look to the sample, that will be very helpfull if it is possible. I would be very grateful about it.

I will wait your response, thank you in advance!

from ultralytics.

glenn-jocher commented on July 4, 2024

@Alexsrp hello,

Thank you for your patience and for providing detailed feedback. I understand how frustrating this issue can be, and I'm here to help you resolve it.

Label Format Confirmation

You are correct that the format for segmentation labels is different from bounding boxes. For segmentation, the format should indeed be:

<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>

It sounds like you have this correctly set up, which is great.

Next Steps

Given that you've already verified the label format and tried the previous suggestions, let's proceed with a closer inspection of the dataset sample you provided. This will help us identify any potential issues that might not be immediately apparent.

Dataset Sample Review

I'll take a look at the dataset sample you provided to see if I can replicate the issue on my end. This will involve running a few training epochs and inspecting the training batches to ensure that the labels are being applied correctly.

Additional Checks

While I review the dataset, here are a few additional checks you can perform:

Ensure Consistent Image and Label Naming: Verify that each image file has a corresponding label file with the same base name. For example, image1.jpg should have a corresponding image1.txt.
Check for Empty or Corrupted Files: Ensure that none of the label files are empty or corrupted. Even a single corrupted file can cause issues during training.
Inspect Data Directory Structure: Ensure that the directory structure specified in your configuration file (data_custom.yaml) matches the actual structure of your dataset directories.

Example Code for Batch Inspection

Here's a small snippet to help you visualize the training batches and ensure that the labels are being applied correctly:

from ultralytics import YOLO
import matplotlib.pyplot as plt

model = YOLO("yolov8s-seg.pt")

# Train the model and visualize batches
results = model.train(data="data_custom.yaml", epochs=10, imgsz=640, augment=False, visualize=True)

# Visualize a few training batches
for i, batch in enumerate(results.train_loader):
    images, targets = batch
    plt.imshow(images[0].permute(1, 2, 0).cpu().numpy())
    plt.show()
    if i == 5:  # Visualize first 5 batches
        break

Conclusion

I'll proceed with reviewing the dataset sample you provided and will get back to you with my findings. In the meantime, please perform the additional checks mentioned above to ensure everything is in order.

Thank you for your cooperation and understanding. We're committed to resolving this issue and appreciate your detailed feedback. If you need further assistance or have additional questions, please let us know!

from ultralytics.

Alexsrp commented on July 4, 2024

I checked everything, and everything is normal. So now I will wait to your conclusion of the Dataset Sample Review.

Please let me know when you come to a conclusion.

Thank you again.

from ultralytics.

glenn-jocher commented on July 4, 2024

Hello @Alexsrp,

Thank you for your patience and for thoroughly checking everything on your end. I appreciate your cooperation.

I'll proceed with a detailed review of the dataset sample you provided to identify any potential issues. This process will involve running a few training epochs and inspecting the training batches to ensure that the labels are being applied correctly.

In the meantime, please ensure that your environment is up-to-date with the latest versions of torch and ultralytics. You can upgrade your packages using the following commands:

pip install --upgrade torch
pip install --upgrade ultralytics

I'll get back to you as soon as I have completed the review. Thank you for your understanding and patience. We're committed to resolving this issue together.

from ultralytics.

Alexsrp commented on July 4, 2024

Hello, I finally found out where the "error" was.

Yesterday, I programmed another script, where I plotted every point of one of the lables, but with one diference with the one I did before, this time i did it point by point not the hole polygon, because the polygon was perfectly good, and I wanted to check the points, just in case, I obtained this:

As you can see everything seem to be ok, I thougth the same. But later I realized that in the botton part, there are only 2 points, the ones in the corners, and I decided that I was going to revise if it was posible to change that.

So I revised the script I did to obtain the countours of the labels and I found out this:

contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

I look in the cv2 docs, and cv2.CHAIN_APPROX_SIMPLE simplifies the polygon in a way that its the same but with less points, so I deactivated:

contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

With this I rerun my new script and the points looks like this:

Now there are lots of them in the bottom.

So I reruned a train and my train batch look like this:

So the conclusión is that some how yolo its not capable to interpret the bottom part of the labels if it has just 2 or 3 points, but if you mantain a lot of points it works perfectly.

I dont know if this can be considered a bug, or it was all the time my error but now it is fixed!

Thak you for your help and advice trying to find it.

from ultralytics.

glenn-jocher commented on July 4, 2024

Hello @Alexsrp,

Thank you for the detailed follow-up and for sharing your findings! It's fantastic to hear that you were able to identify and resolve the issue with the label points. Your thorough investigation and the steps you took to debug the problem are commendable. 👏

Summary of Findings:

It appears that the issue was related to the simplification of the polygon points using cv2.CHAIN_APPROX_SIMPLE in your script. By switching to cv2.CHAIN_APPROX_NONE, you retained all the points, which resolved the problem with the bottom part of the labels not being interpreted correctly by YOLO.

Conclusion:

While this might not be a bug in the YOLO model itself, your discovery highlights an important consideration when preparing segmentation labels. Ensuring that the contours have sufficient points can significantly impact the model's ability to interpret the labels accurately.

Next Steps:

Documentation Update: We'll consider adding a note in the documentation to highlight the importance of contour point density for segmentation tasks.
Community Benefit: Your experience and solution will undoubtedly benefit others facing similar issues. Thank you for sharing this with the community!

If you have any further questions or need additional assistance, feel free to reach out. We're here to help!

from ultralytics.

Comments (14)

Review and Next Steps:

Additional Checks:

Conclusion:

Install

Environments

Status

Next Steps:

Additional Checks:

Example Code:

Label Format Confirmation

Next Steps

Dataset Sample Review

Additional Checks

Example Code for Batch Inspection

Conclusion

Summary of Findings:

Conclusion:

Next Steps:

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org