Comments (5)
@Hogushake hello! π It looks like you're on the right track but need to adjust your approach to display masks for all detected objects in the Person class. The key adjustment is to iterate through all predictions (not just the first one) and overlay or combine their masks accordingly. Here's a modified snippet of your code:
import cv2
import numpy as np
from ultralytics import YOLO
model = YOLO("yolov8n-seg.pt")
cap = cv2.VideoCapture("people.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
results = model.predict(frame, conf=0.3, classes=0)
# Initialize a mask to accumulate all person masks
combined_mask = np.zeros_like(frame[:, :, 0])
# Loop through all detected objects and combine their masks
for mask in results[0].masks.data:
combined_mask += (mask.numpy() * 255).astype("uint8")
# Ensure combined mask is binary
combined_mask = np.clip(combined_mask, 0, 255)
cv2.imshow("Combined Masks", combined_mask)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
This modification initializes combined_mask
to accumulate the masks of all detected persons. Each mask from the predictions is added to this accumulation. Finally, make sure to apply np.clip
to ensure the final mask remains in a valid range.
This should display masks for all detected Person objects with confidence above 0.3, as intended. Let me know if this helps or if you have further questions!
from ultralytics.
@glenn-jocher
Thank you so much for your quick replyπ
When I run the code you sent me, I get the following error:
combined_mask += (mask.numpy() * 255).astype("uint8")
ValueError: operands could not be broadcast together with shapes (360,640) (384,640) (360,640)
This error appears to occur because the output size of yolo is different from the size of the input data i think.
So I added one line of code to adjust the size and solved it!
for mask in results[0].masks.data:
resized_mask = cv2.resize(mask.numpy(), (frame.shape[1], frame.shape[0]), interpolation=cv2.INTER_NEAREST)
combined_mask += (resized_mask * 255).astype("uint8")
The problem of Yolo's output size being different from the input size was mentioned in other questions, so I was aware of it in advance.
So, is there an adjustable internal parameter within yolo that makes the size of the result the same, without using an external function like cv2.resize?
from ultralytics.
Hey @Hogushake,
Great observation on the size discrepancy! π The model indeed outputs a mask that matches its input size, which might differ from your original video frame size if it got resized during inference.
Adding cv2.resize
, as you've done, is currently the recommended practice to match the output mask dimensions with that of the input frame. YOLOv8 does not include a built-in parameter to auto-adjust the output size to match the original unaltered input size directly.
Your modification using cv2.resize
seems apt for ensuring dimension consistency across different processing stages. If any further adjustments are needed or you encounter more issues, feel free to reach out again!
Happy coding! π
from ultralytics.
Thank you for your reply.
Like the code above, we resize for mask.numpy, but the mask size does not match as shown in the picture.
The bottom of the binary mask is not recognized.
Is there a problem with the code?
from ultralytics.
Hey there!
It looks like the issue might be due to how the resizing is handled, particularly with binary masks where nearest neighbor interpolation preserves the binary nature. Here's a small tweak to your approach which ensures that the resizing keeps the binary properties of the mask right:
for mask in results[0].masks.data:
resized_mask = cv2.resize((mask.numpy() * 255).astype('uint8'), (frame.shape[1], frame.shape[0]), interpolation=cv2.INTER_NEAREST)
combined_mask += resized_mask
This should ensure that the resizing does not introduce any unintended changes in the mask values. If you're still facing challenges, please ensure that your frame and mask sizes are printed out correctly before and after resizing to help debug the issue effectively.
Happy coding! π
from ultralytics.
Related Issues (20)
- Can the βdynamic= Trueβ of "engine" not be inferred?
- Weights transfer from a pretrained to a new model HOT 3
- partially initialized module 'pandas' has no attribute '_pandas_datetime_CAPI' (most likely due to a circular import) HOT 1
- How do I run tracking with ultralytics on multiple GPUs for quicker processing? HOT 5
- How can i use MPII dataset for YoloV8-pose training HOT 2
- How to chose what metrics use as goal for model.tune? HOT 4
- Have to solve convert best.pt model to tflite HOT 3
- Have to convert best.pt model to tflite format HOT 2
- Detection and segmentation from same img source
- Run Yolov8 on Jetson Nano with TensorRt HOT 3
- Why doesn't YOLO-worldv2 use ImagePoolingAttn? HOT 2
- How to train model without color? HOT 3
- how to write a self-defined .yaml file for HICO-DET dataset HOT 3
- OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\User_Name\AppData\Roaming\Python\Python311\site-packages\torch\lib\shm.dll" or one of its dependencies. HOT 2
- Attempt to Replicate Validation Produces Worse Results HOT 2
- Yolov8 performance on Raspberry Pi 4B (8Gb) HOT 1
- Get results as YOLO annotation HOT 2
- Bug in torch.unique(return_counts=True) on MPS device results in incorrect counts and negative tensor dimensions HOT 2
- Model quantization with ONNX fails HOT 2
- How to train on multiple/single detection head for Yolov8 like Yolov9? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ultralytics.