Giter VIP home page Giter VIP logo

Comments (11)

glenn-jocher avatar glenn-jocher commented on September 26, 2024 1

Hi @ssatz,

That's fantastic news! 🎉 Congratulations on achieving such high accuracy with your custom dataset. It's great to hear that the system is performing well overall.

For those edge cases, consider further fine-tuning and possibly augmenting your dataset to cover more scenarios. If you encounter any more issues or need further assistance, feel free to open a new ticket.

Thank you for the update, and best of luck with your continued work! 🚀

from ultralytics.

github-actions avatar github-actions commented on September 26, 2024

👋 Hello @ssatz, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

Ultralytics CI

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on September 26, 2024

Hi Sathish,

Great to hear about your success with table detection! For the issue with short rows in table structure recognition, it might be related to how the rows are labeled or the model's sensitivity to smaller objects. Here are a couple of suggestions:

  1. Labeling Consistency: Ensure that the labeling is consistent across the dataset, especially for shorter rows. Sometimes, inconsistencies can lead to poor model performance for specific cases.
  2. Increase Dataset Size: A dataset size of 40 might be quite small for the model to generalize well, especially for complex structures like tables. If possible, try increasing the dataset size.
  3. Model Configuration: Adjust the model's anchor sizes or use a model configuration that is better suited for detecting smaller or thinner objects.

Here's a quick example of adjusting anchor sizes in your model configuration file:

anchors:
  - [10,13, 16,30, 33,23]  # smaller anchors

Hope this helps! Keep us posted on your progress.

from ultralytics.

ssatz avatar ssatz commented on September 26, 2024

@glenn-jocher thanks for the suggestion. anchor boxes are calculated based on dataset by yolo automatically right? To increase dataset size I will be using pub1m dataset which I have to convert pascal to yolo format. I will posting my progress here.

from ultralytics.

glenn-jocher avatar glenn-jocher commented on September 26, 2024

Hi there!

Yes, YOLOv8 can automatically calculate optimal anchor boxes based on your dataset during the training process. Using the pub1m dataset sounds like a great plan to enhance your model's performance. Converting from Pascal VOC to YOLO format is straightforward, and you can use tools like Roboflow to help with the conversion.

Looking forward to seeing your updates on this! Keep us posted. 😊

from ultralytics.

ssatz avatar ssatz commented on September 26, 2024

@glenn-jocher I have converted all the annotation formats to Yolov8, and I don't see the difference in Yolov5 and Yolov8 annotation formats. I'm wondering how to train the Pub1m and Fintab datasets. Should I combine them or first train Pub1m and then use that model to train Fintab? What's the best approach?

from ultralytics.

glenn-jocher avatar glenn-jocher commented on September 26, 2024

Hi there!

Great job on converting your annotations! Regarding training on the Pub1m and Fintab datasets, combining them into a single dataset for training can be beneficial if the table structures in both datasets are similar, as it would provide a more diverse set of examples for the model to learn from. This approach generally helps improve the model's robustness and generalization capabilities.

If the datasets are quite different in terms of table structure or content, you might consider training on Pub1m first to establish a solid baseline model, and then fine-tune on Fintab to adapt to its specific characteristics.

Both strategies have their merits, so the best approach might depend on the specifics of the datasets and your project goals. Keep experimenting and let us know how it goes! 😊

from ultralytics.

ssatz avatar ssatz commented on September 26, 2024

@glenn-jocher Thanks 🙏! We did it! After nearly 30 hours, we've successfully trained.

Metrics:

image

image

Here is the Yolo format data https://huggingface.co/datasets/Codeplug/pub-fintab-yolov

Furthermore we will experiment with our own dataset
For training the dataset we have used Runpod

from ultralytics.

glenn-jocher avatar glenn-jocher commented on September 26, 2024

Hi @ssatz,

That's fantastic news! 🎉 Congratulations on successfully training your model! Your metrics look impressive, and it's great to see the progress you've made.

For your next steps, experimenting with your own dataset sounds like a solid plan. Here are a few tips to help you get the most out of your training:

  1. Fine-Tuning: If you haven't already, consider fine-tuning your model on your specific dataset. This can help the model adapt better to the nuances of your data.

  2. Data Augmentation: Utilize data augmentation techniques to increase the variability in your training data. This can help improve the model's robustness and generalization.

  3. Hyperparameter Tuning: Experiment with different hyperparameters such as learning rate, batch size, and epochs to find the optimal settings for your dataset.

  4. Validation: Ensure you have a robust validation set to monitor the model's performance and avoid overfitting.

Here's a quick example of how you might set up your training script in Python:

from ultralytics import YOLO

# Load your model
model = YOLO("yolov8n-seg.pt")

# Train the model on your dataset
results = model.train(data="path/to/your/dataset.yaml", epochs=100, imgsz=640, batch=16, lr0=0.01)

# Fine-tune on a specific dataset if needed
fine_tune_results = model.train(data="path/to/your/fine-tune-dataset.yaml", epochs=50, imgsz=640, batch=16, lr0=0.001)

And for CLI:

# Train on your dataset
yolo detect train data=path/to/your/dataset.yaml model=yolov8n-seg.pt epochs=100 imgsz=640 batch=16 lr0=0.01

# Fine-tune on a specific dataset
yolo detect train data=path/to/your/fine-tune-dataset.yaml model=path/to/your/trained-model.pt epochs=50 imgsz=640 batch=16 lr0=0.001

Keep us posted on your progress, and feel free to reach out if you have any more questions or need further assistance. Happy training! 🚀

from ultralytics.

github-actions avatar github-actions commented on September 26, 2024

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

from ultralytics.

ssatz avatar ssatz commented on September 26, 2024

Hi @glenn-jocher ,
I wanted to update you on our progress. We've successfully trained our custom dataset and are consistently achieving accuracy levels above 90%. While there are still some edge cases that require fine-tuning, overall the system is performing well.
Thank you for your guidance throughout this process. Your input has been invaluable in helping us reach this point.
Given our current status, I'll be closing this ticket. If any further issues arise, I'll be sure to open a new one.

from ultralytics.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.