Giter VIP home page Giter VIP logo

mrdbourke / airbnb-amenity-detection Goto Github PK

View Code? Open in Web Editor NEW
160.0 8.0 54.0 54.26 MB

Repo for 42 days project to replicate/improve Airbnb's amenity (object) detection pipeline.

Home Page: https://dbourke.link/airbnb42days

Jupyter Notebook 99.91% Python 0.09% Dockerfile 0.01%
deep-learning machine-learning computer-vision detectron2 airbnb-amenity-detection object-detection machine-learning-project

airbnb-amenity-detection's Introduction

I write and make videos about machine learning, health and life.

My writing is like the voice in your head found a typewriter.

My videos are like a spartan warrior leanred to code.

I'm currently working on Nutrify an app where you can take a photo of food and learn about it.

I teach machine learning and deep learning on Zero to Mastery.

I've authored three books:

  • Charlie Walks - a sci-fi/romance/philosophical novel about a machine learning engineer who wants to be a writer.
  • learntensorflow.io - a 50,000+ word online (and free) book that teaches you TensorFlow and Deep Learning in a beginner-friendly, code-first way.
  • learnpytorch.io - the internet's most beginner-friendly way to learn PyTorch for deep learning.

And many people have found my posts on machine learning helpful:

Find me elsewhere online:

airbnb-amenity-detection's People

Contributors

mrdbourke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

airbnb-amenity-detection's Issues

Labels not visible.

According to the google collab notebook, I went through all the steps of data gathering and pre-processing and finally got the JSON file.

My questions is:

  1. why am I not getting labels in my output.
  2. How can I remove the dependency of 'n' number of objects that are detected.

To register the dataset I followed the following commands:

from detectron2.data import DatasetCatalog, MetadataCatalog

# Loop through different datasets
for dataset in ["custom_train", "custom_valid"]:
  
    # Create dataset name strings
    dataset_name = dataset
    print(f"Registering {dataset_name}")

    # Register the datasets with Detectron2's DatasetCatalog, which has space for a lambda function to preprocess it
    DatasetCatalog.register(dataset_name, lambda dataset_name=dataset_name: load_json_labels(dataset_name))

    # Create the metadata for our dataset (the main thing being the classnames we're using)
    MetadataCatalog.get(dataset_name).set(thing_classes=subset_classes)

# Setup metadata variable
cmaker_fireplace_metadata = MetadataCatalog.get("custom_train")

Output:

Registering custom_train
Registering custom_valid

Also checked by:

 DatasetCatalog._REGISTERED

 'custom_train': <function __main__.<lambda>>,
 'custom_valid': <function __main__.<lambda>>,

For training I used this:

from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg

# Setup a model config (recipe for training a Detectron2 model)
cfg=get_cfg()

# Add some basic instructions for the Detectron2 model from the model_zoo: https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/retinanet_R_50_FPN_3x.yaml"))

# Add some pretrained model weights from an object detection model
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/retinanet_R_50_FPN_3x.yaml")

# Setup datasets to train/validate on (this will only work if the datasets are registered with DatasetCatalog)
cfg.DATASETS.TRAIN = ("custom_train",)
cfg.DATASETS.TEST = ("custom_valid",)

# How many dataloaders to use? This is the number of CPUs to load the data into Detectron2, Colab has 2, so we'll use 2
cfg.DATALOADER.NUM_WORKERS = 2

# How many images per batch? The original models were trained on 8 GPUs with 16 images per batch, since we have 1 GPU: 16/8 = 2.
cfg.SOLVER.IMS_PER_BATCH = 2

# We do the same calculation with the learning rate as the GPUs, the original model used 0.01, so we'll divide by 8: 0.01/8 = 0.00125.
cfg.SOLVER.BASE_LR = 0.00125

# How many iterations are we going for? (300 is okay for our small model, increase for larger datasets)
cfg.SOLVER.MAX_ITER = 300

# ROI = region of interest, as in, how many parts of an image are interesting, how many of these are we going to find? 
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128

# We're only dealing with 2 classes (coffeemaker and fireplace) 
cfg.MODEL.RETINANET.NUM_CLASSES = 2

# Setup output directory, all the model artefacts will get stored here in a folder called "outputs" 
os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)

# Setup the default Detectron2 trainer, see: https://detectron2.readthedocs.io/modules/engine.html#detectron2.engine.defaults.DefaultTrainer
trainer = DefaultTrainer(cfg)

# Resume training from model checkpoint or not, we're going to just load the model in the config: https://detectron2.readthedocs.io/modules/engine.html#detectron2.engine.defaults.DefaultTrainer.resume_or_load
trainer.resume_or_load(resume=False) 

# Start training
trainer.train()

For saving the config file and importing to drive:

# Save config to file
with open("output/config.yaml", "w") as f:
    f.write(cfg.dump())

import shutil
shutil.make_archive('custom_model', 'zip', '/content/output')
%cp "/content/custom_model.zip" "/content/drive/My Drive/Facebook AI/" 

I then opened a new notebook and wrote:

model_config="/content/output/config.yaml"
model_weights="/content/output/model_final.pth"
cfg = get_cfg()
cfg.merge_from_file(model_config)
cfg.MODEL.WEIGHTS = model_weights
cfg.MODEL.SCORE_THRESH_TEST = 0.7
predictor = DefaultPredictor(cfg)
img = cv2.imread("/content/Capture2.JPG")
visualizer = Visualizer(img_rgb=img[:, :, ::-1],
                        metadata=MetadataCatalog.get(cfg.DATASETS.TEST[0]),
                        scale=0.7)
outputs = predictor(img) # Outputs: https://detectron2.readthedocs.io/modules/structures.html#detectron2.structures.Instances
visualizer = visualizer.draw_instance_predictions(outputs["instances"][:1].to("cpu"))
cv2_imshow(visualizer.get_image()[:, :, ::-1])

While the input image was:
Capture2

I got the output as:
output

Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.