Giter VIP home page Giter VIP logo

litter-detection's Introduction

Abstract

Abstract—This study combines photo metadata and computer vision to quantify where uncollected litter is present. Images from the Trash Annotations in Context (TACO) dataset were used to teach an algorithm to detect 10 categories of garbage. Although it worked well with smartphone photos, it struggled when trying to process images from vehicle mounted cameras. However, increasing the variety of perspectives and backgrounds in the dataset will help it improve in unfamiliar situations. These data are plotted onto a map which, as accuracy improves, could be used for measuring waste management strategies and quantifying trends.

image

The Problem

The world is increasingly relying on plastics for food packaging, electronics, single use containers, etc. In the 1960s they made up less than 1% of municipal solid waste, by 2005 this number rose to 10%. Post-consumer waste alone accounts for about 5% of total greenhouse gas emissions each year and costs over $200 billion each year. The world bank estimated the global waste production will continue to double roughly every decade, and won't plateu until the end of the century.

plastic pollution

Our Research

Based on recent advances in computer vision, an automated trash detection system could help answer these two questions: Where is waste located? How much is there? This is done by leveraging images from smartphones and dashcams to analyze specific geographic areas. With only a few dedicated vehicles, daily or weekly, snapshots could be taken to show how much litter there is over time. This information will be used to produce a human readable map that could be used by policy makers and non-profit organizations to evaluate and improve litter reduction programs.

litter map

Algorithm: Mask R-CNN

This research uses Masked Regional Convolutional Neural network (Mask R-CNN), a computer vision algorithm that was made open source by Facebook in 2018. This performs both object detection (placing it into a category), and instance segmentation (creating an outline of where it is). Although training is a long process, detection only takes a few seconds, and can be run on a small mobile device with fast results.

detections

Datasets

TACO

The Trash Annotations in Context (TACO) dataset contains publicly uploaded images of litter that have been outlined and labeled [Fig 1]. It was used in combination with data augmentation to achieve an average accuracy score of 86%. TACO contains 1500 images with 4,784 individually outlined objects.

detections

Mapillary

Mapillary converts community submitted dash cam footage into a collection of images covering the road network, and scrubs any personally identifiable information such as faces and license plates. If detection accuracy is high enough then the data in these images could be quickly processed to locate what streets have the most trash and are most in need cleaning up.

Mapillary

CSUS Clean Up

The second output dataset consists of hand collected images of trash that were cleaned up at California State University, Sacramento. These were taken using an iPhone 11 before the trash was disposed of. This dataset is representative of what a volunteer organization could collect to estimate categorized totals and gain other insights into their work.

image

Location Estimation

Each detection was placed on a map allowing these data to be accessible to waste management organizations and policy makers in an easily understandable format. The coordinates and camera orientation were extracted from the image’s exif metadata and used to estimate the camera viewing angle. Next the outline generated by Mask R-CNN was used to horizontally place that detection into the camera’s perspective cone.

Triangulation

Results

image

When evaluating 150 smartphone images a precision of 80% and recall of 73% [Table 2] are achieved, which is close to results seen by researchers using different algorithms on the same dataset [12]. However, the same model scored 1% and 3% respectively [Table 3] when run on 200 images from Mapillary in the neighborhood surrounding California State University, Sacramento.

image

image

Combining the detections from the pretrained model and the images’ metadata, two maps were created showing the detections from each dataset. Fig. 5 shows the waste cleaned up for the smart phone dataset and where it was found. It clearly outlines the loop that was walked around the main campus buildings, as well as a few scattered detections that come from early testing data. Mapillary image results from the community surrounding the university are shown in Fig. 6. As accuracy improves, computer vision generated maps like this could be used to help organizations plan what areas should be cleaned up.

Issues

Although the accuracy on the Mapillary images was low, results from the smartphone images was similar to previous research. This accuracy discrepancy between datasets is likely due to three issues: the camera pixel densities, the training dataset perspective, and misclassified unfamiliar objects. Modern smartphones have very high-definition cameras which provide a high level of detail and clarity. However, the dashcam images used had a lower pixel count, and because they were originally videos, some details may be lost due to video compression. The model previously struggled with detecting lower resolution objects with less than 20 x 20 pixels. Cigarettes, for example, have the second most labeled instances, and are generally uniform in color and appearance, so the model should be able to recognize them very easily. However, because of the small size, it had difficulty extracting features it could use to accurately distinguish them from leaves and other background objects. Because the images in the training dataset are mostly smartphone pictures, they are usually taken very close to the objects and from above. This can cause the model to overfit to this perspective, and not generalize its understanding of trash to a horizontal, road facing camera. Additionally, the trash in the Mapillary images tended to be within the range of 10 to 30 feet, making them appear much smaller and decreasing the model’s confidence.

image

Lastly, there were specific objects the model hadn’t seen during training that it routinely misclassified as trash. The largest cause of FPs was cars being labeled as a can with pop tabs for wheels, which made up 50% of FPs [Fig. 7]. The model could more accurately label cars as TN if the training dataset contained more images with labeled waste in the foreground and cars in the background.

Future Work

Several issues were outlined after analyzing results, these will need to be addressed to increase the accuracy to an acceptable level. To fix the pixel density issue a higher resolution dataset could be used, or better cameras could scan and update Mapillary to improve the image detail. Additionally, adding new images to the TACO dataset that are from Mapillary will allow it to learn a more generalized concept of what waste will look like in new perspectives. To improve location estimation multiple images could be combined to triangulate the position. As these improvements are made, more accurate maps of communities could quickly be generated to show where litter is located with a high level of detail.

Conclusion

h street cleanup

This research demonstrates how computer vision can be used to aid volunteers in finding what areas have the most litter [Fig. 6] and measuring how much of that waste gets removed during a cleanup [Fig. 5]. These techniques will help reduce the amount of litter in the environment, which has been rapidly increasing due to a combination of high waste production, and low collection rates. This can harm animals, be a health risk for surrounding communities, and cause environmental hazards. While trash clean-up activities can help collect and dispose of litter, they are expensive, and thus usually only focus on especially dirty areas. Using computer vision to quantify trash levels in different communities would help these organizations focus their efforts where they’re most needed. Because global waste production isn’t predicted to plateau until the end of the century, new technologies such as this, are needed to help clean up trash faster than it’s being littered.

References

[1] J. R. Jambeck et al., “Plastic waste inputs from land into the ocean,” Science, vol. 347, no. 6223, pp. 768–771, 2015.

[2] M. Sigler, “The effects of plastic pollution on aquatic wildlife: current situations and future solutions,” Water, Air, & Soil Pollution, vol. 225, no. 11, pp. 1–9, 2014.

[3] D. Hoornweg and P. Bhada-Tata, “What a waste: a global review of solid waste management,” World Bank, 2012.

[4] H. Yang, M. Ma, J. R. Thompson, and R. J. Flower, “Waste management, informal recycling, environmental pollution and public health,” J Epidemiol Community Health, vol. 72, no. 3, pp. 237–243, Mar. 2018, doi: 10.1136/jech-2016-208597.

[5] Z. Liu, M. Adams, and T. R. Walker, “Are exports of recyclables from developed to developing countries waste pollution transfer or part of the global circular economy?,” Resources, Conservation and Recycling, vol. 136, pp. 22–23, 2018.

[6] S. J. Barnes, “Out of sight, out of mind: Plastic waste exports, psychological distance and consumer plastic purchasing,” Global Environmental Change, vol. 58, p. 101943, 2019.

[7] J. Redmon, S. Davila, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 779–788. doi: 10.1109/CVPR.2016.91.

[8] B. D. Carolis, F. Ladogana, and N. Macchiarulo, “YOLO TrashNet: Garbage Detection in Video Streams,” in 2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), Bari, Italy, May 2020, pp. 1–7. doi: 10.1109/EAIS48028.2020.9122693.

[9] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, Jul. 2017, pp. 6517–6525. doi: 10.1109/CVPR.2017.690.

[10] L. Donati, T. Fontanini, F. Tagliaferri, and A. Prati, “An Energy Saving Road Sweeper Using Deep Vision for Garbage Detection,” Applied Sciences, vol. 10, no. 22, p. 8146, Nov. 2020, doi: 10.3390/app10228146.

[11] M. Yang and G. Thung, “Classification of trash for recyclability status,” Stanford.edu. [Online]. Available: http://cs229.stanford.edu/proj2016/report/ThungYang-ClassificationOfTrashForRecyclabilityStatus-report.pdf.

[12] H. Gupta, “Trash Image Classification system using machine learning and deep learning algorithms,” Dublin, National College of Ireland, 2020. http://norma.ncirl.ie/4396/.

[13] P. Proença and P. Simões, Tacodataset.org. [Online]. Available: http://tacodataset.org/. [Accessed: 31-Jan-2022].

[14] H. Panwar et al., “AquaVision: Automating the detection of waste in water bodies using deep transfer learning,” Case Studies in Chemical and Environmental Engineering, vol. 2, p. 100026, Sep. 2020, doi: 10.1016/j.cscee.2020.100026.

[15] P. F. Proença and P. Simões, “TACO: Trash Annotations in Context for Litter Detection,” arXiv:2003.06975 [cs], Mar. 2020, Accessed: Oct. 15, 2021. [Online]. Available: http://arxiv.org/abs/2003.06975

[16] U. S. EPA and OLEM, “National overview: Facts and figures on materials, wastes and recycling,” 2017. https://www.epa.gov/facts-and-figures-about-materials-waste-and-recycling/national-overview-facts-and-figures-materials.

[17] J. Bai, S. Lian, Z. Liu, K. Wang, and D. Liu, “Deep Learning Based Robot for Automatically Picking Up Garbage on the Grass,” IEEE Trans. Consumer Electron., vol. 64, no. 3, pp. 382–389, Aug. 2018, doi: 10.1109/TCE.2018.2859629.

[18] P. Kandel, “Computer Vision For Recycling,” University of Mary Washington, 2020. https://scholar.umw.edu/student_research/379/.

[19] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” arXiv:1703.06870 [cs], Jan. 2018, Accessed: Jan. 01, 2022. [Online]. Available: http://arxiv.org/abs/1703.06870

[20] Mapillary.com. [Online]. Available: https://www.mapillary.com/platform. [Accessed: 11-May-2022].

[21] K. Grines, “Video encoding & compression: Why storage & quality matter,” Harmonicinc.com, 17-Aug-2020.

[22] R. Hebbalaguppe, G. Garg, E. Hassan, H. Ghosh, and A. Verma, “Telecom Inventory Management via Object Recognition and Localisation on Google Street View Images,” in 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, Mar. 2017, pp. 725–733. doi: 10.1109/WACV.2017.86.

[23] K. Willis, C. Maureaud, C. Wilcox, and B. D. Hardesty, “How successful are waste abatement campaigns and government policies at reducing plastic waste into the marine environment?,” Marine Policy, vol. 96, pp. 243–249, Oct. 2018, doi: 10.1016/j.marpol.2017.11.037.

litter-detection's People

Contributors

julianofhernandez avatar clarkfitzg avatar

Watchers

James Cloos avatar  avatar

litter-detection's Issues

New GeoJSON

Allow new file GeoJSON file to be created if one isn't loaded.

Breaking imports

import geopandas and import contextily as cx are both using imports that cause tensorflow to break.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.