Classify trash with Machine Learning

3 min readJun 19, 2019

If you don’t know anything about image classification I suggest you to check out my previous article about classifier for handwritten digits. I’m expecting some level of understanding about neural networks from the readers. All the codes are available on github.

Introduction

AI is one of the hottest topic right now. We know it or not but its already around us. Large companies like Amazon, Google, Apple etc. track their users data to categorize and classify their customers for better customer targeting for ads ,product service , and many more fields. In this article I’ll build a classifier for trash, It will categorise the trash in given picture in 5 categories plastic, metal, paper, glass and cardboard. I was not able to improve the accuracy further 78%, but it’s just a problem of small dataset, with more dataset the model should adapt to higher accuracy.

keras data augmentation

To cope up with the problem of small dataset I used data augmentation. Basically it modifies the images slightly to generate more images. If you zoom a picture of plastic bottle a bit, the meaning of the picture does not change. same way, flipping the image horizontally(or vertically in some cases) does not change the meaning of the image. The keras data augmentation does all that modifications for you.

All the images are generated from a single image

flow_from_directory will generate the label class indices automatically from the directory names. So train_generator have both, the image data and the label data.

There is another way to handle the problem of small dataset, transfer learning. Transfer learning is beyond the scope of this article(may cover it in future articles).

The Model

I used five convolutional blocks each containing two convolutional layes and a maxpool layers at the end. If you don’t know about convolutional neural networks then it’s the right time to read about it. Here is a good article.

The numbers are pretty arbitrary, You can play with the numbers and may even get better results than me.

Training

With the data augmentation we can’t use fit like the conventional keras models. With data augmentation we use fit_generator and pass it the train_generator(remember, it contains image data with the labels).

I trained the model on google colab and also suggest you to do so. CNNs need lots of processing power, so if you don’t have a good GPU then don’t even try to train it on your CPU (If you have a lot of time then it’s a different story). I trained the model for 500 epochs(5 times with 100 epochs each).

It’s a good practice to always save your model. I used checkpoint that saves the model if the validation accuracy has improved.

Test

Let’s test our model(spoiler: It’s not great). I evaluated the model and got an accuracy of 78%.

Now, see some predictions from the testing set.

Conclusion

I am not an expert in AI, I am just an engineering student who is interested about the AI stuff. This article is to help other people who are also interested in AI and to help myself to improve my understanding about the topic. All the codes are available on my github.

DarvinX/trash_classifier

A keras model for categorizing trashes into 5 categories. - DarvinX/trash_classifier

github.com

References

Image Preprocessing - Keras Documentation

data_format: Image data format, either "channels_first" or "channels_last". "channels_last" mode means that the images…

keras.io

Understanding of Convolutional Neural Network (CNN) — Deep Learning

In neural networks, Convolutional neural network (ConvNets or CNNs) is one of the main categories to do images…

medium.com

Data Augmentation in Python: Everything You Need to Know - neptune.ai

In machine learning ( ML), if the situation when the model does not generalize well from the training data to unseen…

neptune.ai

Building powerful image classification models using very little data

But what's more, deep learning models are by nature highly repurposable: you can take, say, an image classification or…

blog.keras.io

garythung/trashnet

Dataset of images of trash; Torch-based CNN for garbage image classification - garythung/trashnet