• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

Create Custom Image Dataset for AI/ML projects using Python

January 9, 2021 by systems

Custom Image Dataset using a Python Library

Satyam Kumar
Photo by Franki Chamaki on Unsplash

Data is the basic requirement for any data science projects. Dataset can be available in any type depending on the project. Data can be present in the form of audio, video, text, images, etc. A good amount of dataset is required to train a robust machine learning/deep learning model.

Many times we are not able to search for the appropriate image dataset required for a particular project. Searching and downloading images from the web and annotating it manually requires a lot of time and manpower. In this article, you can read how to prepare a custom image dataset using a few lines of python code that can be used to train several deep learning models, using a python library bing_image_downloader.

Bing is a web search engine developed by Microsoft. Bing Image Downloader is a python library that can be used to download the bulk of images from Bing.com using python. It uses an async URL, so it is very fast.

Installation:

The library can be installed using pip

pip install bing-image-downloader

or clone the GitHub repository

git clone https://github.com/gurugaurav/bing_image_downloader

Usage:

Dogs and Cats Classifier is one of the best beginner projects for CNN. The dataset for this is freely available in just a few clicks. Suppose someone wants to work on a CNN project to predict the dog’s breed from its image.

To develop a CNN model that can predict a dog’s breed from its image, firstly you require a dataset having annotated images of different images of dog’s breed.

  • List down all the names of dog breeds to download custom images:
breed_names_list = 
[
"Affenhuahua", "Afgan Hound", "Akita", "Alaskan Malamute",
"American Bulldog", "Auggie", "Beagle", "Belgian Tervuren",
"Bichon Frise", "Bocker", "Borzoi", "Boxer", "Bugg", "Bulldog"
]
  • After listing the names of the dog’s breed use the bing_image_downloader API to download the bulk of images from bing.com.

This is it, the loop will run to download the images for each of the mentioned breeds and the downloaded images will be stored in separate folders annotated with their breed name.

Parameters:

query_string: String keyword to search
limit: Number of images to download for each search keyword
output_dir: Save the downloaded images to this directory
adult_filter_off: Enable or Disable adult images
force_replace: Is folder name already exist, delete it and save it fresh, else save in same folder.
timeout: timeout for connection in seconds.

Filed Under: Machine Learning

Primary Sidebar

Robots At Work!

Smart vaccine scheme quick to curb rabies threat in African cities

Amazon to open new facilities in Quebec creating over 1,000 jobs

AI and ML in FinTech: Revolutionizing Banking and Financial Services

Statistics for Data Science — Part I: Introduction

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2021 NEO Share

Terms and Conditions - Privacy Policy