Using WhatToLabel for Data Selection

Watch our short overview video below to get a first impression on how data selection works with WhatToLabel.

How it works

Submit your Dataset

You can use our web app, our command-line interface, or the docker container to filter your dataset. The command-line interface comes in handy when already using a cloud server for your deep learning model training.

Select Parameters

We allow you to optimize your dataset for various tasks. The command-line interface, as well as the web application, allows for coarse optimization for classification, object detection, segmentation, and GANs. More fine-grained control can be achieved with the docker container.

Behind the Scenes

After you submit your dataset with your preferred parameters our AI data filtering software – whom we named Boris – analyzes it.  Boris automatically removes corrupt files and rebalances the dataset on a feature level. Based on your filter preference nearby duplicates are removed or a new dataset is created based on the most important samples. We will share more details about how exactly we filter the datasets in this blog post. Click here.

After Filtering

You will be able to either download a list of final filenames or a clean dataset. Additionally, we provide you with a report showing you more details about how Boris processed your data.

Download an example analytics report for MS Coco
Download Analytics Report
Improve your data
Today is the day to get the most out of your data. Share our mission with the world — unleash your data's true potential.
Start Now