How to Use Roboflow Universe
You're likely on Roboflow Universe to search or explore datasets for computer and machine vision or Digital Transformation (DX); find new data to add to your own project or test and deploy models from Universe in your own applications.
Follow along for our complete guide to utilizing Roboflow Universe:
Selecting a Model to Test on Your Own Data
Before we select a model to test our data, there are a few considerations we must have in mind:
- What is our Problem Statement? We need it to be specific, actionable, and measurable to have benchmarks for when we believe our model is performant enough for initial deployment; when we believe the model to be highly performant; and the metrics to track for Return on Investment (as we don't want to spend all of this time making a model for nothing).
- Selecting a dataset that contains the subjects or objects of interest
- Ensuring the dataset is properly labeled (e.g tight bounding boxes, no missing annotations on subjects of interest).
- Checking the class names to discern whether we need to utilize Modify Classes to remap or omit any classes and Filter Null so our class names match when the datasets are combined, and extraneous classes or images are filtered out.
- Using search filters to query datasets by the total number of images in the project; whether it has a trained model we have an idea of how these new images may affect our current dataset; and
Problem Statement: Create an accurate face detection model to automatically count the number of visitors to a retail store throughout the day. This will help us with proper staffing at all hours the store is open as we collect more data over time. The identified faces will be blurred with some added "post-processing" code within our inference script to retain the privacy of our visitors.
Searching Roboflow Universe for Datasets
Universe Datasets for Transfer Learning and Assisted Labeling
Transfer learning is a machine learning (ML) technique where knowledge gained from a model from training a set of problems can be used to solve other related problems. This would be the equivalent of Training from a Checkpoint (Start from a Checkpoint) on Roboflow.
If you'd like to know about Transfer Learning, or how it works, here's a quick guide: A Primer on Transfer Learning
Considerations when selecting a dataset for Transfer Learning:
The same considerations apply as the ones listed above for "Selecting a Model to Test on Your Own Data." However, one metric that should be more heavily considered in dataset selection for transfer learning is the result of the trained Roboflow model for the dataset.
To decipher the training results, you'll need to understand mean average precision (mAP), precision, and recall. For a benchmark, object detection models with mAP, precision, and recall at or above 70% are good places to start your search. If you're working on classification, a model with a Top Validation % greater than 70% are good places to begin your search. For segmentation models, it's best to look for a mAP score of at least 60%.
Model-Assisted Labeling (Label Assist)
Considerations when selecting a dataset for Assisted Labeling:
The same considerations for selecting a good dataset for Transfer Learning apply when selecting a dataset for Assisted Labeling.
One added caveat is that you'll want to test the model with the Hosted Inference API, drop an image or video file into the rfWidget, or use the new Model tab on Roboflow Universe. You'll want to do this to see if the dataset you are using for Assisted Labeling is well-suited to find objects in your images at a confidence level between 10% and 90%.
Testing a Universe Model
As we found in our tests on the Roboflow Universe Model page for the Face Detection model, our model is not perfect. Our model also had trouble with Label Assist at higher confidence levels. This tells us that Active Learning should be implemented to quickly improve our dataset.
Active Learning can be used to improve the confidence level of our dataset's detections on images it has tested poorly on. The tool can also be employed to root out false or inaccurate detections we notice in tests, or when our model is in the production environment.
We can set specific rules within our inference or deployment code to do things such as:
- Sample images for upload to Roboflowwhen detections are under a confidence level of 70%.
Sample images for upload to Roboflow when detections are not present, even though we know they should be.
- In our retail store occupancy use case, we would "know" when to turn on the sampling for this based on our history of the busiest known hours for our store.
Sample images for upload to Roboflow at random time intervals.
- This would be employed to help us combat data drift.
Sample images for upload to Roboflow at times of the day we have seen our model perform poorly.
- This would be employed to sample more images at night if we find that our dataset does poorly at night, or if we have very few images within our dataset for the nighttime.
- If you're having trouble with improving your model, search our Knowledge Base, Documentation, or Community Forum for more resources, or post to our Community Forum for help.
Finished a cool project, or looking for people to collaborate with? Post on the Show & Tell page on our Community Forum!
- We like to feature cool and interesting projects on our blog and newsletter (we're happy to add links to your portfolio or LinkedIn in the writeup) and can provide more storage and training credits without charge for those that cite us in their research paper, post in Show & Tell, write a blog post, or tag us in a post about their project experience on social media.
Roboflow's Social Media Links:
Dwyer, B., Nelson, J. (2022). Roboflow (Version 1.0) [Software]. Available from https://roboflow.com. computer vision.