Roboflow Train: Understanding Training Graphs

Analyzing training graphs from models trained with Roboflow Train

Written by Mohamed Traore

Last published at: October 8th, 2022

Accessing Your Training Graph

To access your training graph, you'll first need to train a model with a Roboflow Train credit. All new workspaces come with 3 free Roboflow Train credits.

Quick Start Guide

 

Models trained with Roboflow Train give you the ability to access a custom Model-Assisted Labeling checkpoint for automation of labeling on your projects, testing your model in the Deploy Tab, and our simplified deployment solutions.

After your model training completes, select "Details" within the UI while you are viewing the trained dataset version.

Visualized Training Results for v7 of the Face Detection Dataset

 

A Guide for Model Production

 

The training graph for version 7 (v7, Roboflow-FAST-model) of the Face Detection dataset on the Featured Section of Roboflow Universe looks like this:

Visualized Roboflow Train Graph
Visualized Roboflow Train Graph for v7 of the Face Detection Dataset

How to Read the Training Graph:

Y-Axis

  • A percentage (%) value, in decimal form.

X-Axis

  • The epoch number for the data point.
  • An epoch is the number of times all of your training data will run through your model architecture's network.
    • e.g training for 100 epochs means your training data will run through your full model architecture's network 100 times.
    • To train for a custom number of epochs, use the Roboflow Model Library after generating a version of your dataset, and exporting it to the notebook of your choice.

box_loss

  • Better known as “box loss.”
  • A loss metric, based on a specific loss function, that measures how "tight" the predicted bounding boxes are to the ground truth objects (the labels on your dataset's images).
  • A lower value indicates your model is improving for generalization and creating better bounding boxes around the objects the dataset has been labeled to identify.

cls_loss

  • Better known as “classification loss.”
  • A loss metric, based on a specific loss function, that measures the correctness of the classification of all predicted bounding boxes. Each individual bounding box may contain an object class, or a “background label” (Null image).

mAP

  • To calculate mAP for object detection, you calculate the average precision for each class in your data based on the model predictions. Average precision is related to the area under the precision-recall curve, or AUC, for a given class in your dataset. The mean of these average precision for the individual classes gives you the mean Average Precision or mAP.

Note: mAP is influenced by Intersection over Union, or IoU, of ground truth labels and predicted bounding boxes.

Intersection over Union (IoU) is measured as the amount of a predicted bounding box that overlaps with the ground truth bounding box, divided by the total area of both bounding boxes.

Visual representation of Intersection over Union (IoU) - Source

mAP_0.5

  • Better known as "mean Average Precision with an IoU of 0.50, or 50%".
  • The mean Average Precision (mAP) with predictions evaluated as a “detected object” at an Intersection over Union (IoU) greater than 0.5, or 50%.

mAP_0.5:0.95

  • Better known as "mean Average Precision with an IoU interval of 0.50 to 0.95, or 50% to 95%".
  • The mean Average Precision (mAP) with predictions evaluated as a “detected object” at an Intersection over Union (IoU) greater than 0.50 and less than or equal to 0.95 (50%-95%].

precision

  • A measure of how precise a model is at prediction time. True positives are divided by all positives that have been guessed.

recall

  • A measure of performance for a prediction system. Recall is used to assess whether a prediction system is guessing enough. True positives are divided by all possible true positives.

train

  • This refers to the metrics for your training set [split].

val

  • This refers to the metrics for your validation set [split].

 

The Train, Validation, and Testing Split

 

If you aren't satisfied with your results, you can take advantage of Roboflow's dataset management system, and code tentacles, such as the Python package, and the Upload API, to add more images to your project, and improve your results in the next training attempt and when the model is next tested and deployed.

 

Implementing Active Learning

Dataset Health Check for Model Improvement

 

Still having trouble? Send an email to help@roboflow.com, or post in our Community Forum for assistance.

For those working on business Proof of Concepts, or on paid Roboflow plans, submit a Contact Form or contact your dedicated Roboflow Support representative for assistance.