Deployment: Using the Roboflow Inference API

How to navigate the Roboflow Hosted Inference API and Python package code snippets for model deployment.

Written by Mohamed Traore

Last published at: June 29th, 2022

Model deployment can be a tedious process. Roboflow offers a number of easy deployment options, with model inference code snippets available for command line (cURL), Python, Javascript, Swift, Java, .NET, and more.

Roboflow's Hosted Inference API

Roboflow's deployment options available after trainingRoboflow's deployment options available after training

Inference API: Response Object Format

Inference with the Hosted Inference API is performed through an HTTP POST request and returns a JSON object containing an array of predictions. Each prediction has the following properties:

  • x = the horizontal center point of the detected object
  • y = the vertical center point of the detected object
  • width = the width of the bounding box
  • height = the height of the bounding box
  • class = the class label of the detected object
  • confidence = the model's confidence that the detected object has the correct label and position coordinates

You'll notice the "image" under "predictions." This contains the width and height of the image or video frame that was sent to the API for inference.

  • width = the width of the bounding box
  • height = the height of the bounding box
// an example JSON object
    "predictions": [
        {"x": 1113.0,
        "y": 880.0,
        "width": 138,
        "height": 330,
        "class": "white-bishop",
        "confidence": 0.642
        {"width": 2048,
        "height": 1371

An example Response Image with a drawn bounding box that was created with Roboflow's Hosted Inference API's Response Object after making an inference call. The depicted object is of a white pawn with a red object detection bounding box around it.Example Response Image (box drawn with Hosted Inference API's Response Object)

Note: position (0,0) refers to the top-left corner of the image or video feed frame the inference was performed on.

Examples & Guides:

The Roboflow Hosted Inference API allows for inference on both images, and video.

Here's a Google Colab notebook pre-filled with an example Roboflow Hosted Inference API and Python package code snippet to help you get started after creating your first project:

Displaying the Response Image with format=image

If you pass format=image in the query string, the inference API will return a base64 encoded string of your image with the inference detections drawn on top. You can decode this with your favorite image processing library - here we provide an example with cv2 and numpy

# Get prediction from Roboflow Infer API
resp =, data=img_str, headers={
    "Content-Type": "application/x-www-form-urlencoded"
}, stream=True).raw

# Parse result image
image = np.asarray(bytearray(, dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR)

Drawing a Box from the Inference API JSON Output

Frameworks and packages for rendering bounding boxes can differ in positional formats. Given the response JSON object's properties, a bounding box can always be drawn using some combination of the following rules:

  • the center point will always be (x,y)
  • the corner points (x1, y1) and (x2, y2) can be found using:
    • x1 = x - (width/2)
    • y1 = y - (height/2)
    • x2 = x + (width/2)
    • y2 = y + (height/2)

The corner points approach is a common pattern and is seen in libraries such as Pillow when building the box object to render bounding boxes within an Image.

Don't forget to iterate through all detections found when working with predictions!

# example box object from the Pillow library
for bounding_box in detections:
    x1 = bounding_box['x'] - bounding_box['width'] / 2
    x2 = bounding_box['x'] + bounding_box['width'] / 2
    y1 = bounding_box['y'] - bounding_box['height'] / 2
    y2 = bounding_box['y'] + bounding_box['height'] / 2
    box = (x1, x2, y1, y2)

Full working example:

Note: to receive an image with predicted box output - just pass "format=image" as a query parameter. This section shows an example if you want to double-check and parse your JSON output.

import glob
import requests
import base64
from base64 import decodebytes
import io
from PIL import Image, ImageDraw, ImageFont
import time
import cv2
from io import BytesIO

parts = []
url_base = ''
endpoint = '[YOUR-MODEL]'
access_token = '?access_token=[YOUR_TOKEN]'
format = '&format=json'
confidence = '&confidence=10'
url = ''.join(parts)

f = '[YOUR-IMAGE].jpg'
image =
buffered = io.BytesIO()
image = image.convert("RGB"), quality=90, format="JPEG")
img_str = base64.b64encode(buffered.getvalue())
img_str = img_str.decode("ascii")

headers = {'accept': 'application/json'}
start = time.time()
r =, data=img_str, headers=headers)
print('post took ' + str(time.time() - start))

# print(r) - the Response Object
preds = r.json()
detections = preds['predictions']

# drawing the box on the inferenced image, using the data from the Response Object
image =
draw = ImageDraw.Draw(image)
font = ImageFont.load_default()

for box in detections:
    # update the Hex code to adjust the bounding box color
    color = "#4892EA"
    x1 = box['x'] - box['width'] / 2
    x2 = box['x'] + box['width'] / 2
    y1 = box['y'] - box['height'] / 2
    y2 = box['y'] + box['height'] / 2
        x1, y1, x2, y2
    ], outline=color, width=5)
    if True:
        text = box['class']
        text_size = font.getsize(text)
    # set button size + 10px margins
    button_size = (text_size[0]+20, text_size[1]+20)
    button_img ='RGBA', button_size, color)
    # put text on button with 10px margins
    button_draw = ImageDraw.Draw(button_img)
    # to customize the text color for object's label: update "fill" value
    button_draw.text((10, 10), text, font=font, fill=(255,255,255,255))
    # put button on source image in position (0, 0)
    image.paste(button_img, (int(x1), int(y1)))

Next Steps:

  1. Collect data from bad or imperfect model predictions.
    1. e.g low-confidence detections and/or a lack of detections when at least one should be present
  2. Improve your model by Implementing Active Learning.
  3. After retraining your model, re-deployit to test its performance, or release it for use in production.
    1. More Resources: Deploying to Production with the Roboflow Python package
  4. Repeat Steps 1-3 to continue improving your model until you receive the desired results.
    1. e.g achieving desired results based on your optimal model training metrics (mAP/precision/recall) or the model's performance while inferring on images or video.