3rd Workshop on Maritime Computer Vision (MaCVi)

Challenges / Approximate Supervised Object Distance Estimation

Approximate Supervised Object Distance Estimation

Quick links: Dataset download Submit Leaderboards Ask for help

Quick Start

  1. Download the dataset from the dataset page.
  2. Train your model on the provided dataset. Find starter code here.
  3. Upload your model exported to ONNX to be evaluated by us on the test split, see upload page. Please note that the evaluation may take a couple of hours (~4).

🏅 Prizes

The top team(s) of this challenge will win prizes sponsored by Shield AI. Details of the prizes will be worked out soon.

Overview

This challenge focuses on developing algorithms for approximate supervised object distance estimation using monocular images captured from USVs. Participants are tasked with creating models that estimate the distance of navigational aids such as buoys, using only visual cues. The goal is to maximize accuracy of the object detection and distance estimation tasks while ensuring real-time operation.

Task

The task is to develop a deep learning model capable of detecting objects and predicting their distances from a USV using monocular images. Models will be evaluated on both detection accuracy and distance estimation under varying environmental conditions.

Dataset

The dataset consists of approximately 3000 images of maritime navigational aids, primarily red and green buoy markers. Only the training set is provided; the test set is withheld to establish a benchmark for all submitted models in the competition. Accurate distance ground truth values are included, calculated using the haversine distance between the camera's GPS coordinates for each frame and the mapped buoy locations. Some images contain objects at considerable distances, appearing as only a few pixels in the video feed. These challenging samples may affect the object detector's performance, potentially lowering the mAP metric. Since the challenge evaluation includes object detection criteria (such as mAP), you may choose to exclude such samples from the training set if desired.

Evaluation metrics

The submitted models are evaluated on a test set that is not publically available.
Two different metrics are employed to access the models performance regarding object detection as well as distance estimation:

The weighted Distance Error Metric is defined as follows: $$\varepsilon_{Dist} = \sum_{i=1}^n \frac{c_i}{\sum_{j=1}^n c_j} \frac{|d_i - \hat{d}_i|}{d_i}$$ where we compute the relative weighted distance error by first computing the absolute error, dividing this by the ground truth value and then multiplying the relative error with the normalized confidence. Since predictions for distant objects typically show greater deviation from the ground truth, we aim to penalize smaller errors for closer objects as well.

To evaluate the object detection performance as well as the distance error the final, combined metric is specifies as follows: $$\text{Combined Metric} = \text{mAP@[0.5:0.95]} \cdot (1- \min(\varepsilon_{Dist}, 1))$$ The final challenge placements are determined based on this score.

Participate

To participate in the challenge follow these steps:

  1. Download the dataset from the dataset page.
  2. Train a deep learning model on the dataset.
  3. Export your trained model to ONNX and submit it via the upload page. We will subsequently evaluate it on the test split

Get Started

To help you get started and provide a brief introduction to the topic, we have developed an adapted YOLOv7 object detector that can also make distance predictions. You can find the code here.

Model Submission

In order to submit your model you must first export it to ONNX format. A Python script is provided for this in the starter code repository for the adapted YOLOv7 model including distance estimation. Please be aware that modifications to this script may be necessary, depending on the structure of your model.

The submitted ONNX files must meet the following requirements:

The submitted models are tested on rescaled images of size 1024x1024. The output tensor of the model adheres to the YOLO Format, where x,y are the absolute center coordinates, w,h are the absolute bounding box dimensions and objectness, class are used to compute the confidence score. This tensor is extended by the distance value. We expect this value to be already rescaled to meters (not normalized).

To verify the validity of your ONNX export, you can use the testscript_onnx.py available in the repository. This script closely resembles the one used for evaluation on the server.

Terms and Conditions

If you have any questions, please join the MaCVi Support forum.