2nd Workshop on Maritime Computer Vision (MaCVi)

USV-based Obstacle Detection

Quick links: Dataset download Submit Leaderboards Ask for help


Quick Start

  1. Download the LaRS dataset on the dataset page.
  2. Train your model on the LaRS training set.
  3. Upload a .json file with predictions on the detection upload page.

Overview

This year's obstacle detection and segmentation challenges feature the newly released LaRS Dataset. Unlike existing benchmarks, LaRS focuses on scene diversity and covers a wide range of environments including inline waters such as lakes, canals and rivers. To perform well on LaRS, one needs to build a robust model that generalizes well to various situations. However, with this challenge we recognize the fact that for early practical applications of USV collision avoidance and path planning, robust dynamic obstacle detection is beneficial, even if its pixel-wise accuracy is somewhat lacking.

For the Obstacle Detection track, your task is to develop an obstacle detection method that detects the obstacles in the input image and represents their location in the image using rectangular bounding box. Following the LaRS Dataset nomenclature  there are 8 classes of dynamic obstacles (boat, buoy, other, row boat, swimmer, animal, paddle board, float) and three classes of "stuff", that is the pixels in the image that do not correspond to any of the dynamic obstacles classes.

For the purpose of this challenge, all eight LaRS dynamic obstacle classes should be treated as an obstacle and treated equally. You may use   any kind of processing backend to infer the bounding boxes , as long as you provide axis aligned bounding boxes as your result. You may train your methods on the LaRS training set, which has been designed specifically for this use case. You may also use additional publicly available data to train your method. In this case, please disclose this during the submission process.

Task

Create a single-class obstacle detection method that provides the axis-aligned bounding boxes of all the dynamic obstacles in an image. A dynamic obstacle in the USV world is anything that the USV needs to avoid and is not plotted on the charts, (e.g. boats, swimmers, buoys, platforms). Think of this challenge as following: you are building an autonomous boat, that has state-of-the-art electronic charts available, so your concerns are not static "obstacles"  (land and piers), but only the 8 classes of dynamic obstacles, as defined by  LaRS Dataset nomenclature .

Dataset

LaRS consists of 4000+ USV-centric scenes captured in various aquatic domains. It includes per-pixel panoptic masks for water, sky and different types of obstacles. On a high level, obstacles are divided into i) dynamic obstacles, which are objects floating in the water (e.g. boats, buoys, swimmers) and ii) static obstacles, which are all remaining obstacle regions (shoreline, piers). Additionally, dynamic obstacles are categorized into 8 different obstacle classes: boat/ship, row boat, buoy, float, paddle board, swimmer, animal and other. More information >

This challenge is based on the dynamic obstacle classes in the LaRS dataset and treats all dynamic obstacle classes as a single class, "obstacle".

Evaluation metric

 At its final output, your algorithm should output detections of dynamic obstacles, with rectangular axis-aligned bounding boxes. For this challenge, we do not use LaRS evaluation methodology. Since it is an object detection challenge, MODS object detection evaluation metrics apply, as follows:

Detection performance is are evaluated on standard IoU metric for bounding box detections. A detection counts as true positive (TP) if it has at least a 0.3 IoU score with the ground truth, otherwise it is counted as a false positive (FP), unless 75% of more of the pixels in the submitted bounding box overlap with the image area that is denoted as "static obstacle" as defined in the LaRS Dataset nomenclature. This simply means that we do not care if you make false positive detections on land.   The final score is an average F1 score, derived from TP and FN metrics. That's all! In the unlikely event of a tie, your output will be evaluated with more restrictive IoU to get to the final winner.

Furthermore, we require every participant to submit information on the speed of their method measured in frames per second (FPS). Please also indicate the hardware that you used for benchmarking the speed. Lastly, you should indicate which data sets (also for pretraining) you used during training.

Participate

To participate in the challenge follow these steps:

  1. Download the LaRS dataset ( LaRS webpage).
  2. Train a object detection method on the LaRS training set, considering dynamic objects only! You can also use additional publicly available training data, but must disclose it during submission.
  3. Use the evaluation tool we provide to analyze your results and upload them to the server.
  4. Also note the performance of your method (in FPS) and the hardware used.
  5. Create a submission .json file with your predictions.
  6. Upload your .json file on the upload page. You need to register in order to submit your results.
    • After submission, your results will be evaluated on the server. This may take several minutes. Please refresh the dashboard page to see results. The dashboard will also display potential errors in case of failed submissions (hover over the error icon). You may evaluate at most one submission per day (per challenge track). Failed attempts do not count towards this limit.

Submission format

The format of the predictions JSON file is very similar to the panoptic_annotations.json files from the LaRS dataset, except that each frame object provides a detections array describing detected obstacles:

The JSON file must contain the list of LaRS test images, then the list of "annotations", one "annotation" per image. The annotation element must contain the appropriate id field (that matches the corresponding image ID). Each annotation element must contain a detections array, which contains the detections produced by your algorithm. If there are no detections in the frame, the detections should be empty. Otherwise, it should contain one object per detection, consisting of an id which should be unique within the image, and bbox (bounding box; [x, y, width, height]).

Terms and Conditions

In case of any questions regarding the challenge datasets or submission, please join the MaCVi Support forum.