Training data and baseline code are now available ! Open development phase is open!

Tasks 🚀👩‍💻👨‍💻¶

The PANTHER Challenge focuses on pancreatic tumor segmentation across two key tasks, addressing both diagnostic imaging and treatment planning. Participants will develop AI models capable of automatically segmenting pancreatic tumors.

▶️ Task 1: Pancreatic Tumor Segmentation on Diagnostic MRIs¶

Accurate tumor segmentation on T1-weighted contrast-enhanced arterial phase MRIs is essential for staging and treatment planning, as this sequence typically provides the best tumor visibility. AI models developed for this task should segment the tumor, supporting consistent and efficient radiological assessments.

Dataset: 92 annotated cases, plus 367 unannotated MRIs from various sequences (e.g. venous phase, DWI).
Objective: Develop models that segment tumors effectively. While using the unlabeled data is optional, it may be beneficial for self-supervised learning, pretraining, or semi-supervised approaches to improve segmentation performance.

Figure 1: Examples from Task 1 MRI scans, with tumor segmentations highlighted in red

▶️ Task 2: Pancreatic Tumor Segmentation on MR-Linac MRIs¶

Precise segmentation on MR-Linac images is critical for adaptive radiotherapy planning, where real-time adjustments are needed during treatment. This task poses a few-shot learning challenge, requiring models to generalize from a limited dataset.

Dataset: 50 annotated MR-Linac cases.
Objective: Develop robust models capable of transfer learning and domain adaptation to achieve accurate segmentation despite the small dataset size. Participants may use the diagnostic MRI dataset from Task 1 for pretraining or domain adaptation, but this is not mandatory. Additionally, given the clinical treatment setting, inference time is a critical factor. Models must deliver real-time predictions during treatment, and a maximum inference time (T_max) will be considered for evaluation.

Figure 2: Examples from Task 2 MRI scans, with tumor segmentations highlighted in green

📌 Note: Both pancreas and tumor annotations will be provided for the training dataset to help participants familiarize themselves with the organ’s anatomy and tumor locations. Participants may use these labels at their discretion, however, the expected output for evaluation is a segmentation mask with the same dimensions as the input MRI, where 0 = background and 1 = tumor.

Evaluation 📊¶

📈Performance Metrics¶

The evaluation of algorithms will be based on a combination of overlap, boundary, and volumetric metrics to ensure a comprehensive assessment of segmentation performance. Emphasis will be placed on tumor segmentation metrics, reflecting the clinical importance of accurately delineating pancreatic tumors.

Dice Similarity Coefficient (DSC): Measures the spatial overlap between predicted and ground truth segmentations for both the pancreas and tumor.
5mm Surface Dice: Evaluates boundary accuracy by calculating the Dice coefficient within a 5mm tolerance around the segmentation surface.
Mean Average Surface Distance (MASD): Measures the mean of averaged minimum distances between boundaries of predicted and ground truth segmentations.
Hausdorff Distance 95% (HD95): Captures the 95th percentile of the maximum boundary distances between the predicted and ground truth segmentations.
Root Mean Square Error (RMSE) on Tumor Burden: Quantifies the error in tumor volume estimation.

To assess your predictions locally using PANTHER's performance metrics, please run the evaluation script available here.

🏅 Ranking¶

To rank team performance for each task, each submitted algorithm is evaluated on every test case in the evaluation set by computing individual metric values. These values are then averaged across all test cases to yield a single score per metric. Teams receive a ranking for each metric, and their overall ranking is determined by averaging these positions. The winner for each task is the team with the best average ranking.