Skip to content

Lucaswaung/list-of-surgical-tool-datasets

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 

Repository files navigation

Description

List of surgical tool datasets organised by task. A list of data repositories is also displayed at the bottom. Please open an issue if you see a relevant open dataset which is missing or if you find inacurate information.

Tool classification

Dataset Brief description Images Procedures Paper
Cholec80 80 videos of cholecystectomy surgeries performed by 13 surgeons. The videos are captured at 25 fps. The dataset is labeled with the phase (at 25 fps) and tool presence annotations (at 1 fps). A tool is defind as present in an image if at least half of the tool tip is visible. 86K 80 Twinanda et al. 2016
CATARACTS This dataset consists of 50 cataract surgery. It was annotated for two main tasks: surgical tool presence detection and surgical activity recognition. It was divided into two sets (train, test) for the surgical tool presence detection task and 3 sets (train, dev, test) for the activity recognition task. 900K 50 Al Hajj et al. 2019

Tool segmentation

Dataset Brief description Images Procedures Paper
RMIT This dataset consists of three image sequences during retinal microsurgery. For each image sequence, the instrument position and size has been hand annotated. 1.5K 4 Sznitman et a. 2012
InstrumentCrowd The training data was generated from a total of 6 surgical procedures, three from laparoscopic adrenalectomies and three from laparoscopic pancreatic resections. From each surgery, 20 images containing one or several medical instruments were extracted, yielding 120 images in total. 120 6 Maier-Hein et al. 2014
NeuroSurgicalTools Consists of 2476 monocular images (1221 for training and 1255 for testing) coming from in vivo neurosurgeries. The resolution of the images varies from 612×460 to 1920×1080. 2.5K 14 Bouget et al. 2015
EndoVis2015 提供了分割和关键点数据。没提供三维数据。 40 2D in-vivo images from 4 laparoscopic colorectal surgeries. Each pixel is labelled as either background, shaft and manipulator (~160 2D images and annotations in total). 4x 45-second 2D images sequences of at least one Large Needle Driver instrument in an ex-vivo setup. Each pixel is labelled as either backgroud, shaft, head or clasper. 9K 8 N/A
EndoVis2017 8x 225-frame robotic surgical videos, captured at 2 Hz, with manually labelled different tool parts and types. The testing set contains 8x 75-frame videos and 2x 300-frame videos. 1.8K 8 Allan et al. 2019
EndoVis2018 提供了双目立体图像,和相机参数。 Training dataset is made up of 16 robotic nephrectomy procedures recorded using da Vinci Xi systems in porcine labs (subsampled to 2fps). Sequences with little or no motion are manually removed to leave 149 frames per procedure. Video frames are 1280x1024 and we provide the left and right eye camera image as well as the stereo camera calibration parameters. Labels are only provided for the left image. 2.4K 16 Allan et al. 2020
ROBUST-MIS2019 Procedures in rectal resection and proctocolectomy. A training case encompasses a 10 second video snippet in form of 250 endoscopic image frames and a reference annotation for the last frame. In the annotated frame a “0” indicates the absence of a medical instrument and numbers “1”, “2“, ... represent different instances of medical instruments. 10K 30 Ross et al. 2020
Kvasir-Instrument
The Kvasir-Instrument dataset consists of consists of 590 annotated frames comprising of GI procedure tools such as snares, balloons, biopsy forceps, etc. The resolution of the image in the dataset varies from 720x576 to 1280x1024. 590 N/A Jha et al. 2020
CholecSeg8k This dataset contains 8080 laparoscopic cholecystectomy image frames extracted and annotated from 17 video clips in Cholec80. 8K 17 Hong et al. 2020
RoboTool 514 images extracted from the videos of 20 freely available robotic surgical procedures and annotated for binary tool-background segmentation. 514 20 Garcia-Peraza-Herrera et al. 2021

Tool-tissue action detection

Dataset Brief description Images Procedures Paper
CholecT50 Every frame is annotated with labels from the triplet: instrument, verb and target. N/A N/A N/A
SARAS-MESAD2021 给了动作标注和锚框。 Dataset contains monocular digital recordings from da Vinci Xi robotic system. Two sub-datasets: MESAD-Real and MESAD-Phantom. MESAD-Real represents the prostatectomy procedures recorded on human patients. It contains four sessions of complete prostatectomy procedure performed by expert surgeons on real patients. MESAD-Phantom is also designed for surgeon action detection during prostatectomy, but is composed of videos captured during procedures on phantoms used for the training of surgeons. MESAD-Real comprises 21 action classes and MESAD-Phantom contemplates a smaller list of 14 action classes. Both the datasets share 11 action classes. N/A 4 N/A
PSI-AVA PSI-AVA is a dataset designed for holistic surgical scene understanding. It contains approximately 20.45 hours of the surgical procedure performed by three expert surgeons and annotations for both long-term (Phase and Step recognition) and short-term reasoning (Instrument detection and novel Atomic Action recognition) in robot-assisted radical prostatectomy videos. N/A 8 Valderrama et al. 2022

Skill assessment and workflow recognition

Dataset Brief description Images Procedures Paper
JIGSAWS 提供了三维坐标信息,但没提供三维坐标到镜头的转换信息。 The JIGSAWS dataset consists of three components: kinematic data (Cartesian positions, orientations, velocities, angular velocities and gripper angle describing the motion of the manipulators), video data (stereo video captured from the endoscopic camera), and manual annotations of gestures (atomic surgical activity segment labels) and skill (global rating score using modified objective structured assessments of technical skills). N/A N/A Gao et al. 2014
Cataract-101 This dataset contains 101 videos of cataract surgeries annotated with two kinds of information: Anonymous ID and experience level of operating surgeon, and starting points of quasi-standardized operation phases in videos. 1.3M 101 Schoeffmann et al. 2018
HeiCo The data set contains of data from the ROBUST-MIS 2019 challenge and the Surgical Workflow Challenges from EndoVis 2017 and 2018. 10K 30 Maier-Hein et al. 2020
MISAW The data-set contains 27 micro-anastomosis training sequences and is composed of the following information: stereoscopic video, kinematic data, workflow annotation at 3 levels of granularity (phases, steps, and activities). N/A 27 Huaulmé et al. 2021
PETRAW 在下载。Dataset for online automatic recognition of surgical workflow by using both kinematic and stereoscopic video information on a micro-anastomosis training task. N/A 100 N/A

Image-to-image translation

Dataset Brief description Paper
Laparoscopic Image to Image Translation 模拟信息,提供深度、分割等。 Synthetic images in a 3D environemnt roughly resembling laparoscopic liver surgery scenes. A group of Generative Adversarial Networks (GAN) is trained to translate these images to look like real laparoscopic images. After the training process, the translated images along with their labels can be used as training data for a certain target task. Pfeiffer et al. 2019

Multi-task datasets

Dataset Brief description Images/Videos Procedures Paper
ART-Net This dataset consists non-robotic tools with annotated tool presence, tool segmentation, and instrumnt geometric primitives (mid-line, edge-line, tooltip). The images come from laparoscopic hysterectomy videos. This dataset also contains tool presence annotated for another set of 3000 images, namely 1500 positive and 1500 negative images, respectively, for which some positive images contain multiple tools. 4270 images are labelled for tool detection. If the tool shaft is not visible at all, the image is marked as negative. When a small part of the tool shaft is visible, the image is marked as positive. For segmentation and geometric primitive extraction, 635 images are annotated. Different for each task 29 Hasan al. 2021
HeiSurF Surgical Workflow Analysis and Full Scene Segmentation. All surgeries were annotated framewise for surgical phases by surgical experts. Surgical actions, instrument usage and surgical skill levels were annotated. The surgeries recorded are laparoscopic gallbladder removals (cholecystectomy). The data for segmentation consists of two parts. In the first part of the training dataset, frames at 2 minute intervals from 24 operations (the same operations as for the workflow challenge) are provided. The second part of the training dataset will consist of brief sequences taken from each video, where frames will be segmented at 1fps. To ensure anonymity, frames corresponding to extra-abdominal views are censored by entirely white (RGB 255 255 255) frames. The testing dataset of 9 videos will not be released. 24 videos 30 HeiSurf Presentation
AutoLaparo AutoLaparo contains videos of laparoscopic hysterectomy. Three sub-datasets are designed for the following three tasks: surgical workflow recognition, laparoscope motion prediction, instrument and key anatomy segmentation. The videos are recorded at 25 fps with a standard resolution of 1920×1080 pixels. The duration of videos ranges from 27 to 112 minutes due to the varying difficulties of the surgeries. After pre-processing, the average duration is 66 minutes and the total duration is 1388 minutes.
Annotations:
  • Surgical workflow recognition: the hysterectomy procedure is divided into 7 phases and each frame is annotated with a phase label.
  • Laparoscope motion prediction: 300 clips are carefully selected from Phase 2-4 of the 21 videos and each clip lasts for 10 seconds. Seven types of motion modes are defined, including one Static mode and six non-static mode: Up, Down, Left, Right, Zoom-in, and Zoom-out.
  • Instrument and key anatomy segmentation: for each clip in the motion prediction task, six frames are sampled at 1fps, and annotated with pixel-wise segmentation. Four types of instruments and one key anatomy is annotated in the dataset: grasping forceps, LigaSure, dissecting and grasping forceps, electric hook, uterus.
Different for each task 21 Wang et al. 2022
SurgToolLoc This dataset contains clips of surgical training exercises using the da Vinci robotic system. In them, trainees perform standard activities such as dissecting tissue and suturing. There are 24,695 video clips, each 30 seconds long and captured at 60 fps with a resolution of 1280x720 pixels.
  • Training data: for each 30-second clip within the training set, just tool presence labels indicating which robotic tools are installed are provided. For the extent of each clip, the same three tools (out of 14 possible) are installed. However, some may be obscured or temporarily invisible, i.e. there is noise in the tool presence labels of the training set.
  • Testing data: The test has tool presence labels and also bounding boxes around the robotic tools. The videos are sampled at 1Hz.
44M N/A N/A

Organ segmentation datasets

Dataset Brief description Images Procedures Paper
Dresden Surgical Anatomy Dataset The Dresden Surgical Anatomy Dataset provides semantic segmentations of eight abdominal organs (colon, liver, pancreas, small intestine, spleen, stomach, ureter, vesicular glands), the abdominal wall and two vessel structures (inferior mesenteric artery, intestinal veins) in laparoscopic view. The majority of patients (26/32) were male, the overall average age was 63 years and the mean body mass index (BMI) was 26.75 kg/m2 (Table 1). All included patients had a clinical indication for the surgical procedure. Surgeries were performed using a standard Da Vinci® Xi/X Endoscope with Camera (8 mm diameter, 30° angle, Intuitive Surgical, Item code 470057) and recorded using the CAST-System (Orpheus Medical GmbH, Frankfurt a.M., Germany). Each record was saved at a resolution of 1920x1080 pixels in MPEG-4 format and lasts between about two and ten hours. 13K 32 Carstens et al. 2023
SurgAI3.8K The dataset contains the following annotations: uterus segmentation, uterus contours and the regions of the left and right fallopian tube junctions. 3.8K 79 Zadeh et al. 2023

Bleeding segmentation datasets

Dataset Brief description Images Procedures Paper
Rabbani et al. 2022 From the 60-hour video footage, 750 frames are extracted for training, and 199 for testing. Authors downsample all the images to 854×480 pixels for training. 949 labelled images and over 60 hours of unlabelled video 96 Rabbani et al. 2022

Repositories holding multiple datasets

About

List of surgical tool datasets organised by task.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published