Medical Image Segmentation AI built from Abidjan 🇨🇮 - SAM 2 + Gradio
I asked an LLM to explore possible improvements a bit. Since this is a medical-domain project, I’d also recommend checking out the Hugging Science Discord:
This is a promising first prototype. Making a SAM 2 + Gradio demo public on Hugging Face is a good way to invite feedback, especially for a student-built project from Abidjan with an interest in African medical imaging datasets.
My main suggestion is to sharpen the positioning. I would avoid presenting the current version as a clinical AI system or a fully automatic medical segmentation model. A more accurate and stronger framing would be:
A SAM 2-assisted medical image annotation demo for research and education.
That wording is safer and more technically precise. It also gives you a clearer roadmap: not “replace the clinician,” but “help researchers create, inspect, refine, and export candidate annotations.”
High-level priorities
If I were improving this project, I would focus on these five areas:
- Positioning and safety — make clear this is research/education only, not clinical software.
- Interactive prompting — replace hidden or fixed prompts with visible click / box / negative-point workflows.
- Annotation export — move from mask metadata to standard dataset formats.
- Evaluation — add Dice/IoU and failure cases, even on a small public sample set.
- Medical / HF ecosystem alignment — reference MedSAM2, Sam2Rad, MedSegDB, HF4H, Hugging Science, Model Cards, Dataset Cards, and privacy-aware data governance.
1. Reposition the project
I would rename or subtitle it as something like:
SAM 2-assisted medical image annotation demo
or:
Research demo for prompt-based medical image mask generation
I would avoid phrases like:
- “clinical AI”
- “diagnostic segmentation”
- “upload any MRI / X-ray / scan”
- “automatically detects medical regions of interest”
- “ready for clinical annotation”
- “confidence score” without qualification
Suggested wording:
This Space generates candidate segmentation masks for 2D public or de-identified medical image slices. It is intended for research, education, and annotation workflow exploration only. It is not for diagnosis, treatment planning, triage, measurement, or clinical decision-making.
This matters because SAM 2 is a general promptable segmentation model for images and videos. It is powerful, but it is not automatically a validated medical segmentation system. See also the SAM 2 GitHub repository and SAM 2 paper.
2. Add a visible medical safety and privacy notice
For a public HF Space that accepts medical-looking images, I would put a short warning directly in the interface, not only in the README.
Suggested UI notice:
Research / education only. Do not use for diagnosis, treatment planning, triage, or clinical decisions. Do not upload patient-identifiable images. Use only public, synthetic, or properly de-identified images.
I would also add a storage note:
Uploaded images are processed for inference only. Do not upload patient data. If this Space stores, logs, caches, or persists files, that behavior must be explicitly documented.
Medical image privacy is not only about filenames. Patient-identifying information can appear in:
- DICOM metadata
- private DICOM tags
- free-text fields
- burned-in pixel annotations
- ultrasound overlays
- screenshots or exported PNG/JPEG images
- filenames and logs
Useful references:
- NCI Medical Imaging De-Identification Project
- DICOM Attribute Confidentiality Profiles
- TCIA De-identification Knowledge Base
- HF Spaces storage documentation
Even if your Space currently uses only temporary files, it is still better to say so explicitly.
3. Replace fixed internal prompts with interactive prompts
The most important technical upgrade is the prompt interface.
If the app currently uses fixed or heuristic points internally, then it is not really segmenting the user’s intended medical target. It is asking SAM 2 to segment whatever those internal points happen to indicate. That can work for a demo, but it is fragile and hard to evaluate.
SAM-style models are strongest when the target is specified interactively. I would expose the prompt controls:
- positive point: “include this”
- negative point: “exclude this”
- bounding box: “segment the object inside this box”
- optional rough mask or scribble
- multiple candidate masks
- mask score per candidate
- choose best mask
- refine with more prompts
- export selected mask
A better workflow would be:
- Upload a 2D public/de-identified image.
- Select target with a click or box.
- Generate candidate masks.
- Add negative points if needed.
- Choose or refine the best mask.
- Assign a label.
- Export standard annotation files.
Useful UI references:
- SAM2 Image Predictor Space
- Gradio ImageEditor
- Gradio AnnotatedImage
- Label Studio SAM integration
- Using SAM2 with Label Studio for image annotation
For the current project, the key change is:
hidden fixed prompts → visible human-in-the-loop prompts
That one change would make the app much more credible.
4. Be precise about supported input formats
If the app accepts PIL / RGB images through Gradio, then it is a 2D image-slice demo, not a DICOM/MRI/CT-volume application.
I would state:
- Supported now: PNG/JPEG 2D images or exported slices.
- Not yet supported: DICOM, NIfTI, 3D CT/MRI volumes, DICOM SEG, PACS, clinical viewers.
- Not validated for: diagnosis, measurement, treatment planning, or automated reporting.
Suggested wording:
This demo currently works on 2D image files. It does not yet preserve DICOM metadata, voxel spacing, orientation, slice thickness, series context, or 3D anatomy.
This makes the project look more serious, not less. Medical imaging users will trust a tool more when its boundaries are clear.
Future 3D directions:
- DICOM / NIfTI loader
- voxel spacing handling
- axial / coronal / sagittal viewing
- slice propagation
- 3D labelmap export
- NIfTI mask export
- DICOM SEG as an advanced target
- integration with MONAI Label, 3D Slicer, or OHIF
5. Make the export genuinely annotation-friendly
The current JSON sounds useful, but I would not call it “training-dataset ready” unless it supports standard formats. A JSON with bounding box, area, coverage, and score is better described as mask metadata.
A stronger export bundle would include:
mask.png— binary maskoverlay.png— visual previewannotation.json— internal metadatacoco.json— COCO instance segmentation- RLE mask
- label name / class name
- model name and checkpoint
- prompt type: point / box / mask / scribble
- prompt coordinates
- candidate mask score
- human-reviewed flag
- correction history
- image dimensions and preprocessing notes
For annotation-tool compatibility:
- CVAT COCO format
- CVAT segmentation mask format
- Label Studio SAM integration
For future 3D medical imaging:
- NIfTI labelmap
- voxel spacing
- orientation
- source volume ID after de-identification
- DICOM SEG, if you move toward research imaging interoperability
I would phrase it like this:
The current JSON is useful mask metadata. To make it training-dataset friendly, add PNG masks, COCO JSON, RLE, CVAT-compatible export, and Label Studio prediction JSON.
6. Add semantic labels, not only masks
A segmentation mask alone is not enough for most medical datasets. You also need to know what the mask represents.
For example:
- lung
- liver
- kidney
- tumor
- lesion
- bone
- vessel
- polyp
- optic disc
- cell nucleus
- instrument
- background / artifact
This matters because “mask 1” is not a reusable medical annotation. A training dataset needs a label schema.
Relevant resources:
- BiomedParseData
- BiomedParse model
- BiomedParse GitHub
BiomedParse is interesting because it frames biomedical image understanding as segmentation + detection + recognition across multiple modalities. That is a useful direction for this project: not only “where is the mask?” but also “what is the mask?”
7. Add a small evaluation page
A public demo becomes much stronger if it includes even a small benchmark.
Minimum useful metrics:
- Dice
- IoU / Jaccard
- 95% Hausdorff distance, where appropriate
- Normalized Surface Dice, for 3D later
- click count vs Dice
- point prompt vs box prompt
- SAM 2 vs MedSAM / SAM-Med2D / MedSAM2
- modality-wise results
- failure examples
Important failure cases to show:
- low-contrast boundaries
- small lesions
- noisy ultrasound
- overlapping anatomy
- multiple similar structures
- cropped anatomy
- burned-in text
- non-medical images
- poor-quality screenshots
Useful datasets / benchmark references:
- MedSegDB
- CVPR-BiomedSegFM
- Project Imaging-X
- Medical Segmentation Decathlon
- MedSAM dataset list
Success cases are good for a demo. Failure cases are what make the project useful for research.
8. Consider a model selector, but do not start there
It may be tempting to immediately replace SAM 2 with a more medical model. I would not make that the first step. First fix the prompt UI, safety language, and export.
After that, a model selector would be useful:
| Model / resource | Why it matters |
|---|---|
| SAM 2 | general promptable segmentation baseline |
| MedSAM | medical adaptation of SAM, strong reference point |
| SAM-Med2D | 2D medical segmentation adaptation |
| MedSAM2 | SAM 2.1 adapted/fine-tuned for 3D medical images and videos |
| Sam2Rad | useful example of SAM/SAM2 prompt-learning for ultrasound |
| MONAI VISTA3D-HF | 3D medical segmentation foundation model on HF |
| TotalSegmentator | strong CT/MR anatomy segmentation baseline |
| nnU-Net | essential task-specific medical segmentation baseline |
For the current Space, I would start with:
- SAM 2 baseline
- MedSAM or SAM-Med2D for 2D medical images
- MedSAM2 as a future 3D/video direction
Sam2Rad is especially useful conceptually because it separates autonomous, semi-autonomous human-in-the-loop, and manual prompting modes. That distinction would also be valuable in your UI.
9. Use Hugging Face documentation patterns
If you publish a model, dataset, or improved Space, I would use proper HF documentation patterns.
References:
- HF Model Cards
- HF Dataset Cards
- HF Model Release Checklist
- Hugging Face for Health
- Hugging Science
For a medical model card, include:
- intended use
- out-of-scope use
- supported inputs
- unsupported inputs
- training data
- evaluation data
- known limitations
- bias / representativeness
- privacy statement
- clinical-use disclaimer
- license
- citation
- hardware requirements
- failure cases
For a medical dataset card, include:
- source
- modality
- anatomy
- annotation type
- label schema
- annotator expertise
- de-identification process
- consent / ethics review, if applicable
- license and redistribution limits
- demographic/geographic coverage, if ethically shareable
- scanner/site metadata, if allowed
- known biases and gaps
HF4H and Hugging Science are useful communities to look at because this is not only a computer vision project. It touches health, open science, documentation, evaluation, and data governance.
10. Treat the Africa / Abidjan angle as a strength, but add governance
The Abidjan / Côte d’Ivoire / Africa angle is valuable. Medical AI needs more geographic diversity, more local participation, and more datasets that are not only from a few wealthy institutions.
But if the long-term goal is African medical imaging datasets, the next step is not just model engineering. It is also governance.
Important items:
- local clinical collaborators
- ethics approval, when needed
- consent pathway, when needed
- de-identification workflow
- annotation protocol
- annotator expertise
- label definitions
- modality and scanner metadata
- license and redistribution policy
- dataset card
- access control, if needed
- bias and representativeness statement
Useful references:
- Hugging Face for Health
- Project Imaging-X
- General-Medical-AI Project Imaging-X dataset
- HF gated datasets documentation
A good long-term vision might be:
Build an open, well-documented, privacy-aware African medical imaging annotation workflow, starting with public/de-identified samples and human-in-the-loop segmentation.
That is much more compelling than simply “SAM 2 for medical images.”
11. Suggested roadmap
Phase 1 — Make the current demo safe and precise
- Rename as a research annotation demo.
- Add “not for clinical use.”
- Add “do not upload patient-identifiable data.”
- Clarify whether images are stored.
- Clarify supported input: 2D PNG/JPEG only.
- Rename “confidence” to “SAM mask score” or “model mask score.”
Phase 2 — Add human-in-the-loop prompts
- positive clicks
- negative clicks
- bounding boxes
- candidate masks
- mask selection
- prompt history
- simple correction workflow
Phase 3 — Add useful exports
- binary mask PNG
- overlay PNG
- COCO JSON
- RLE
- CVAT-compatible mask
- Label Studio prediction JSON
- label schema
- model/prompt metadata
Phase 4 — Add evaluation
- small public sample set
- Dice / IoU
- click-count curves
- modality-wise results
- success and failure gallery
- SAM 2 vs MedSAM / SAM-Med2D / MedSAM2 comparison
Phase 5 — Explore medical foundation models
- MedSAM for 2D medical prompting
- SAM-Med2D for 2D medical images
- MedSAM2 for 3D/video medical segmentation
- VISTA3D for MONAI-style 3D workflows
- TotalSegmentator / nnU-Net as practical baselines
Phase 6 — Build toward a real dataset collaboration
- dataset card
- de-identification protocol
- governance plan
- local clinical collaboration
- annotation guidelines
- review workflow
- license / access policy
12. Concrete wording changes
I would replace:
Medical Image Segmentation AI
with:
SAM 2-assisted medical image annotation demo
I would replace:
Upload any X-ray, MRI or scan
with:
Upload a 2D public or de-identified medical image slice, such as PNG/JPEG exported from a public dataset
I would replace:
automatically segments regions of interest
with:
generates candidate segmentation masks from user prompts
I would replace:
confidence score
with:
SAM mask score, not a clinical confidence score
I would replace:
JSON annotations ready for AI training datasets
with:
exports mask metadata now; PNG mask, COCO JSON, RLE, CVAT, and Label Studio export would make it more dataset-friendly
Overall
This is a good starting point. The strongest next step is not necessarily “use a bigger model.” The strongest next step is to make the project safer, more precise, and more useful as an annotation workflow.
The core transformation would be:
from hidden fixed prompts to visible interactive prompts from demo metadata to standard annotation exports from medical-sounding claims to research/education positioning from isolated examples to evaluated public samples and failure cases from a segmentation demo to a documented dataset-collaboration workflow
That would make the project much more useful for medical imaging researchers and more credible for open-science collaboration.
Discussion in the ATmosphere