Zubnet AIसीखेंWiki › Image Segmentation
Using AI

Image Segmentation

Semantic Segmentation, SAM, Instance Segmentation
एक image के हर pixel को एक category में classify करना। Semantic segmentation pixels को class से label करती है (road, sidewalk, building, sky)। Instance segmentation individual objects को distinguish करती है (person 1, person 2)。 Panoptic segmentation दोनों करती है। Meta का SAM (Segment Anything Model) किसी भी object को एक point click या text prompt से segment कर सकता है, बिना task-specific training के।

यह क्यों matter करता है

Segmentation image content की सबसे precise understanding provide करती है। Self-driving cars को pixel-level road boundaries चाहिए, सिर्फ bounding boxes नहीं। Medical imaging को exact tumor boundaries चाहिए। Photo editing को background removal के लिए precise object masks चाहिए। SAM की zero training से किसी भी object को segment करने की ability ने इस previously specialized capability को सभी के लिए accessible बना दिया।

Deep Dive

Traditional segmentation models (U-Net for medical images, DeepLab for general scenes) are trained on specific categories and produce fixed-class outputs. They work well within their training domain but can't segment novel objects. SAM (Kirillov et al., 2023, Meta) changed this by training on 1 billion masks across 11 million images, learning a general notion of "objectness" that transfers to any domain without fine-tuning.

SAM and Its Impact

SAM takes a prompt (a point click, a bounding box, or text) and produces a segmentation mask for the indicated object. It works on images it has never seen, for object types it was never specifically trained on — microscopy images, satellite photos, artwork. SAM 2 extended this to video, maintaining consistent object segmentation across frames. The impact: tasks that previously required domain-specific training and expensive annotation now work out of the box.

Applications

Medical imaging: segmenting tumors, organs, and cells for diagnosis and treatment planning. Autonomous driving: understanding the drivable surface, lane markings, and obstacles at pixel level. Photo/video editing: precise background removal, object selection, and compositing. Agriculture: analyzing crop health from aerial imagery. Robotics: understanding object boundaries for grasping and manipulation.

संबंधित अवधारणाएँ

← सभी Terms
ESC