What is AIWiki Malaysia?

AIWiki Malaysia is a free, open AI knowledge base covering artificial intelligence concepts, tools, models, and use cases — written specifically for Malaysian professionals and students. It is maintained by AITG Sdn Bhd, an AI company based in Penang.

Who maintains AIWiki Malaysia?

AIWiki Malaysia is maintained by AITG Sdn Bhd (Registration: 202601016521 (1678618-W)), an AI company headquartered in George Town, Penang, Malaysia. The editorial team continuously updates and expands the knowledge base.

What topics does AIWiki Malaysia cover?

AIWiki Malaysia covers a wide range of AI topics including large language models (LLMs), AI agents, machine learning fundamentals, prompt engineering, AI automation, generative AI tools, Malaysian AI regulations, local vendor landscape, and real-world AI use cases relevant to the Malaysian market.

How do I search for AI topics on AIWiki Malaysia?

You can use the search bar at the top of the site to find articles by keyword or topic. Articles are also organised by category, so you can browse by subject area such as Models, Tools, Concepts, or Use Cases.

Is AIWiki Malaysia available in Bahasa Malaysia?

Yes. AIWiki Malaysia publishes content in both English and Bahasa Malaysia to serve the full breadth of the Malaysian professional and student community. Language availability is indicated on each article page.

How can I submit a topic or suggest an article?

You can suggest topics or submit article ideas by contacting the AIWiki Malaysia team at admin@aiteragrid.com. AITG Sdn Bhd reviews all submissions and publishes content that meets editorial accuracy standards.

Image Segmentation

A computer vision task that partitions an image into meaningful regions by assigning a class label to every pixel, enabling pixel-level understanding of visual scenes.

6 min readLast updated May 2026Applications

Image segmentation is a computer vision task in which an image is divided into regions and a label is assigned to every pixel rather than to the image as a whole. Unlike image classification, which produces a single class for the entire image, and object detection, which produces bounding boxes, segmentation yields a dense, pixel-level map of the scene. This finer resolution makes segmentation the foundation for any application that requires precise understanding of object shape, boundary, or area — including medical diagnosis, autonomous driving, agricultural monitoring, and augmented reality.

Variants

Segmentation tasks are usually grouped into four main variants.

Semantic segmentation assigns each pixel to a category from a fixed set of classes. All instances of the same class are treated identically — every car pixel receives the "car" label without distinguishing one vehicle from another.

Instance segmentation identifies and separates individual objects of the same class. Each car receives its own mask. Mask R-CNN, introduced by He and colleagues in 2017, is the canonical architecture and is still widely deployed.

Panoptic segmentation, proposed by Kirillov and colleagues in 2019, unifies the two by labelling every pixel with both a semantic class and an instance ID. Stuff classes (sky, road, grass) are handled semantically while thing classes (people, vehicles) are handled per instance.

Interactive and promptable segmentation allows a user — or another model — to specify what to segment through clicks, boxes, or text prompts. The Segment Anything Model (SAM) and its successor SAM 2 popularised this style, producing masks for arbitrary objects with little or no per-domain training.

Key architectures

Early deep-learning segmentation relied on fully convolutional networks (FCNs), which replaced dense classification layers with convolutions to produce dense output. U-Net, introduced in 2015 for biomedical imaging, popularised the encoder-decoder structure with skip connections that combines coarse semantic features with fine spatial detail. It remains a workhorse architecture in medical imaging and satellite analysis.

DeepLab, developed at Google, introduced atrous (dilated) convolutions and spatial pyramid pooling to capture context at multiple scales without losing resolution. DeepLabV3+ remains a strong baseline for semantic segmentation in 2025.

Mask R-CNN extends the Faster R-CNN object detector with a mask head, producing instance segmentations with high accuracy. HRNet maintains high-resolution feature maps throughout the network and excels at fine boundary delineation.

Transformer-based methods including SegFormer, Mask2Former, and the SAM family have become dominant in recent years, often outperforming CNN baselines on standard benchmarks such as ADE20K, Cityscapes, and COCO. OMG-Seg, released in 2024, handles ten different segmentation tasks in a single unified model.

Metrics

Segmentation quality is most often measured by intersection over union (IoU), defined as the ratio of the overlap between prediction and ground truth to the area of their union. Mean IoU (mIoU) averages this across classes. The Dice coefficient is preferred in medical imaging where class imbalance is severe. Panoptic quality (PQ), introduced with panoptic segmentation, combines a recognition term and a segmentation term to measure both detection and mask quality.

Applications

Medical imaging is one of the largest application areas. Segmentation of tumours in MRI and CT scans, of cell nuclei in histopathology slides, and of retinal structures in fundus photographs has become routine in research settings and increasingly common in clinical workflows.

In autonomous driving, semantic and instance segmentation of road, vehicle, pedestrian, and lane-marking pixels feeds into planning and control systems. Satellite and aerial segmentation supports land-use classification, deforestation monitoring, and disaster response. Augmented reality applications use segmentation to separate users from backgrounds for video calls and to insert virtual objects behind real ones.

Malaysian Context — segmentation in healthcare, agriculture, and infrastructure

Image segmentation has been adopted across multiple Malaysian sectors. In healthcare, hospitals affiliated with Universiti Malaya Medical Centre, Universiti Sains Malaysia Hospital, and Institut Jantung Negara use segmentation models for radiology workflows. Research groups at Universiti Malaya and Universiti Putra Malaysia have published peer-reviewed work on diabetic retinopathy segmentation, breast tumour delineation, and tuberculosis chest X-ray analysis — all conditions with disproportionate impact on the Malaysian population.

In the palm oil sector, segmentation is central to crown counting from drone imagery, ganoderma disease detection, and fresh fruit bunch ripeness analysis. The Malaysian Palm Oil Board (MPOB) has funded several university projects deploying U-Net and Mask R-CNN variants for plantation monitoring.

Infrastructure operators including PLUS Malaysia and the Department of Public Works use segmentation on pavement imagery for pothole and crack detection. Telekom Malaysia uses segmentation in aerial imagery to plan fibre rollout. The Malaysian Remote Sensing Agency (ARSM) under the Ministry of Science, Technology and Innovation (MOSTI) operates the Pulau Pinang and Banting ground stations and uses segmentation extensively for land-cover monitoring, flood mapping, and haze impact assessment.

Compliance with the Personal Data Protection Act 2010 (PDPA) is a constant consideration when segmentation models touch identifiable health or personal imagery. The National AI Office Malaysia and MDEC have encouraged the use of on-premises or federated approaches for sensitive use cases. Training programmes funded by HRD Corp and conducted at MDEC's MyDigital Workforce centres regularly cover segmentation as part of applied computer vision curricula.

Challenges

Segmentation requires dense, pixel-level annotations that are expensive and slow to produce. Class imbalance — where target structures occupy a small fraction of pixels — complicates training and evaluation. Domain shift between training and deployment imagery remains a persistent issue, particularly for tropical agricultural, medical, and satellite applications where most public datasets reflect temperate or Western contexts. The rise of promptable foundation models such as SAM 2 has partially mitigated annotation cost but introduced new questions around prompt design and downstream evaluation.

References

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. MICCAI.
He, K., Gkioxari, G., Dollar, P., & Girshick, R. (2017). Mask R-CNN. ICCV.
Kirillov, A. et al. (2023). Segment Anything. ICCV.
Ravi, N. et al. (2024). SAM 2: Segment Anything in Images and Videos. Meta AI Research.
Minaee, S. et al. (2022). Image Segmentation Using Deep Learning: A Survey. IEEE TPAMI.

Tags:computer-vision segmentation semantic-segmentation instance-segmentation deep-learning

Type	Computer vision task
Main variants	Semantic, instance, panoptic, interactive
Output	Per-pixel class or instance labels
Key architectures	U-Net, DeepLab, Mask R-CNN, SAM 2, OMG-Seg
Common metrics	IoU, mIoU, Dice coefficient, PQ
Typical use cases	Medical imaging, autonomous driving, satellite analysis, AR