Image segmentation: resources

Segment the screen / disambiguate areas of interest

Intros

Segmantic segementation
- Difference between object detection (bounding box) and semantic segmentation (each pixel is classified)
- One general thing most of the architectures have in common is an encoder network followed by a decoder network
Deeplearning intro on TowardsDataScience
Deeplearning intro on Medium
Mobile Real-time Video Segmentation
State of the art for semantic segmentation 2019

Specific algorithms

Deeplab

Particularly usable for high performance and real-time applications. Suitable for feasibility checks - easy to implement and works well for real-time applications.

Convolutional networks

Fully convolutional networks for semantic segmentation
- FCN is not so powerful as other discussed models and serves as basic information
CNNs with skip connections
Fully Convolutional Networks for Semantic Segmentation
U-Net for satellite images

Mask R-CNN takes a different approach as the encoder-decoder structure. It is an extension of Faster R-CNN, which is used for object detection. Besides the class label and bounding box coordinates, it returns the mask for each object.

Mask R-CNN, extends Faster R-CNN
Facebookresearch maskrcnn benchmark, superseded by Detectron2 by Facebook
Mask R-CNN implementation
Transfer learning with Mask R-CNN
EfficientPS: New State-of-the-art Model in Panoptic Segmentation. No code available.
Seamless Scene Segmentation

Implementation:

Mask R-CNN Implementation
Running Mask R-CNN code in Google Colab. Based on https://emadehsan.com/p/object-detection
https://github.com/matterport/Mask_RCNN
based on matterport but on Google Colab Google colab

GAN

YoloV3

Document structure. Figure out which input elements belong to which text elements
Semantics of words - Word2Vec

Useful links for implementation:

Semantic segmentation suite
Dataset Pascal VOC2012

useful image

Image segmentation: resources

Segment the screen / disambiguate areas of interest

Subscribe