Segment the screen / disambiguate areas of interest
Intros
- Segmantic segementation
- Difference between object detection (bounding box) and semantic segmentation (each pixel is classified)
- One general thing most of the architectures have in common is an encoder network followed by a decoder network
- Deeplearning intro on TowardsDataScience
- Deeplearning intro on Medium
- Mobile Real-time Video Segmentation
- State of the art for semantic segmentation 2019
Specific algorithms
Deeplab
Particularly usable for high performance and real-time applications. Suitable for feasibility checks - easy to implement and works well for real-time applications.
- Deeplab-V3 paper
- Code for Deeplab and Papers with Code
- Tensorflow-deeplab-v3-plus
- Google TPU tutorial on deeplab
- Tensorflow deeplab readme
- Towardsdatascience.com The evolution of deeplab for semantic segmentation
- How to use deeplab in tensorflow for object segmentation
- Semantic image segmentation with DeepLab in Tensorflow
Convolutional networks
- Fully convolutional networks for semantic segmentation
- FCN is not so powerful as other discussed models and serves as basic information
- CNNs with skip connections
- Fully Convolutional Networks for Semantic Segmentation
- U-Net for satellite images
Mask R-CNN takes a different approach as the encoder-decoder structure. It is an extension of Faster R-CNN, which is used for object detection. Besides the class label and bounding box coordinates, it returns the mask for each object.
- Mask R-CNN, extends Faster R-CNN
- Facebookresearch maskrcnn benchmark, superseded by Detectron2 by Facebook
- Mask R-CNN implementation
- Transfer learning with Mask R-CNN
- EfficientPS: New State-of-the-art Model in Panoptic Segmentation. No code available.
- Seamless Scene Segmentation
Implementation:
- Mask R-CNN Implementation
- Running Mask R-CNN code in Google Colab. Based on https://emadehsan.com/p/object-detection
- https://github.com/matterport/Mask_RCNN
- based on matterport but on Google Colab Google colab
GAN
YoloV3
Document structure. Figure out which input elements belong to which text elements
Semantics of words - Word2Vec
Useful links for implementation:
- Semantic segmentation suite
- Dataset Pascal VOC2012