Rethinking semantic segmen- tation from a sequence-to-sequence perspective with trans- formers

Sixiao Zheng, Jiachen Lu, Hengshuang Zhao, Xiatian Zhu, Zekun Luo, Yabiao Wang, Yanwei Fu, Jianfeng Feng, Tao Xiang, Philip HS Torr, et al · 2012 · arXiv 2012.15840

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

cs.CV · 2021-03-25 · accept · novelty 8.0

Swin Transformer reaches 87.3% ImageNet accuracy and sets new records on COCO detection and ADE20K segmentation by replacing global self-attention with shifted-window local attention inside a hierarchical pyramid.

BEiT: BERT Pre-Training of Image Transformers

cs.CV · 2021-06-15 · conditional · novelty 7.0

BEiT pre-trains vision transformers via masked image modeling on visual tokens and reaches 83.2% ImageNet top-1 accuracy for the base model and 86.3% for the large model using only ImageNet-1K data.

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

cs.CV · 2021-02-08 · unverdicted · novelty 6.0

TransUNet is a hybrid CNN-Transformer architecture that outperforms prior U-Net and Transformer baselines on multi-organ and cardiac medical image segmentation tasks.

citing papers explorer

Showing 3 of 3 citing papers.

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows cs.CV · 2021-03-25 · accept · none · ref 81
Swin Transformer reaches 87.3% ImageNet accuracy and sets new records on COCO detection and ADE20K segmentation by replacing global self-attention with shifted-window local attention inside a hierarchical pyramid.
BEiT: BERT Pre-Training of Image Transformers cs.CV · 2021-06-15 · conditional · none · ref 22
BEiT pre-trains vision transformers via masked image modeling on visual tokens and reaches 83.2% ImageNet top-1 accuracy for the base model and 86.3% for the large model using only ImageNet-1K data.
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation cs.CV · 2021-02-08 · unverdicted · none · ref 18
TransUNet is a hybrid CNN-Transformer architecture that outperforms prior U-Net and Transformer baselines on multi-organ and cardiac medical image segmentation tasks.

Rethinking semantic segmen- tation from a sequence-to-sequence perspective with trans- formers

fields

years

verdicts

representative citing papers

citing papers explorer