VideoBert : A joint model for video and language representation learning

Chen Sun, Austin Myers, Carl Vondrick, Kevin Murphy, Cordelia Schmid · 1904 · arXiv 1904.01766

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

cs.CL · 2020-02-19 · unverdicted · novelty 6.0

CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.

VisualBERT: A Simple and Performant Baseline for Vision and Language

cs.CV · 2019-08-09 · conditional · novelty 6.0

VisualBERT is a Transformer model that implicitly aligns text and image regions through self-attention and achieves competitive or superior results on VQA, VCR, NLVR2, and Flickr30K after pre-training on captions.

citing papers explorer

Showing 2 of 2 citing papers.

CodeBERT: A Pre-Trained Model for Programming and Natural Languages cs.CL · 2020-02-19 · unverdicted · none · ref 56
CodeBERT pre-trains a bimodal model on code and text pairs plus unimodal data to achieve state-of-the-art results on natural language code search and code documentation generation.
VisualBERT: A Simple and Performant Baseline for Vision and Language cs.CV · 2019-08-09 · conditional · none · ref 34
VisualBERT is a Transformer model that implicitly aligns text and image regions through self-attention and achieves competitive or superior results on VQA, VCR, NLVR2, and Flickr30K after pre-training on captions.

VideoBert : A joint model for video and language representation learning

fields

years

verdicts

representative citing papers

citing papers explorer