Recognition: no theorem link
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Pith reviewed 2026-05-13 20:59 UTC · model grok-4.3
The pith
CodeBERT is a pre-trained bimodal model for natural language and code that uses replaced token detection to learn transferable representations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language code search and code documentation generation. The model is built on a Transformer architecture and trained with a hybrid objective that incorporates replaced token detection on plausible alternatives sampled from generators. This setup lets the training process use both bimodal NL-PL pairs, which supply input tokens, and unimodal data, which improves the generators themselves. Fine-tuning on the target tasks produces state-of-the-art performance, and zero-shot evaluation on an NL-PL probing dataset shows gains over earlier pre-trained models.
What carries the argument
hybrid objective function that combines replaced token detection with bimodal NL-PL pairs and unimodal data to learn general-purpose representations
If this is right
- Fine-tuning CodeBERT raises accuracy on natural-language queries that retrieve relevant code snippets.
- The same fine-tuned model improves the quality of generated natural-language documentation for given code.
- Fixed CodeBERT parameters already yield stronger zero-shot performance on tasks that probe alignment between code and text.
- The learned representations are intended to support a range of additional NL-PL applications beyond the two evaluated tasks.
Where Pith is reading between the lines
- The same pre-training recipe could be applied to other cross-modal pairs such as code and visual diagrams.
- Probing experiments hint that the model encodes finer semantic correspondences than earlier unimodal or separately trained encoders.
- Extending the unimodal data to additional programming languages would likely broaden the model's utility for polyglot codebases.
Load-bearing premise
The hybrid pre-training objective produces representations general enough to transfer effectively when the model is later fine-tuned on downstream tasks.
What would settle it
A controlled experiment in which CodeBERT, after identical fine-tuning, fails to exceed the best prior models on the standard code-search and documentation-generation benchmarks would falsify the central claim.
read the original abstract
We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us to utilize both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators. We evaluate CodeBERT on two NL-PL applications by fine-tuning model parameters. Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation tasks. Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. Results show that CodeBERT performs better than previous pre-trained models on NL-PL probing.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CodeBERT, a Transformer-based bimodal pre-trained model for natural language (NL) and programming language (PL). It is trained with a hybrid objective combining replaced token detection on NL-PL pairs (using generators trained on unimodal data) to learn general-purpose representations. After fine-tuning, it claims state-of-the-art results on natural language code search and code documentation generation; a zero-shot probing evaluation on a constructed NL-PL dataset also shows gains over prior pre-trained models.
Significance. If the empirical claims hold after addressing ablations, the work would be significant for demonstrating effective transfer from a hybrid pre-training regime that mixes bimodal pairs with unimodal data to downstream NL-PL tasks. The probing setup provides a useful lens on what knowledge is captured, and the overall approach could serve as a strong baseline for code intelligence applications.
major comments (3)
- [§4.1] §4.1 and Table 2: The SOTA claim on natural language code search reports improved MRR but supplies no ablation that isolates the replaced token detection component from standard MLM on identical bimodal data; without this, it is unclear whether the hybrid objective (rather than data scale or architecture) drives the reported gains over baselines such as RoBERTa.
- [§4.2] §4.2 and Table 3: The code documentation generation results claim SOTA BLEU scores after fine-tuning, yet no statistical significance tests or variance across random seeds are provided, and the contribution of the unimodal generator training step is not quantified via controlled removal.
- [§5] §5: The NL-PL probing dataset and zero-shot protocol are introduced to show superior performance, but the section does not report the exact number of probe examples per category or the precise metric (accuracy vs. F1) used for the comparison against prior models, weakening the interpretability of the knowledge-acquisition claim.
minor comments (3)
- [Abstract] Abstract: No quantitative metrics, dataset names, or baseline references are supplied, making the SOTA assertion difficult to assess at a glance.
- [§3.1] §3.1: The notation for bimodal NL-PL pairs and the generator sampling process could be illustrated with a short concrete example to improve reproducibility.
- [Figure 1] Figure 1: The architecture diagram lacks labels for the replaced-token-detection head and the flow of unimodal data, reducing clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [§4.1] §4.1 and Table 2: The SOTA claim on natural language code search reports improved MRR but supplies no ablation that isolates the replaced token detection component from standard MLM on identical bimodal data; without this, it is unclear whether the hybrid objective (rather than data scale or architecture) drives the reported gains over baselines such as RoBERTa.
Authors: We agree that an explicit ablation is needed to isolate the contribution of replaced token detection. In the revised version we will train a controlled baseline using standard MLM on exactly the same bimodal NL-PL pairs (with identical data scale and architecture) and report its MRR on the code search task alongside CodeBERT in an updated Table 2. This will allow direct attribution of gains to the hybrid objective. revision: yes
-
Referee: [§4.2] §4.2 and Table 3: The code documentation generation results claim SOTA BLEU scores after fine-tuning, yet no statistical significance tests or variance across random seeds are provided, and the contribution of the unimodal generator training step is not quantified via controlled removal.
Authors: We will add statistical rigor by reporting BLEU scores averaged over five random seeds with standard deviations and include paired significance tests against baselines in §4.2 and Table 3. We will also add a controlled ablation that removes the unimodal data from generator training while keeping all other settings fixed, quantifying its effect on downstream documentation generation performance. revision: yes
-
Referee: [§5] §5: The NL-PL probing dataset and zero-shot protocol are introduced to show superior performance, but the section does not report the exact number of probe examples per category or the precise metric (accuracy vs. F1) used for the comparison against prior models, weakening the interpretability of the knowledge-acquisition claim.
Authors: We will revise §5 to state the exact number of examples per probe category and explicitly note that accuracy is the evaluation metric. A supplementary table listing the category sizes will be added for full transparency. revision: yes
Circularity Check
No circularity: empirical pre-training and evaluation
full rationale
The paper describes CodeBERT as a Transformer model trained via a hybrid objective (replaced token detection on bimodal NL-PL pairs plus unimodal data for generators) and then fine-tuned for downstream tasks. All central claims rest on reported experimental metrics for code search and documentation generation, plus zero-shot probing, rather than any derivation, equation, or prediction that reduces to its own inputs by construction. No self-citation chains, ansatzes smuggled via prior work, or fitted parameters renamed as predictions appear in the load-bearing steps. The work is self-contained against external benchmarks and follows standard empirical ML practice.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Transformer-based models pre-trained with replaced token detection on bimodal data learn general-purpose NL-PL representations
Forward citations
Cited by 31 Pith papers
-
Deep Graph-Language Fusion for Structure-Aware Code Generation
CGFuse enables deep token-level fusion of graph-derived structural features into language models, yielding 10-16% BLEU and 6-11% CodeBLEU gains on code generation tasks.
-
Identifying and Characterizing Semantic Clones of Solidity Functions
A code-and-comment analysis method detects semantic clones in Solidity functions with 59% overall precision (84% for same-name functions) and 97% recall on 300k contracts, plus LLM summaries for uncommented code.
-
RepoDoc: A Knowledge Graph-Based Framework to Automatic Documentation Generation and Incremental Updates
RepoDoc uses a repository knowledge graph with module clustering and semantic impact propagation to generate more complete documentation 3x faster with 85% fewer tokens and handle incremental updates 73% faster than p...
-
R2Code: A Self-Reflective LLM Framework for Requirements-to-Code Traceability
R2Code improves requirement-to-code traceability with a bidirectional alignment network, self-reflective consistency verification, and dynamic context-adaptive retrieval, yielding 7.4% average F1 gain and up to 41.7% ...
-
SynthFix: Adaptive Neuro-Symbolic Code Vulnerability Repair
SynthFix adaptively routes LLM code repairs to supervised fine-tuning or symbolic-reward fine-tuning, yielding up to 32% higher exact match on JavaScript and C vulnerability benchmarks.
-
AgentSZZ: Teaching the LLM Agent to Play Detective with Bug-Inducing Commits
AgentSZZ is an LLM-agent framework that identifies bug-inducing commits with up to 27.2% higher F1 scores than prior methods by enabling adaptive exploration and causal tracing, especially for cross-file and ghost commits.
-
GAIA: a benchmark for General AI Assistants
GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.
-
CodeBLEU: a Method for Automatic Evaluation of Code Synthesis
CodeBLEU improves correlation with human programmer scores on code synthesis tasks by adding syntactic AST matching and semantic data-flow matching to the standard BLEU n-gram approach.
-
GraphCodeBERT: Pre-training Code Representations with Data Flow
GraphCodeBERT uses data flow graphs in pre-training to capture semantic code structure and reaches state-of-the-art results on code search, clone detection, translation, and refinement.
-
NeuroFlake: A Neuro-Symbolic LLM Framework for Flaky Test Classification
NeuroFlake integrates discriminative token mining into LLMs to classify flaky tests, raising F1-score to 69.34% on FlakeBench while showing greater robustness to semantic-preserving perturbations than prior methods.
-
MAS-Algorithm: A Workflow for Solving Algorithmic Programming Problems with a Multi-Agent System
MAS-Algorithm is a multi-agent workflow that improves AI acceptance rates on algorithmic problems by 6.48% on average, outperforming parameter-efficient fine-tuning.
-
MAS-Algorithm: A Workflow for Solving Algorithmic Programming Problems with a Multi-Agent System
MAS-Algorithm is a multi-agent workflow that raises acceptance rates on algorithmic problems by 6.48% on average over baseline models.
-
Mitigating False Positives in Static Memory Safety Analysis of Rust Programs via Reinforcement Learning
Reinforcement learning on MIR features with fuzz testing feedback reduces false positives in Rust static memory safety analysis, raising precision from 25.6% to 59% and accuracy to 65.2% while keeping 74.6% recall.
-
Mitigating False Positives in Static Memory Safety Analysis of Rust Programs via Reinforcement Learning
Reinforcement learning on MIR features combined with cargo-fuzz validation reduces false positives in Rust static memory safety analysis, raising precision from 25.6% to 59.0% and accuracy to 65.2%.
-
VulStyle: A Multi-Modal Pre-Training for Code Stylometry-Augmented Vulnerability Detection
VulStyle pre-trains on 4.9M functions using code, non-terminal ASTs, and stylometry features, then fine-tunes to achieve SOTA F1 gains of 4-48% on BigVul and VulDeePecker.
-
A Metamorphic Testing Approach to Diagnosing Memorization in LLM-Based Program Repair
Metamorphic testing on Defects4J and GitBug-Java reveals substantial performance drops in seven LLMs that correlate with NLL, indicating data leakage in LLM-based program repair.
-
On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation
Continuous latent-vector compression improves BLEU scores on repository-level code tasks by up to 28.3% at 4x compression while cutting inference latency.
-
DiffHLS: Differential Learning for High-Level Synthesis QoR Prediction with GNNs and LLM Code Embeddings
DiffHLS predicts HLS QoR via differential learning: separate GNN+LLM models for kernel baseline and design delta are composed to yield the final estimate, showing lower MAPE than GNN baselines on PolyBench.
-
ContractShield: Bridging Semantic-Structural Gaps via Hierarchical Cross-Modal Fusion for Multi-Label Vulnerability Detection in Obfuscated Smart Contracts
ContractShield achieves 89% Hamming score and 91% F1-score for five vulnerability types in obfuscated smart contracts via hierarchical cross-modal fusion of semantic, temporal, and structural features with only 1-3% p...
-
GoCoMA: Hyperbolic Multimodal Representation Fusion for Large Language Model-Generated Code Attribution
GoCoMA fuses code stylometry and binary artifact images via hyperbolic Poincaré ball projection and geodesic-cosine attention to attribute LLM-generated code, outperforming baselines on CoDET-M4 and LLMAuthorBench.
-
RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation
RoboTwin 2.0 automates diverse synthetic data creation for dual-arm robots via MLLMs and five-axis domain randomization, leading to 228-367% gains in manipulation success.
-
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
DeBERTaV3 improves DeBERTa by switching to replaced token detection pre-training and using gradient-disentangled embedding sharing, reaching 91.37% on GLUE and new SOTA on XNLI zero-shot.
-
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
-
PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection
Controlled experiments show PLM-GNN hybrids improve code tasks over GNN-only baselines, with PLM source having larger impact than GNN backbone.
-
From Theory to Practice: Code Generation Using LLMs for CAPEC and CWE Frameworks
LLMs generated 615 vulnerable code snippets aligned with CAPEC and CWE frameworks across three languages, with 0.98 cosine similarity between model outputs.
-
Improving MPI Error Detection and Repair with Large Language Models and Bug References
Augmenting LLMs with bug references, few-shot learning, chain-of-thought, and RAG improves MPI error detection accuracy from 44% to 77% and generalizes across models.
-
What Are Adversaries Doing? Automating Tactics, Techniques, and Procedures Extraction: A Systematic Review
Systematic review of 80 papers shows TTP extraction shifting to transformer and LLM methods but limited by narrow datasets, single-label focus, and low reproducibility.
-
StarCoder: may the source be with you!
StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.
-
Prompt-Driven Code Summarization: A Systematic Literature Review
A systematic review that categorizes prompting strategies for LLM-based code summarization, assesses their effectiveness, and identifies gaps in research and evaluation practices.
-
A systematic literature Review for Transformer-based Software Vulnerability detection
A review of 80 studies from 2021-2025 on transformer-based software vulnerability detection identifies trends in architectures, datasets, and challenges such as data imbalance and interpretability.
-
A Survey on Large Language Models for Code Generation
A systematic literature review that organizes recent work on LLMs for code generation into a taxonomy covering data curation, model advances, evaluations, ethics, environmental impact, and applications, with benchmark...
Reference graph
Works this paper leans on
-
[2]
Advances in neural information processing systems , pages=
Sequence to sequence learning with neural networks , author=. Advances in neural information processing systems , pages=
-
[3]
Proceedings of the 20th international conference on Computational Linguistics , pages=
Orange: a method for evaluating automatic evaluation metrics for machine translation , author=. Proceedings of the 20th international conference on Computational Linguistics , pages=. 2004 , organization=
work page 2004
-
[4]
Proceedings of the 40th annual meeting on association for computational linguistics , pages=
BLEU: a method for automatic evaluation of machine translation , author=. Proceedings of the 40th annual meeting on association for computational linguistics , pages=. 2002 , organization=
work page 2002
-
[7]
Improving language understanding by generative pre-training , author=. URL https://s3-us-west-2. amazonaws. com/openai-assets/researchcovers/languageunsupervised/language understanding paper. pdf , year=
-
[8]
BioBERT: a pre-trained biomedical language repre- sentationmodelforbiomedicaltextmining
Biobert: pre-trained biomedical language representation model for biomedical text mining , author=. arXiv preprint arXiv:1901.08746 , year=
-
[9]
Visualbert: A simple and perfor- 13 mant baseline for vision and language
Visualbert: A simple and performant baseline for vision and language , author=. arXiv preprint arXiv:1908.03557 , year=
-
[12]
Kevin Clark and Minh-Thang Luong and Quoc V. Le and Christopher D. Manning , booktitle=
-
[15]
Advances in neural information processing systems , pages=
Attention is all you need , author=. Advances in neural information processing systems , pages=
-
[19]
Advances in Neural Information Processing Systems , pages=
Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks , author=. Advances in Neural Information Processing Systems , pages=
-
[21]
International Conferenceon Learning Representations , year=
code2seq: Generating sequences from structured representations of code , author=. International Conferenceon Learning Representations , year=
- [22]
-
[23]
2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) , pages=
Deep code search , author=. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) , pages=. 2018 , organization=
work page 2018
-
[24]
An introduction to neural information retrieval , author=. Foundations and Trends. 2018 , publisher=
work page 2018
-
[29]
Moses: Open source toolkit for statistical machine translation , author=. Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions , pages=
-
[32]
Summarizing source code using a neural attention model , author=. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
-
[33]
Adam: A Method for Stochastic Optimization
Adam: A method for stochastic optimization , author=. arXiv preprint arXiv:1412.6980 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[34]
Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. 2019. code2seq: Generating sequences from structured representations of code. International Conferenceon Learning Representations
work page 2019
-
[35]
Kyunghyun Cho, Bart Van Merri \"e nboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[36]
Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. \ ELECTRA \ : Pre-training text encoders as discriminators rather than generators. In International Conference on Learning Representations
work page 2020
-
[37]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[38]
Xiaodong Gu, Hongyu Zhang, and Sunghun Kim. 2018. Deep code search. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pages 933--944. IEEE
work page 2018
-
[39]
Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Codesearchnet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[40]
Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, and Luke Zettlemoyer. 2016. Summarizing source code using a neural attention model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2073--2083
work page 2016
-
[41]
Dan Jurafsky. 2000. Speech & language processing. Pearson Education India
work page 2000
- [42]
-
[43]
Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882
work page Pith review arXiv 2014
-
[44]
Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, et al. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of...
work page 2007
-
[45]
Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[46]
Chin-Yew Lin and Franz Josef Och. 2004. Orange: a method for evaluating automatic evaluation metrics for machine translation. In Proceedings of the 20th international conference on Computational Linguistics, page 501. Association for Computational Linguistics
work page 2004
-
[47]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[48]
Jiasen Lu, Dhruv Batra, Devi Parikh, and Stefan Lee. 2019. Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Advances in Neural Information Processing Systems, pages 13--23
work page 2019
-
[49]
Bhaskar Mitra, Nick Craswell, et al. 2018. An introduction to neural information retrieval. Foundations and Trends in Information Retrieval , 13(1):1--126
work page 2018
-
[50]
Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365
work page Pith review arXiv 2018
- [51]
-
[52]
Telmo Pires, Eva Schlinger, and Dan Garrette. 2019. How multilingual is multilingual bert? arXiv preprint arXiv:1906.01502
work page Pith review arXiv 2019
-
[53]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. URL https://s3-us-west-2. amazonaws. com/openai-assets/researchcovers/languageunsupervised/language understanding paper. pdf
work page 2018
-
[54]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683
work page internal anchor Pith review Pith/arXiv arXiv 2019
- [55]
- [56]
-
[57]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems, pages 3104--3112
work page 2014
-
[58]
Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075
work page Pith review arXiv 2015
- [59]
-
[60]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems, pages 5998--6008
work page 2017
-
[61]
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144
work page internal anchor Pith review Pith/arXiv arXiv 2016
- [62]
-
[63]
International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 1, F ebruary 2002: Special Issue on H ow N et and Its Applications. 2002
work page 2002
-
[64]
Tseng, Hui-Hsin and Liu, Chao-Lin and Gao, Zhao-Ming and Chen, Keh-Jiann. 以構詞律與相似法為本的中文動詞自動分類研究 (A Hybrid Approach for Automatic Classification of C hinese Unknown Verbs) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 1, F ebruary 2002: Special Issue on H ow N et and Its Applications. 2002
work page 2002
-
[65]
Word Sense Disambiguation and Sense-Based NV Event Frame Identifier
Tsai, Jia-Lin and Hsu, Wen-Lian and Su, Jeng-Woei. Word Sense Disambiguation and Sense-Based NV Event Frame Identifier. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 1, F ebruary 2002: Special Issue on H ow N et and Its Applications. 2002
work page 2002
-
[66]
一種基於知網的語義排歧模型研究 (A Study of Semantic Disambiguation Based on H ow N et) [In C hinese]
Yang, Xiaofeng and Li, Tangqiu. 一種基於知網的語義排歧模型研究 (A Study of Semantic Disambiguation Based on H ow N et) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 1, F ebruary 2002: Special Issue on H ow N et and Its Applications. 2002
work page 2002
-
[67]
基於文本概念和k NN 的跨語種文本過濾 (Cross-Language Text Filtering Based on Text Concepts and k NN ) [In C hinese
Su, Weifeng and Li, Shaozi and Li, Tanqiu and You, Wenjian. 基於文本概念和k NN 的跨語種文本過濾 (Cross-Language Text Filtering Based on Text Concepts and k NN ) [In C hinese. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 1, F ebruary 2002: Special Issue on H ow N et and Its Applications. 2002
work page 2002
-
[68]
International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[69]
Chen, Zusun and Zhou, Qiang and Zhao, Qiang. 情境 --- --- 組織/存放辭彙語義知識的恰當框架 (Situation -- A Suitable Framework for Organizing and Positioning Lexical Semantic Knowledge) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[70]
A Study on Word Similarity using Context Vector Models
Chen, Keh-Jiann and You, Jia-Ming. A Study on Word Similarity using Context Vector Models. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[71]
基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net) [In C hinese]
Liu, Qun and Li, Sujian. 基於《知網》的辭彙語義相似度計算 (Word Similarity Computing Based on How-net) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[72]
基於組合特徵的漢語名詞詞義消歧 (A Study on Noun Sense Disambiguation Based on Syntagmatic Features) [In C hinese]
Wang, Hui. 基於組合特徵的漢語名詞詞義消歧 (A Study on Noun Sense Disambiguation Based on Syntagmatic Features) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[73]
Kang, Shiyong. 《現代漢語新詞語資訊電子詞典》的研究與實現 (Development and Study of the `` Modern C hinese New Words Information Electronic Dictionary '' ) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[74]
Song, Rou and Xu, Yong. 基於詞彙語義的百科辭典知識提取實驗 (An Experiment on Knowledge Extraction from an Encyclopedia Based on Lexicon Semantics) [In C hinese]. International Journal of Computational Linguistics & C hinese Language Processing, Volume 7, Number 2, August 2002: Special Issue on Computational C hinese Lexical Semantics. 2002
work page 2002
-
[75]
Wu, Chung-Hsien and Tseng, Yuen-Hsien and Kao, Hung-Yu. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[76]
Hsu, Yao-Chi and Yang, Ming-Han and Hung, Hsiao-Tsung and Lin, Yi-Ju and Chen, Berlin. 評估尺度相關最佳化方法於華語錯誤發音檢測之研究(Evaluation Metric-related Optimization Methods for M andarin Mispronunciation Detection) [In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[77]
Yang, Ming-Han and Hsu, Yao-Chi and Hung, Hsiao-Tsung and Chen, Ying-Wen and Chen, Berlin and Chen, Kuan-Yu. 融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究(Leveraging Multi-task Learning with Neural Network Based Acoustic Modeling for Improved Meeting Speech Recognition) [In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ...
work page 2016
-
[78]
Yan, Bi-Cheng and Shih, Chin-Hong and Liu, Shih-Hung and Chen, Berlin. 使用字典學習法於強健性語音辨識(The Use of Dictionary Learning Approach for Robustness Speech Recognition) [In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[79]
Chan, Chia-Hsien and Chen, Chia-Ping. 以多層感知器辨識情緒於國台客語料庫 (Use Multilayer Perceptron To Recognize Emotion in M andarin, T aiwanese and H akka Database) [In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[80]
Huang, Shu-Ling and Li, Shi-Min and Bai, Ming-Hong and Wu, Jian-Cheng and Wang, Ying-Ni and Lin, Qing-Long. 「 V 到」結構的合分詞及語意區分(Word segmentation and sense representation for V-dao structure in C hinese)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[81]
歌詞演唱錯誤偵測(Automatic Sung Lyrics Verification)[In C hinese]
Kung, Shiang-Shiun and Ma, Cin-Hao and Shen, Sin-Fu and Hsiao, Po-Yuan and Tsai, Wei-Ho. 歌詞演唱錯誤偵測(Automatic Sung Lyrics Verification)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[82]
基於詞語分布均勻度的核心詞彙選擇之研究(A Study on Dispersion Measures for Core Vocabulary Compilation )[In C hinese]
Bai, Ming-Hong and Wu, Jian-Cheng and Chien, Ying-Ni and Huang, Shu-Ling and Lin, Ching-Lung. 基於詞語分布均勻度的核心詞彙選擇之研究(A Study on Dispersion Measures for Core Vocabulary Compilation )[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[83]
Chen, Pei-Yi and Chung, Siaw-Fong. 什麼時候「認真就輸了」? --- --- 語料庫中「認真」一詞的語意變化(Do We Lose When Being Serious? --- C hange in Meaning of the Word `` Renzen(認真) '' in Corpora). Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[84]
Crowdsourcing Experiment Designs for C hinese Word Sense Annotation
Huang, Tzu-Yun and Wu, Hsiao-Han and Lee, Chia-Chen and Lee, Shao-Man and Li, Guan-Wei and Hsieh, Shu-Kai. Crowdsourcing Experiment Designs for C hinese Word Sense Annotation. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[85]
基於相依詞向量的剖析結果重估與排序(N-best Parse Rescoring Based on Dependency-Based Word Embeddings)
Hsieh, Yu-Ming and Ma, Wei-Yun. 基於相依詞向量的剖析結果重估與排序(N-best Parse Rescoring Based on Dependency-Based Word Embeddings). Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[86]
Pu, Guan-Ying and Chen, Po-Lin and Wu, Shih-Hung. 以語言模型評估學習者文句修改前後之流暢度(Using language model to assess the fluency of learners sentences edited by teachers)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[87]
Hsieh, Yu-Lun and Liu, Shih-Hung and Chen, Kuan-Yu and Wang, Hsin-Min and Hsu, Wen-Lian and Chen, Berlin. 運用序列到序列生成架構於重寫式自動摘要(Exploiting Sequence-to-Sequence Generation Framework for Automatic Abstractive Summarization)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[88]
Chen, Kuan-Hung and Liao, Shu-Han and Liao, Yuan-Fu and Wang, Yih-Ru. 基於字元階層之語音合成用文脈訊息擷取(Character-Level Linguistic Features Extraction for Text-to-Speech System) [In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[89]
多通道之多重音頻串流方法之研究(Multi-channel Source Clustering of Polyphonic Music) [In C hinese]
Kuan, Chih Yi and Su, Li and Chin, Yu Hao and Wang, Jia-Ching. 多通道之多重音頻串流方法之研究(Multi-channel Source Clustering of Polyphonic Music) [In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[90]
Support Super-Vector Machines in Automatic Speech Emotion Recognition
Chen, Chia-Ying and Chen, Chia-Ping. Support Super-Vector Machines in Automatic Speech Emotion Recognition. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[91]
Liu, Chin-Ting and Chen, Li-mei and Lin, Yu-Ching and Cheng, Chia-Fang and Chang, Hui-chen. Speech Intelligibility and the Production of Fricative and Affricate among M andarin-speaking Children with Cerebral Palsy. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[92]
Hu, Hsueh-ying and Chung, Siaw-Fong. 網路新興語言 & 耍 ' 之語意辨析:以批踢踢語料庫為本(On the semantic analysis of the verb shua3 in Taiwan M andarin: The PTT corpus-based study)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[93]
Wang, Xu-Xiang and Zheng, Zhi-Hao and Tsao, Yu and Hong, Jhih-Wei. 非負矩陣分解法於語音調變頻譜強化之研究(A study of enhancing the modulation spectrum of speech signals via nonnegative matrix factorization)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[94]
Chen, Yao-Hui and Wang, Jhih-Wei. 以多重表示選擇文章分類的樣本(Using Multiple Representations to Select Instances for Text Classification)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[95]
Computing Sentiment Scores of Verb Phrases for V ietnamese
Tran, Thien Khai and Phan, Tuoi Thi. Computing Sentiment Scores of Verb Phrases for V ietnamese. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[96]
Automatic evaluation of surface coherence in L 2 texts in C zech
Rysov \'a , Kate r ina and Rysov \'a , Magdal \'e na and M \' rovsk \'y , Ji r \'. Automatic evaluation of surface coherence in L 2 texts in C zech. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
-
[97]
F acebook 活動事件擷取系統( F acebook Activity Event Extraction System)[In C hinese]
Lin, Yuan-Hao and Chang, Chia-Hui. F acebook 活動事件擷取系統( F acebook Activity Event Extraction System)[In C hinese]. Proceedings of the 28th Conference on Computational Linguistics and Speech Processing ( ROCLING 2016). 2016
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.