{"paper":{"title":"Exploiting Pre-trained Encoder-Decoder Transformers for Sequence-to-Sequence Constituent Parsing","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Pre-trained encoder-decoder models like BART and T5, when fine-tuned to output linearized parse trees, outperform earlier sequence-to-sequence parsers and compete with specialized constituent parsers on continuous data.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Cristina Outeiri\\~no Cid, Daniel Fern\\'andez-Gonz\\'alez","submitted_at":"2026-05-13T11:28:56Z","abstract_excerpt":"To achieve deep natural language understanding, syntactic constituent parsing plays a crucial role and is widely required by many artificial intelligence systems for processing both text and speech. A recent approach involves using standard sequence-to-sequence models to handle constituent parsing as a machine translation problem, moving away from traditional task-specific parsers. These models are typically initialized with pre-trained encoder-only language models like BERT or RoBERTa. However, the use of pre-trained encoder-decoder language models for constituency parsing has not been thorou"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Our results demonstrate that our approach outperforms all prior sequence-to-sequence models and performs competitively with leading task-specific constituent parsers on continuous constituent parsing.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That standard fine-tuning of encoder-decoder models on linearized trees is sufficient to capture the full syntactic structure without additional task-specific mechanisms or architectural changes.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Pre-trained encoder-decoder transformers fine-tuned for sequence-to-sequence constituent parsing outperform prior seq2seq models and compete with specialized parsers on continuous treebanks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Pre-trained encoder-decoder models like BART and T5, when fine-tuned to output linearized parse trees, outperform earlier sequence-to-sequence parsers and compete with specialized constituent parsers on continuous data.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"7f91b8c5c0cc883d676bb24b0b3b6ae3cbe93e4adefac0b6a4cbcd6cd748e604"},"source":{"id":"2605.13373","kind":"arxiv","version":1},"verdict":{"id":"f934572b-21b2-4149-b8a0-cd11a7de4353","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T19:53:57.933858Z","strongest_claim":"Our results demonstrate that our approach outperforms all prior sequence-to-sequence models and performs competitively with leading task-specific constituent parsers on continuous constituent parsing.","one_line_summary":"Pre-trained encoder-decoder transformers fine-tuned for sequence-to-sequence constituent parsing outperform prior seq2seq models and compete with specialized parsers on continuous treebanks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That standard fine-tuning of encoder-decoder models on linearized trees is sufficient to capture the full syntactic structure without additional task-specific mechanisms or architectural changes.","pith_extraction_headline":"Pre-trained encoder-decoder models like BART and T5, when fine-tuned to output linearized parse trees, outperform earlier sequence-to-sequence parsers and compete with specialized constituent parsers on continuous data."},"references":{"count":62,"sample":[{"doi":"10.18653/v1/d18-1037","year":2018,"title":"J. G¯ u, H. S. Shavarani, A. Sarkar, Top-down tree structured decoding with syntactic connections for neural machine translation and parsing, in: Proceedings of the 2018 Conference on Empirical Method","work_id":"d9e01bc1-378c-426e-9306-87d3e715ef23","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2018,"title":"X. Wang, H. Pham, P. Yin, G. Neubig, A tree-based decoder for neu- ral machine translation, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for ","work_id":"c22def47-2ae9-4c55-94b0-fa9f6fdbfcdd","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2019,"title":"A. Currey, K. Heafield, Incorporating source syntax into transformer- based neural machine translation, in: Proceedings of the Fourth Con- ference on Machine Translation (Volume 1: Research Papers), A","work_id":"1d6e7d82-2803-40e2-bc29-3535685caeac","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.2174/2666255813999200922142212","year":2022,"title":"D. Bouras, M. Amroune, H. Bendjenna, I. Bendib, Improving fine- grained opinion mining approach with a deep constituency tree-long short term memory network and word embedding, Recent Advances in Comp","work_id":"b1390ea3-5358-49d3-b7a5-0d395d3eb6b9","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.18653/v1/2020.acl-main.341","year":2020,"title":"Sentibert: A transferable transformer-based architecture for composi- tional sentiment semantics","work_id":"d5f43344-7028-4c72-821d-11336fcce8e1","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":62,"snapshot_sha256":"8bdea998f3343ad16bb7c02726f675fbea3dd4f37c7629aa6641c6adf32b1e07","internal_anchors":2},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}