Automatic Generation of Titles for Research Papers Using Language Models

Debarshi Kumar Sanyal; Samiran Chattopadhyay; Tohida Rehman

arxiv: 2606.05085 · v1 · pith:5SBTPGD3new · submitted 2026-06-03 · 💻 cs.CL · cs.AI

Automatic Generation of Titles for Research Papers Using Language Models

Tohida Rehman , Debarshi Kumar Sanyal , Samiran Chattopadhyay This is my paper

Pith reviewed 2026-06-28 06:42 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords title generationlanguage modelsPEGASUSabstractsresearch papersfine-tuningautomatic evaluation metricsSpringerSSAT dataset

0 comments

The pith

Fine-tuned PEGASUS-large generates research paper titles from abstracts more accurately than LLaMA-3-8B or zero-shot GPT-3.5-turbo across standard metrics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that language models can be applied to create titles directly from paper abstracts, with a new dataset added to existing ones for training and testing. It compares several models and finds that fine-tuning PEGASUS-large yields the strongest results on automatic scores while zero-shot GPT-3.5-turbo lags behind. The work shows that such generated titles are generally appropriate and that ChatGPT can also produce creative variants. A sympathetic reader would care because selecting a clear title is a recurring difficulty for authors, and reliable automation could reduce that effort.

Core claim

The central claim is that fine-tuned PEGASUS-large outperforms fine-tuned LLaMA-3-8B and zero-shot GPT-3.5-turbo on ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore when generating titles from abstracts on the CSPubSum, LREC-COLING-2024, and new SpringerSSAT datasets, and that the resulting titles are generally appropriate and reliable.

What carries the argument

Fine-tuned PEGASUS-large applied to abstract-to-title generation, evaluated by overlap and embedding-based metrics.

If this is right

Authors in computer science and social sciences can use the fine-tuned PEGASUS model as a practical assistant when drafting titles.
The SpringerSSAT dataset provides additional training material for future title-generation work in the social sciences.
Zero-shot prompting of GPT-3.5-turbo is shown to be less competitive than fine-tuned smaller models on the chosen metrics.
ChatGPT can be prompted to produce creative title alternatives that differ from the more literal outputs of the fine-tuned models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fine-tuning approach could be tested on generating titles from other sections such as conclusions or introductions.
If automatic metrics correlate poorly with human preference, future work would need to collect direct human ratings to guide model selection.
The finding that fine-tuned open models beat zero-shot large models suggests similar patterns may hold for other short-text generation tasks in academic writing.

Load-bearing premise

That automatic metrics such as ROUGE and BERTScore serve as reliable stand-ins for whether human readers or domain experts would judge the generated titles as appropriate, clear, and appealing.

What would settle it

A human evaluation study in which domain experts rate the titles produced by each model and the model ranked highest by automatic metrics receives lower average scores for appropriateness or appeal than a lower-ranked model.

read the original abstract

The title of a research paper conveys its primary idea and, occasionally, its conclusions in a clear and concise manner. Choosing an appropriate title is often challenging, and automated title generation can assist authors in this task. In this work, we propose a technique to generate paper titles from abstracts using open-weight pre-trained and large language models. We use the CSPubSum and LREC-COLING-2024 datasets and introduce a new dataset, SpringerSSAT, curated from four Springer journals in the social sciences. Additionally, we use GPT-3.5-turbo in a zero-shot setting to generate titles. Model performance is evaluated with ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore metrics. Our experiments show that fine-tuned PEGASUS-large outperforms other models, including fine-tuned LLaMA-3-8B and zero-shot GPT-3.5-turbo, across most metrics. We further demonstrate that ChatGPT can generate creative paper titles. Overall, AI-generated titles are generally appropriate and reliable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds one new dataset but its performance claims rest on automatic metrics with no human validation or training details shown.

read the letter

The concrete contribution is the SpringerSSAT dataset pulled from four Springer social-science journals. They combine it with CSPubSum and LREC-COLING-2024, fine-tune PEGASUS-large and LLaMA-3-8B, run zero-shot GPT-3.5-turbo, and note that ChatGPT can produce creative titles. That dataset step is useful if someone needs more social-science examples.

The rest is standard. Title generation from abstracts is already in the literature, and the paper does not introduce a new method or first-principles result. The headline finding is that fine-tuned PEGASUS-large beats the others on ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore. No training hyperparameters, split sizes, or significance tests appear in the abstract, and the full text does not change that picture.

The bigger issue is the evaluation. These metrics measure overlap with reference titles; they do not directly test whether a title is clear, specific, or appealing to readers. The paper offers no human study or correlation check to show the scores track expert judgment. Without that link, the outperformance claim stays weak. The note that ChatGPT titles are creative is also unsupported by any systematic comparison.

This work is for people who want another off-the-shelf dataset or a quick model bake-off for a title tool. It does not advance the core NLP question. I would not bring it to a reading group, would not cite it, and would not send it to serious peer review. The dataset is the only part worth keeping; the claims need human data and fuller experimental reporting before they are worth referee time.

Referee Report

2 major / 2 minor

Summary. The paper proposes generating research paper titles from abstracts using open-weight pre-trained models (fine-tuned PEGASUS-large and LLaMA-3-8B) and zero-shot GPT-3.5-turbo. It introduces the SpringerSSAT dataset curated from Springer social science journals, alongside CSPubSum and LREC-COLING-2024. Performance is measured with ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore; the central claim is that fine-tuned PEGASUS-large outperforms the other models across most metrics. The work also notes that ChatGPT produces creative titles and concludes that AI-generated titles are generally appropriate and reliable.

Significance. If the automatic-metric results are shown to track human judgments of title quality, the work would provide a practical demonstration that fine-tuned sequence-to-sequence models can assist title selection, together with a new public dataset (SpringerSSAT) that expands coverage to the social sciences. The use of open-weight models and explicit comparison to a strong zero-shot baseline are positive features that support reproducibility.

major comments (2)

[Abstract / Experiments] Abstract and Experiments section: the headline claim that fine-tuned PEGASUS-large 'outperforms other models across most metrics' is presented without any description of training hyperparameters, data splits, random seeds, or statistical significance tests. This absence makes it impossible to determine whether the reported gains are robust or could be artifacts of a single run.
[Evaluation] Evaluation section: the paper relies exclusively on ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore as evidence of title quality. No human evaluation or correlation study is reported to show that higher scores on these metrics correspond to titles judged clearer, more specific, or more appealing by domain experts or readers—the very properties the abstract states titles must convey.

minor comments (2)

[Abstract] The abstract states that 'AI-generated titles are generally appropriate and reliable' without quantifying what fraction of outputs were inspected or by what criteria appropriateness was judged.
[Datasets] Dataset construction details for SpringerSSAT (e.g., filtering criteria, number of papers per journal, train/dev/test splits) are referenced but not fully specified in the provided text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address the major points below and will revise the manuscript to enhance reproducibility and transparency.

read point-by-point responses

Referee: [Abstract / Experiments] Abstract and Experiments section: the headline claim that fine-tuned PEGASUS-large 'outperforms other models across most metrics' is presented without any description of training hyperparameters, data splits, random seeds, or statistical significance tests. This absence makes it impossible to determine whether the reported gains are robust or could be artifacts of a single run.

Authors: We agree that these experimental details are necessary for assessing robustness. In the revised manuscript we will expand the Experiments section with a full description of training hyperparameters, data splits (including how the train/validation/test partitions were created for each dataset), random seeds, and statistical significance testing (e.g., bootstrap or paired tests) between model outputs. revision: yes
Referee: [Evaluation] Evaluation section: the paper relies exclusively on ROUGE, METEOR, MoverScore, BERTScore, and SciBERTScore as evidence of title quality. No human evaluation or correlation study is reported to show that higher scores on these metrics correspond to titles judged clearer, more specific, or more appealing by domain experts or readers—the very properties the abstract states titles must convey.

Authors: We recognize that automatic metrics alone do not fully capture title quality. We will add an explicit limitations paragraph that discusses the reliance on automatic metrics, cites prior work on their correlation with human judgments in summarization and title-generation settings, and includes additional qualitative examples comparing titles produced by each model. A dedicated human evaluation study lies outside the scope of the present work and is noted as future research. revision: partial

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper evaluates fine-tuned models and zero-shot GPT-3.5-turbo on title generation using standard automatic metrics (ROUGE, METEOR, MoverScore, BERTScore, SciBERTScore) against reference titles from external datasets (CSPubSum, LREC-COLING-2024, and the newly introduced SpringerSSAT). No load-bearing steps reduce any prediction or result to the paper's own inputs by construction, self-definition, or self-citation chains. The outperformance claim is an empirical comparison, not a tautology, and the derivation is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that fine-tuning on abstract-title pairs transfers to good title generation and that automatic metrics correlate with human notions of title quality; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Pre-trained language models can be fine-tuned on abstract-title pairs to improve title generation performance
This premise justifies the use of fine-tuned PEGASUS and LLaMA models in the experiments.

pith-pipeline@v0.9.1-grok · 5717 in / 1167 out tokens · 48417 ms · 2026-06-28T06:42:02.684049+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 4 canonical work pages · 2 internal anchors

[1]

Article title type and its relation with the number of downloads and citations.Scientometrics, 88(2):653–661, 2011

Hamid R Jamali and Mahsa Nikzad. Article title type and its relation with the number of downloads and citations.Scientometrics, 88(2):653–661, 2011

2011
[2]

The advantage of short paper titles.Royal Society Open Science, 2(8):150266, 2015

Adrian Letchford, Helen Susannah Moat, and Tobias Preis. The advantage of short paper titles.Royal Society Open Science, 2(8):150266, 2015

2015
[3]

Integrated construction and simulation of tool paths for milling dental crowns and bridges

Fatemeh Rostami, Asghar Mohammad- poorasl, and Mohammad Hajizadeh. The effect of characteristics of title on cita- tion rates of articles.Scientometrics, 98:2007–2010, 2014. T able 18:Model vs human, and human vs human evaluation on 10 selected examples fromLREC-COLING- 2024dataset. The models are fine-tuned onCSPubSumtraining set. All scores are reported...

work page doi:10.1007/s43545-024-00886-w 2007
[4]

Active Learning Design Choices for NER with Transformers

Tohida Rehman, Debarshi Kumar Sanyal, and Samiran Chattopadhyay. Can pre- trained language models generate titles for research papers? InInternational Conference on Asian Digital Libraries, pages 154–170. Springer, 2024. 20 T able 23:Comparison of author-written titles, model-generated titles (from PEGASUS-large and LLaMA-3-8B fine-tuned on theCSPubSumtra...

2024
[5]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text trans- former.Journal of Machine Learning Research, 21(140):1–67, 2020

2020
[6]

BART: Denoising sequence- to-sequence pre-training for natural lan- guage generation, translation, and compre- hension

Mike Lewis, Yinhan Liu, Naman Goyal, Mar- jan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. BART: Denoising sequence- to-sequence pre-training for natural lan- guage generation, translation, and compre- hension. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors,Proceed- ings of the 58th Ann...

2020
[7]

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. PEGASUS: pre- training with extracted gap-sentences for abstractive summarization. InProceed- ings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020

2020
[8]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, et al. LLaMA: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[9]

Llama 3 model card

AI@Meta. Llama 3 model card. 2024

2024
[10]

ROUGE: A package for auto- matic evaluation of summaries

Chin-Yew Lin. ROUGE: A package for auto- matic evaluation of summaries. InText Summarization Branches Out, pages 74–81, 2004. 21

2004
[11]

METEOR: An automatic metric for mt evaluation with improved correlation with human judgments

Satanjeev Banerjee and Alon Lavie. METEOR: An automatic metric for mt evaluation with improved correlation with human judgments. InProceedings of the ACL Workshop on Intrinsic and Extrinsic Eval- uation Measures for Machine Translation and/or Summarization, pages 65–72, 2005

2005
[12]

Meyer, and Steffen Eger

Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, and Steffen Eger. MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors,Proceedings of the 2019 Conference on Empirical Meth- ods in Natural Language Processing and the 9th Internation...

2019
[13]

Association for Computational Linguis- tics
[14]

BERTScore: Evaluating text generation with BERT

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. BERTScore: Evaluating text generation with BERT. InProceedings of the Interna- tional Conference on Learning Representa- tions, 2020

2020
[15]

Entity-level factual consistency of abstractive text summarization

Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos Santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, and Bing Xiang. Entity-level factual consistency of abstractive text summarization. In Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty, editors,Proceedings of the 16th Conference of the European Chapter of the Association for Computational Ling...

2021
[16]

The automatic creation of literature abstracts.IBM Journal of Research and Development, 2(2):159–165, 1958

Hans Peter Luhn. The automatic creation of literature abstracts.IBM Journal of Research and Development, 2(2):159–165, 1958

1958
[17]

Compendium: A text sum- marization system for generating abstracts of research papers.Data & Knowledge Engi- neering, 88:164–175, 2013

Elena Lloret, Mar´ ıa Teresa Rom´ a-Ferri, and Manuel Palomar. Compendium: A text sum- marization system for generating abstracts of research papers.Data & Knowledge Engi- neering, 88:164–175, 2013

2013
[18]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neu- ral networks. InProceedings of the 27th International Conference on Neural Informa- tion Processing Systems - Volume 2, NIPS’14, page 3104–3112, Cambridge, MA, USA, 2014. MIT Press

2014
[19]

Neural machine translation by jointly learning to align and translate

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations, 2015

2015
[20]

Abstractive text summarization using sequence-to-sequence RNNs and beyond

Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Caglar Gulcehre, and Bing Xiang. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learn- ing, pages 280–290, Berlin, Germany, 2016. Association for Computational Linguistics

2016
[21]

See, Peter J

A. See, Peter J. Liu, and Christopher D. Man- ning. Get to the point: Summarization with pointer-generator networks. InProceedings of the 55th Annual Meeting of the Associa- tion for Computational Linguistics (Volume 1: Long Papers), 2017

2017
[22]

Attention is all you need.Advances in Neural Information Processing Systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Par- mar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in Neural Information Processing Systems, 30, 2017

2017
[23]

BERT: Pre-training of deep bidirectional transformers for lan- guage understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for lan- guage understanding. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Tech- nologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019

2019
[24]

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer.arXiv preprint arXiv:1910.10683, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910
[25]

PEGASUS: Pre- training with extracted gap-sentences for abstractive summarization

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. PEGASUS: Pre- training with extracted gap-sentences for abstractive summarization. InProceed- ings of the International Conference on Machine Learning (ICLR), pages 11328– 11339. PMLR, 2020

2020
[26]

From neural sentence summarization to head- line generation: A coarse-to-fine approach

Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. From neural sentence summarization to head- line generation: A coarse-to-fine approach. In IJCAI, volume 17, pages 4109–4115, 2017. 22

2017
[27]

Automatic title generation in sci- entific articles for authorship assistance: a summarization approach.Journal of ICT Research and Applications, 11(3):253–267, 2017

Jan Wira Gotama Putra and Masayu Leylia Khodra. Automatic title generation in sci- entific articles for authorship assistance: a summarization approach.Journal of ICT Research and Applications, 11(3):253–267, 2017

2017
[28]

Automatic title generation for text with pre-trained transformer language model

Prakhar Mishra, Chaitali Diwan, Srinath Srinivasa, and Gopalakrishnan Srini- vasaraghavan. Automatic title generation for text with pre-trained transformer language model. InProceedings of the 2021 IEEE 15th International Conference on Seman- tic Computing (ICSC), pages 17–24. IEEE, 2021

2021
[29]

Paper abstract writing through editing mechanism.arXiv preprint arXiv:1805.06064, 2018

Qingyun Wang, Zhihao Zhou, Lifu Huang, Spencer Whitehead, Boliang Zhang, Heng Ji, and Kevin Knight. Paper abstract writing through editing mechanism.arXiv preprint arXiv:1805.06064, 2018

work page arXiv 2018
[30]

A dataset of attributes from papers of a machine learning conference.Data in brief, 24:103836, 2019

Diego Vallejo-Huanga, Paulina Morillo, and C` esar Ferri. A dataset of attributes from papers of a machine learning conference.Data in brief, 24:103836, 2019

2019
[31]

Gen- erating accurate and engaging research paper titles using nlp techniques

Thulasi Bikku, Nirmala Rani Narimalla, Keerthi Konda, Anusha Nakkala, Avanti Yarlagadda, and B Sachuthananthan. Gen- erating accurate and engaging research paper titles using nlp techniques. InInternational Conference on Innovations in Bio-Inspired Computing and Applications, pages 428–437. Springer, 2023

2023
[32]

OAG-BERT: Towards a unified backbone language model for academic knowledge services

Xiao Liu, Da Yin, Jingnan Zheng, Xingjian Zhang, Peng Zhang, Hongxia Yang, Yux- iao Dong, and Jie Tang. OAG-BERT: Towards a unified backbone language model for academic knowledge services. InProceed- ings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3418–3428, 2022

2022
[33]

Auto- matic generation of research highlights from scientific abstracts

Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, and Partha Pratim Das. Auto- matic generation of research highlights from scientific abstracts. InProceedings of the 2nd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Docu- ments (EEKE 2021), collocated with JCDL 2021, pages 69–70, 2021

2021
[34]

Named entity recognition based automatic generation of research highlights

Tohida Rehman, Debarshi Kumar Sanyal, Prasenjit Majumder, and Samiran Chat- topadhyay. Named entity recognition based automatic generation of research highlights. InProceedings of the Workshop on Scholarly Data Processing (SDP 2022),, collocated with COLING 2022, pages 163–169. ACL, 2022

2022
[35]

Research highlight generation with ELMo contextual embeddings.Scalable Computing: Practice and Experience, 24(2):181–190, 2023

Tohida Rehman, Debarshi Kumar Sanyal, and Samiran Chattopadhyay. Research highlight generation with ELMo contextual embeddings.Scalable Computing: Practice and Experience, 24(2):181–190, 2023

2023
[36]

Gen- eration of highlights from research papers using pointer-generator networks and SciB- ERT embeddings.IEEE Access, 11:91358– 91374, 2023

Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, and Partha Pratim Das. Gen- eration of highlights from research papers using pointer-generator networks and SciB- ERT embeddings.IEEE Access, 11:91358– 91374, 2023

2023
[37]

Why and how to embrace AI such as ChatGPT in your academic life

Zhicheng Lin. Why and how to embrace AI such as ChatGPT in your academic life. Royal Society Open Science, 10(8):230658, 2023

2023
[38]

Edward J. Ciaccio. Use of artificial intelli- gence in scientific paper writing.Informatics in Medicine Unlocked, 41:101253, 2023

2023
[39]

Modest: A dataset for multi domain scientific title gen- eration.Knowledge-Based Systems, page 113557, 2025

Necva B¨ ol¨ uc¨ u, Yunus Can Bilge, Dilber C ¸ etinta¸ s, and Zehra Y¨ ucel. Modest: A dataset for multi domain scientific title gen- eration.Knowledge-Based Systems, page 113557, 2025

2025
[40]

A supervised approach to extractive summarisation of scientific papers

Ed Collins, Isabelle Augenstein, and Sebas- tian Riedel. A supervised approach to extractive summarisation of scientific papers. InProceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 195–205. Association for Computational Linguistics, 2017

2017
[41]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Pra- fulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020. 23

1901
[42]

SciBERT: A pretrained language model for scientific text

Iz Beltagy, Kyle Lo, and Arman Cohan. SciBERT: A pretrained language model for scientific text. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors,Pro- ceedings of the 2019 Conference on Empiri- cal Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP- IJCNLP), pages 3615–36...

2019
[43]

From word embeddings to document distances

Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. From word embeddings to document distances. InProceedings of the International Conference on Machine Learn- ing, pages 957–966. PMLR, 2015

2015
[44]

Evaluating the factual consistency of abstractive text summarization

Wojciech Kryscinski, Bryan McCann, Caim- ing Xiong, and Richard Socher. Evaluating the factual consistency of abstractive text summarization. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu, editors,Pro- ceedings of the 2020 Conference on Empiri- cal Methods in Natural Language Processing (EMNLP), pages 9332–9346, Online, Novem- ber 2020. Associatio...

2020

[1] [1]

Article title type and its relation with the number of downloads and citations.Scientometrics, 88(2):653–661, 2011

Hamid R Jamali and Mahsa Nikzad. Article title type and its relation with the number of downloads and citations.Scientometrics, 88(2):653–661, 2011

2011

[2] [2]

The advantage of short paper titles.Royal Society Open Science, 2(8):150266, 2015

Adrian Letchford, Helen Susannah Moat, and Tobias Preis. The advantage of short paper titles.Royal Society Open Science, 2(8):150266, 2015

2015

[3] [3]

Integrated construction and simulation of tool paths for milling dental crowns and bridges

Fatemeh Rostami, Asghar Mohammad- poorasl, and Mohammad Hajizadeh. The effect of characteristics of title on cita- tion rates of articles.Scientometrics, 98:2007–2010, 2014. T able 18:Model vs human, and human vs human evaluation on 10 selected examples fromLREC-COLING- 2024dataset. The models are fine-tuned onCSPubSumtraining set. All scores are reported...

work page doi:10.1007/s43545-024-00886-w 2007

[4] [4]

Active Learning Design Choices for NER with Transformers

Tohida Rehman, Debarshi Kumar Sanyal, and Samiran Chattopadhyay. Can pre- trained language models generate titles for research papers? InInternational Conference on Asian Digital Libraries, pages 154–170. Springer, 2024. 20 T able 23:Comparison of author-written titles, model-generated titles (from PEGASUS-large and LLaMA-3-8B fine-tuned on theCSPubSumtra...

2024

[5] [5]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text trans- former.Journal of Machine Learning Research, 21(140):1–67, 2020

2020

[6] [6]

BART: Denoising sequence- to-sequence pre-training for natural lan- guage generation, translation, and compre- hension

Mike Lewis, Yinhan Liu, Naman Goyal, Mar- jan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, and Luke Zettlemoyer. BART: Denoising sequence- to-sequence pre-training for natural lan- guage generation, translation, and compre- hension. In Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault, editors,Proceed- ings of the 58th Ann...

2020

[7] [7]

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter J. Liu. PEGASUS: pre- training with extracted gap-sentences for abstractive summarization. InProceed- ings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020

2020

[8] [8]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timoth´ ee Lacroix, Baptiste Rozi` ere, Naman Goyal, Eric Hambro, Faisal Azhar, et al. LLaMA: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[9] [9]

Llama 3 model card

AI@Meta. Llama 3 model card. 2024

2024

[10] [10]

ROUGE: A package for auto- matic evaluation of summaries

Chin-Yew Lin. ROUGE: A package for auto- matic evaluation of summaries. InText Summarization Branches Out, pages 74–81, 2004. 21

2004

[11] [11]

METEOR: An automatic metric for mt evaluation with improved correlation with human judgments

Satanjeev Banerjee and Alon Lavie. METEOR: An automatic metric for mt evaluation with improved correlation with human judgments. InProceedings of the ACL Workshop on Intrinsic and Extrinsic Eval- uation Measures for Machine Translation and/or Summarization, pages 65–72, 2005

2005

[12] [12]

Meyer, and Steffen Eger

Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, and Steffen Eger. MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors,Proceedings of the 2019 Conference on Empirical Meth- ods in Natural Language Processing and the 9th Internation...

2019

[13] [13]

Association for Computational Linguis- tics

[14] [14]

BERTScore: Evaluating text generation with BERT

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. BERTScore: Evaluating text generation with BERT. InProceedings of the Interna- tional Conference on Learning Representa- tions, 2020

2020

[15] [15]

Entity-level factual consistency of abstractive text summarization

Feng Nan, Ramesh Nallapati, Zhiguo Wang, Cicero Nogueira dos Santos, Henghui Zhu, Dejiao Zhang, Kathleen McKeown, and Bing Xiang. Entity-level factual consistency of abstractive text summarization. In Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty, editors,Proceedings of the 16th Conference of the European Chapter of the Association for Computational Ling...

2021

[16] [16]

The automatic creation of literature abstracts.IBM Journal of Research and Development, 2(2):159–165, 1958

Hans Peter Luhn. The automatic creation of literature abstracts.IBM Journal of Research and Development, 2(2):159–165, 1958

1958

[17] [17]

Compendium: A text sum- marization system for generating abstracts of research papers.Data & Knowledge Engi- neering, 88:164–175, 2013

Elena Lloret, Mar´ ıa Teresa Rom´ a-Ferri, and Manuel Palomar. Compendium: A text sum- marization system for generating abstracts of research papers.Data & Knowledge Engi- neering, 88:164–175, 2013

2013

[18] [18]

Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neu- ral networks. InProceedings of the 27th International Conference on Neural Informa- tion Processing Systems - Volume 2, NIPS’14, page 3104–3112, Cambridge, MA, USA, 2014. MIT Press

2014

[19] [19]

Neural machine translation by jointly learning to align and translate

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations, 2015

2015

[20] [20]

Abstractive text summarization using sequence-to-sequence RNNs and beyond

Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Caglar Gulcehre, and Bing Xiang. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learn- ing, pages 280–290, Berlin, Germany, 2016. Association for Computational Linguistics

2016

[21] [21]

See, Peter J

A. See, Peter J. Liu, and Christopher D. Man- ning. Get to the point: Summarization with pointer-generator networks. InProceedings of the 55th Annual Meeting of the Associa- tion for Computational Linguistics (Volume 1: Long Papers), 2017

2017

[22] [22]

Attention is all you need.Advances in Neural Information Processing Systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Par- mar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in Neural Information Processing Systems, 30, 2017

2017

[23] [23]

BERT: Pre-training of deep bidirectional transformers for lan- guage understanding

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for lan- guage understanding. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Tech- nologies, Volume 1 (Long and Short Papers), pages 4171–4186, 2019

2019

[24] [24]

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learning with a unified text-to-text transformer.arXiv preprint arXiv:1910.10683, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910

[25] [25]

PEGASUS: Pre- training with extracted gap-sentences for abstractive summarization

Jingqing Zhang, Yao Zhao, Mohammad Saleh, and Peter Liu. PEGASUS: Pre- training with extracted gap-sentences for abstractive summarization. InProceed- ings of the International Conference on Machine Learning (ICLR), pages 11328– 11339. PMLR, 2020

2020

[26] [26]

From neural sentence summarization to head- line generation: A coarse-to-fine approach

Jiwei Tan, Xiaojun Wan, and Jianguo Xiao. From neural sentence summarization to head- line generation: A coarse-to-fine approach. In IJCAI, volume 17, pages 4109–4115, 2017. 22

2017

[27] [27]

Automatic title generation in sci- entific articles for authorship assistance: a summarization approach.Journal of ICT Research and Applications, 11(3):253–267, 2017

Jan Wira Gotama Putra and Masayu Leylia Khodra. Automatic title generation in sci- entific articles for authorship assistance: a summarization approach.Journal of ICT Research and Applications, 11(3):253–267, 2017

2017

[28] [28]

Automatic title generation for text with pre-trained transformer language model

Prakhar Mishra, Chaitali Diwan, Srinath Srinivasa, and Gopalakrishnan Srini- vasaraghavan. Automatic title generation for text with pre-trained transformer language model. InProceedings of the 2021 IEEE 15th International Conference on Seman- tic Computing (ICSC), pages 17–24. IEEE, 2021

2021

[29] [29]

Paper abstract writing through editing mechanism.arXiv preprint arXiv:1805.06064, 2018

Qingyun Wang, Zhihao Zhou, Lifu Huang, Spencer Whitehead, Boliang Zhang, Heng Ji, and Kevin Knight. Paper abstract writing through editing mechanism.arXiv preprint arXiv:1805.06064, 2018

work page arXiv 2018

[30] [30]

A dataset of attributes from papers of a machine learning conference.Data in brief, 24:103836, 2019

Diego Vallejo-Huanga, Paulina Morillo, and C` esar Ferri. A dataset of attributes from papers of a machine learning conference.Data in brief, 24:103836, 2019

2019

[31] [31]

Gen- erating accurate and engaging research paper titles using nlp techniques

Thulasi Bikku, Nirmala Rani Narimalla, Keerthi Konda, Anusha Nakkala, Avanti Yarlagadda, and B Sachuthananthan. Gen- erating accurate and engaging research paper titles using nlp techniques. InInternational Conference on Innovations in Bio-Inspired Computing and Applications, pages 428–437. Springer, 2023

2023

[32] [32]

OAG-BERT: Towards a unified backbone language model for academic knowledge services

Xiao Liu, Da Yin, Jingnan Zheng, Xingjian Zhang, Peng Zhang, Hongxia Yang, Yux- iao Dong, and Jie Tang. OAG-BERT: Towards a unified backbone language model for academic knowledge services. InProceed- ings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3418–3428, 2022

2022

[33] [33]

Auto- matic generation of research highlights from scientific abstracts

Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, and Partha Pratim Das. Auto- matic generation of research highlights from scientific abstracts. InProceedings of the 2nd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Docu- ments (EEKE 2021), collocated with JCDL 2021, pages 69–70, 2021

2021

[34] [34]

Named entity recognition based automatic generation of research highlights

Tohida Rehman, Debarshi Kumar Sanyal, Prasenjit Majumder, and Samiran Chat- topadhyay. Named entity recognition based automatic generation of research highlights. InProceedings of the Workshop on Scholarly Data Processing (SDP 2022),, collocated with COLING 2022, pages 163–169. ACL, 2022

2022

[35] [35]

Research highlight generation with ELMo contextual embeddings.Scalable Computing: Practice and Experience, 24(2):181–190, 2023

Tohida Rehman, Debarshi Kumar Sanyal, and Samiran Chattopadhyay. Research highlight generation with ELMo contextual embeddings.Scalable Computing: Practice and Experience, 24(2):181–190, 2023

2023

[36] [36]

Gen- eration of highlights from research papers using pointer-generator networks and SciB- ERT embeddings.IEEE Access, 11:91358– 91374, 2023

Tohida Rehman, Debarshi Kumar Sanyal, Samiran Chattopadhyay, Plaban Kumar Bhowmick, and Partha Pratim Das. Gen- eration of highlights from research papers using pointer-generator networks and SciB- ERT embeddings.IEEE Access, 11:91358– 91374, 2023

2023

[37] [37]

Why and how to embrace AI such as ChatGPT in your academic life

Zhicheng Lin. Why and how to embrace AI such as ChatGPT in your academic life. Royal Society Open Science, 10(8):230658, 2023

2023

[38] [38]

Edward J. Ciaccio. Use of artificial intelli- gence in scientific paper writing.Informatics in Medicine Unlocked, 41:101253, 2023

2023

[39] [39]

Modest: A dataset for multi domain scientific title gen- eration.Knowledge-Based Systems, page 113557, 2025

Necva B¨ ol¨ uc¨ u, Yunus Can Bilge, Dilber C ¸ etinta¸ s, and Zehra Y¨ ucel. Modest: A dataset for multi domain scientific title gen- eration.Knowledge-Based Systems, page 113557, 2025

2025

[40] [40]

A supervised approach to extractive summarisation of scientific papers

Ed Collins, Isabelle Augenstein, and Sebas- tian Riedel. A supervised approach to extractive summarisation of scientific papers. InProceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 195–205. Association for Computational Linguistics, 2017

2017

[41] [41]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Pra- fulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems, 33:1877–1901, 2020. 23

1901

[42] [42]

SciBERT: A pretrained language model for scientific text

Iz Beltagy, Kyle Lo, and Arman Cohan. SciBERT: A pretrained language model for scientific text. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors,Pro- ceedings of the 2019 Conference on Empiri- cal Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP- IJCNLP), pages 3615–36...

2019

[43] [43]

From word embeddings to document distances

Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. From word embeddings to document distances. InProceedings of the International Conference on Machine Learn- ing, pages 957–966. PMLR, 2015

2015

[44] [44]

Evaluating the factual consistency of abstractive text summarization

Wojciech Kryscinski, Bryan McCann, Caim- ing Xiong, and Richard Socher. Evaluating the factual consistency of abstractive text summarization. In Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu, editors,Pro- ceedings of the 2020 Conference on Empiri- cal Methods in Natural Language Processing (EMNLP), pages 9332–9346, Online, Novem- ber 2020. Associatio...

2020