An LLM-Powered Semantic Alignment Framework for Journal Recommendation
Pith reviewed 2026-06-29 02:38 UTC · model grok-4.3
The pith
Large language models can recommend journals by matching manuscript semantics directly to journal scope descriptions without any task-specific training.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an LLM-powered semantic alignment framework can treat journal recommendation as direct semantic matching between manuscript content and journal scope descriptions, allowing accurate recommendations without task-specific training. Experiments with DeepSeek-V3 on 23,609 articles from 49 journals yield Top-3, Top-5, and Top-10 accuracies of 40.23 percent, 53.67 percent, and 70.05 percent. Adding reference information improves results, repeated runs show an average Top-5 Jaccard similarity of 84 percent, and the model supplies interpretable reasoning for its choices.
What carries the argument
The semantic alignment process in which the LLM infers suitability by comparing article titles, abstracts, keywords, and candidate journal descriptions.
If this is right
- Journal recommendation systems can operate without access to historical submission records or user interaction logs.
- The generated reasoning outputs supply explicit explanations that link manuscript content to journal fit.
- Including reference lists as additional input measurably raises recommendation accuracy.
- Recommendations remain consistent across independent runs of the same model.
Where Pith is reading between the lines
- The same semantic-matching approach could be applied to conference or grant recommendation by substituting the corresponding scope descriptions.
- Because the method needs no training data, it could lower the barrier to building recommendation tools for smaller or emerging research fields.
- Pairing the LLM judgments with lightweight post-processing rules derived from citation statistics might raise precision without reintroducing supervised training.
Load-bearing premise
An off-the-shelf large language model can reliably judge whether a manuscript fits a journal from semantic content alone, without domain-specific fine-tuning, historical patterns, or explicit scope rules.
What would settle it
On a new collection of manuscripts whose true journal assignments are known, if the framework's top-10 accuracy falls below 50 percent while human editors achieve substantially higher agreement with the ground-truth journals, the claim of reliable semantic judgment would be challenged.
Figures
read the original abstract
Journal recommendation is an important task in scholarly information systems. Existing approaches typically rely on supervised learning models, manually engineered features, or historical interaction data, which may limit their generalizability and interpretability. We propose an LLM-powered semantic alignment framework that formulates journal recommendation as a semantic matching problem between manuscript content and journal scope descriptions. The framework enables large language models (LLMs) to infer journal suitability directly from article titles, abstracts, keywords, and candidate journal information without task-specific training. Experiments are conducted using DeepSeek-V3 on a dataset of 23,609 articles from 49 journals in statistics and related fields. The proposed framework achieves Top-3, Top-5, and Top-10 accuracies of 40.23\%, 53.67\%, and 70.05\%, respectively. Additional analyses show that incorporating reference information generally improves recommendation performance and that recommendations remain highly stable across repeated runs, with an average Top-5 Jaccard similarity of 84\%. The framework also generates interpretable reasoning outputs that provide insights into the recommendation process. These findings demonstrate the potential of LLMs as a training-free and scalable paradigm for journal recommendation and scholarly decision support.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes an LLM-powered semantic alignment framework for journal recommendation that formulates the task as direct semantic matching between manuscript content (titles, abstracts, keywords) and journal scope descriptions. Using DeepSeek-V3 without task-specific training on a dataset of 23,609 articles from 49 statistics-related journals, it reports Top-3/Top-5/Top-10 accuracies of 40.23%/53.67%/70.05%, with additional gains from reference information, high run-to-run stability (average Top-5 Jaccard similarity 84%), and interpretable reasoning outputs.
Significance. If the empirical results can be reproduced and shown to be free of prompt artifacts or implicit leakage, the work would demonstrate a viable training-free paradigm for journal recommendation that improves interpretability and generalizability over supervised models reliant on historical interaction data.
major comments (2)
- [Methods/Experimental Setup] Methods/Experimental Setup: The manuscript provides no prompt templates, no description of how 'candidate journal information' or journal scope descriptions are sourced and encoded, and no procedure for converting LLM outputs into ranked lists. Without these, the central claim that DeepSeek-V3 achieves the stated accuracies via semantic matching alone cannot be verified or distinguished from post-processing rules or few-shot effects.
- [Abstract and Experiments] Abstract and §4 (Experiments): The reported Top-3/5/10 accuracies lack any baseline comparisons, error analysis, statistical significance tests, or details on how the 23,609 articles and 49 journals were selected and labeled. This leaves open the possibility that the numbers reflect dataset artifacts rather than the framework's contribution.
minor comments (2)
- [Abstract] The term 'candidate journal information' is used without a precise definition of its content or how it differs from the manuscript input.
- [Results] Clarify whether the stability analysis (Jaccard similarity) was performed on the same set of candidate journals or across varying candidate pools.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive feedback. We address each major comment below and will revise the manuscript to improve reproducibility, add missing details, and strengthen the experimental section.
read point-by-point responses
-
Referee: [Methods/Experimental Setup] The manuscript provides no prompt templates, no description of how 'candidate journal information' or journal scope descriptions are sourced and encoded, and no procedure for converting LLM outputs into ranked lists. Without these, the central claim that DeepSeek-V3 achieves the stated accuracies via semantic matching alone cannot be verified or distinguished from post-processing rules or few-shot effects.
Authors: We agree that these implementation details are necessary for verification. In the revised manuscript we will add a new subsection (likely §3.2) that includes the complete prompt templates used with DeepSeek-V3, describes the sourcing of journal scope descriptions directly from each journal's official 'Aims & Scope' page, explains the text encoding approach, and specifies the deterministic procedure for parsing the LLM's free-text output into an ordered list of recommended journals. This will make explicit that no additional post-processing rules or few-shot examples beyond the base prompt were applied. revision: yes
-
Referee: [Abstract and Experiments] The reported Top-3/5/10 accuracies lack any baseline comparisons, error analysis, statistical significance tests, or details on how the 23,609 articles and 49 journals were selected and labeled. This leaves open the possibility that the numbers reflect dataset artifacts rather than the framework's contribution.
Authors: We accept that the experimental section is incomplete without these elements. We will add (i) baseline comparisons against TF-IDF cosine similarity and BM25 on the same article-journal text pairs, (ii) an error analysis of mis-ranked cases, (iii) statistical significance testing (McNemar's test) against the baselines, and (iv) expanded dataset description: the 49 journals were chosen as the most prominent statistics and related-field outlets according to Web of Science subject categories and impact factors; the 23,609 articles comprise a random sample of papers published in those journals from 2018-2023, with ground-truth labels taken directly from the publishing journal. Potential selection biases will be discussed as a limitation. revision: yes
Circularity Check
No circularity; empirical measurement on fixed dataset with no derivations or fitted predictions
full rationale
The paper formulates journal recommendation as semantic matching and reports direct empirical accuracies (Top-3/5/10) from running DeepSeek-V3 on 23,609 articles. No equations, parameters, or derivation chain exist. No self-citations are invoked to justify uniqueness or force the result. The accuracies are presented as measured outcomes, not predictions that reduce to inputs by construction. This matches the default expectation of no significant circularity (score 0-2).
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Technological forecasting and social change , volume=
A text-embedding-based approach to measuring patent-to-patent technological similarity , author=. Technological forecasting and social change , volume=. 2022 , publisher=
2022
-
[2]
2018 IEEE international conference on data mining (ICDM) , pages=
Self-attentive sequential recommendation , author=. 2018 IEEE international conference on data mining (ICDM) , pages=. 2018 , organization=
2018
-
[3]
Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval , pages=
Attentive collaborative filtering: Multimedia recommendation with item-and component-level attention , author=. Proceedings of the 40th International ACM SIGIR conference on Research and Development in Information Retrieval , pages=
-
[4]
Neural Machine Translation by Jointly Learning to Align and Translate
Neural machine translation by jointly learning to align and translate , author=. arXiv preprint arXiv:1409.0473 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[5]
Advances in neural information processing systems , volume=
Attention is all you need , author=. Advances in neural information processing systems , volume=
-
[6]
Journal of the American Society for information Science and Technology , volume=
Bias in peer review , author=. Journal of the American Society for information Science and Technology , volume=. 2013 , publisher=
2013
-
[7]
Research Policy , volume=
Bias against novelty in science: A cautionary tale for users of bibliometric indicators , author=. Research Policy , volume=. 2017 , publisher=
2017
-
[8]
Scientometrics , volume=
A journal recommender for article submission using transformers , author=. Scientometrics , volume=. 2023 , publisher=
2023
-
[9]
Proceedings of the 9th ACM Conference on Recommender Systems , pages=
Elsevier journal finder: recommending journals for your paper , author=. Proceedings of the 9th ACM Conference on Recommender Systems , pages=
-
[10]
Nucleic acids research , volume=
eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications , author=. Nucleic acids research , volume=. 2007 , publisher=
2007
-
[11]
Journal of Information Science , volume=
Factors influencing researchers’ journal selection decisions , author=. Journal of Information Science , volume=. 2022 , publisher=
2022
-
[12]
Knowledge and Information Systems , volume=
Scholarly recommendation systems: a literature survey , author=. Knowledge and Information Systems , volume=. 2023 , publisher=
2023
-
[13]
2018 , publisher=
Measuring research: What everyone needs to know , author=. 2018 , publisher=
2018
-
[14]
Expert Systems with Applications , volume=
Deep learning in citation recommendation models survey , author=. Expert Systems with Applications , volume=. 2020 , publisher=
2020
-
[15]
Electronics , volume=
A survey of recommendation systems: recommendation models, techniques, and application fields , author=. Electronics , volume=. 2022 , publisher=
2022
-
[16]
Journal of Big Data , volume=
A systematic review and research perspective on recommender systems , author=. Journal of Big Data , volume=. 2022 , publisher=
2022
-
[17]
Scientometrics , volume=
Deep learning for journal recommendation system of research papers , author=. Scientometrics , volume=. 2023 , publisher=
2023
-
[18]
Journal of Information Science , pages=
Investigating the reviewer assignment problem: A systematic literature review , author=. Journal of Information Science , pages=. 2023 , publisher=
2023
-
[19]
Knowledge-Based Systems , volume=
Paper recommendation based on heterogeneous network embedding , author=. Knowledge-Based Systems , volume=. 2020 , publisher=
2020
-
[20]
Proceedings of the 1st ACM conference on Electronic commerce , pages=
Recommender systems in e-commerce , author=. Proceedings of the 1st ACM conference on Electronic commerce , pages=
-
[21]
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval , pages=
Social media recommendation based on people and tags , author=. Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval , pages=
-
[22]
Expert systems with applications , volume=
Entertainment recommender systems for group of users , author=. Expert systems with applications , volume=. 2011 , publisher=
2011
-
[23]
Ieee Access , volume=
A systematic study on the recommender systems in the E-commerce , author=. Ieee Access , volume=. 2020 , publisher=
2020
-
[24]
Nature , volume=
Papers and patents are becoming less disruptive over time , author=. Nature , volume=. 2023 , publisher=
2023
-
[25]
Proceedings of the National Academy of Sciences , volume=
Measuring the effectiveness of scientific gatekeeping , author=. Proceedings of the National Academy of Sciences , volume=. 2015 , publisher=
2015
-
[26]
Humanities and social sciences communications , volume=
AI-assisted peer review , author=. Humanities and social sciences communications , volume=. 2021 , publisher=
2021
-
[27]
Journal of the American Society for Information Science and Technology , volume=
Peer review in a changing world: An international study measuring the attitudes of researchers , author=. Journal of the American Society for Information Science and Technology , volume=. 2013 , publisher=
2013
-
[28]
Information Processing & Management , volume=
Are large language models qualified reviewers in originality evaluation? , author=. Information Processing & Management , volume=. 2025 , publisher=
2025
-
[29]
Science , volume=
Atypical combinations and scientific impact , author=. Science , volume=. 2013 , publisher=
2013
-
[30]
Scientometrics , pages=
A review on the novelty measurements of academic papers , author=. Scientometrics , pages=. 2025 , publisher=
2025
-
[31]
Science , volume=
Science of science , author=. Science , volume=. 2018 , publisher=
2018
-
[32]
Science , volume=
Impact factor distortions , author=. Science , volume=. 2013 , publisher=
2013
-
[33]
Nature , volume=
Publishing: Journals could share peer-review data , author=. Nature , volume=. 2017 , publisher=
2017
-
[34]
30th USENIX security symposium (USENIX Security 21) , pages=
Extracting training data from large language models , author=. 30th USENIX security symposium (USENIX Security 21) , pages=
-
[35]
Holistic Evaluation of Language Models
Holistic evaluation of language models , author=. arXiv preprint arXiv:2211.09110 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[36]
Proceedings of the 2020 conference on fairness, accountability, and transparency , pages=
Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing , author=. Proceedings of the 2020 conference on fairness, accountability, and transparency , pages=
2020
-
[37]
arXiv preprint arXiv:2404.16251 , year=
Prompt leakage effect and defense strategies for multi-turn llm interactions , author=. arXiv preprint arXiv:2404.16251 , year=
-
[38]
Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security , pages=
Pleak: Prompt leaking attacks against large language model applications , author=. Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security , pages=
2024
-
[39]
Findings of the Association for Computational Linguistics: EMNLP 2025 , pages=
LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models , author=. Findings of the Association for Computational Linguistics: EMNLP 2025 , pages=
2025
-
[40]
arXiv preprint arXiv:2511.16209 , year=
PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization , author=. arXiv preprint arXiv:2511.16209 , year=
-
[41]
Bioinformatics , volume=
Jane: suggesting journals, finding experts , author=. Bioinformatics , volume=. 2008 , publisher=
2008
-
[42]
Knowledge-based systems , volume=
A content-based recommender system for computer science publications , author=. Knowledge-based systems , volume=. 2018 , publisher=
2018
-
[43]
Journal of Information Systems and Telecommunication (JIST) , volume=
Publication venue recommendation based on paper’s title and co-authors network , author=. Journal of Information Systems and Telecommunication (JIST) , volume=. 2018 , publisher=
2018
-
[44]
IEEE Access , volume=
Publication venue recommendation using profiles based on clustering , author=. IEEE Access , volume=. 2022 , publisher=
2022
-
[45]
arXiv preprint arXiv:2109.11343 , year=
Towards explainable scientific venue recommendations , author=. arXiv preprint arXiv:2109.11343 , year=
-
[46]
Proceedings of the 2020 international conference on multimedia retrieval , pages=
A framework for paper submission recommendation system , author=. Proceedings of the 2020 international conference on multimedia retrieval , pages=
2020
-
[47]
Pubmender
The deep learning--based recommender system “Pubmender” for choosing a biomedical publication venue: Development and validation study , author=. Journal of medical Internet research , volume=. 2019 , publisher=
2019
-
[48]
Scientometrics , volume=
Recommendation method for academic journal submission based on doc2vec and XGBoost , author=. Scientometrics , volume=. 2022 , publisher=
2022
-
[49]
Findings of the Association for Computational Linguistics: EMNLP 2020 , pages=
Where to submit? Helping researchers to choose the right venue , author=. Findings of the Association for Computational Linguistics: EMNLP 2020 , pages=
2020
-
[50]
2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) , pages=
Graphconfrec: A graph neural network-based conference recommender system , author=. 2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) , pages=. 2021 , organization=
2021
-
[51]
Expert Systems with Applications , volume=
PSRMTE: Paper submission recommendation using mixtures of transformer , author=. Expert Systems with Applications , volume=. 2022 , publisher=
2022
-
[52]
2024 11th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI) , pages=
Leveraging Transformer-Based Topic Modeling using BERTopic for Publication Venue Recommendation , author=. 2024 11th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI) , pages=. 2024 , organization=
2024
-
[53]
Journal of medical Internet research , volume=
Evaluating the ability of open-source artificial intelligence to predict accepting-journal impact factor and eigenfactor score using academic article abstracts: cross-sectional machine learning analysis , author=. Journal of medical Internet research , volume=. 2023 , publisher=
2023
-
[54]
Asian Conference on Intelligent Information and Database Systems , pages=
SimCPSR: Simple Contrastive Learning for Paper Submission Recommendation System , author=. Asian Conference on Intelligent Information and Database Systems , pages=. 2022 , organization=
2022
-
[55]
Journal of Informetrics , volume=
Poincare: Recommending publication venues via treatment effect estimation , author=. Journal of Informetrics , volume=. 2022 , publisher=
2022
-
[56]
arXiv preprint arXiv:2510.05495 , year=
Automated Research Article Classification and Recommendation Using NLP and ML , author=. arXiv preprint arXiv:2510.05495 , year=
-
[57]
Journal of Scientometric Research , volume=
Quartile Prediction and Journal Recommendation Using Deep Learning Models for Artificial Intelligence Articles , author=. Journal of Scientometric Research , volume=
-
[58]
Neurocomputing , volume=
Learning to recommend journals for submission based on embedding models , author=. Neurocomputing , volume=. 2022 , publisher=
2022
-
[59]
, author=
Bag-of-embeddings for text classification. , author=. IJCAI , volume=
-
[60]
Inverse-Category-Frequency based supervised term weighting scheme for text categorization
Inverse-category-frequency based supervised term weighting scheme for text categorization , author=. arXiv preprint arXiv:1012.2609 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[61]
Document Embeddings vs
Keyphrases vs. Document Embeddings vs. Terms for Recommender Systems: An Online Evaluation , author=. Proceedings of the ACM/IEEECS Joint Conference on Digital Libraries (JCDL) , year=
-
[62]
International Conference on Innovative Systems for Digital Economy| ISDE , pages=
A RECOMMENDER SYSTEM FOR ACADEMIC PUBLICATIONS USING CONTENT-BASED FILTERING TECHNIQUES , author=. International Conference on Innovative Systems for Digital Economy| ISDE , pages=
-
[63]
Advances in Science, Technology and Engineering Systems , volume=
The design of a hybrid model-based journal recommendation system , author=. Advances in Science, Technology and Engineering Systems , volume=. 2020 , publisher=
2020
-
[64]
Scientometrics , volume=
Research paper recommendation system based on multiple features from citation network , author=. Scientometrics , volume=. 2024 , publisher=
2024
-
[65]
arXiv preprint arXiv:1912.08694 , year=
Meta-learned per-instance algorithm selection in scholarly recommender systems , author=. arXiv preprint arXiv:1912.08694 , year=
-
[66]
Data Technologies and Applications , volume=
Scholarly publication venue recommender systems: A systematic literature review , author=. Data Technologies and Applications , volume=. 2020 , publisher=
2020
-
[67]
ITS , pages=
Personalized Scholarly Article Recommendations Based on the Recurrent Neural Network and Probabilistic Models , author=. ITS , pages=
-
[68]
arXiv preprint arXiv:2408.15371 , year=
Temporal graph neural network-powered paper recommendation on dynamic citation networks , author=. arXiv preprint arXiv:2408.15371 , year=
-
[69]
Knowledge-Based Systems , volume=
Recommending scientific paper via heterogeneous knowledge embedding based attentive recurrent neural networks , author=. Knowledge-Based Systems , volume=. 2021 , publisher=
2021
-
[70]
Journal of the Association for Information Science and Technology , volume=
A profile-boosted research analytics framework to recommend journals for manuscripts , author=. Journal of the Association for Information Science and Technology , volume=. 2015 , publisher=
2015
-
[71]
2012 11th international conference on machine learning and applications , volume=
Venue recommendation: Submitting your paper with style , author=. 2012 11th international conference on machine learning and applications , volume=. 2012 , organization=
2012
-
[72]
Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology
Calculating semantic similarity between academic articles using topic event and ontology , author=. arXiv preprint arXiv:1711.11508 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[73]
IEEE Access , volume=
A hybrid personalized scientific paper recommendation approach integrating public contextual metadata , author=. IEEE Access , volume=. 2021 , publisher=
2021
-
[74]
Neural Computing and Applications , volume=
Revisiting recommender systems: an investigative survey , author=. Neural Computing and Applications , volume=. 2025 , publisher=
2025
-
[75]
Big Data and Cognitive Computing , volume=
A comparative analysis of sentence transformer models for automated journal recommendation using PubMed metadata , author=. Big Data and Cognitive Computing , volume=. 2025 , publisher=
2025
-
[76]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Pre-training of deep bidirectional transformers for language understanding , author=. arXiv preprint arXiv:1810.04805 , volume=
work page internal anchor Pith review Pith/arXiv arXiv
-
[77]
Sentence-bert: Sentence embeddings using siamese bert-networks , author=. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) , pages=
2019
-
[78]
Gpt-4 technical report , author=. arXiv preprint arXiv:2303.08774 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[79]
Proceedings of the 2025 2nd International Conference on Informatics Education and Computer Technology Applications , pages=
Enhanced recommendation combining collaborative filtering and large language models , author=. Proceedings of the 2025 2nd International Conference on Informatics Education and Computer Technology Applications , pages=
2025
-
[80]
Proceedings of the 16th ACM conference on recommender systems , pages=
Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5) , author=. Proceedings of the 16th ACM conference on recommender systems , pages=
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.