arxiv: 2603.17361 · v2 · submitted 2026-03-18 · 💻 cs.IR · cs.AI· cs.CL· cs.SI

Recognition: no theorem link

Public Profile Matters: A Scalable Integrated Approach to Recommend Citations in the Wild

Karan Goyal , Dikshant Kukreja , Vikram Goyal , Mukesh Mohania

Authors on Pith no claims yet

Pith reviewed 2026-05-15 09:14 UTC · model grok-4.3

classification 💻 cs.IR cs.AIcs.CLcs.SI

keywords citation recommendationinformation retrievalhuman citation patternsinductive evaluationreranking modelvector gatingscalable systems

0 comments

The pith

A lightweight non-learnable profiler captures human citation patterns to improve recommendations for new papers under realistic temporal constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Profiler, a simple non-learnable module that encodes typical human choices when citing literature without training parameters or adding systematic bias. This module supplies confidence priors to DAVINCI, a reranker that merges profile information with semantic similarity through an adaptive vector-gating step. The authors demonstrate that the combination reaches new state-of-the-art results on standard citation-recommendation benchmarks while remaining computationally light. They further replace conventional transductive testing with an inductive protocol that splits data strictly by publication date to simulate recommendations for papers that have not yet appeared.

Core claim

By integrating a lightweight Profiler module that encodes human citation patterns into a reranking model DAVINCI using adaptive vector-gating, citation recommendation systems can achieve state-of-the-art results on benchmarks while being more efficient and generalizable, especially when evaluated in an inductive setting with strict temporal constraints that simulate real-world conditions for new papers.

What carries the argument

Profiler, a lightweight non-learnable module that captures human citation patterns, combined with DAVINCI's adaptive vector-gating mechanism for integrating profile priors with semantic information.

If this is right

Citation recommendations become more accurate and less biased across multiple benchmark datasets.
The system scales better for large collections because the profiler adds negligible computational cost.
Performance measured under inductive temporal constraints more closely reflects usefulness for newly authored papers.
The vector-gating integration allows semantic and profile signals to be balanced without fixed weights.
Generalisability improves because the profiler avoids dataset-specific learned biases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same lightweight profiling approach could transfer to other recommendation settings where historical user choices follow repeatable patterns.
Researchers might test whether the profiler's priors remain useful when the underlying document collection grows by orders of magnitude.
The strict temporal evaluation protocol could serve as a template for assessing other scientific recommendation tasks that must respect publication chronology.
Future systems could explore replacing the fixed profiler with a periodically refreshed non-learnable snapshot to track slow shifts in citation norms.

Load-bearing premise

That the Profiler accurately and without bias captures the patterns humans use when choosing citations, and that enforcing temporal splits in evaluation adequately simulates recommending citations for brand-new papers.

What would settle it

Observing whether the claimed performance gains hold when the inductive evaluation is replaced by a standard transductive split or when the profiler is removed from the system.

Figures

Figures reproduced from arXiv: 2603.17361 by Dikshant Kukreja, Karan Goyal, Mukesh Mohania, Vikram Goyal.

**Figure 1.** Figure 1: Navigating the performance landscape of the public profile enrichment on the FullTextPeerRead and ACL-200 validation sets. Each plot shows a [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 2.** Figure 2: The architecture of our two-stage citation recommendation system. (1) The non-learnable [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: The performance variation for varied query composition when the [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Python functions for constructing the query and [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

read the original abstract

Proper citation of relevant literature is essential for contextualising and validating scientific contributions. While current citation recommendation systems leverage local and global textual information, they often overlook the nuances of the human citation behaviour. Recent methods that incorporate such patterns improve performance but incur high computational costs and introduce systematic biases into downstream rerankers. To address this, we propose Profiler, a lightweight, non-learnable module that captures human citation patterns efficiently and without bias, significantly enhancing candidate retrieval. Furthermore, we identify a critical limitation in current evaluation protocol: the systems are assessed in a transductive setting, which fails to reflect real-world scenarios. We introduce a rigorous Inductive evaluation setting that enforces strict temporal constraints, simulating the recommendation of citations for newly authored papers in the wild. Finally, we present DAVINCI, a novel reranking model that integrates profiler-derived confidence priors with semantic information via an adaptive vector-gating mechanism. Our system achieves new state-of-the-art results across multiple benchmark datasets, demonstrating superior efficiency and generalisability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The inductive temporal split and non-learnable Profiler are the useful pieces; the SOTA claim is still unproven without the numbers and a clear check on graph masking.

read the letter

The paper's real contribution is the move to an inductive evaluation protocol that enforces strict temporal cuts for new papers, paired with a lightweight non-learnable Profiler that pulls citation patterns without training costs or obvious bias injection. That setup directly targets the transductive leakage common in prior citation recommenders and offers a practical efficiency win for systems that need to handle fresh documents in the wild. DAVINCI then folds the profiler priors into semantic reranking via adaptive vector gating, which looks like a clean way to combine the signals without heavy machinery. The abstract frames these as fixes for compute and bias problems in existing work, and the evaluation shift is a clear step toward more realistic testing. The stress-test concern about whether the citation graph itself is masked at the same temporal boundary is worth a close look in the methods section; if post-cut edges remain available during candidate retrieval, the reported gains could partly reflect an easier regime rather than the modules alone. The abstract does not spell out the masking details, so that point needs verification before the generalisability claim can be taken at face value. Overall the work is aimed at IR groups building citation tools or academic search engines. A reader focused on deployment constraints would find the protocol change and the profiler design worth examining. It has enough new pieces and a grounded motivation to merit peer review, even if the results tables will decide how much revision is needed.

Referee Report

2 major / 1 minor

Summary. The paper proposes Profiler, a lightweight non-learnable module that captures human citation patterns for efficient, bias-free candidate retrieval in citation recommendation systems. It critiques existing transductive evaluation protocols and introduces a new inductive setting enforcing strict temporal constraints to simulate recommendations for newly authored papers. It further presents DAVINCI, a reranking model that integrates profiler-derived confidence priors with semantic information through an adaptive vector-gating mechanism. The system is claimed to achieve new state-of-the-art results across multiple benchmark datasets while demonstrating superior efficiency and generalisability.

Significance. If the central claims hold under a properly masked inductive protocol, the work would advance citation recommendation by providing a scalable, parameter-free way to inject human citation behavior without the computational overhead or bias risks of learned modules, alongside a more realistic evaluation framework. The vector-gating integration in DAVINCI offers a potentially general mechanism for combining heterogeneous signals.

major comments (2)

[Inductive evaluation setting] Abstract and evaluation protocol section: the inductive setting with 'strict temporal constraints' is load-bearing for the SOTA and generalisability claims, yet it is not specified whether the citation graph itself is masked at the same temporal cut used to partition papers. If post-cut edges remain available during candidate retrieval, Profiler priors and the reranker can exploit future information unavailable in a genuine 'in the wild' scenario for new papers; this must be clarified with a precise description of graph construction and candidate generation under the inductive split.
[Profiler module] Profiler module description: the assertion that the module operates 'without bias' is central to the efficiency and downstream reranker advantages, but no analysis is provided showing that the captured patterns do not encode systematic distributional biases from the source citation data. Evidence (e.g., bias metrics or ablation on downstream fairness) is required to support that the non-learnable design truly eliminates bias rather than merely avoiding learned parameters.

minor comments (1)

Abstract lacks any quantitative results, error bars, or dataset names despite asserting SOTA performance; moving at least headline numbers (e.g., recall@10 deltas) into the abstract would strengthen the summary.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of the inductive evaluation protocol and the bias claims for Profiler. We address each major comment below with clarifications and commit to revisions where needed to strengthen the paper.

read point-by-point responses

Referee: Abstract and evaluation protocol section: the inductive setting with 'strict temporal constraints' is load-bearing for the SOTA and generalisability claims, yet it is not specified whether the citation graph itself is masked at the same temporal cut used to partition papers. If post-cut edges remain available during candidate retrieval, Profiler priors and the reranker can exploit future information unavailable in a genuine 'in the wild' scenario for new papers; this must be clarified with a precise description of graph construction and candidate generation under the inductive split.

Authors: We appreciate this observation on the need for explicit specification. Our inductive protocol constructs the citation graph using only pre-cut edges for all candidate retrieval and prior computation, with no post-cut edges accessible during inference for new papers. This enforces the temporal constraints to simulate real-world 'in the wild' recommendations. We will expand the evaluation protocol section with a precise step-by-step description of graph construction, temporal partitioning, and candidate generation to eliminate any ambiguity. revision: yes
Referee: Profiler module description: the assertion that the module operates 'without bias' is central to the efficiency and downstream reranker advantages, but no analysis is provided showing that the captured patterns do not encode systematic distributional biases from the source citation data. Evidence (e.g., bias metrics or ablation on downstream fairness) is required to support that the non-learnable design truly eliminates bias rather than merely avoiding learned parameters.

Authors: The non-learnable design of Profiler avoids introducing new biases via optimization but can still propagate distributional patterns from the source data. We will add a dedicated analysis subsection, including quantitative bias metrics (e.g., disparity in citation frequency across subfields) and ablations measuring impact on downstream fairness metrics such as equalized odds in reranking. This will provide the requested evidence and refine our claims accordingly. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper introduces Profiler as a non-learnable module, a new inductive evaluation protocol with temporal constraints, and DAVINCI reranker with vector-gating. No load-bearing step reduces by construction to fitted inputs, self-definitions, or self-citation chains. Claims of SOTA rest on external benchmarks and the proposed modules without the specific reductions enumerated in the circularity patterns. The evaluation setting is presented as an independent methodological contribution rather than a renaming or smuggling of prior results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no details on parameters, axioms, or entities; assessment limited to high-level claims.

pith-pipeline@v0.9.0 · 5490 in / 1031 out tokens · 39312 ms · 2026-05-15T09:14:50.528407+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 1 internal anchor

[1]

The peer review process: past, present, and future,

J. A. Drozdz and M. R. Ladomery, “The peer review process: past, present, and future,”British Journal of Biomedical Science, vol. 81, p. 12054, 2024

work page 2024
[2]

Publications during covid-19 times: An unexpected overall increase,

R. Rousseau, C. Garcia-Zorita, and E. Sanz-Casado, “Publications during covid-19 times: An unexpected overall increase,”Journal of Informetrics, vol. 17, no. 4, p. 101461, 2023

work page 2023
[3]

Raging against the literature: Llm- powered dataset mention extraction,

P. Datta, S. Datta, and D. Roy, “Raging against the literature: Llm- powered dataset mention extraction,” inProceedings of the 24th ACM/IEEE Joint Conference on Digital Libraries, 2024, pp. 1–12

work page 2024
[4]

Content-based citation recommendation,

C. Bhagavatula, S. Feldman, R. Power, and W. Ammar, “Content-based citation recommendation,” inProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). New Orleans, Louisiana: Association for Computational Linguistics, Jun. 2018, pp. 238–251. [On...

work page 2018
[5]

SymTax: Symbiotic relationship and taxonomy fusion for effective citation recommendation,

K. Goyal, M. Goel, V . Goyal, and M. Mohania, “SymTax: Symbiotic relationship and taxonomy fusion for effective citation recommendation,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguistics, Aug. 2024, pp. 8997–9008. [Online]. Avail...

work page 2024
[6]

Local citation recommendation with hierarchical-attention text encoder and scibert-based reranking,

N. Gu, Y . Gao, and R. H. Hahnloser, “Local citation recommendation with hierarchical-attention text encoder and scibert-based reranking,” in European Conference on Information Retrieval. Springer, 2022, pp. 274–288

work page 2022
[7]

Gtr: An explainable graph topic- aware recommender for scholarly document,

P. Ni, X. Wang, B. Lv, and L. Wu, “Gtr: An explainable graph topic- aware recommender for scholarly document,”Electronic Commerce Research and Applications, vol. 67, p. 101439, 2024

work page 2024
[8]

Global citation recommendation employing generative adversarial network,

Z. Ali, G. Qi, K. Muhammad, P. Kefalas, and S. Khusro, “Global citation recommendation employing generative adversarial network,” Expert Systems with Applications, vol. 180, p. 114888, 2021

work page 2021
[9]

Graph neural collabo- rative topic model for citation recommendation,

Q. Xie, Y . Zhu, J. Huang, P. Du, and J.-Y . Nie, “Graph neural collabo- rative topic model for citation recommendation,”ACM Transactions on Information Systems (TOIS), vol. 40, no. 3, pp. 1–30, 2021

work page 2021
[10]

A context-aware citation recommendation model with bert and graph convolutional networks,

C. Jeong, S. Jang, E. Park, and S. Choi, “A context-aware citation recommendation model with bert and graph convolutional networks,” Scientometrics, vol. 124, pp. 1907–1922, 2020

work page 1907
[11]

Attentive stacked denoising autoencoder with bi-lstm for personalized context-aware citation recom- mendation,

T. Dai, L. Zhu, Y . Wang, and K. M. Carley, “Attentive stacked denoising autoencoder with bi-lstm for personalized context-aware citation recom- mendation,”IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 553–568, 2019

work page 2019
[12]

Neural citation network for context-aware citation recommendation,

T. Ebesu and Y . Fang, “Neural citation network for context-aware citation recommendation,” inProceedings of the 40th international ACM SIGIR conference on research and development in information retrieval, 2017, pp. 1093–1096

work page 2017
[13]

A neural probabilis- tic model for context based citation recommendation,

W. Huang, Z. Wu, C. Liang, P. Mitra, and C. Giles, “A neural probabilis- tic model for context based citation recommendation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 2015

work page 2015
[14]

Cite- sight: supporting contextual citation recommendation using differential search,

A. Livne, V . Gokuladas, J. Teevan, S. T. Dumais, and E. Adar, “Cite- sight: supporting contextual citation recommendation using differential search,” inProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval, 2014, pp. 807–816

work page 2014
[15]

Context-aware citation recommendation,

Q. He, J. Pei, D. Kifer, P. Mitra, and L. Giles, “Context-aware citation recommendation,” inProceedings of the 19th international conference on World wide web, 2010, pp. 421–430

work page 2010
[16]

CiteBART: Learning to generate citations for local citation recommendation,

E. Y . C ¸ elik and S. Tekir, “CiteBART: Learning to generate citations for local citation recommendation,” inProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, C. Christodoulopoulos, T. Chakraborty, C. Rose, and V . Peng, Eds. Suzhou, China: Association for Computational Linguistics, Nov. 2025, pp. 1703–1719. [Online]...

work page 2025
[17]

Bert: Pre-training of deep bidirectional transformers for language understanding,

J. D. M.-W. C. Kenton and L. K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” inProceedings of naacL-HLT, vol. 1, 2019, p. 2

work page 2019
[18]

Semi-supervised classification with graph convolutional networks,

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” inInternational Conference on Learning Representations, 2017. [Online]. Available: https://openreview.net/ forum?id=SJU4ayYgl

work page 2017
[19]

Improved local citation recommendation based on context enhanced with global information,

Z. Medi ´c and J. ˇSnajder, “Improved local citation recommendation based on context enhanced with global information,” inProceedings of the first workshop on scholarly document processing, 2020, pp. 97–103. 12

work page 2020
[20]

Neighborhood contrastive learning for scientific document representations with citation embeddings,

M. Ostendorff, N. Rethmeier, I. Augenstein, B. Gipp, and G. Rehm, “Neighborhood contrastive learning for scientific document representations with citation embeddings,” inProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Y . Goldberg, Z. Kozareva, and Y . Zhang, Eds. Abu Dhabi, United Arab Emirates: Association for Com...

work page 2022
[21]

Gptzero finds 100 new hallucinations in neurips 2025 accepted papers,

N. Shmatko, A. Adam, and P. Esau, “Gptzero finds 100 new hallucinations in neurips 2025 accepted papers,”GPTZero, Jan. 2026. [Online]. Available: https://gptzero.me/news/neurips/

work page 2025
[22]

Scirepeval: A multi-format benchmark for scientific document representations,

A. Singh, M. D’Arcy, A. Cohan, D. Downey, and S. Feldman, “Scirepeval: A multi-format benchmark for scientific document representations,” inConference on Empirical Methods in Natural Language Processing, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:254018137

work page 2022
[23]

SciBERT: A pretrained language model for scientific text,

I. Beltagy, K. Lo, and A. Cohan, “SciBERT: A pretrained language model for scientific text,” inProceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Hong Kong, China: Association for Computational Linguistics, Nov. 2019, pp. 3615–3620...

work page 2019
[24]

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Y . Zhang, M. Li, D. Long, X. Zhang, H. Lin, B. Yang, P. Xie, A. Yang, D. Liu, J. Lin, F. Huang, and J. Zhou, “Qwen3 embedding: Advancing text embedding and reranking through foundation models,” arXiv preprint arXiv:2506.05176, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[25]

M3-embedding: Multi-linguality, multi-functionality, multi-granularity text embeddings through self-knowledge distillation,

J. Chen, S. Xiao, P. Zhang, K. Luo, D. Lian, and Z. Liu, “M3-embedding: Multi-linguality, multi-functionality, multi-granularity text embeddings through self-knowledge distillation,” inFindings of the Association for Computational Linguistics: ACL 2024, L.-W. Ku, A. Martins, and V . Srikumar, Eds. Bangkok, Thailand: Association for Computational Linguisti...

work page 2024
[26]

Making large language models a better foundation for dense retrieval,

C. Li, Z. Liu, S. Xiao, and Y . Shao, “Making large language models a better foundation for dense retrieval,” 2023. Karan Goyalis pursuing his PhD from IIIT Delhi, India. He works at the forefront of both generic and applied data-centric AI. He has served as a session chair at CIKM 2025 and has been a reviewer for ICLR, ICMI, and IEEE TCAD. He has been aw...

work page 2023
[27]

Outstanding Innovation Award

He was invited at ACM India annual research symposium ARCS 2026 for his work on structural graph augmentation. Before joining PhD, he com- pleted his Masters from IIT Delhi and was then a Physical Design engineer at Qualcomm Bengaluru involved in the design of Qualcomm’s premier tier SoCs. During his short 2-year tenure, he received Qualcomm’s Purposeful ...

work page 2026