Canonical reference

Title resolution pending

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al · 2020

Canonical reference. 100% of citing Pith papers cite this work as background.

15 Pith papers citing it

Background 100% of classified citations

browse 15 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 8

citation-polarity summary

background 8

representative citing papers

Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers

cs.SE · 2025-06-16 · conditional · novelty 8.0

First study of 1,899 MCP servers finds eight distinct vulnerabilities (only three traditional), 7.2% with general issues, 5.5% with tool poisoning, and 66% with code smells, urging MCP-specific security practices.

From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems

cs.MA · 2025-06-05 · accept · novelty 7.0

A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.

C-Pack: Packed Resources For General Chinese Embeddings

cs.CL · 2023-09-14 · accept · novelty 7.0

C-Pack releases a new Chinese embedding benchmark, large training dataset, and optimized models that outperform priors by up to 10% on C-MTEB while also delivering English SOTA results.

RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models

cs.IR · 2025-02-02 · unverdicted · novelty 6.0

RankFlow deploys four LLM roles in sequence to rewrite queries, generate pseudo-answers, summarize passages, and rerank candidates, outperforming prior methods on TREC-DL, BEIR, and NovelEval.

Subgraph-level Universal Prompt Tuning

cs.LG · 2024-02-16 · unverdicted · novelty 6.0

SUPT assigns prompt features at the subgraph level to enable universal prompt tuning for any GNN pre-training strategy and outperforms fine-tuning in 42 of 45 full-shot and 41 of 45 few-shot graph experiments with average gains of 2.5% and 6.6%.

Crystallizing Schemas with Teleoscope: Thematic Curation of Large Text Corpora on Reddit

cs.HC · 2024-02-09 · unverdicted · novelty 6.0

Teleoscope enables thematic curation of large Reddit corpora via interactive refinement, with three deployments indicating benefits in serendipitous keyword discovery, search saturation confidence, and collaborative curation discussions.

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

cs.DC · 2023-04-21 · unverdicted · novelty 6.0

PyTorch Fully Sharded Data Parallel enables training of significantly larger models than Distributed Data Parallel with comparable speed and near-linear TFLOPS scaling.

Context Collapse: Barriers to Adoption for Generative AI in Workplace Settings

cs.CY · 2026-04-06 · unverdicted · novelty 5.0

Expert interviews demonstrate that context in generative AI workplace use collapses or rots over time, limiting tool effectiveness and revealing pitfalls in computational context approaches.

TREASURE: The Visa Payment Foundation Model for High-Volume Transaction Understanding

cs.LG · 2025-11-24 · unverdicted · novelty 5.0

TREASURE is a transformer model for payment transactions that boosts abnormal behavior detection performance by 111% over production systems and improves recommendation models by 104% when used as an embedding provider.

Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation

cs.AI · 2025-09-08 · conditional · novelty 5.0

Introduces PAS and FAS task abstractions plus the LLM-S^3 benchmark to evaluate LLMs on generating sociodemographic survey responses across 11 real datasets and multiple models.

CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment

cs.CR · 2024-10-16 · unverdicted · novelty 5.0

CoreGuard introduces a computation- and communication-efficient protocol claimed to deliver upper-bound security against model stealing for edge-deployed LLMs with negligible overhead.

Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models

cs.SE · 2023-10-17 · unverdicted · novelty 5.0

bLLMs achieve state-of-the-art results on limited and imbalanced SE sentiment datasets even in zero-shot settings, but fine-tuned sLLMs outperform when ample balanced training data is available.

AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions

cs.AI · 2024-08-23 · unverdicted · novelty 4.0

The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.

A Survey of Scaling in Large Language Model Reasoning

cs.AI · 2025-04-02 · unverdicted · novelty 3.0

A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.

Data-Centric Foundation Models in Computational Healthcare: A Survey

cs.LG · 2024-01-04 · unverdicted · novelty 3.0

The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.

citing papers explorer

Showing 15 of 15 citing papers.

Model Context Protocol (MCP) at First Glance: Studying the Security and Maintainability of MCP Servers cs.SE · 2025-06-16 · conditional · none · ref 26
First study of 1,899 MCP servers finds eight distinct vulnerabilities (only three traditional), 7.2% with general issues, 5.5% with tool poisoning, and 66% with code smells, urging MCP-specific security practices.
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems cs.MA · 2025-06-05 · accept · none · ref 14
A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.
C-Pack: Packed Resources For General Chinese Embeddings cs.CL · 2023-09-14 · accept · none · ref 14
C-Pack releases a new Chinese embedding benchmark, large training dataset, and optimized models that outperform priors by up to 10% on C-MTEB while also delivering English SOTA results.
RankFlow: A Multi-Role Collaborative Reranking Workflow Utilizing Large Language Models cs.IR · 2025-02-02 · unverdicted · none · ref 11
RankFlow deploys four LLM roles in sequence to rewrite queries, generate pseudo-answers, summarize passages, and rerank candidates, outperforming prior methods on TREC-DL, BEIR, and NovelEval.
Subgraph-level Universal Prompt Tuning cs.LG · 2024-02-16 · unverdicted · none · ref 3
SUPT assigns prompt features at the subgraph level to enable universal prompt tuning for any GNN pre-training strategy and outperforms fine-tuning in 42 of 45 full-shot and 41 of 45 few-shot graph experiments with average gains of 2.5% and 6.6%.
Crystallizing Schemas with Teleoscope: Thematic Curation of Large Text Corpora on Reddit cs.HC · 2024-02-09 · unverdicted · none · ref 12
Teleoscope enables thematic curation of large Reddit corpora via interactive refinement, with three deployments indicating benefits in serendipitous keyword discovery, search saturation confidence, and collaborative curation discussions.
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel cs.DC · 2023-04-21 · unverdicted · none · ref 3
PyTorch Fully Sharded Data Parallel enables training of significantly larger models than Distributed Data Parallel with comparable speed and near-linear TFLOPS scaling.
Context Collapse: Barriers to Adoption for Generative AI in Workplace Settings cs.CY · 2026-04-06 · unverdicted · none · ref 27
Expert interviews demonstrate that context in generative AI workplace use collapses or rots over time, limiting tool effectiveness and revealing pitfalls in computational context approaches.
TREASURE: The Visa Payment Foundation Model for High-Volume Transaction Understanding cs.LG · 2025-11-24 · unverdicted · none · ref 4
TREASURE is a transformer model for payment transactions that boosts abnormal behavior detection performance by 111% over production systems and improves recommendation models by 104% when used as an embedding provider.
Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation cs.AI · 2025-09-08 · conditional · none · ref 4
Introduces PAS and FAS task abstractions plus the LLM-S^3 benchmark to evaluate LLMs on generating sociodemographic survey responses across 11 real datasets and multiple models.
CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment cs.CR · 2024-10-16 · unverdicted · none · ref 4
CoreGuard introduces a computation- and communication-efficient protocol claimed to deliver upper-bound security against model stealing for edge-deployed LLMs with negligible overhead.
Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models cs.SE · 2023-10-17 · unverdicted · none · ref 6
bLLMs achieve state-of-the-art results on limited and imbalanced SE sentiment datasets even in zero-shot settings, but fine-tuned sLLMs outperform when ample balanced training data is available.
AI Safety Landscape for Large Language Models: Taxonomy, State-of-the-art, and Future Directions cs.AI · 2024-08-23 · unverdicted · none · ref 81
The paper introduces a taxonomy of AI safety for LLMs organized into Trustworthy AI, Responsible AI, and Safe AI perspectives, accompanied by a review of state-of-the-art methods, challenges, and future directions.
A Survey of Scaling in Large Language Model Reasoning cs.AI · 2025-04-02 · unverdicted · none · ref 12
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.
Data-Centric Foundation Models in Computational Healthcare: A Survey cs.LG · 2024-01-04 · unverdicted · none · ref 28
The paper surveys data-centric strategies for foundation models in computational healthcare and supplies a curated list of related models and datasets.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer