Title resolution pending

Liu, Yinhan, Ott, Myle, Goyal, Naman, Du, Jingfei, Joshi, Mandar, Chen, Danqi

17 Pith papers cite this work. Polarity classification is still indexing.

17 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

100,000+ Movie Reviews from Kazakhstan: Russian, Kazakh, and Code-Switched Texts

cs.CL · 2026-05-09 · accept · novelty 7.0 · 2 refs

A new corpus of 100,502 annotated movie reviews from Kazakhstan enables sentiment analysis research in Russian, Kazakh, and code-switched texts.

Rethinking the Rank Threshold for LoRA Fine-Tuning

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

For binary classification in the NTK regime, LoRA rank r=1 suffices and is often optimal under cross-entropy loss, reducing the prior sufficient condition from r>=12.

FIBER: A Differentially Private Optimizer with Filter-Aware Innovation Bias Correction

cs.LG · 2026-05-05 · unverdicted · novelty 7.0

FiBeR adds a closed-form filter-aware correction A(ω)σ_w² to the second-moment term for temporally filtered DP gradients, improving adaptive optimization performance.

Adaptive Instruction Composition for Automated LLM Red-Teaming

cs.CR · 2026-04-22 · unverdicted · novelty 7.0

Adaptive Instruction Composition uses a neural contextual bandit with RL to adaptively combine crowdsourced texts, generating more effective and diverse LLM jailbreaks than random or prior adaptive methods on Harmbench.

Learning Posterior Predictive Distributions for Node Classification from Synthetic Graph Priors

cs.LG · 2026-04-21 · unverdicted · novelty 7.0

NodePFN pre-trains on synthetic graphs with controllable homophily and causal feature-label models to achieve 71.27 average accuracy on 23 node classification benchmarks without graph-specific training.

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

cs.CL · 2024-05-07 · unverdicted · novelty 7.0

DeepSeek-V2 delivers top-tier open-source LLM performance using only 21B active parameters by compressing the KV cache 93.3% and cutting training costs 42.5% via MLA and DeepSeekMoE.

Is Child-Directed Language Optimized for Word Learning? A Computational Study of Verb Meaning Acquisition

cs.CL · 2026-05-12 · unverdicted · novelty 6.0

Computational experiments show verb learning benefits in child-directed language likely stem from spoken register properties rather than unique optimization for children.

Lost in State Space: Probing Frozen Mamba Representations

cs.CL · 2026-04-30 · unverdicted · novelty 6.0

Frozen Mamba patch-boundary readouts do not outperform mean pooling for sentence representations on SST-2, CoLA, MRPC, STS-B, and IMDb due to anisotropy (cosine similarity ~0.9999) and representational collapse (MCC=0 on CoLA).

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

cs.AI · 2025-07-01 · conditional · novelty 6.0

Math reasoning gains in LLMs rarely transfer to general domains; RL tuning generalizes while SFT causes forgetting and representation drift.

Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

cs.AI · 2024-08-01 · conditional · novelty 6.0

Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

cs.CL · 2020-02-10 · accept · novelty 6.0

Fine-tuned language models store knowledge in parameters to answer questions competitively with retrieval-based open-domain QA systems.

Exploring the Effectiveness of Using LLMs for Automated Assessment of Student Self Explanations in Programming Education

cs.HC · 2026-05-20 · unverdicted · novelty 5.0

Compares LLMs against semantic similarity for binary classification of student self-explanations in programming education.

Modeling Human Perspectives with Socio-Demographic Representations

cs.CL · 2026-04-20 · unverdicted · novelty 5.0

Socio-Contrastive Learning jointly learns socio-demographic representations and textual features via contrastive objectives to predict annotator perspectives more accurately than concatenation baselines.

StarCoder: may the source be with you!

cs.CL · 2023-05-09 · accept · novelty 5.0

StarCoderBase matches or beats OpenAI's code-cushman-001 on multi-language code benchmarks; the Python-fine-tuned StarCoder reaches 40% pass@1 on HumanEval while retaining other-language performance.

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

cs.CL · 2024-01-05 · unverdicted · novelty 4.0

DeepSeek LLM 67B exceeds LLaMA-2 70B on code, mathematics and reasoning benchmarks after pre-training on 2 trillion tokens and alignment via SFT and DPO.

Multilingual and Multimodal LLMs in the Wild: Building for Low-Resource Languages

cs.CL · 2026-05-16 · unverdicted · novelty 2.0

A tutorial synthesizing foundations, recent models such as PALO and Maya, and low-cost methods for tri-modal multilingual AI in resource-constrained settings.

PSK@EEUCA 2026: Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat

cs.CL · 2026-05-08 · unverdicted · novelty 2.0

Llama 3.1 8B fine-tuned with calibrated 5% synthetic data augmentation reaches 0.6234 F1-macro on multi-class toxicity detection in gaming chat and places fourth among 35 teams.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer