hub

Language models are unsupervised multitask learners

· 2019

18 Pith papers cite this work. Polarity classification is still indexing.

18 Pith papers citing it

browse 18 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 4

citation-polarity summary

background 4

representative citing papers

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.

Generative Quantum-inspired Kolmogorov-Arnold Eigensolver

quant-ph · 2026-05-06 · unverdicted · novelty 7.0

GQKAE uses quantum-inspired Kolmogorov-Arnold networks to reduce parameters by 66% in generative quantum eigensolvers while achieving chemical accuracy on H4, N2, LiH, and other molecules.

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

cs.LG · 2026-04-23 · unverdicted · novelty 7.0

ProjRes achieves near-100% accuracy in membership inference on FedLLMs by measuring projection residuals of hidden embeddings on gradient subspaces, outperforming prior methods by up to 75.75% even under differential privacy.

Structural Anchors and Reasoning Fragility:Understanding CoT Robustness in LLM4Code

cs.SE · 2026-04-14 · unverdicted · novelty 7.0

CoT prompting in LLM4Code shows mixed robustness that depends on model family, task structure, and perturbations destabilizing structural anchors, leading to trajectory deformations like lengthening, branching, and simplification.

MetaKE: Meta-Learning for Knowledge Editing Toward a Better Accuracy-Editability Trade-off

cs.CL · 2026-03-13 · unverdicted · novelty 7.0

MetaKE unifies knowledge editing stages via bi-level optimization and a structural gradient proxy to improve the accuracy-editability trade-off over prior methods.

Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers

cs.CL · 2026-05-16 · unverdicted · novelty 6.0

Diffusion LLMs can act as their own efficiency teachers by using revokable parallel decoding to identify reliable token orders and then distilling those orders into the model parameters for faster inference.

PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems

cs.CR · 2026-05-15 · unverdicted · novelty 6.0

PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.

One Prompt, Many Sounds: Modeling Listener Variability in LLM-Based Equalization

cs.SD · 2026-01-14 · unverdicted · novelty 6.0

LLMs using in-context learning and fine-tuning on listener experiment data generate equalization settings that align better with population preferences than random sampling or static presets.

LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning

cs.LG · 2025-05-22 · conditional · novelty 6.0

LLaDA-V is a diffusion-based multimodal large language model that reaches competitive or state-of-the-art results on visual instruction tasks while using a non-autoregressive architecture.

A Survey on Vision-Language-Action Models for Embodied AI

cs.RO · 2024-05-23 · unverdicted · novelty 6.0

This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.

Hidden Human-Like Nature of Machine-Generated Texts: Theory and Detection Enhancement

cs.CL · 2026-05-22 · unverdicted · novelty 5.0

Reveals hidden human-like spans in machine-generated texts that raise detection complexity and proposes a stacked enhancement framework with hard-EM optimization to improve detectors across LLMs.

Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection

cs.CL · 2026-05-15 · unverdicted · novelty 5.0

A multi-level framework that models local and global relations among token detection scores to improve machine-generated text detection with low overhead.

Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions

cs.AI · 2026-05-04 · unverdicted · novelty 5.0

BerLU constructs a C1-differentiable activation with Lipschitz constant 1 via Bernstein polynomial approximation, showing better performance and efficiency than baselines on image classification with ViTs and CNNs.

Cross-Lingual Attention Distillation with Personality-Informed Generative Augmentation for Multilingual Personality Recognition

cs.CL · 2026-04-10 · unverdicted · novelty 5.0

ADAM uses personality-guided LLM augmentation and cross-lingual attention distillation to raise balanced accuracy on multilingual personality recognition to 0.6332 on Essays and 0.7448 on Kaggle, outperforming standard BCE loss.

On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization

cs.LG · 2025-11-14 · unverdicted · novelty 5.0

MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.

RadarPLM: Adapting Pre-trained Language Models for Marine Radar Target Detection by Selective Fine-tuning

eess.SP · 2025-09-15 · unverdicted · novelty 5.0

RadarPLM adapts PLMs for marine radar target detection with lightweight adaptation and selective fine-tuning based on online learning values, reporting at least 6.35% average detection gains in low SCR conditions.

Large Language Model-Brained GUI Agents: A Survey

cs.AI · 2024-11-27 · unverdicted · novelty 4.0

A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.

Low-Rank Adaptation Redux for Large Models

cs.LG · 2026-04-23 · unverdicted · novelty 3.0

An overview revisits LoRA variants by categorizing advances in architectural design, efficient optimization, and applications while linking them to classical signal processing tools for principled fine-tuning.

citing papers explorer

Showing 18 of 18 citing papers.

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling cs.LG · 2026-05-13 · unverdicted · none · ref 19
QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.
Generative Quantum-inspired Kolmogorov-Arnold Eigensolver quant-ph · 2026-05-06 · unverdicted · none · ref 36
GQKAE uses quantum-inspired Kolmogorov-Arnold networks to reduce parameters by 66% in generative quantum eigensolvers while achieving chemical accuracy on H4, N2, LiH, and other molecules.
Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach cs.LG · 2026-04-23 · unverdicted · none · ref 1
ProjRes achieves near-100% accuracy in membership inference on FedLLMs by measuring projection residuals of hidden embeddings on gradient subspaces, outperforming prior methods by up to 75.75% even under differential privacy.
Structural Anchors and Reasoning Fragility:Understanding CoT Robustness in LLM4Code cs.SE · 2026-04-14 · unverdicted · none · ref 25
CoT prompting in LLM4Code shows mixed robustness that depends on model family, task structure, and perturbations destabilizing structural anchors, leading to trajectory deformations like lengthening, branching, and simplification.
MetaKE: Meta-Learning for Knowledge Editing Toward a Better Accuracy-Editability Trade-off cs.CL · 2026-03-13 · unverdicted · none · ref 33
MetaKE unifies knowledge editing stages via bi-level optimization and a structural gradient proxy to improve the accuracy-editability trade-off over prior methods.
Roll Out and Roll Back: Diffusion LLMs are Their Own Efficiency Teachers cs.CL · 2026-05-16 · unverdicted · none · ref 2
Diffusion LLMs can act as their own efficiency teachers by using revokable parallel decoding to identify reliable token orders and then distilling those orders into the model parameters for faster inference.
PrivScope: Task-scoped Disclosure Control for Hybrid Agentic Systems cs.CR · 2026-05-15 · unverdicted · none · ref 2
PrivScope enforces task-scoped disclosure at the local-cloud boundary in hybrid agents, eliminating profile leakage and halving re-identification risk on medical workflows while preserving task success.
One Prompt, Many Sounds: Modeling Listener Variability in LLM-Based Equalization cs.SD · 2026-01-14 · unverdicted · none · ref 37
LLMs using in-context learning and fine-tuning on listener experiment data generate equalization settings that align better with population preferences than random sampling or static presets.
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning cs.LG · 2025-05-22 · conditional · none · ref 14
LLaDA-V is a diffusion-based multimodal large language model that reaches competitive or state-of-the-art results on visual instruction tasks while using a non-autoregressive architecture.
A Survey on Vision-Language-Action Models for Embodied AI cs.RO · 2024-05-23 · unverdicted · none · ref 279
This is the first survey on vision-language-action models, providing a taxonomy across three lines, plus summaries of datasets, simulators, benchmarks, challenges, and future directions in embodied AI.
Hidden Human-Like Nature of Machine-Generated Texts: Theory and Detection Enhancement cs.CL · 2026-05-22 · unverdicted · none · ref 2
Reveals hidden human-like spans in machine-generated texts that raise detection complexity and proposes a stacked enhancement framework with hard-EM optimization to improve detectors across LLMs.
Multi-Level Contextual Token Relation Modeling for Machine-Generated Text Detection cs.CL · 2026-05-15 · unverdicted · none · ref 2
A multi-level framework that models local and global relations among token detection scores to improve machine-generated text detection with low overhead.
Universal Smoothness via Bernstein Polynomials: A Constructive Approximation Approach for Activation Functions cs.AI · 2026-05-04 · unverdicted · none · ref 20
BerLU constructs a C1-differentiable activation with Lipschitz constant 1 via Bernstein polynomial approximation, showing better performance and efficiency than baselines on image classification with ViTs and CNNs.
Cross-Lingual Attention Distillation with Personality-Informed Generative Augmentation for Multilingual Personality Recognition cs.CL · 2026-04-10 · unverdicted · none · ref 15
ADAM uses personality-guided LLM augmentation and cross-lingual attention distillation to raise balanced accuracy on multilingual personality recognition to 0.6332 on Essays and 0.7448 on Kaggle, outperforming standard BCE loss.
On-Device Fine-Tuning via Backprop-Free Zeroth-Order Optimization cs.LG · 2025-11-14 · unverdicted · none · ref 24
MeZO enables larger models for on-device fine-tuning by estimating gradients via forward passes only, with theoretical size estimates and numerical results showing accuracy benefits when wall-clock time is sufficient.
RadarPLM: Adapting Pre-trained Language Models for Marine Radar Target Detection by Selective Fine-tuning eess.SP · 2025-09-15 · unverdicted · none · ref 31
RadarPLM adapts PLMs for marine radar target detection with lightweight adaptation and selective fine-tuning based on online learning values, reporting at least 6.35% average detection gains in low SCR conditions.
Large Language Model-Brained GUI Agents: A Survey cs.AI · 2024-11-27 · unverdicted · none · ref 87
A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.
Low-Rank Adaptation Redux for Large Models cs.LG · 2026-04-23 · unverdicted · none · ref 150
An overview revisits LoRA variants by categorizing advances in architectural design, efficient optimization, and applications while linking them to classical signal processing tools for principled fine-tuning.

Language models are unsupervised multitask learners

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer