A controlled formal language task reveals fine-tuning outperforms in-context learning on in-distribution generalization but equals it on out-of-distribution, with ICL showing greater sensitivity to model size and tokenization.
Title resolution pending
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
unclear 1representative citing papers
Cascaded systems remain the most reliable for speech translation overall, but recent SpeechLLMs match or outperform them in many conditions while standalone speech models lag.
CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
FinFRE-RAG combines importance-guided feature reduction with label-aware retrieval-augmented generation to boost LLM performance on tabular fraud detection across four public datasets while providing human-readable rationales.
GS-Quant generates coarse-to-fine discrete codes for KG entities via semantic hierarchy injection and causal sequence reconstruction, enabling LLMs to perform knowledge graph completion by treating the codes as vocabulary tokens.
MSMO framework achieves claimed SOTA cross-lingual ABSA via sentence- and aspect-level alignment, code-switching, consistency training, and knowledge distillation.
Fine-tuning on annotated English and Japanese dialogues improves clustering of backchannels and fillers and makes generated utterances closer to human ones.
Frequent sentence-level text improves LLM prompting and fine-tuning performance across math, translation, commonsense, and tool-use tasks via a proposed frequency law and curriculum ordering.
citing papers explorer
-
Fine-tuning vs. In-context Learning in Large Language Models: A Formal Language Learning Perspective
A controlled formal language task reveals fine-tuning outperforms in-context learning on in-distribution generalization but equals it on out-of-distribution, with ICL showing greater sensitivity to model size and tokenization.
-
Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
Cascaded systems remain the most reliable for speech translation overall, but recent SpeechLLMs match or outperform them in many conditions while standalone speech models lag.
-
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.
-
ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation
ClusterRAG applies density-based clustering to user profiles for collaborative retrieval in personalized RAG and reports best performance on LaMP tasks by combining target and similar-user profiles.
-
Understanding Structured Financial Data with LLMs: A Case Study on Fraud Detection
FinFRE-RAG combines importance-guided feature reduction with label-aware retrieval-augmented generation to boost LLM performance on tabular fraud detection across four public datasets while providing human-readable rationales.
-
GS-Quant: Granular Semantic and Generative Structural Quantization for Knowledge Graph Completion
GS-Quant generates coarse-to-fine discrete codes for KG entities via semantic hierarchy injection and causal sequence reconstruction, enabling LLMs to perform knowledge graph completion by treating the codes as vocabulary tokens.
-
MSMO-ABSA: Multi-Scale and Multi-Objective Optimization for Cross-Lingual Aspect-Based Sentiment Analysis
MSMO framework achieves claimed SOTA cross-lingual ABSA via sentence- and aspect-level alignment, code-switching, consistency training, and knowledge distillation.
-
Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models
Fine-tuning on annotated English and Japanese dialogues improves clustering of backchannels and fillers and makes generated utterances closer to human ones.
-
Adam's Law: Textual Frequency Law on Large Language Models
Frequent sentence-level text improves LLM prompting and fine-tuning performance across math, translation, commonsense, and tool-use tasks via a proposed frequency law and curriculum ordering.