{"total":40,"items":[{"citing_arxiv_id":"2605.23168","ref_index":3,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"PoisonForge: Task-Level Targeted Poisoning Benchmark for Instruction-Tuned LLMs","primary_cat":"cs.CR","submitted_at":"2026-05-22T02:41:13+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"PoisonForge benchmark shows that 1% poisoned examples achieve over 70% attack success rate on targeted tasks across 11 of 12 tested LLMs with under 0.5% leakage to non-target tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.22435","ref_index":157,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Assisted Counterspeech Writing at the Crossroads of Hate Speech and Misinformation","primary_cat":"cs.CL","submitted_at":"2026-05-21T13:02:08+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLMs generate adequate counterspeech for co-occurring hate and misinformation in 40% of cases, with a mixed knowledge strategy from fact-checkers and NGOs proving most effective after expert revision.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16600","ref_index":39,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Where Pretraining writes and Alignment reads: the asymmetry of Transformer weight space","primary_cat":"cs.LG","submitted_at":"2026-05-15T20:00:59+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Pretraining and alignment induce asymmetric geometric traces in transformer weights because alignment updates concentrate in read pathways due to activation covariance while write pathways inherit less structure from alignment losses.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14521","ref_index":101,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Enjoy Your Layer Normalization with the Computational Efficiency of RMSNorm","primary_cat":"cs.LG","submitted_at":"2026-05-14T08:05:39+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A framework to identify and convert foldable layer normalizations to RMSNorm for exact equivalence and faster inference in deep neural networks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14194","ref_index":44,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"GradShield: Alignment Preserving Finetuning","primary_cat":"cs.CL","submitted_at":"2026-05-13T23:19:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"GradShield removes data points likely to cause safety misalignment during LLM finetuning by computing a Finetuning Implicit Harmfulness Score and applying adaptive thresholding, keeping attack success rates below 6% while preserving utility.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11887","ref_index":4,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models","primary_cat":"cs.CL","submitted_at":"2026-05-12T10:01:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Qwen-Scope provides open-source sparse autoencoders for Qwen models that function as practical interfaces for steering, evaluating, data workflows, and optimizing large language models.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"j∈ {1, . . . ,D}:z j(xi)>0 , (2) where zj(xi) is the j-th component of the SAE latent representation of xi, extracted at the last token position. Note that zj(xi) implicitly incorporates the Top-k ReLU activation applied within the SAE encoder; we omit this detail from the notation for brevity. The feature footprint of the entire benchmark is: F(D) = N[ i=1 F(xi). (3) 4.2 Benchmark Redundancy Performance-based redundancy.The most direct way to measure redundancy is to ask: how small can a subset be while still preserving the model ranking? To illustrate this intuitively, consider the following 7 two simple mathematical problems, drawn from GSM8K (Cobbe et al., 2021) and MATH (Hendrycks et al., 2021), respectively: • Candy has 15 light blue spools of thread, 45 dark blue spools of thread, 40 light green spools of thread, and 50 dark green spools of thread. What percent of her spools are blue? • Gina has five pairs of white socks, three pairs of black socks, and two pairs of red socks. What percent of her socks are red? Both problems share an identical mathematical structure, which involves computing a ratio and express- ing it as a percentage, and they differ only in surface context. As training corpora scale up, models become increasingly robust to surface-level context variation, rendering repeated evaluation on struc- turally identical problems redundant. For the purpose of model ranking, such samples contribute little discriminative power. To quantify the discriminative power of benchmark samples, we introduce the following framework. Fix a panel of M models. Let p∈R M denote the vector of model accuracies on the full benchmark D, and ˆp(S) the corresponding vector on a subset S. We measure ranking agreement via Kendall'sτ: τ(S,D) =τ p, ˆp(S) \u0001 , (4) Kendall's τ is preferred over Spearman's ρ here because it has a direct combinatorial interpretation: (τ+ 1)/2 equals the fraction of model pairs whose relative ordering is preserved by the subset. For a si"},{"citing_arxiv_id":"2605.08513","ref_index":28,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"A Single Neuron Is Sufficient to Bypass Safety Alignment in Large Language Models","primary_cat":"cs.CL","submitted_at":"2026-05-08T21:45:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Suppressing one refusal neuron or amplifying one concept neuron bypasses safety alignment in LLMs from 1.7B to 70B parameters without training or prompt engineering.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01167","ref_index":23,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Minimizing Collateral Damage in Activation Steering","primary_cat":"cs.LG","submitted_at":"2026-05-01T23:52:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Activation steering is cast as constrained optimization that minimizes collateral damage by weighting perturbations according to the empirical second-moment matrix of activations instead of assuming isotropy.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.01046","ref_index":44,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Learning in the Fisher Subspace: A Guided Initialization for LoRA Fine-Tuning","primary_cat":"cs.LG","submitted_at":"2026-05-01T19:20:25+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.00369","ref_index":98,"ref_count":2,"confidence":0.55,"is_internal_anchor":false,"paper_title":"InvEvolve: Evolving White-Box Inventory Policies via Large Language Models with Performance Guarantees","primary_cat":"cs.LG","submitted_at":"2026-05-01T03:12:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"InvEvolve evolves white-box inventory policies from LLMs with statistical safety guarantees and outperforms classical and deep learning methods on synthetic and real retail data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.21901","ref_index":20,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"GiVA: Gradient-Informed Bases for Vector-Based Adaptation","primary_cat":"cs.CL","submitted_at":"2026-04-23T17:48:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"GiVA uses gradients to initialize vector adapters so they match LoRA performance at eight times lower rank while keeping extreme parameter efficiency.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.19015","ref_index":60,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"FedProxy: Federated Fine-Tuning of LLMs via Proxy SLMs and Heterogeneity-Aware Fusion","primary_cat":"cs.LG","submitted_at":"2026-04-21T03:06:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"FedProxy replaces weak adapters with a proxy SLM for federated LLM fine-tuning, outperforming prior methods and approaching centralized performance via compression, heterogeneity-aware aggregation, and training-free fusion.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18562","ref_index":56,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"AnchorSeg: Language Grounded Query Banks for Reasoning Segmentation","primary_cat":"cs.CV","submitted_at":"2026-04-20T17:49:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"AnchorSeg uses ordered query banks of latent reasoning tokens plus a spatial anchor token and a Token-Mask Cycle Consistency loss to achieve 67.7% gIoU and 68.1% cIoU on the ReasonSeg benchmark.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18510","ref_index":18,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Different Paths to Harmful Compliance: Behavioral Side Effects and Mechanistic Divergence Across LLM Jailbreaks","primary_cat":"cs.CR","submitted_at":"2026-04-20T17:01:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Different LLM jailbreak techniques achieve similar harmful compliance but lead to distinct behavioral side effects and mechanistic changes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.21074","ref_index":37,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation","primary_cat":"cs.CL","submitted_at":"2025-02-28T14:07:48+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"CODI compresses explicit CoT into continuous space via self-distillation and is the first implicit method to match explicit CoT performance on GSM8k at GPT-2 scale with 3.1x compression and 28.2% higher accuracy than prior implicit approaches.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.18449","ref_index":80,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution","primary_cat":"cs.SE","submitted_at":"2025-02-25T18:45:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2502.10248","ref_index":176,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model","primary_cat":"cs.CV","submitted_at":"2025-02-14T15:58:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Step-Video-T2V describes a 30B-parameter text-to-video model with custom Video-VAE, 3D DiT, flow matching, and Video-DPO that claims state-of-the-art results on a new internal benchmark.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2412.05496","ref_index":17,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Flex Attention: A Programming Model for Generating Optimized Attention Kernels","primary_cat":"cs.LG","submitted_at":"2024-12-07T01:46:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"FlexAttention supplies a compiler-driven interface that expresses common attention variants in a few lines of PyTorch and emits optimized kernels whose speed matches hand-written implementations.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2408.04840","ref_index":64,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models","primary_cat":"cs.CV","submitted_at":"2024-08-09T03:25:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"mPLUG-Owl3 introduces hyper attention blocks to integrate vision and language for long image-sequence understanding and reports SOTA results on single-image, multi-image, and video benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2408.00724","ref_index":6,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models","primary_cat":"cs.AI","submitted_at":"2024-08-01T17:16:04+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Empirical analysis shows scaling inference compute via strategies like tree search can be more efficient than scaling model parameters, with 7B models plus novel search outperforming 34B models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2406.11717","ref_index":38,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Refusal in Language Models Is Mediated by a Single Direction","primary_cat":"cs.LG","submitted_at":"2024-06-17T16:36:12+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Refusal in language models is mediated by a single direction in residual stream activations that can be erased to disable safety or added to elicit refusal.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2406.08464","ref_index":35,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing","primary_cat":"cs.CL","submitted_at":"2024-06-12T17:52:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Magpie synthesizes 300K high-quality alignment instructions from Llama-3-Instruct via auto-regressive prompting on partial templates, enabling fine-tuned models to match official instruct performance on AlpacaEval, ArenaHard, and WildBench.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2405.14782","ref_index":121,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Lessons from the Trenches on Reproducible Evaluation of Language Models","primary_cat":"cs.CL","submitted_at":"2024-05-23T16:50:49+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2405.01470","ref_index":32,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"WildChat: 1M ChatGPT Interaction Logs in the Wild","primary_cat":"cs.CL","submitted_at":"2024-05-02T17:00:02+00:00","verdict":"ACCEPT","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"WildChat releases a dataset of 1 million ChatGPT conversations with timestamps, demographics, and headers, claimed to be the most diverse and multilingual such resource available.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2403.07691","ref_index":102,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"ORPO: Monolithic Preference Optimization without Reference Model","primary_cat":"cs.CL","submitted_at":"2024-03-12T14:34:08+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":8.0,"formal_verification":"none","one_line_summary":"ORPO performs preference alignment during supervised fine-tuning via a monolithic odds ratio penalty, allowing 7B models to outperform larger state-of-the-art models on alignment benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2402.13116","ref_index":52,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"A Survey on Knowledge Distillation of Large Language Models","primary_cat":"cs.CL","submitted_at":"2024-02-20T16:17:37+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A comprehensive survey of knowledge distillation for LLMs structured around algorithms, skill enhancement, and vertical applications, highlighting data augmentation as a key enabler.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2402.11411","ref_index":101,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Aligning Modalities in Vision Large Language Models via Preference Fine-tuning","primary_cat":"cs.LG","submitted_at":"2024-02-18T00:56:16+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"POVID generates AI-created preference data to fine-tune vision-language models with DPO, reducing hallucinations and improving benchmark scores.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2401.15077","ref_index":34,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty","primary_cat":"cs.LG","submitted_at":"2024-01-26T18:59:01+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"EAGLE resolves feature-level uncertainty in speculative sampling via one-step token advancement, delivering 2.7x-3.5x speedup on LLaMA2-Chat 70B and doubled throughput across multiple model families and tasks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2401.10020","ref_index":23,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Self-Rewarding Language Models","primary_cat":"cs.CL","submitted_at":"2024-01-18T14:43:47+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Iterative self-rewarding via LLM-as-Judge in DPO training on Llama 2 70B improves instruction following and self-evaluation, outperforming GPT-4 on AlpacaEval 2.0.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2401.03568","ref_index":20,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Agent AI: Surveying the Horizons of Multimodal Interaction","primary_cat":"cs.AI","submitted_at":"2024-01-07T19:11:18+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"The paper defines Agent AI as interactive multimodal systems that perceive grounded data and generate embodied actions, arguing this approach can mitigate hallucinations in foundation models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2312.17090","ref_index":69,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels","primary_cat":"cs.CV","submitted_at":"2023-12-28T16:10:25+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Q-Align trains LMMs on discrete text-defined levels for visual scoring, achieving SOTA on IQA, IAA, and VQA while unifying the tasks in OneAlign.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2312.13771","ref_index":57,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"AppAgent: Multimodal Agents as Smartphone Users","primary_cat":"cs.CV","submitted_at":"2023-12-21T11:52:45+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"AppAgent lets large language models operate diverse smartphone apps via visual interactions and learns app usage from exploration or demonstrations.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2311.16867","ref_index":211,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"The Falcon Series of Open Language Models","primary_cat":"cs.CL","submitted_at":"2023-11-28T15:12:47+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Falcon-180B is a 180B-parameter open decoder-only model trained on 3.5 trillion tokens that approaches PaLM-2-Large performance at lower cost and is released with dataset extracts.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2309.11495","ref_index":73,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Chain-of-Verification Reduces Hallucination in Large Language Models","primary_cat":"cs.CL","submitted_at":"2023-09-20T17:50:55+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Chain-of-Verification reduces hallucinations in large language models by drafting responses, planning independent verification questions, answering them separately, and generating a final verified output.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2309.08532","ref_index":127,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers","primary_cat":"cs.CL","submitted_at":"2023-09-15T16:50:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"EvoPrompt uses LLMs to run evolutionary operators on populations of prompts, outperforming human-engineered prompts by up to 25% on BIG-Bench Hard tasks across 31 datasets.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2309.05922","ref_index":86,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"A Survey of Hallucination in Large Foundation Models","primary_cat":"cs.AI","submitted_at":"2023-09-12T02:34:06+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":3.0,"formal_verification":"none","one_line_summary":"A survey classifying hallucination phenomena specific to large foundation models, establishing evaluation criteria, examining mitigation strategies, and discussing future directions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2309.03883","ref_index":73,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models","primary_cat":"cs.CL","submitted_at":"2023-09-07T17:45:31+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DoLa reduces hallucinations in LLMs by contrasting logits from later versus earlier layers during decoding, improving truthfulness on TruthfulQA by 12-17 absolute points without fine-tuning or retrieval.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2305.14233","ref_index":200,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"Enhancing Chat Language Models by Scaling High-quality Instructional Conversations","primary_cat":"cs.CL","submitted_at":"2023-05-23T16:49:14+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"UltraChat supplies 1.5 million high-quality multi-turn dialogues that, when used to fine-tune LLaMA, produce UltraLLaMA, which outperforms prior open-source chat models including Vicuna.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2305.11206","ref_index":12,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"LIMA: Less Is More for Alignment","primary_cat":"cs.CL","submitted_at":"2023-05-18T17:45:22+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Fine-tuning a 65B model on 1,000 high-quality examples produces output that humans rate as good as or better than GPT-4 in 43% of cases, indicating most capabilities come from pretraining.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2303.16199","ref_index":174,"ref_count":1,"confidence":0.55,"is_internal_anchor":false,"paper_title":"LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention","primary_cat":"cs.CV","submitted_at":"2023-03-28T17:59:12+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LLaMA-Adapter turns frozen LLaMA 7B into a capable instruction follower using only 1.2M new parameters and zero-init attention, matching Alpaca while extending to image-conditioned reasoning on ScienceQA and COCO.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}