pith. sign in

arxiv: 2606.21645 · v1 · pith:EA27QIMVnew · submitted 2026-06-19 · 💻 cs.CL · cs.LG

Behavioral and Representational Evidence of Binomial Ordering Preferences in Large Language Models

Pith reviewed 2026-06-26 14:15 UTC · model grok-4.3

classification 💻 cs.CL cs.LG
keywords binomial orderinglarge language modelsdistributional alignmentsparse probingpreference strengthmultilingual evaluationmodel steering
0
0 comments X

The pith

Large language models recover the dominant order of conventional word pairs but diverge from their exact corpus frequency distributions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether LLMs capture gradient frequency preferences in binomial expressions such as men and women, where both orders are grammatical but one is more common. It builds a dataset of 600 pairs across eight languages and compares corpus-derived ordering probabilities against the probabilities induced by six open-weight models using both categorical and distributional measures. Models recover the dominant order especially on strongly conventionalized pairs yet match the precise distributions less closely. This pattern holds across languages and indicates that directional success does not equate to faithful modeling of statistical nuance. Internal probing further shows that preference strength is represented in middle-to-late layers and can be altered by steering.

Core claim

While models often behaviorally recover the dominant corpus-preferred order, particularly for strongly conventionalized pairs, they align less well with the exact corpus preference distributions. This suggests that apparent directional order overstates how faithfully LLMs capture the statistical nuances of language use. Sparse probing verifies that the concept of preference strength is partially encoded among middle-to-late layers, and steering along probe-derived directions alters model-induced ordering distributions.

What carries the argument

Binomial ordering treated as a distributional alignment problem between corpus frequencies and model output probabilities, measured with both categorical accuracy and distributional metrics.

If this is right

  • Directional preference recovery alone does not demonstrate faithful capture of corpus frequency distributions.
  • Preference strength can be read out from middle-to-late layer representations via sparse probes.
  • Steering along probe-derived directions can change the ordering distributions produced by the model.
  • Evaluation of LLM statistical modeling must go beyond binary order choice to distributional alignment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If models are primarily matching memorized conventions rather than learning underlying distributions, then scaling or additional pretraining on raw counts may not close the gap.
  • The same probing and steering approach could be applied to other gradient linguistic phenomena such as collocation strength or syntactic optionality.
  • Cross-linguistic consistency of the mismatch suggests the limitation is architectural rather than language-specific.

Load-bearing premise

That the constructed 600-pair multilingual dataset's corpus-derived preferences serve as the correct ground-truth benchmark for how faithfully models should capture statistical nuances of language use.

What would settle it

A measurement showing that model-induced ordering probabilities correlate at the level of exact corpus frequencies across the full set of 600 pairs when evaluated with the same distributional metrics.

Figures

Figures reproduced from arXiv: 2606.21645 by Hinrich Sch\"utze, Volker Tresp, Yilun Liu, Yunpu Ma, Zhiqing Yang.

Figure 1
Figure 1. Figure 1: Pipeline for analyzing behavioral and representational binomial ordering preferences of LLMs. Corpus [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Item-level alignment between Llama-3.1-8B and corpus-derived ordering preferences across languages. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Model–corpus cross-linguistic correlation for [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Sparse probing results for corpus-derived preference strength in Qwen3-4B using last-token binomial [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Effects of steering on Qwen3-4B binomial order probabilities using the last-token steering vector at layer [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Steering case study for men and women us￾ing the last-token steering vector at layer 14. For each language, stacked bars compare the model probabilities of men and women (orange) and women and men (blue) at λ = 0, −10, −20, with λ = 0 denoting the original baseline. leaves less visible room for change, since a base￾line already pinned near zero or one mechanically limits how far probabilities can move, eve… view at source ↗
Figure 7
Figure 7. Figure 7: Corpus reference and Llama-3.1-8B cross-linguistic structure. Each cell reports the Spearman correla [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Cross-linguistic alignment diagnostics for Qwen3-4B. Left: model–corpus correlations. Right: model– [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Cross-linguistic alignment diagnostics for Qwen3-14B. Left: model–corpus correlations. Right: model– [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Cross-linguistic alignment diagnostics for Llama-3.2-3B. Left: model–corpus correlations. Right: [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Cross-linguistic alignment diagnostics for Gemma-3-4B. Left: model–corpus correlations. Right: [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Cross-linguistic alignment diagnostics for Gemma-3-12B. Left: model–corpus correlations. Right: [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Item-level model–corpus alignment for different models across languages. Points are language-specific [PITH_FULL_IMAGE:figures/full_fig_p017_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Sparse probing results for corpus-derived preference strength in Qwen3-4B using mean-pooled binomial [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Effects of steering on Qwen3-4B binomial order probabilities using the last-token steering vector at [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Effects of steering on Qwen3-4B binomial order probabilities using the mean-pooled steering vector at [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗
read the original abstract

Large language models (LLMs) can readily reproduce conventional expressions, yet their ability to model gradient frequency distributions remains underexplored. We investigate this using linguistic binomials, such as men and women, where both word permutations are grammatically valid but exhibit distinct, cross-linguistic variations in conventionality. We formalize binomial ordering as a distributional alignment problem, and construct a multilingual dataset of 600 binomial pairs across 8 languages. With categorical and distributional metrics, we measure and compare the corpus-derived preferences with model-induced ordering probabilities of 6 open-weight LLMs. While models often behaviorally recover the dominant corpus-preferred order, particularly for strongly conventionalized pairs, they align less well with the exact corpus preference distributions. This suggests that apparent directional order overstates how faithfully LLMs capture the statistical nuances of language use. Sparse probing verifies that the concept of preference strength is partially encoded among middle-to-late layers, and steering along probe-derived directions alters model-induced ordering distributions, demonstrating that the statistical behavioral preference of LLMs can be mechanistically measured and manipulated via internal representations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript claims that LLMs recover the dominant ordering in binomial pairs (e.g., men and women) from corpus data, especially for strongly conventionalized items, but show weaker alignment with the precise corpus preference distributions. This is evaluated via categorical and distributional metrics on a newly constructed multilingual dataset of 600 pairs across 8 languages, using 6 open-weight LLMs. The authors conclude that apparent directional recovery overstates faithfulness to statistical nuances of language use. Supporting representational evidence comes from sparse probing, which finds preference strength partially encoded in middle-to-late layers, and from steering experiments that alter model-induced ordering distributions.

Significance. If the central empirical patterns hold after methodological clarification, the work usefully distinguishes categorical from gradient statistical modeling in LLMs and supplies both behavioral metrics and mechanistic probing results. The construction of a multilingual binomial dataset and the combination of corpus comparison with internal representation analysis constitute concrete contributions that could support future evaluation of statistical fidelity in language models.

major comments (1)
  1. The central claim—that directional order overstates faithfulness to statistical nuances—requires that the corpus-derived preference distributions constitute the correct target. The abstract provides no information on pair selection criteria, frequency estimation procedures, genre or domain coverage of the source corpora, or cross-linguistic normalization, all of which are load-bearing for interpreting poorer distributional alignment as an LLM limitation rather than a benchmark artifact.
minor comments (2)
  1. [Abstract] The abstract refers to 'sparse probing' and 'probe-derived directions' without indicating the probe architecture, number of layers tested, or how directions were extracted; these details are needed for reproducibility even at the abstract level.
  2. The manuscript should specify the exact six open-weight LLMs evaluated and the precise categorical and distributional metrics employed, as these choices directly affect the reported alignment results.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on methodological transparency. We address the point below and will revise the manuscript to strengthen the presentation of our benchmark.

read point-by-point responses
  1. Referee: The central claim—that directional order overstates faithfulness to statistical nuances—requires that the corpus-derived preference distributions constitute the correct target. The abstract provides no information on pair selection criteria, frequency estimation procedures, genre or domain coverage of the source corpora, or cross-linguistic normalization, all of which are load-bearing for interpreting poorer distributional alignment as an LLM limitation rather than a benchmark artifact.

    Authors: We agree that the abstract should briefly summarize key aspects of the dataset construction to support interpretation of the results. The full manuscript (Section 3) details pair selection from large web-scale corpora using frequency and conventionality filters, frequency estimation via add-one smoothed log-odds ratios on co-occurrence counts, genre coverage spanning news, web text, and literary sources, and language-specific normalization to handle tokenization differences. We will revise the abstract to include a concise statement of these procedures, clarifying that the corpus distributions serve as an empirical proxy for usage patterns rather than an infallible target. This addresses the concern without altering the central claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the empirical comparison chain

full rationale

The paper's central claims rely on direct empirical comparisons between corpus-derived binomial preferences from a constructed 600-pair multilingual dataset and model-induced ordering probabilities, evaluated via categorical and distributional metrics, plus sparse probing of internal activations. These measurements use external corpus data and model internals as independent benchmarks without any reduction to quantities defined by the paper's own fitted parameters or self-citations. No self-definitional loops, fitted inputs called predictions, or load-bearing self-citation chains are present, rendering the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on domain assumptions about corpus data as ground truth and dataset representativeness; no free parameters or invented entities are described.

axioms (2)
  • domain assumption Corpus frequencies accurately reflect conventionality and preference strength in language use across the selected languages.
    Used as the benchmark for all model comparisons in the distributional alignment evaluation.
  • domain assumption The constructed set of 600 binomial pairs is representative of cross-linguistic variations in binomial ordering.
    Basis for the multilingual dataset used to measure model-induced ordering probabilities.

pith-pipeline@v0.9.1-grok · 5734 in / 1365 out tokens · 35456 ms · 2026-06-26T14:15:50.121136+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

276 extracted references · 47 canonical work pages · 1 internal anchor

  1. [3]

    Token-level Identification of Multiword Expressions using Pre-trained Multilingual Language Models

    Swaminathan, Raghuraman and Cook, Paul. Token-level Identification of Multiword Expressions using Pre-trained Multilingual Language Models. Proceedings of the 19th Workshop on Multiword Expressions (MWE 2023). 2023. doi:10.18653/v1/2023.mwe-1.1

  2. [4]

    Set-Based Prompting: Provably Solving the Language Model Order Dependency Problem

    McIlroy-Young, Reid and Brown, Katrina and Olson, Conlan and Zhang, Linjun and Dwork, Cynthia. Set-Based Prompting: Provably Solving the Language Model Order Dependency Problem. Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024). 2024

  3. [5]

    Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little

    Sinha, Koustuv and Jia, Robin and Hupkes, Dieuwke and Pineau, Joelle and Williams, Adina and Kiela, Douwe. Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. doi:10.18653/v1/2021.emnlp-main.230

  4. [6]

    Neural computation , volume=

    Adaptive mixtures of local experts , author=. Neural computation , volume=. 1991 , publisher=

  5. [7]

    arXiv preprint arXiv:1312.4314 , year=

    Learning factored representations in a deep mixture of experts , author=. arXiv preprint arXiv:1312.4314 , year=

  6. [8]

    Science China technological sciences , volume=

    Pre-trained models for natural language processing: A survey , author=. Science China technological sciences , volume=. 2020 , publisher=

  7. [9]

    arXiv preprint arXiv:2212.05055 , year=

    Sparse upcycling: Training mixture-of-experts from dense checkpoints , author=. arXiv preprint arXiv:2212.05055 , year=

  8. [10]

    Advances in Neural Information Processing Systems , volume=

    Learn to explain: Multimodal reasoning via thought chains for science question answering , author=. Advances in Neural Information Processing Systems , volume=. 2022 , url=

  9. [11]

    Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

    Vizwiz grand challenge: Answering visual questions from blind people , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=. 2018 , url=

  10. [12]

    Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

    Towards vqa models that can read , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=. 2019 , url=

  11. [13]

    arXiv preprint arXiv:2307.06281 , year=

    Mmbench: Is your multi-modal model an all-around player? , author=. arXiv preprint arXiv:2307.06281 , year=

  12. [14]

    Advances in neural information processing systems , volume=

    Visual instruction tuning , author=. Advances in neural information processing systems , volume=. 2024 , url=

  13. [15]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

    Improved baselines with visual instruction tuning , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=. 2024 , url=

  14. [16]

    Neural computation , volume=

    Hierarchical mixtures of experts and the EM algorithm , author=. Neural computation , volume=. 1994 , publisher=

  15. [17]

    arXiv preprint arXiv:1701.06538 , year=

    Outrageously large neural networks: The sparsely-gated mixture-of-experts layer , author=. arXiv preprint arXiv:1701.06538 , year=

  16. [18]

    International Conference on Machine Learning , pages=

    Glam: Efficient scaling of language models with mixture-of-experts , author=. International Conference on Machine Learning , pages=. 2022 , organization=

  17. [19]

    arXiv preprint arXiv:2006.16668 , year=

    Gshard: Scaling giant models with conditional computation and automatic sharding , author=. arXiv preprint arXiv:2006.16668 , year=

  18. [21]

    arXiv preprint arXiv:2401.04088 , year=

    Mixtral of experts , author=. arXiv preprint arXiv:2401.04088 , year=

  19. [22]

    arXiv preprint arXiv:2402.01739 , year=

    Openmoe: An early effort on open mixture-of-experts language models , author=. arXiv preprint arXiv:2402.01739 , year=

  20. [23]

    arXiv preprint arXiv:2401.15947 , year=

    Moe-llava: Mixture of experts for large vision-language models , author=. arXiv preprint arXiv:2401.15947 , year=

  21. [24]

    arXiv preprint arXiv:2401.06066 , year=

    Deepseekmoe: Towards ultimate expert specialization in mixture-of-experts language models , author=. arXiv preprint arXiv:2401.06066 , year=

  22. [25]

    arXiv preprint arXiv:2304.01933 , year=

    Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models , author=. arXiv preprint arXiv:2304.01933 , year=

  23. [26]

    International Conference on Learning Representations , year=

    Adaptive budget allocation for parameter-efficient fine-tuning , author=. International Conference on Learning Representations , year=

  24. [27]

    Advances in Neural Information Processing Systems , volume=

    Qlora: Efficient finetuning of quantized llms , author=. Advances in Neural Information Processing Systems , volume=. 2024 , url=

  25. [28]

    arXiv preprint arXiv:2406.04984 , year=

    MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter , author=. arXiv preprint arXiv:2406.04984 , year=

  26. [29]

    arXiv preprint arXiv:2011.14522 , year=

    Feature learning in infinite-width neural networks , author=. arXiv preprint arXiv:2011.14522 , year=

  27. [30]

    arXiv preprint arXiv:2110.14168 , year=

    Training verifiers to solve math word problems , author=. arXiv preprint arXiv:2110.14168 , year=

  28. [31]

    Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=

    Are NLP Models really able to Solve Simple Math Word Problems? , author=. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies , pages=. 2021 , url=

  29. [32]

    Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=

    Learning to solve arithmetic word problems with verb categorization , author=. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages=. 2014 , url=

  30. [33]

    Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=

    Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems , author=. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=. 2017 , url=

  31. [34]

    Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing , pages=

    Solving General Arithmetic Word Problems , author=. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing , pages=. 2015 , url=

  32. [35]

    Transactions of the Association for Computational Linguistics , volume=

    Parsing algebraic word problems into equations , author=. Transactions of the Association for Computational Linguistics , volume=. 2015 , publisher=

  33. [36]

    arXiv preprint arXiv:2202.08906 , year=

    St-moe: Designing stable and transferable sparse expert models , author=. arXiv preprint arXiv:2202.08906 , year=

  34. [37]

    Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning , author=. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) , pages=. 2021 , url=

  35. [38]

    arXiv preprint arXiv:2406.18219 , year=

    A Closer Look into Mixture-of-Experts in Large Language Models , author=. arXiv preprint arXiv:2406.18219 , year=

  36. [39]

    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

    On the benefits of learning to route in mixture-of-experts models , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=. 2023 , url=

  37. [40]

    arXiv preprint arXiv:2407.01906 , year=

    Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models , author=. arXiv preprint arXiv:2407.01906 , year=

  38. [41]

    Fast and robust early-exiting framework for autoregressive language models with synchronized parallel decoding , url =

    Bae, Sangmin and Ko, Jongwoo and Song, Hwanjun and Yun, Se-Young , journal =. Fast and robust early-exiting framework for autoregressive language models with synchronized parallel decoding , url =

  39. [42]

    doi:10.18653/v1/2020.findings-emnlp.148 , pages =

    Barbieri, Francesco and Camacho-Collados, Jose and Espinosa Anke, Luis and Neves, Leonardo , booktitle =. doi:10.18653/v1/2020.findings-emnlp.148 , pages =

  40. [43]

    Glass , bibsource =

    Anthony Bau and Yonatan Belinkov and Hassan Sajjad and Nadir Durrani and Fahim Dalvi and James R. Glass , bibsource =. Identifying and Controlling Important Neurons in Neural Machine Translation , url =. Proc. of ICLR , publisher =

  41. [44]

    Language models can explain neurons in language models , url =

    Bills, Steven and Cammarata, Nick and Mossing, Dan and Tillman, Henk and Gao, Leo and Goh, Gabriel and Sutskever, Ilya and Leike, Jan and Wu, Jeff and Saunders, William , journal =. Language models can explain neurons in language models , url =

  42. [45]

    Towards monosemanticity: Decomposing language models with dictionary learning , url =

    Bricken, Trenton and Templeton, Adly and Batson, Joshua and Chen, Brian and Jermyn, Adam and Conerly, Tom and Turner, Nick and Anil, Cem and Denison, Carson and Askell, Amanda and others , journal =. Towards monosemanticity: Decomposing language models with dictionary learning , url =

  43. [46]

    Tom B. Brown and Benjamin Mann and Nick Ryder and Melanie Subbiah and Jared Kaplan and Prafulla Dhariwal and Arvind Neelakantan and Pranav Shyam and Girish Sastry and Amanda Askell and Sandhini Agarwal and Ariel Herbert. Language Models are Few-Shot Learners , url =. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Informa...

  44. [47]

    Discovering latent knowledge in language models without supervision , url =

    Burns, Collin and Ye, Haotian and Klein, Dan and Steinhardt, Jacob , journal =. Discovering latent knowledge in language models without supervision , url =

  45. [48]

    Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021) , doi =

    Caselli, Tommaso and Basile, Valerio and Mitrovi. Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021) , doi =

  46. [49]

    Red Teaming Deep Neural Networks with Feature Synthesis Tools , url =

    Stephen Casper and Yuxiao Li and Jiawei Li and Tong Bu and Kevin Zhang and Kaivalya Hariharan and Dylan Hadfield-Menell , journal =. Red Teaming Deep Neural Networks with Feature Synthesis Tools , url =

  47. [50]

    Deep language algorithms predict semantic comprehension from brain activity , url =

    Caucheteux, Charlotte and Gramfort, Alexandre and King, Jean-R. Deep language algorithms predict semantic comprehension from brain activity , url =. Scientific reports , number =

  48. [51]

    Manco and E

    Michelangelo Ceci and Corrado Loglisci and G. Manco and E. Masciari and Z. Ras and R. Goebel and Yuzuru Tanaka , doi =. New Frontiers in Mining Complex Patterns , title =

  49. [52]

    Universal sentence encoder , url =

    Cer, Daniel and Yang, Yinfei and Kong, Sheng-yi and Hua, Nan and Limtiaco, Nicole and John, Rhomni St and Constant, Noah and Guajardo-Cespedes, Mario and Yuan, Steve and Tar, Chris and others , journal =. Universal sentence encoder , url =

  50. [53]

    Description Based Text Classification with Reinforcement Learning , url =

    Duo Chai and Wei Wu and Qinghong Han and Fei Wu and Jiwei Li , bibsource =. Description Based Text Classification with Reinforcement Learning , url =. Proc. of ICML , pages =

  51. [54]

    A survey on evaluation of large language models , url =

    Chang, Yupeng and Wang, Xu and Wang, Jindong and Wu, Yuan and Yang, Linyi and Zhu, Kaijie and Chen, Hao and Yi, Xiaoyuan and Wang, Cunxiang and Wang, Yidong and others , journal =. A survey on evaluation of large language models , url =

  52. [55]

    doi:10.18653/v1/2020.acl-main.194 , pages =

    Chen, Jiaao and Yang, Zichao and Yang, Diyi , booktitle =. doi:10.18653/v1/2020.acl-main.194 , pages =

  53. [56]

    EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism , url =

    Chen, Yanxi and Pan, Xuchen and Li, Yaliang and Ding, Bolin and Zhou, Jingren , journal =. EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism , url =

  54. [57]

    Very Deep Convolutional Networks for Text Classification , url =

    Conneau, Alexis and Schwenk, Holger and Barrault, Lo. Very Deep Convolutional Networks for Text Classification , url =. Proceedings of the 15th Conference of the

  55. [58]

    Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , url =

    Conneau, Alexis and Kiela, Douwe and Schwenk, Holger and Barrault, Lo. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data , url =. Proc. of EMNLP , doi =

  56. [59]

    Knowledge Neurons in Pretrained Transformers , booktitle =

    Dai, Damai and Dong, Li and Hao, Yaru and Sui, Zhifang and Chang, Baobao and Wei, Furu , booktitle =. Knowledge Neurons in Pretrained Transformers , url =. doi:10.18653/v1/2022.acl-long.581 , pages =

  57. [60]

    Glass , bibsource =

    Fahim Dalvi and Nadir Durrani and Hassan Sajjad and Yonatan Belinkov and Anthony Bau and James R. Glass , bibsource =. What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep. The Thirty-Third. doi:10.1609/aaai.v33i01.33016309 , pages =

  58. [61]

    Discovering Latent Concepts Learned in

    Fahim Dalvi and Abdul Rafae Khan and Firoj Alam and Nadir Durrani and Jia Xu and Hassan Sajjad , bibsource =. Discovering Latent Concepts Learned in. Proc. of ICLR , publisher =

  59. [62]

    Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=

    Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model , author=. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages=. 2023 , url=

  60. [63]

    BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , author=. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) , pages=. 2019 , url=

  61. [64]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    Piqa: Reasoning about physical commonsense in natural language , author=. Proceedings of the AAAI conference on artificial intelligence , volume=. 2020 , url=

  62. [65]

    Social IQa: Commonsense Reasoning about Social Interactions , author=. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) , pages=. 2019 , url=

  63. [66]

    Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , pages=

    HellaSwag: Can a Machine Really Finish Your Sentence? , author=. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , pages=. 2019 , url=

  64. [67]

    Communications of the ACM , volume=

    Winogrande: An adversarial winograd schema challenge at scale , author=. Communications of the ACM , volume=. 2021 , publisher=

  65. [68]

    arXiv preprint arXiv:1803.05457 , year=

    Think you have solved question answering? try arc, the ai2 reasoning challenge , author=. arXiv preprint arXiv:1803.05457 , year=

  66. [69]

    Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages=

    Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , author=. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing , pages=. 2018 , url=

  67. [70]

    International conference on machine learning , pages=

    Language modeling with gated convolutional networks , author=. International conference on machine learning , pages=. 2017 , organization=

  68. [71]

    arXiv preprint arXiv:2002.05202 , year=

    Glu variants improve transformer , author=. arXiv preprint arXiv:2002.05202 , year=

  69. [72]

    Advances in Neural Information Processing Systems , volume=

    Multi-head adapter routing for cross-task generalization , author=. Advances in Neural Information Processing Systems , volume=. 2024 , url=

  70. [73]

    BERT: pre-training of deep bidirectional transformers for language understanding

    Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle =. doi:10.18653/v1/N19-1423 , pages =

  71. [74]

    Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping , url =

    Dodge, Jesse and Ilharco, Gabriel and Schwartz, Roy and Farhadi, Ali and Hajishirzi, Hannaneh and Smith, Noah , journal =. Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping , url =

  72. [75]

    Analyzing Individual Neurons in Pre-trained Language Models , url =

    Durrani, Nadir and Sajjad, Hassan and Dalvi, Fahim and Belinkov, Yonatan , booktitle =. Analyzing Individual Neurons in Pre-trained Language Models , url =. doi:10.18653/v1/2020.emnlp-main.395 , pages =

  73. [76]

    Linguistic correlation analysis: Discovering salient neurons in deepnlp models , url =

    Durrani, Nadir and Dalvi, Fahim and Sajjad, Hassan , journal =. Linguistic correlation analysis: Discovering salient neurons in deepnlp models , url =

  74. [77]

    A mathematical framework for transformer circuits , url =

    Elhage, Nelson and Nanda, Neel and Olsson, Catherine and Henighan, Tom and Joseph, Nicholas and Mann, Ben and Askell, Amanda and Bai, Yuntao and Chen, Anna and Conerly, Tom and others , journal =. A mathematical framework for transformer circuits , url =

  75. [78]

    Softmax linear units , url =

    Elhage, Nelson and Hume, Tristan and Olsson, Catherine and Nanda, Neel and Henighan, Tom and Johnston, Scott and ElShowk, Sheer and Joseph, Nicholas and DasSarma, Nova and Mann, Ben and others , journal =. Softmax linear units , url =

  76. [79]

    Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity , url =

    Fedus, William and Zoph, Barret and Shazeer, Noam , journal =. Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity , url =

  77. [80]

    Language-agnostic BERT Sentence Embedding

    Feng, Fangxiaoyu and Yang, Yinfei and Cer, Daniel and Arivazhagan, Naveen and Wang, Wei , booktitle =. Language-agnostic. doi:10.18653/v1/2022.acl-long.62 , pages =

  78. [81]

    S im CSE : Simple Contrastive Learning of Sentence Embeddings

    Gao, Tianyu and Yao, Xingcheng and Chen, Danqi , booktitle =. doi:10.18653/v1/2021.emnlp-main.552 , pages =

  79. [82]

    arXiv preprint arXiv:2312.12379 , year=

    Mixture of cluster-conditional lora experts for vision-language instruction tuning , author=. arXiv preprint arXiv:2312.12379 , year=

  80. [83]

    arXiv preprint arXiv:2402.12851 , year=

    Moelora: Contrastive learning guided mixture of experts on parameter-efficient fine-tuning for large language models , author=. arXiv preprint arXiv:2402.12851 , year=

Showing first 80 references.