pith. sign in

arxiv: 2509.03234 · v2 · submitted 2025-09-03 · 💻 cs.LG

TeRA: Vector-based Random Tensor Network for High-Rank Adaptation of Large Language Models

Pith reviewed 2026-05-18 19:14 UTC · model grok-4.3

classification 💻 cs.LG
keywords Parameter-Efficient Fine-TuningTensor NetworksHigh-Rank AdaptationLarge Language ModelsLoRAPEFTRandom InitializationTucker Decomposition
0
0 comments X

The pith

TeRA enables high-rank weight updates in LLMs while training only as many parameters as vector-based adapters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TeRA to resolve the usual trade-off in fine-tuning large language models, where high-rank updates require many more trainable parameters than simpler vector methods. It represents each weight update through a Tucker-like tensor network whose large factors are randomly initialized once, frozen, and shared across all layers. Only a small set of layer-specific scaling vectors is trained, keeping the total trainable count as low as basic vector adapters. Experiments show this construction matches or beats existing high-rank methods on standard adaptation tasks while theoretical checks and ablations confirm the random factors supply the needed expressivity.

Core claim

TeRA parametrizes the tensorized weight update matrix as a Tucker-like tensor network, whereby large randomly initialized factors are frozen and shared across layers, while only small layer-specific scaling vectors, corresponding to diagonal entries of factor matrices, are trained. This achieves high-rank weight updates while retaining the parameter efficiency of vector-based PEFT adapters, matching or even outperforming existing high-rank adapters.

What carries the argument

Tucker-like tensor network that decomposes the weight update, keeping large random factors frozen and shared while training only per-layer scaling vectors.

If this is right

  • High-rank updates become feasible without increasing the trainable parameter budget beyond vector-based methods.
  • Adapter performance equals or exceeds prior high-rank techniques on language-model fine-tuning benchmarks.
  • The separation of shared random structure from per-layer scalings reduces redundancy across model layers.
  • Theoretical guarantees and ablation results support that the random tensor factors encode sufficient high-rank directions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The shared random factors may implicitly align adaptation directions across layers without explicit coordination.
  • The same random-tensor pattern could be tested on other parameter-efficient methods such as prompt tuning.
  • Scaling the approach to models with thousands of layers would test whether the fixed factors remain effective without retraining.

Load-bearing premise

Randomly initialized and frozen large factors in the tensor network, when paired with only layer-specific scaling vectors, suffice to capture the high-rank information required for effective adaptation.

What would settle it

An ablation or benchmark run in which replacing the frozen random factors with learned ones yields no gain, or where TeRA falls measurably behind a comparable high-rank adapter on a task known to need high-rank capacity.

Figures

Figures reproduced from arXiv: 2509.03234 by Danilo Mandic, Giorgos Iacovides, Wuyang Zhou, Yuxuan Gu.

Figure 1
Figure 1. Figure 1: TeRA exhibits superior performance, high-rank [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A comparison between LoRA (Hu et al. 2022) and our proposed TeRA method. LoRA represents the weight update [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Rank analysis of ∆Wq (max allowed rank of 4096) and ∆Wv (max allowed rank of 1024) across Llama￾3-8B layers. TeRA consistently maintains a high (near-full) rank. In contrast, methods like LoRA and VeRA have lower￾rank weight updates, limiting their expressivity. a superior trade-off between model performance, high rank, and parameter efficiency. As shown in [PITH_FULL_IMAGE:figures/full_fig_p002_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Rank of ∆Wq and ∆Wv (Max possible rank = 4096) across different layers in Llama-2-7B under different tensorization schemes in the commonsense reasoning task. Initialization of Frozen Factor Matrices. We explore different initialization choices for the frozen factor matrices. Specifically, we compare TeRA with a variant, TeRAiden, where its frozen factor matrices are all identity matrices. Note that TeRAide… view at source ↗
Figure 6
Figure 6. Figure 6: Comparison between TeRA and TeRAiden on the commonsense reasoning dataset with Llama-2-7B. Conclusion We have introduced TeRA, a high-rank PEFT adapter which utilizes a tensor network to parameterize the tensorized weight updates. In this way, TeRA offers a more effec￾tive alternative to existing vector-based adapters, achieving much better performances and high-rank updates but with a similar amount of tr… view at source ↗
Figure 4
Figure 4. Figure 4: Average accuracy across eight commonsense rea [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Parameter-Efficient Fine-Tuning (PEFT) methods, such as Low-Rank Adaptation (LoRA), have significantly reduced the number of trainable parameters needed in fine-tuning large language models (LLMs). The developments of LoRA-style adapters have considered two main directions: (1) enhancing model expressivity with high-rank adapters, and (2) aiming for further parameter reduction, as exemplified by vector-based methods. However, these approaches come with a trade-off, as achieving the expressivity of high-rank weight updates typically comes at the cost of sacrificing the extreme parameter efficiency offered by vector-based techniques. To address this issue, we propose a vector-based random Tensor network for high-Rank Adaptation (TeRA), a novel PEFT method that achieves high-rank weight updates while retaining the parameter efficiency of vector-based PEFT adapters. This is achieved by parametrizing the tensorized weight update matrix as a Tucker-like tensor network (TN), whereby large randomly initialized factors are frozen and shared across layers, while only small layer-specific scaling vectors, corresponding to diagonal entries of factor matrices, are trained. Comprehensive experiments demonstrate that TeRA matches or even outperforms existing high-rank adapters, while requiring as few trainable parameters as vector-based methods. Theoretical analysis and ablation studies validate the effectiveness of the proposed TeRA method. The code is available at https://github.com/guyuxuan9/TeRA.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes TeRA, a PEFT method for LLMs that parametrizes weight updates via a Tucker-like tensor network. Large randomly initialized factors are frozen and shared across layers, while only small per-layer scaling vectors (corresponding to diagonal entries) are trained. This is claimed to deliver high-rank adaptation updates at the parameter cost of vector-based methods. Comprehensive experiments, theoretical analysis, and ablations are reported to show that TeRA matches or outperforms existing high-rank adapters while using as few trainable parameters as vector-based PEFT.

Significance. If the central claims are substantiated, TeRA would usefully bridge the expressivity-efficiency trade-off in PEFT by showing that a shared random tensor basis plus per-layer scalings can suffice for effective high-rank updates. The availability of code and the inclusion of both theoretical analysis and ablations are positive features that aid reproducibility and verification.

major comments (2)
  1. [§3] §3 (Method), Tucker-like TN parametrization: The central claim that frozen, shared random factors plus per-layer scaling vectors produce effective high-rank updates rests on the untested assumption that a single random subspace already contains the principal adaptation directions across layers. If the random basis is misaligned with layer-wise gradient structure, scaling alone cannot recover the missing expressivity; the paper must demonstrate stability under different random seeds for the frozen factors.
  2. [§4.3] §4.3 (Ablations) and experimental tables: The reported performance gains over high-rank baselines are load-bearing for the claim, yet the manuscript provides insufficient detail on whether ablations include replacement of the shared random factors by an independent draw or by a learned basis; without such controls the results cannot rule out that success depends on a fortunate random initialization rather than the architecture itself.
minor comments (2)
  1. [§3] Notation for the scaling vectors and the precise definition of the Tucker contraction should be clarified with an explicit equation showing which modes are contracted and which remain diagonal.
  2. [§4] Figure captions and table headers should explicitly state the number of trainable parameters for each compared method to make the efficiency claim immediately verifiable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the opportunity to address these points. We agree that additional empirical verification of stability and clearer ablation controls will strengthen the manuscript. We outline our responses below and will incorporate the suggested revisions.

read point-by-point responses
  1. Referee: [§3] §3 (Method), Tucker-like TN parametrization: The central claim that frozen, shared random factors plus per-layer scaling vectors produce effective high-rank updates rests on the untested assumption that a single random subspace already contains the principal adaptation directions across layers. If the random basis is misaligned with layer-wise gradient structure, scaling alone cannot recover the missing expressivity; the paper must demonstrate stability under different random seeds for the frozen factors.

    Authors: We acknowledge that demonstrating robustness to the choice of random seed for the shared frozen factors is valuable for substantiating the central claim. Section 3.2 provides a theoretical argument that a random Tucker-like basis can span the necessary high-rank space with high probability, but we agree this should be complemented by empirical checks. In the revised version we will add a new table (or subsection in §4) reporting mean and standard deviation of performance across at least five independent random seeds for the frozen factors on the main benchmarks, thereby directly addressing the concern about potential misalignment. revision: yes

  2. Referee: [§4.3] §4.3 (Ablations) and experimental tables: The reported performance gains over high-rank baselines are load-bearing for the claim, yet the manuscript provides insufficient detail on whether ablations include replacement of the shared random factors by an independent draw or by a learned basis; without such controls the results cannot rule out that success depends on a fortunate random initialization rather than the architecture itself.

    Authors: We appreciate the referee highlighting the need for explicit controls that isolate the benefit of sharing a single random tensor basis. The existing ablations in §4.3 vary the scaling-vector dimension and core rank but do not yet include the requested variants. We will expand §4.3 with two new controls: (i) per-layer independent random draws of the factor matrices (instead of a shared draw), and (ii) a learned (non-frozen) basis version. These additions will be presented alongside the original results so readers can assess whether performance depends on a fortunate initialization or on the shared-random architecture itself. revision: yes

Circularity Check

0 steps flagged

Explicit architectural parametrization with no load-bearing self-definition or fitted-input prediction

full rationale

The paper defines TeRA directly as a Tucker-like tensor network in which large random factors are frozen and shared while only per-layer scaling vectors are trained. This is presented as an engineering choice that trades off expressivity and parameter count, not as a quantity derived from or equivalent to the fitted scaling vectors themselves. No equations reduce the claimed high-rank adaptation performance to the trainable parameters by construction, and no self-citation chain is invoked to justify uniqueness or to rename an existing result. The central claim therefore remains an independent architectural proposal whose validity is tested empirically rather than presupposed by the method's own inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach rests on the modeling assumption that a random Tucker-like decomposition with frozen shared factors can stand in for full high-rank updates when only scaling vectors are learned.

free parameters (1)
  • layer-specific scaling vectors
    Trainable parameters fitted during fine-tuning; their values are determined by the adaptation objective.
axioms (1)
  • domain assumption Randomly initialized frozen factors in the tensor network suffice to represent the necessary high-rank structure when scaled per layer.
    Invoked to justify freezing the large components while training only the vectors.
invented entities (1)
  • TeRA tensor network parametrization no independent evidence
    purpose: To achieve high-rank updates with vector-level trainable parameter count
    New architectural construction introduced by the paper.

pith-pipeline@v0.9.0 · 5793 in / 1288 out tokens · 58341 ms · 2026-05-18T19:14:54.821605+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 5 internal anchors

  1. [1]

    , " * write output.state after.block = add.period write newline

    ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

  3. [3]

    GPT-4 Technical Report

    Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F. L.; Almeida, D.; Altenschmidt, J.; Altman, S.; Anadkat, S.; et al. 2023. GPT-4 Technical Report . arXiv preprint arXiv:2303.08774

  4. [4]

    Banerjee, S.; and Lavie, A. 2005. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments . In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization, 65--72

  5. [5]

    Bershatsky, D.; Cherniuk, D.; Daulbaev, T.; Mikhalev, A.; and Oseledets, I. 2024. LoTR : Low Tensor Rank Weight Adaptation. arXiv:2402.01376

  6. [6]

    Bisk, Y.; Zellers, R.; Gao, J.; Choi, Y.; et al. 2020. PIQA: Reasoning about Physical Commonsense in Natural Language . In Proceedings of the AAAI conference on artificial intelligence, volume 34, 7432--7439

  7. [7]

    D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al

    Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J. D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. 2020. Language models are few-shot learners. Advances in neural information processing systems, 33: 1877--1901

  8. [8]

    Cichocki, A.; Mandic, D.; De Lathauwer, L.; Zhou, G.; Zhao, Q.; Caiafa, C.; and PHAN, H. A. 2015. Tensor Decompositions for Signal Processing Applications: From two-way to multiway component analysis. IEEE Signal Processing Magazine, 32(2): 145--163

  9. [9]

    Clark, C.; Lee, K.; Chang, M.-W.; Kwiatkowski, T.; Collins, M.; and Toutanova, K. 2019. BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions . In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2924--2936

  10. [10]

    Clark, P.; Cowhey, I.; Etzioni, O.; Khot, T.; Sabharwal, A.; Schoenick, C.; and Tafjord, O. 2018. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge . arXiv preprint arXiv:1803.05457

  11. [11]

    Cobbe, K.; Kosaraju, V.; Bavarian, M.; Chen, M.; Jun, H.; Kaiser, L.; Plappert, M.; Tworek, J.; Hilton, J.; Nakano, R.; et al. 2021. Training Verifiers to Solve Math Word Problems . arXiv preprint arXiv:2110.14168

  12. [12]

    De Lathauwer, L.; De Moor, B.; and Vandewalle, J. 2000 a . A multilinear singular value decomposition. SIAM journal on Matrix Analysis and Applications, 21(4): 1253--1278

  13. [13]

    De Lathauwer, L.; De Moor, B.; and Vandewalle, J. 2000 b . On the best rank-1 and rank-(r1, r2,..., rn) approximation of higher-order tensors. SIAM journal on Matrix Analysis and Applications, 21(4): 1324--1342

  14. [14]

    Dinan, E.; Logacheva, V.; Malykh, V.; Miller, A.; Shuster, K.; Urbanek, J.; Kiela, D.; Szlam, A.; Serban, I.; Lowe, R.; et al. 2019. The Second Conversational Intelligence Challenge (ConvAI2) . In The NeurIPS'18 Competition: From Machine Learning to Intelligent Conversations, 187--208. Springer

  15. [15]

    Grattafiori, A.; Dubey, A.; Jauhri, A.; Pandey, A.; Kadian, A.; Al-Dahle, A.; Letman, A.; Mathur, A.; Schelten, A.; Vaughan, A.; et al. 2024. The Llama 3 Herd of Models . arXiv preprint arXiv:2407.21783

  16. [16]

    Gu, Y.; Zhou, W.; Iacovides, G.; and Mandic, D. 2025. TensorLLM: Tensorising Multi-Head Attention for Enhanced Reasoning and Compression in LLMs. arXiv preprint arXiv:2501.15674

  17. [17]

    J.; yelong shen; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; and Chen, W

    Hu, E. J.; yelong shen; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; and Chen, W. 2022. Lo RA : Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations

  18. [18]

    Hu, Z.; Wang, L.; Lan, Y.; Xu, W.; Lim, E.-P.; Bing, L.; Xu, X.; Poria, S.; and Lee, R. 2023. LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 5254--5276

  19. [19]

    Huang, Q.; Ko, T.; Zhuang, Z.; Tang, L.; and Zhang, Y. 2025. Hi RA : Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models. In The Thirteenth International Conference on Learning Representations

  20. [20]

    Iacovides, G.; Zhou, W.; Li, C.; Zhao, Q.; and Mandic, D. 2025. Domain-Aware Tensor Network Structure Search. arXiv preprint arXiv:2505.23537

  21. [21]

    Iacovides, G.; Zhou, W.; and Mandic, D. 2024. Towards LLM -guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection . arXiv preprint arXiv:2410.10728

  22. [22]

    Jiang, T.; Huang, S.; Luo, S.; Zhang, Z.; Huang, H.; Wei, F.; Deng, W.; Sun, F.; Zhang, Q.; Wang, D.; and Zhuang, F. 2024. MoRA : High-Rank Updating for Parameter-Efficient Fine-Tuning. arXiv:2405.12130

  23. [23]

    Koncel-Kedziorski, R.; Roy, S.; Amini, A.; Kushman, N.; and Hajishirzi, H. 2016. MAWPS : A Math Word Problem Repository. In Proceedings of the 2016 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies , 1152--1157. San Diego, California: Association for Computational Linguistics

  24. [24]

    J.; Blankevoort, T.; and Asano, Y

    Kopiczko, D. J.; Blankevoort, T.; and Asano, Y. M. 2024. Ve RA : Vector-based Random Matrix Adaptation. In The Twelfth International Conference on Learning Representations

  25. [25]

    Lester, B.; Al-Rfou, R.; and Constant, N. 2021. The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 3045--3059. Association for Computational Linguistics

  26. [26]

    Li, C.; Zeng, J.; Li, C.; Caiafa, C.; and Zhao, Q. 2023. Alternating local enumeration (TnALE): solving tensor network structure search with fewer evaluations . In Proceedings of the 40th International Conference on Machine Learning, ICML'23. JMLR.org

  27. [27]

    Lin, C.-Y. 2004. ROUGE : A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, 74--81. Barcelona, Spain: Association for Computational Linguistics

  28. [28]

    Ling, W.; Yogatama, D.; Dyer, C.; and Blunsom, P. 2017. Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems . In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 158--167

  29. [29]

    Liu, X.; Ji, K.; Fu, Y.; Tam, W.; Du, Z.; Yang, Z.; and Tang, J. 2022. P -Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks. In Muresan, S.; Nakov, P.; and Villavicencio, A., eds., Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 61--68. Dublin, Ireland: Associat...

  30. [30]

    Loshchilov, I.; and Hutter, F. 2019. Decoupled Weight Decay Regularization . In International Conference on Learning Representations

  31. [31]

    Mihaylov, T.; Clark, P.; Khot, T.; and Sabharwal, A. 2018. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2381--2391

  32. [32]

    Oseledets, I. V. 2011. Tensor-train decomposition. SIAM Journal on Scientific Computing, 33(5): 2295--2317

  33. [33]

    Patel, A.; Bhattamishra, S.; and Goyal, N. 2021. Are NLP Models really able to Solve Simple Math Word Problems? In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2080--2094. Association for Computational Linguistics

  34. [34]

    Sakaguchi, K.; Le Bras, R.; Bhagavatula, C.; and Choi, Y. 2020. WinoGrande: An adversarial winograd schema challenge at scale . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 8732--8740

  35. [35]

    Sap, M.; Rashkin, H.; Chen, D.; LeBras, R.; and Choi, Y. 2019. SocialIQA: Commonsense Reasoning about Social Interactions . In Conference on Empirical Methods in Natural Language Processing

  36. [36]

    Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288

  37. [37]

    R.; et al

    Tucker, L. R.; et al. 1964. The extension of factor analysis to three-dimensional matrices. Contributions to mathematical psychology, 110119: 110--182

  38. [38]

    D.; Fischer, J.; and Song, Y

    Wang, M.; Duc, K. D.; Fischer, J.; and Song, Y. S. 2017. Operator norm inequalities between tensor unfoldings on the partition lattice. Linear algebra and its applications, 520: 44--66

  39. [39]

    Yang, Y.; Zhou, J.; Wong, N.; and Zhang, Z. 2024. LoRETTA: Low-Rank Economic Tensor-Train Adaptation for Ultra-Low-Parameter Fine-Tuning of Large Language Models . In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 3161--3176

  40. [40]

    Zellers, R.; Holtzman, A.; Bisk, Y.; Farhadi, A.; and Choi, Y. 2019. HellaSwag: Can a Machine Really Finish Your Sentence? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 4791--4800

  41. [41]

    Q.; and Artzi, Y

    Zhang, T.; Kishore, V.; Wu, F.; Weinberger, K. Q.; and Artzi, Y. 2020. BERTScore: Evaluating Text Generation with BERT . In International Conference on Learning Representations