TouchAI: Exploring human-AI perceptual alignment in touch through language model representations
Pith reviewed 2026-05-24 00:03 UTC · model grok-4.3
The pith
LLMs show partial alignment with human touch perceptions of textiles via verbal descriptions, but the match varies sharply by material.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the Guess What Textile task, participants handle a target and reference textile, describe their differences in words, and the LLM identifies the target by computing similarity between the description and stored textile representations in its embedding space. Results indicate a degree of perceptual alignment that varies significantly across samples, with strong performance on silk satin and weak performance on cotton denim, while participants report that the LLM outputs do not closely match their own tactile experiences.
What carries the argument
The Guess What Textile interaction, which converts verbal tactile descriptions into LLM embedding similarities to guess the target textile without visual input.
If this is right
- Perceptual alignment between LLMs and human touch exists but is material-dependent.
- Alignment is stronger for silk satin than for cotton denim.
- Participants judge LLM predictions as imperfect reflections of their tactile sensations.
- Identifying sources of alignment variance can guide improvements in touch-related AI tasks.
- Better human-AI perceptual alignment in touch would support future everyday applications involving tactile judgment.
Where Pith is reading between the lines
- The same language-only method could be applied to non-textile objects to test whether alignment variance is general or textile-specific.
- If verbalization limits alignment, models that ingest direct haptic sensor data might close the gap for materials that are hard to describe.
- Material-dependent success suggests that alignment quality tracks how readily a textile's surface properties translate into everyday language.
Load-bearing premise
Participants' verbal descriptions fully and accurately capture their tactile sensations, and distances in the LLM embedding space correspond to human perceptual similarity.
What would settle it
An experiment in which the LLM's identification accuracy stays at chance level for every textile pair or in which participants rate every model prediction as a poor match to their felt experience.
Figures
read the original abstract
Aligning large language models (LLMs) behaviour with human intent is critical for future AI. An important yet often overlooked aspect of this alignment is the perceptual alignment. Perceptual modalities like touch are more multifaceted and nuanced compared to other sensory modalities such as vision. This work investigates how well LLMs align with human touch experiences using the "textile hand" task. We created a "Guess What Textile" interaction in which participants were given two textile samples -- a target and a reference -- to handle. Without seeing them, participants described the differences between them to the LLM. Using these descriptions, the LLM attempted to identify the target textile by assessing similarity within its high-dimensional embedding space. Our results suggest that a degree of perceptual alignment exists, however varies significantly among different textile samples. For example, LLM predictions are well aligned for silk satin, but not for cotton denim. Moreover, participants didn't perceive their textile experiences closely matched by the LLM predictions. This is only the first exploration into perceptual alignment around touch, exemplified through textile hand. We discuss possible sources of this alignment variance, and how better human-AI perceptual alignment can benefit future everyday tasks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents an exploratory study on perceptual alignment between LLMs and humans in the tactile domain via a 'Guess What Textile' task. Participants handle two textile samples (target and reference) without seeing them and provide free-form verbal descriptions of their differences; the LLM then attempts to identify the target by computing similarity in its embedding space. The central claim is that a degree of alignment exists but varies significantly across textiles (e.g., well-aligned for silk satin, poorly for cotton denim), while participants themselves did not perceive their experiences as closely matched by the LLM outputs. The work positions itself as a first exploration into touch-based perceptual alignment.
Significance. If the proxy of verbal descriptions plus embedding similarity validly captures tactile perceptual alignment, the results could offer preliminary evidence that LLMs encode some cross-modal structure relevant to touch, with implications for applications like assistive technologies or sensory AI. The explicit acknowledgment of participant-LLM mismatch is a strength that highlights limitations rather than overclaiming. However, the absence of methodological details (participant N, metrics, controls) limits assessment of whether any alignment is robust or artifactual.
major comments (3)
- [Abstract] Abstract: The claim that 'LLM predictions are well aligned for silk satin, but not for cotton denim' is presented without any quantitative similarity scores, statistical tests, participant numbers, or controls for description quality/textile selection; this directly undermines evaluation of the central varying-alignment result.
- [Abstract] Abstract: The interpretation of embedding similarity as evidence of perceptual alignment rests on the assumption that free-form verbal descriptions sufficiently encode tactile percepts and that LLM embedding cosine (or other) similarity tracks human-perceived tactile similarity; this assumption is load-bearing for the claim yet is directly questioned by the paper's own report that 'participants didn't perceive their textile experiences closely matched by the LLM predictions.'
- [Abstract] Abstract/Methods (inferred from task description): No details are provided on the exact similarity metric in embedding space, how textiles were selected, blinding procedures, or inter-participant consistency, all of which are required to rule out confounds such as linguistic priors or describability differences rather than genuine tactile alignment.
minor comments (1)
- [Abstract] Abstract: Minor phrasing issues such as 'LLM behaviour' (consistent spelling) and 'textile hand' task could be clarified for readers unfamiliar with the domain.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our exploratory study. We address each major point below, agreeing where details are missing from the abstract and committing to revisions for clarity while defending the cautious framing of our partial-alignment findings.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that 'LLM predictions are well aligned for silk satin, but not for cotton denim' is presented without any quantitative similarity scores, statistical tests, participant numbers, or controls for description quality/textile selection; this directly undermines evaluation of the central varying-alignment result.
Authors: We agree the abstract is too concise and omits key details. The full manuscript describes an exploratory design without formal statistical tests owing to its small scale and qualitative focus. We will revise the abstract to state the participant number, note the absence of statistical testing, and briefly describe textile selection criteria based on distinct tactile properties. revision_made: yes revision: yes
-
Referee: [Abstract] Abstract: The interpretation of embedding similarity as evidence of perceptual alignment rests on the assumption that free-form verbal descriptions sufficiently encode tactile percepts and that LLM embedding cosine (or other) similarity tracks human-perceived tactile similarity; this assumption is load-bearing for the claim yet is directly questioned by the paper's own report that 'participants didn't perceive their textile experiences closely matched by the LLM predictions.'
Authors: This observation is correct and aligns with our intent. We report the participant-LLM mismatch precisely to signal that verbal descriptions plus embedding similarity constitute an imperfect proxy. Our claim is limited to 'a degree of alignment exists, however varies significantly' rather than strong equivalence. We will revise the abstract to label the measure explicitly as a proxy and cross-reference the mismatch finding. revision_made: partial revision: partial
-
Referee: [Abstract] Abstract/Methods (inferred from task description): No details are provided on the exact similarity metric in embedding space, how textiles were selected, blinding procedures, or inter-participant consistency, all of which are required to rule out confounds such as linguistic priors or describability differences rather than genuine tactile alignment.
Authors: The full manuscript's Methods section specifies cosine similarity on embeddings, selection of textiles for contrasting tactile qualities, and blind handling (no visual access). Inter-participant consistency was not quantified but varied with description content. We will add a concise methods summary to the abstract to make these elements explicit and allow readers to assess potential confounds. revision_made: yes revision: yes
Circularity Check
No circularity: empirical study with off-the-shelf embeddings
full rationale
The paper reports an empirical human-subject experiment in which participants provide verbal descriptions of textile pairs and an off-the-shelf LLM embedding space is used to compute similarity-based guesses. No derivations, equations, fitted parameters, or predictions are defined in terms of themselves. No self-citation chains or uniqueness theorems are invoked to justify core claims. The results consist of direct experimental measurements of alignment variance across textiles; the work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Verbal descriptions provided by participants accurately reflect their tactile sensory experiences of the textiles.
- domain assumption Similarity in the LLM's embedding space corresponds to human perceptual similarity for touch sensations.
Reference graph
Works this paper leans on
-
[1]
Aligning ai with shared human values
Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, and Jacob Steinhardt. Aligning ai with shared human values. In International Conference on Learning Representations , 2020
work page 2020
-
[2]
Visalign: Dataset for measuring the alignment between ai and humans in visual perception
Jiyoung Lee, Seungho Kim, Seunghyun Won, Joonseok Lee, Marzyeh Ghassemi, James Thorne, Jaeseok Choi, O-Kil Kwon, and Edward Choi. Visalign: Dataset for measuring the alignment between ai and humans in visual perception. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023
work page 2023
-
[3]
Artificial intelligence, values, and alignment
Iason Gabriel. Artificial intelligence, values, and alignment. Minds and machines, 30(3):411–437, 2020
work page 2020
-
[4]
The alignment problem: Machine learning and human values
Brian Christian. The alignment problem: Machine learning and human values . WW Norton & Company, 2020
work page 2020
-
[5]
A General Language Assistant as a Laboratory for Alignment
Amanda Askell, Yuntao Bai, Anna Chen, Dawn Drain, Deep Ganguli, Tom Henighan, Andy Jones, Nicholas Joseph, Ben Mann, Nova DasSarma, et al. A general language assistant as a laboratory for alignment. arXiv preprint arXiv:2112.00861, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[6]
Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, and Thore Graepel. Open Problems in Cooperative AI, 2020
work page 2020
-
[7]
Cooperative ai: machines must learn to find common ground
Allan Dafoe, Yoram Bachrach, Gillian Hadfield, Eric Horvitz, Kate Larson, and Thore Graepel. Cooperative ai: machines must learn to find common ground. Nature, 593(7857):33–36, 2021
work page 2021
-
[8]
The culture of AI: Everyday life and the digital revolution
Anthony Elliott. The culture of AI: Everyday life and the digital revolution . Routledge, 2019
work page 2019
-
[9]
https://cacm.acm.org/news/ raising-the-dead-with-ai/ , 2024
Raising the dead with ai – communications of the acm. https://cacm.acm.org/news/ raising-the-dead-with-ai/ , 2024
work page 2024
-
[10]
Implicit learning: Below the subjective threshold
Zoltan Dienes and Dianne Berry. Implicit learning: Below the subjective threshold. Psychonomic bulletin & review, 4:3–23, 1997
work page 1997
-
[11]
Dermot Lynott and Louise Connell. Modality exclusivity norms for 400 nouns: The relationship between perceptual experience and surface word form. Behavior research methods, 45:516–526, 2013
work page 2013
-
[12]
S Kawabata and Masako Niwa. Objective measurement of fabric mechanical property and quality: its application to textile and clothing manufacturing. International Journal of Clothing Science and Technology , 3(1):7–18, 1991
work page 1991
-
[13]
Effect of mechanical and physical properties on fabric hand
Hassan Behery. Effect of mechanical and physical properties on fabric hand . Elsevier, 2005
work page 2005
-
[14]
Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B
Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane,...
work page 2023
-
[15]
Training language models to follow instructions with human feedback
Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems , 35:27730–27744, 2022
work page 2022
-
[16]
Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation
Nitesh Goyal, Ian D Kivlichan, Rachel Rosen, and Lucy Vasserman. Is your toxicity my toxicity? exploring the impact of rater identity on toxicity annotation. Proceedings of the ACM on Human-Computer Interaction , 6(CSCW2):1–28, 2022. 13
work page 2022
-
[17]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[18]
Alexander Pan, Jun Shern Chan, Andy Zou, Nathaniel Li, Steven Basart, Thomas Woodside, Hanlin Zhang, Scott Emmons, and Dan Hendrycks. Do the rewards justify the means? measuring trade-offs between rewards and ethical behavior in the machiavelli benchmark. In International Conference on Machine Learning , pages 26837–26867. PMLR, 2023
work page 2023
-
[19]
Latent space alignment using adversarially guided self-play
Mycal Tucker, Yilun Zhou, and Julie A Shah. Latent space alignment using adversarially guided self-play. International Journal of Human–Computer Interaction , 38(18-20):1753–1771, 2022
work page 2022
-
[20]
Conceptual alignment: How brains achieve mutual understanding
Arjen Stolk, Lennart Verhagen, and Ivan Toni. Conceptual alignment: How brains achieve mutual understanding. Trends in cognitive sciences, 20(3):180–191, 2016
work page 2016
-
[21]
System alignment supports cross-domain learning and zero-shot generalisation
Kaarina Aho, Brett D Roads, and Bradley C Love. System alignment supports cross-domain learning and zero-shot generalisation. Cognition, 227:105200, 2022
work page 2022
-
[22]
Representational similarity analysis-connecting the branches of systems neuroscience
Nikolaus Kriegeskorte, Marieke Mur, and Peter A Bandettini. Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in systems neuroscience, page 4, 2008
work page 2008
-
[23]
Human align- ment of neural network representations
Lukas Muttenthaler, Jonas Dippel, Lorenz Linhardt, Robert A Vandermeulen, and Simon Kornblith. Human align- ment of neural network representations. In The Eleventh International Conference on Learning Representations , 2022
work page 2022
-
[24]
Text-to-concept (and back) via cross-model alignment
Mazda Moayeri, Keivan Rezaei, Maziar Sanjabi, and Soheil Feizi. Text-to-concept (and back) via cross-model alignment. In International Conference on Machine Learning , pages 25037–25060. PMLR, 2023
work page 2023
-
[25]
Large language models predict human sensory judgments across six modalities
Raja Marjieh, Ilia Sucholutsky, P v Rijn, Nori Jacoby, and Thomas L Griffiths. Large language models predict human sensory judgments across six modalities. arXiv preprint arXiv:2302.01308, 2023
-
[26]
Representation learning: A review and new perspectives
Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence , 35(8):1798–1828, 2013
work page 2013
-
[27]
S. J. Lederman and R. L. Klatzky. Haptic perception: A tutorial. Attention, Perception, & Psychophysics , 71(7):1439–1459, October 2009
work page 2009
-
[28]
Tactual perception of material properties
Wouter M Bergmann Tiest. Tactual perception of material properties. Vision research, 50(24):2775–2782, 2010
work page 2010
-
[29]
Roberta L Klatzky, Susan J Lederman, and Catherine Reed. There’s more to touch than meets the eye: The salience of object attributes for haptics with and without vision. Journal of experimental psychology: general , 116(4):356, 1987
work page 1987
-
[30]
Roberta L Klatzky, Susan J Lederman, and Victoria A Metzger. Identifying objects by touch: An “expert system”. Perception & psychophysics, 37:299–302, 1985
work page 1985
-
[31]
Mapping the sensory-perceptual space of vibration for user-centered intuitive tactile design
Robert Rosenkranz and M Ercan Altinsoy. Mapping the sensory-perceptual space of vibration for user-centered intuitive tactile design. IEEE Transactions on Haptics, 14(1):95–108, 2020
work page 2020
-
[32]
Douglas Atkinson, Sharon Baurley, Bruna Beatriz Petreca, Nadia Bianchi-Berthouze, and Penelope Watkins. The tactile triangle: a design research framework demonstrated through tactile comparisons of textile materials. Journal of Design Research, 14(2):142–170, 2016
work page 2016
-
[33]
Catarina Marques, Elisete Correia, Lia-Tânia Dinis, and Alice Vilela. An overview of sensory characterization techniques: From classical descriptive analysis to the emergence of novel profiling methods. F oods, 11(3):255, 2022
work page 2022
-
[34]
Shared interest: Measuring human-ai alignment to identify recurring patterns in model behavior
Angie Boggust, Benjamin Hoover, Arvind Satyanarayan, and Hendrik Strobelt. Shared interest: Measuring human-ai alignment to identify recurring patterns in model behavior. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems , pages 1–17, 2022
work page 2022
-
[35]
On the Opportunities and Risks of Foundation Models
Rishi Bommasani, Drew A Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[36]
The hand of textiles–definitions, achievements, perspectives–a review
Izabela Luiza Ciesielska-Wrobel and Lieva Van Langenhove. The hand of textiles–definitions, achievements, perspectives–a review. Textile Research Journal, 82(14):1457–1468, 2012
work page 2012
-
[37]
Mee-Sung Choi and Susan P Ashdown. Effect of changes in knit structure and density on the mechanical and hand properties of weft-knitted fabrics for outerwear. Textile Research Journal, 70(12):1033–1045, 2000
work page 2000
-
[38]
Comfort properties of textiles
Keith Slater. Comfort properties of textiles. Textile progress, 9(4):1–70, 1977. 14
work page 1977
-
[39]
Bruna Petreca, Sharon Baurley, and Nadia Bianchi-Berthouze. How do designers feel textiles? In 2015 International conference on affective computing and intelligent interaction (ACII) , pages 982–987. IEEE, 2015
work page 2015
-
[40]
An understanding of embodied textile selection processes & a toolkit to support them
Bruna Beatriz Petreca. An understanding of embodied textile selection processes & a toolkit to support them . Royal College of Art (United Kingdom), 2016
work page 2016
-
[41]
Telextiles: End-to-end remote transmission of fabric tactile sensation
Takekazu Kitagishi, Yuichi Hiroi, Yuna Watanabe, Yuta Itoh, and Jun Rekimoto. Telextiles: End-to-end remote transmission of fabric tactile sensation. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, pages 1–10, 2023
work page 2023
-
[42]
Fabrictouch: a multimodal fabric assessment touch gesture dataset to slow down fast fashion
Temitayo Olugbade, Lili Lin, Alice Sansoni, Nihara Warawita, Yuanze Gan, Xijia Wei, Bruna Petreca, Giuseppe Boccignone, Douglas Atkinson, Youngjun Cho, et al. Fabrictouch: a multimodal fabric assessment touch gesture dataset to slow down fast fashion. In 2023 11th International Conference on Affective Computing and Intelligent Interaction (ACII), pages 1–...
work page 2023
-
[43]
Textilenet: A material taxonomy-based fashion textile dataset
Shu Zhong, Miriam Ribul, Youngjun Cho, and Marianna Obrist. Textilenet: A material taxonomy-based fashion textile dataset. arXiv preprint arXiv:2301.06160, 2023
-
[44]
Language models are unsupervised multitask learners
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, et al. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019
work page 2019
-
[45]
E-commerce worldwide-statistics & facts
Daniela Coppola. E-commerce worldwide-statistics & facts. Statista. https://www.statista. com/topics/871/online- shopping/, 2021
work page 2021
-
[46]
ltd.. Texflag Sample Book Co. Textile sample books, fabric sample book
-
[47]
C. Chan and Fashionary. Textilepedia: The Complete Fabric Guide. Fashionary, 2020
work page 2020
-
[48]
Geoffrey E Hinton and Sam Roweis. Stochastic neighbor embedding. Advances in neural information processing systems, 15, 2002
work page 2002
-
[49]
Text and Code Embeddings by Contrastive Pre-Training
Arvind Neelakantan, Tao Xu, Raul Puri, Alec Radford, Jesse Michael Han, Jerry Tworek, Qiming Yuan, Nikolas Tezak, Jong Wook Kim, Chris Hallacy, et al. Text and code embeddings by contrastive pre-training.arXiv preprint arXiv:2201.10005, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[50]
Pointer sentinel mixture models, 2016
Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. Pointer sentinel mixture models, 2016
work page 2016
-
[51]
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. In The IEEE International Conference on Computer Vision (ICCV) , December 2015
work page 2015
-
[52]
The psychophysics of sensory function
Stanley S Stevens. The psychophysics of sensory function. American scientist, 48(2):226–253, 1960
work page 1960
-
[53]
Language Is Not All You Need: Aligning Perception with Language Models, 2023
Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, and Furu Wei. Language Is Not All You Need: Aligning Perception with Language Models, 2023. 15
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.