TactX: Learning Shared Tactile Representations Across Diverse Sensors

Carmelo Sferrazza; Junsung Park; Sachin Bhadang; Sha Yi; Xiaolong Wang

arxiv: 2606.31236 · v1 · pith:24GBCEULnew · submitted 2026-06-30 · 💻 cs.RO

TactX: Learning Shared Tactile Representations Across Diverse Sensors

Junsung Park , Sachin Bhadang , Carmelo Sferrazza , Sha Yi , Xiaolong Wang This is my paper

Pith reviewed 2026-07-01 05:21 UTC · model grok-4.3

classification 💻 cs.RO

keywords tactile representationsshared latent spacesensor transfercontact-rich manipulationzero-shot transfermultimodal tactilerobot manipulationpolicy transfer

0 comments

The pith

TactX learns a shared latent space across tactile sensors of different types, enabling zero-shot policy transfer between them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes TactX to learn transferable tactile representations across resistive, magnetic, and vision-based sensors. It uses paired contact data to train modality-specific encoders that map inputs to a common latent space. This allows policies trained on one sensor to be applied directly to another. Experiments on four manipulation tasks show improved success rates over vision-only approaches. This matters for making tactile sensing more practical across varied robot hardware.

Core claim

TactX maps heterogeneous tactile observations into a shared latent space through modality-specific encoders trained on paired contact data. Such paired interactions provide a natural alignment signal across modalities, and the encoders are jointly trained across all sensor pairs, inducing a consistent latent space for all sensor types. Our experiments show that TactX aligns tactile representations across sensors while preserving object-level contact information. Policies trained with one sensor transfer zero-shot to physically distinct sensors through the shared latent, improving the average success rate from 27.5% for vision-only policy to 45.9% on four contact-rich manipulation tasks.

What carries the argument

modality-specific encoders jointly trained on paired contact data to induce a consistent latent space across resistive, magnetic, and vision-based sensors

If this is right

Policies trained on data from one tactile sensor can be deployed on different sensors without retraining.
Success rates on pick-and-place, plug insertion, board wiping, and object reorientation tasks increase to an average of 45.9%.
The latent space supports both alignment across sensors and preservation of object contact details.
Manipulation policies become less dependent on specific tactile hardware.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hardware choices for robots could become more flexible if sensors can be swapped without policy changes.
Collecting paired contact data might be a scalable way to align new sensor types in the future.
The approach could extend to other sensing modalities beyond tactile if similar pairing is possible.

Load-bearing premise

Paired contact interactions supply a sufficient natural alignment signal to induce a consistent latent space across all three transduction modalities while preserving object-level contact information.

What would settle it

Zero-shot transfer experiments yielding success rates no higher than the 27.5% vision-only baseline on the manipulation tasks would indicate the shared latent does not enable effective cross-sensor policy use.

Figures

Figures reproduced from arXiv: 2606.31236 by Carmelo Sferrazza, Junsung Park, Sachin Bhadang, Sha Yi, Xiaolong Wang.

**Figure 1.** Figure 1: TACTX learns a shared latent representation that aligns heterogeneous tactile sensors and enables zero-shot transfer of tactile-conditioned policies. Abstract: Tactile sensors provide critical information for contact-rich manipulation, yet tactile representations and policies remain tightly coupled to each specific sensor, limiting transferability across robots and hardware platforms. We propose TACTX, a… view at source ↗

**Figure 2.** Figure 2: TACTX trains on paired contacts from two sensors at a time. Paired observations are encoded into a shared latent space, aligned with InfoNCE, and decoded through self- and cross-reconstruction. Other pairs are trained analogously, yielding a single latent space shared by all three sensors. (cross-reconstruction, e.g. gj (zi) → xj ): the latent from one finger must reconstruct the other finger’s ground-trut… view at source ↗

**Figure 3.** Figure 3: Transitive cross-sensor alignment. Cosine similarity along the Daimon→eFlesh→FlexiTac path measures global latent alignment, with dashed lines indicating the mean for each method. We first evaluate whether TACTX aligns tactile observations from different sensing modalities into a shared latent space. We compare TACTX with three objective variants: a reconstructiononly model (using Eq. (2)), a contrastive… view at source ↗

**Figure 4.** Figure 4: Sensor invariance and semantic preservation in the shared latent space. Sensor-prediction accuracy measures whether sensor identity remains recoverable from frozen latents, where lower values closer to the 33.3% chance level indicate stronger sensor invariance. Object-classification accuracy evaluates whether object-level information is preserved, where “Self” denotes training and testing on the same sens… view at source ↗

**Figure 5.** Figure 5: Self- and cross-reconstruction from the shared latent. We visualize representative validation contacts from sphere, plane, and circle indentors. For each sensor, the first column is the ground-truth observation, the diagonal entries are self-reconstructions, and the off-diagonal entries are cross-reconstructions decoded from the nearest latent representations of the other sensors in the validation set. ti… view at source ↗

**Figure 6.** Figure 6: Downstream manipulation tasks. We evaluate zero-shot tactile policy transfer on four contact-rich tasks: plug insertion, board wiping, pick-and-place, and object reorientation. transfer is its sensitivity to the contact threshold: we use three separate sensor-specific thresholds that are held fixed across all tasks (Appendix D.3), and this threshold mismatch between tasks leads to higher variance and incon… view at source ↗

**Figure 7.** Figure 7: The three tactile sensors used in TACTX, each spanning a different transduction modality. All three are visually matched (black TPU/tape/elastomer) to remove cosmetic shortcuts and have roughly commensurate active sensing areas. Mounting and pairing. Two sensors are mounted on opposing fingers of a Franka parallel-jaw gripper; the third is swapped in for separate runs. We cover all 3 2 × 2 = 6 configurat… view at source ↗

**Figure 8.** Figure 8: The 10 3D-printed pretraining objects used for paired data collection, spanning point, edge, and area contact geometries. Objects and protocol. Pretraining data uses 10 3Dprinted objects ( [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

read the original abstract

Tactile sensors provide critical information for contact-rich manipulation, yet tactile representations and policies remain tightly coupled to each specific sensor, limiting transferability across robots and hardware platforms. We propose TactX, a framework for learning a transferable tactile representation across sensors spanning three fundamentally different transduction modalities: resistive, magnetic, and vision-based. TactX maps heterogeneous tactile observations into a shared latent space through modality-specific encoders trained on paired contact data. Such paired interactions provide a natural alignment signal across modalities, and the encoders are jointly trained across all sensor pairs, inducing a consistent latent space for all sensor types. Our experiments show that TactX aligns tactile representations across sensors while preserving object-level contact information, as evidenced by sensor-identity prediction and object classification in the learned latent space. We evaluate TactX on four contact-rich manipulation tasks: pick-and-place, plug insertion, board wiping, and object reorientation, and show that policies trained with one sensor transfer zero-shot to physically distinct sensors through the shared latent. This improves the average success rate from 27.5% for vision-only policy to 45.9%, providing a step toward sensor-agnostic tactile manipulation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TactX shows a workable joint-training route to shared tactile latents across three sensor types and reports zero-shot policy transfer, but the alignment mechanism stays thin on paper.

read the letter

The main takeaway is that TactX jointly trains modality-specific encoders for resistive, magnetic, and vision-based tactile sensors on paired contact data to produce one latent space, then shows policies trained on one sensor can run zero-shot on the others, moving average success from 27.5% (vision-only) to 45.9% across pick-and-place, plug insertion, wiping, and reorientation.

What is actually new is the three-way cross-modality setup and the joint training across all sensor pairs rather than pairwise only. The paper also supplies concrete checks that the latent keeps object identity while hiding which sensor produced the input. Those classification results are a direct, falsifiable outcome and give some support that the alignment is doing useful work.

The soft spot is the alignment claim itself. The only signal described is the paired contacts; no cycle-consistency loss, invariance penalty, or distribution-matching term is mentioned that would force the latents to stay interchangeable outside the paired set. Resistive, magnetic, and vision sensors differ in spatial support and noise, so gaps in the paired data could leave policy inputs misaligned. The abstract gives no dataset sizes, training details, or ablations, which leaves the 45.9% number difficult to interpret.

This paper is for robotics groups already working on contact-rich manipulation and hardware transfer. A reader who needs to move policies between tactile sensors will find the task suite and the zero-shot numbers worth looking at.

I would send it to peer review. The problem is practical, the experimental claims are on real tasks, and the full methods can be checked once the details are supplied.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes TactX, a framework that learns a shared latent space for tactile representations across three transduction modalities (resistive, magnetic, vision-based) by training modality-specific encoders jointly on paired contact interactions. It claims this alignment preserves object-level contact information (verified via sensor-identity and object classification) and enables zero-shot policy transfer across sensors on four contact-rich tasks, raising average success from 27.5% (vision-only baseline) to 45.9%.

Significance. If the zero-shot transfer result holds under rigorous controls, the work would address a practical barrier in contact-rich manipulation by decoupling policies from specific sensor hardware, potentially enabling more reusable tactile skills across robot platforms.

major comments (2)

[Method description (alignment signal)] The central zero-shot transfer claim rests on the assumption that joint training on paired contact tuples alone induces functionally interchangeable latents across modalities with non-overlapping spatial support, noise spectra, and dynamic range. No cycle-consistency, invariance, or distribution-matching term is described that would enforce this equivalence outside the paired set; without such a mechanism the policy (trained only on one sensor's latents) can encounter out-of-distribution inputs from a new sensor.
[Experiments (success-rate results)] The reported improvement from 27.5% to 45.9% is presented without accompanying details on trial counts, statistical tests, variance across seeds, or ablations that isolate the contribution of the shared latent versus other factors (e.g., sensor-specific fine-tuning or task-specific data). These omissions are load-bearing for the transfer claim.

minor comments (2)

[Abstract / Method] The abstract and method sections would benefit from an explicit statement of the loss function(s) used for joint encoder training and the precise definition of 'paired contact tuples' (e.g., how temporal and spatial alignment is performed across modalities).
[Experiments] Figure captions and table headers should clarify whether the reported success rates are means over multiple runs and whether the vision-only baseline uses the same policy architecture as the TactX variants.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We respond point-by-point to the major comments below.

read point-by-point responses

Referee: [Method description (alignment signal)] The central zero-shot transfer claim rests on the assumption that joint training on paired contact tuples alone induces functionally interchangeable latents across modalities with non-overlapping spatial support, noise spectra, and dynamic range. No cycle-consistency, invariance, or distribution-matching term is described that would enforce this equivalence outside the paired set; without such a mechanism the policy (trained only on one sensor's latents) can encounter out-of-distribution inputs from a new sensor.

Authors: Paired contact tuples correspond to identical physical interactions observed by different sensors, supplying direct supervision that maps each modality's reading of the same event into a common latent. Joint optimization of all modality-specific encoders across every sensor pair further constrains the latent space to be consistent, as any deviation would increase the joint loss. The manuscript already demonstrates that this produces latents preserving object-level contact information via the reported sensor-identity and object-classification probes. The zero-shot policy transfer experiments on four tasks provide empirical evidence that the resulting representations are interchangeable in practice; no explicit cycle-consistency term is required because the multi-pair paired supervision itself enforces the necessary alignment. revision: no
Referee: [Experiments (success-rate results)] The reported improvement from 27.5% to 45.9% is presented without accompanying details on trial counts, statistical tests, variance across seeds, or ablations that isolate the contribution of the shared latent versus other factors (e.g., sensor-specific fine-tuning or task-specific data). These omissions are load-bearing for the transfer claim.

Authors: We agree that the current manuscript omits these experimental details. In the revision we will report the exact number of trials per task and sensor combination, include standard deviations across random seeds, add appropriate statistical tests, and provide ablations that isolate the contribution of the shared latent from other factors such as task-specific data or sensor-specific fine-tuning. revision: yes

Circularity Check

0 steps flagged

No circularity: alignment induced by external paired data, not internal definitions

full rationale

The provided abstract and context contain no equations, fitted parameters, or self-citations. The shared latent space is induced by joint training of modality-specific encoders on externally collected paired contact tuples, which constitute an independent alignment signal rather than a quantity defined in terms of the latent itself. No load-bearing step reduces to a self-definition, a renamed prediction, or an imported uniqueness theorem. The zero-shot transfer claim is evaluated against external manipulation benchmarks, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, training losses, or architectural details, so free parameters, axioms, and invented entities cannot be audited; the central alignment claim rests on the unstated premise that paired data is available and sufficient.

pith-pipeline@v0.9.1-grok · 5750 in / 1141 out tokens · 29743 ms · 2026-07-01T05:21:18.748415+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

57 extracted references · 49 canonical work pages · 11 internal anchors

[1]

Calandra, A

R. Calandra, A. Owens, D. Jayaraman, J. Lin, W. Yuan, J. Malik, E. H. Adelson, and S. Levine. More than a feeling: Learning to grasp and regrasp using vision and touch.IEEE Robotics and Automation Letters, 3(4):3300–3307, Oct. 2018. ISSN 2377-3774. doi:10.1109/lra.2018. 2852779. URLhttp://dx.doi.org/10.1109/LRA.2018.2852779

work page doi:10.1109/lra.2018 2018
[2]

Z.-H. Yin, B. Huang, Y . Qin, Q. Chen, and X. Wang. Rotating without seeing: Towards in-hand dexterity through touch, 2023. URLhttps://arxiv.org/abs/2303.10880

work page arXiv 2023
[3]

Palenicek, T

D. Palenicek, T. Gruner, T. Schneider, A. B¨ohm, J. Lenz, I. Pfenning, E. Kr¨amer, and J. Peters. Learning tactile insertion in the real world, 2024. URLhttps://arxiv.org/abs/2405. 00383

2024
[4]

Oller, D

M. Oller, D. Berenson, and N. Fazeli. Tactile-driven non-prehensile object manipulation via extrinsic contact mode control, 2024. URLhttps://arxiv.org/abs/2405.18214

work page arXiv 2024
[5]

F. Yang, C. Feng, Z. Chen, H. Park, D. Wang, Y . Dou, Z. Zeng, X. Chen, R. Gangopadhyay, A. Owens, and A. Wong. Binding touch to everything: Learning unified multimodal tactile representations, 2024. URLhttps://arxiv.org/abs/2401.18084

work page arXiv 2024
[6]

J. Zhao, Y . Ma, L. Wang, and E. H. Adelson. Transferable tactile transformers for representa- tion learning across diverse sensors and tasks, 2024. URLhttps://arxiv.org/abs/2406. 13640

2024
[7]

R. Feng, J. Hu, W. Xia, T. Gao, A. Shen, Y . Sun, B. Fang, and D. Hu. Anytouch: Learning unified static-dynamic representation across multiple visuo-tactile sensors, 2025. URLhttps: //arxiv.org/abs/2502.12191

work page arXiv 2025
[8]

Higuera, A

C. Higuera, A. Sharma, C. K. Bodduluri, T. Fan, P. Lancaster, M. Kalakrishnan, M. Kaess, B. Boots, M. Lambeta, T. Wu, and M. Mukadam. Sparsh: Self-supervised touch representa- tions for vision-based tactile sensing, 2024. URLhttps://arxiv.org/abs/2410.24090

work page arXiv 2024
[9]

Rodriguez, Y

S. Rodriguez, Y . Dou, M. Oller, A. Owens, and N. Fazeli. Cross-sensor touch generation,
[10]

URLhttps://arxiv.org/abs/2510.09817

work page arXiv
[11]

W. Yuan, S. Dong, and E. H. Adelson. Gelsight: High-resolution robot tactile sensors for estimating geometry and force.Sensors (Basel, Switzerland), 17, 2017. URLhttps://api. semanticscholar.org/CorpusID:3474913

2017
[12]

Lambeta, P.-W

M. Lambeta, P.-W. Chou, S. Tian, B. Yang, B. Maloon, V . R. Most, D. Stroud, R. Santos, A. Byagowi, G. Kammerer, D. Jayaraman, and R. Calandra. Digit: A novel design for a low- cost compact high-resolution tactile sensor with application to in-hand manipulation.IEEE Robotics and Automation Letters, 5(3):3838–3845, 2020. ISSN 2377-3774. doi:10.1109/lra. 20...

work page doi:10.1109/lra 2020
[13]

Ward-Cherrier, N

B. Ward-Cherrier, N. Pestell, L. Cramphorn, B. Winstone, M. Giannaccini, J. Rossiter, and N. Lepora. The tactip family: Soft optical tactile sensors with 3d-printed biomimetic mor- phologies.Soft Robotics, 5, 01 2018. doi:10.1089/soro.2017.0052

work page doi:10.1089/soro.2017.0052 2018
[14]

C. Lin, H. Zhang, J. Xu, L. Wu, and H. Xu. 9dtact: A compact vision-based tactile sensor for accurate 3d shape reconstruction and generalizable 6d force estimation, 2023. URLhttps: //arxiv.org/abs/2308.14277

work page arXiv 2023
[15]

T. Tomo, A. Schmitz, W. Wong, H. Kristanto, S. Somlor, J. Hwang, L. Jamone, and S. Sugano. Covering a robot fingertip with uskin: A soft electronic skin with distributed 3-axis force sensitive elements for robot hands.IEEE Robotics and Automation Letters, PP:1–1, 08 2017. doi:10.1109/LRA.2017.2734965. 9

work page doi:10.1109/lra.2017.2734965 2017
[16]

Bhirangi, T

R. Bhirangi, T. Hellebrekers, C. Majidi, and A. Gupta. Reskin: versatile, replaceable, lasting tactile skins, 2022. URLhttps://arxiv.org/abs/2111.00071

work page arXiv 2022
[17]

Pattabiraman, Z

V . Pattabiraman, Z. Huang, D. Panozzo, D. Zorin, L. Pinto, and R. Bhirangi. eflesh: Highly customizable magnetic touch sensing using cut-cell microstructures, 2025. URLhttps:// arxiv.org/abs/2506.09994

work page arXiv 2025
[18]

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

B. Huang and Y . Li. Flexitac: A low-cost, open-source, scalable tactile sensing solution for robotic systems, 2026. URLhttps://arxiv.org/abs/2604.28156

work page internal anchor Pith review Pith/arXiv arXiv 2026
[19]

Khamis, R

H. Khamis, R. Albero, M. Salerno, A. Shah Idil, and A. Loizou. Papillarray: An incipient slip sensor for dexterous robotic or prosthetic manipulation – design and prototype validation. Sensors and Actuators A: Physical, 270, 12 2017. doi:10.1016/j.sna.2017.12.058

work page doi:10.1016/j.sna.2017.12.058 2017
[20]

Representation Learning with Contrastive Predictive Coding

A. van den Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding, 2019. URLhttps://arxiv.org/abs/1807.03748

work page internal anchor Pith review Pith/arXiv arXiv 2019
[21]

T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware, 2023. URLhttps://arxiv.org/abs/2304.13705

work page internal anchor Pith review Pith/arXiv arXiv 2023
[22]

H. Qi, B. Yi, S. Suresh, M. Lambeta, Y . Ma, R. Calandra, and J. Malik. General in-hand object rotation with vision and touch, 2023. URLhttps://arxiv.org/abs/2309.09979

work page arXiv 2023
[23]

T. Lin, Y . Zhang, Q. Li, H. Qi, B. Yi, S. Levine, and J. Malik. Learning visuotactile skills with two multifingered hands, 2024. URLhttps://arxiv.org/abs/2404.16823

work page arXiv 2024
[24]

Z.-H. Yin, C. Wang, L. Pineda, F. Hogan, K. Bodduluri, A. Sharma, P. Lancaster, I. Prasad, M. Kalakrishnan, J. Malik, M. Lambeta, T. Wu, P. Abbeel, and M. Mukadam. Dexteritygen: Foundation controller for unprecedented dexterity, 2025. URLhttps://arxiv.org/abs/ 2502.04307

work page arXiv 2025
[25]

X. Liu, H. Wang, and L. Yi. Dexndm: Closing the reality gap for dexterous in-hand rotation via joint-wise neural dynamics model, 2025. URLhttps://arxiv.org/abs/2510.08556

work page arXiv 2025
[26]

Jiang, S

S. Jiang, S. Zhao, Y . Fan, and P. Yin. Gelfusion: Enhancing robotic manipulation under visual constraints via visuotactile fusion, 2025. URLhttps://arxiv.org/abs/2505.07455

work page arXiv 2025
[27]

Y . She, S. Wang, S. Dong, N. Sunil, A. Rodriguez, and E. Adelson. Cable manipulation with a tactile-reactive gripper, 2020. URLhttps://arxiv.org/abs/1910.02860

work page arXiv 2020
[28]

F. R. Hogan, M. Bauza, O. Canal, E. Donlon, and A. Rodriguez. Tactile regrasp: Grasp adjustments via simulated tactile transformations, 2018. URLhttps://arxiv.org/abs/ 1803.01940

work page internal anchor Pith review Pith/arXiv arXiv 2018
[29]

GelSlim: A High-Resolution, Compact, Robust, and Calibrated Tactile-sensing Finger

E. Donlon, S. Dong, M. Liu, J. Li, E. Adelson, and A. Rodriguez. Gelslim: A high-resolution, compact, robust, and calibrated tactile-sensing finger, 2018. URLhttps://arxiv.org/abs/ 1803.00628

work page internal anchor Pith review Pith/arXiv arXiv 2018
[30]

DM-Tac W: High-resolution vision-based tactile sensor.https://www

Daimon Robotics. DM-Tac W: High-resolution vision-based tactile sensor.https://www. dmrobot.com/en/, 2025. Accessed: 2026-05-28

2025
[31]

Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation

A. Alspach, K. Hashimoto, N. Kuppuswamy, and R. Tedrake. Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation, 2019. URLhttps://arxiv.org/abs/ 1904.02252

work page internal anchor Pith review Pith/arXiv arXiv 2019
[32]

W. K. Do and M. K. III. Densetact: Optical tactile sensor for dense shape reconstruction, 2022. URLhttps://arxiv.org/abs/2201.01367

work page arXiv 2022
[33]

Bhirangi, V

R. Bhirangi, V . Pattabiraman, E. Erciyes, Y . Cao, T. Hellebrekers, and L. Pinto. Anyskin: Plug- and-play skin sensing for robotic touch, 2024. URLhttps://arxiv.org/abs/2409.08276. 10

work page arXiv 2024
[34]

Wettels, V

N. Wettels, V . Santos, R. Johansson, and G. Loeb. Biomimetic tactile sensor array.Advanced Robotics, 22:829–849, 08 2008. doi:10.1163/156855308X314533

work page doi:10.1163/156855308x314533 2008
[35]

M. S. Li and H. S. Stuart. Acoustac: Tactile sensing with acoustic resonance for electronics- free soft skin, 2023. URLhttps://arxiv.org/abs/2307.09730

work page arXiv 2023
[36]

Zhang, D.-G

K. Zhang, D.-G. Kim, E. T. Chang, H.-H. Liang, Z. He, K. Lampo, P. Wu, I. Kymissis, and M. Ciocarlie. Vibecheck: Using active acoustic tactile sensing for contact-rich manipulation,
[37]

URLhttps://arxiv.org/abs/2504.15535

work page internal anchor Pith review Pith/arXiv arXiv
[38]

M. A. Lee, Y . Zhu, K. Srinivasan, P. Shah, S. Savarese, L. Fei-Fei, A. Garg, and J. Bohg. Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks, 2019. URLhttps://arxiv.org/abs/1810.10191

work page internal anchor Pith review Pith/arXiv arXiv 2019
[39]

Sharma, C

A. Sharma, C. Higuera, C. K. Bodduluri, Z. Liu, T. Fan, T. Hellebrekers, M. Lambeta, B. Boots, M. Kaess, T. Wu, F. R. Hogan, and M. Mukadam. Self-supervised perception for tactile skin covered dexterous hands, 2025. URLhttps://arxiv.org/abs/2505.11420

work page arXiv 2025
[40]

Higuera, A

C. Higuera, A. Sharma, T. Fan, C. K. Bodduluri, B. Boots, M. Kaess, M. Lambeta, T. Wu, Z. Liu, F. R. Hogan, and M. Mukadam. Tactile beyond pixels: Multisensory touch representa- tions for robot manipulation, 2025. URLhttps://arxiv.org/abs/2506.14754

work page arXiv 2025
[41]

Z. Xu, R. Uppuluri, X. Zhang, C. Fitch, P. G. Crandall, W. Shou, D. Wang, and Y . She. Unit: Data efficient tactile representation with generalization to unseen objects. 2025. URLhttps: //arxiv.org/abs/2408.06481

work page arXiv 2025
[42]

TacO: Benchmarking Tactile Sensors for Object Manipulation

A. Zorin, Z. Si, M. Park, J. Park, A. Buynitsky, S. Bhadang, T. Park, S. J. Yoon, Y .-L. Park, O. Kroemer, Z. Temel, M. T. Tolley, S. Yi, and X. Wang. Taco: Benchmarking tactile sensors for object manipulation, 2026. URLhttps://arxiv.org/abs/2605.21976

work page internal anchor Pith review Pith/arXiv arXiv 2026
[43]

Jiang, Y

G. Jiang, Y . Liang, J. Ye, J.-Y . Huang, C. Jing, R. Duan, P. Abbeel, X. Wang, and X. Zou. Cross-hand latent representation for vision-language-action models, 2026. URLhttps:// arxiv.org/abs/2603.10158

work page arXiv 2026
[44]

Bauer, E

E. Bauer, E. Nava, and R. K. Katzschmann. Latent action diffusion for cross-embodiment manipulation, 2026. URLhttps://arxiv.org/abs/2506.14608

work page arXiv 2026
[45]

T. Wang, D. Bhatt, X. Wang, and N. Atanasov. Cross-embodiment robot manipulation skill transfer using latent space alignment, 2024. URLhttps://arxiv.org/abs/2406.01968

work page arXiv 2024
[46]

Dastider, H

A. Dastider, H. Fang, and M. Lin. Cross-embodiment robotic manipulation synthesis via guided demonstrations through cyclevae and human behavior transformer, 2025. URLhttps: //arxiv.org/abs/2503.08622

work page arXiv 2025
[47]

Q. Bu, Y . Yang, J. Cai, S. Gao, G. Ren, M. Yao, P. Luo, and H. Li. Univla: Learning to act anywhere with task-centric latent actions, 2025. URLhttps://arxiv.org/abs/2505. 06111

2025
[48]

Sensor-Invariant Tactile Representation

H. Gupta, Y . Mo, S. Jin, and W. Yuan. Sensor-invariant tactile representation, 2025. URL https://arxiv.org/abs/2502.19638

work page internal anchor Pith review Pith/arXiv arXiv 2025
[49]

R. Feng, Y . Zhou, S. Mei, D. Zhou, P. Wang, S. Cui, B. Fang, G. Yao, and D. Hu. Anytouch 2: General optical tactile representation learning for dynamic tactile perception, 2026. URL https://arxiv.org/abs/2602.09617

work page arXiv 2026
[50]

Rodriguez, Y

S. Rodriguez, Y . Dou, W. van den Bogert, M. Oller, K. So, A. Owens, and N. Fazeli. Con- trastive touch-to-touch pretraining, 2024. URLhttps://arxiv.org/abs/2410.11834. 11

work page arXiv 2024
[51]

Z. Chen, F. Ni, K. Luo, Z. Wu, X. Zhang, E. Spyrakos-Papastavridis, L. Jamone, N. F. Lepora, J. Deng, and S. Luo. Uniforce: A unified latent force model for robot manipulation with diverse tactile sensors, 2026. URLhttps://arxiv.org/abs/2602.01153

work page arXiv 2026
[52]

Z. Chen, N. Ou, X. Zhang, Z. Wu, Y . Zhao, Y . Wang, E. S. Papastavridis, N. Lepora, L. Jamone, J. Deng, and S. Luo. Training tactile sensors to learn force sensing from each other, 2025. URL https://arxiv.org/abs/2503.01058

work page arXiv 2025
[53]

J. Hou, X. Zhou, Q. Yang, and A. J. Spiers. Unitac-nv: A unified tactile representation for non-vision-based tactile sensors, 2025. URLhttps://arxiv.org/abs/2506.19699

work page arXiv 2025
[54]

Z. Chen, N. Ou, X. Zhang, and S. Luo. Transforce: Transferable force prediction for vision- based tactile sensors with sequential image translation, 2025. URLhttps://arxiv.org/ abs/2409.09870

work page arXiv 2025
[55]

Y . Wi, J. Yin, E. Xiang, A. Sharma, J. Malik, M. Mukadam, N. Fazeli, and T. Hellebrekers. Tactalign: Human-to-robot policy transfer via tactile alignment, 2026. URLhttps://arxiv. org/abs/2602.13579

work page arXiv 2026
[56]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations, 2020. URLhttps://arxiv.org/abs/2002.05709

work page internal anchor Pith review Pith/arXiv arXiv 2020
[57]

P. Wu, Y . Shentu, Z. Yi, X. Lin, and P. Abbeel. Gello: A general, low-cost, and intuitive tele- operation framework for robot manipulators, 2024. URLhttps://arxiv.org/abs/2309. 13037. 12 A Data Collection Details Sensors.To prevent too much visual change the eFlesh housing is 3D-printed in black TPU and the FlexiTac surface is covered with black anti-sli...

2024

[1] [1]

Calandra, A

R. Calandra, A. Owens, D. Jayaraman, J. Lin, W. Yuan, J. Malik, E. H. Adelson, and S. Levine. More than a feeling: Learning to grasp and regrasp using vision and touch.IEEE Robotics and Automation Letters, 3(4):3300–3307, Oct. 2018. ISSN 2377-3774. doi:10.1109/lra.2018. 2852779. URLhttp://dx.doi.org/10.1109/LRA.2018.2852779

work page doi:10.1109/lra.2018 2018

[2] [2]

Z.-H. Yin, B. Huang, Y . Qin, Q. Chen, and X. Wang. Rotating without seeing: Towards in-hand dexterity through touch, 2023. URLhttps://arxiv.org/abs/2303.10880

work page arXiv 2023

[3] [3]

Palenicek, T

D. Palenicek, T. Gruner, T. Schneider, A. B¨ohm, J. Lenz, I. Pfenning, E. Kr¨amer, and J. Peters. Learning tactile insertion in the real world, 2024. URLhttps://arxiv.org/abs/2405. 00383

2024

[4] [4]

Oller, D

M. Oller, D. Berenson, and N. Fazeli. Tactile-driven non-prehensile object manipulation via extrinsic contact mode control, 2024. URLhttps://arxiv.org/abs/2405.18214

work page arXiv 2024

[5] [5]

F. Yang, C. Feng, Z. Chen, H. Park, D. Wang, Y . Dou, Z. Zeng, X. Chen, R. Gangopadhyay, A. Owens, and A. Wong. Binding touch to everything: Learning unified multimodal tactile representations, 2024. URLhttps://arxiv.org/abs/2401.18084

work page arXiv 2024

[6] [6]

J. Zhao, Y . Ma, L. Wang, and E. H. Adelson. Transferable tactile transformers for representa- tion learning across diverse sensors and tasks, 2024. URLhttps://arxiv.org/abs/2406. 13640

2024

[7] [7]

R. Feng, J. Hu, W. Xia, T. Gao, A. Shen, Y . Sun, B. Fang, and D. Hu. Anytouch: Learning unified static-dynamic representation across multiple visuo-tactile sensors, 2025. URLhttps: //arxiv.org/abs/2502.12191

work page arXiv 2025

[8] [8]

Higuera, A

C. Higuera, A. Sharma, C. K. Bodduluri, T. Fan, P. Lancaster, M. Kalakrishnan, M. Kaess, B. Boots, M. Lambeta, T. Wu, and M. Mukadam. Sparsh: Self-supervised touch representa- tions for vision-based tactile sensing, 2024. URLhttps://arxiv.org/abs/2410.24090

work page arXiv 2024

[9] [9]

Rodriguez, Y

S. Rodriguez, Y . Dou, M. Oller, A. Owens, and N. Fazeli. Cross-sensor touch generation,

[10] [10]

URLhttps://arxiv.org/abs/2510.09817

work page arXiv

[11] [11]

W. Yuan, S. Dong, and E. H. Adelson. Gelsight: High-resolution robot tactile sensors for estimating geometry and force.Sensors (Basel, Switzerland), 17, 2017. URLhttps://api. semanticscholar.org/CorpusID:3474913

2017

[12] [12]

Lambeta, P.-W

M. Lambeta, P.-W. Chou, S. Tian, B. Yang, B. Maloon, V . R. Most, D. Stroud, R. Santos, A. Byagowi, G. Kammerer, D. Jayaraman, and R. Calandra. Digit: A novel design for a low- cost compact high-resolution tactile sensor with application to in-hand manipulation.IEEE Robotics and Automation Letters, 5(3):3838–3845, 2020. ISSN 2377-3774. doi:10.1109/lra. 20...

work page doi:10.1109/lra 2020

[13] [13]

Ward-Cherrier, N

B. Ward-Cherrier, N. Pestell, L. Cramphorn, B. Winstone, M. Giannaccini, J. Rossiter, and N. Lepora. The tactip family: Soft optical tactile sensors with 3d-printed biomimetic mor- phologies.Soft Robotics, 5, 01 2018. doi:10.1089/soro.2017.0052

work page doi:10.1089/soro.2017.0052 2018

[14] [14]

C. Lin, H. Zhang, J. Xu, L. Wu, and H. Xu. 9dtact: A compact vision-based tactile sensor for accurate 3d shape reconstruction and generalizable 6d force estimation, 2023. URLhttps: //arxiv.org/abs/2308.14277

work page arXiv 2023

[15] [15]

T. Tomo, A. Schmitz, W. Wong, H. Kristanto, S. Somlor, J. Hwang, L. Jamone, and S. Sugano. Covering a robot fingertip with uskin: A soft electronic skin with distributed 3-axis force sensitive elements for robot hands.IEEE Robotics and Automation Letters, PP:1–1, 08 2017. doi:10.1109/LRA.2017.2734965. 9

work page doi:10.1109/lra.2017.2734965 2017

[16] [16]

Bhirangi, T

R. Bhirangi, T. Hellebrekers, C. Majidi, and A. Gupta. Reskin: versatile, replaceable, lasting tactile skins, 2022. URLhttps://arxiv.org/abs/2111.00071

work page arXiv 2022

[17] [17]

Pattabiraman, Z

V . Pattabiraman, Z. Huang, D. Panozzo, D. Zorin, L. Pinto, and R. Bhirangi. eflesh: Highly customizable magnetic touch sensing using cut-cell microstructures, 2025. URLhttps:// arxiv.org/abs/2506.09994

work page arXiv 2025

[18] [18]

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

B. Huang and Y . Li. Flexitac: A low-cost, open-source, scalable tactile sensing solution for robotic systems, 2026. URLhttps://arxiv.org/abs/2604.28156

work page internal anchor Pith review Pith/arXiv arXiv 2026

[19] [19]

Khamis, R

H. Khamis, R. Albero, M. Salerno, A. Shah Idil, and A. Loizou. Papillarray: An incipient slip sensor for dexterous robotic or prosthetic manipulation – design and prototype validation. Sensors and Actuators A: Physical, 270, 12 2017. doi:10.1016/j.sna.2017.12.058

work page doi:10.1016/j.sna.2017.12.058 2017

[20] [20]

Representation Learning with Contrastive Predictive Coding

A. van den Oord, Y . Li, and O. Vinyals. Representation learning with contrastive predictive coding, 2019. URLhttps://arxiv.org/abs/1807.03748

work page internal anchor Pith review Pith/arXiv arXiv 2019

[21] [21]

T. Z. Zhao, V . Kumar, S. Levine, and C. Finn. Learning fine-grained bimanual manipulation with low-cost hardware, 2023. URLhttps://arxiv.org/abs/2304.13705

work page internal anchor Pith review Pith/arXiv arXiv 2023

[22] [22]

H. Qi, B. Yi, S. Suresh, M. Lambeta, Y . Ma, R. Calandra, and J. Malik. General in-hand object rotation with vision and touch, 2023. URLhttps://arxiv.org/abs/2309.09979

work page arXiv 2023

[23] [23]

T. Lin, Y . Zhang, Q. Li, H. Qi, B. Yi, S. Levine, and J. Malik. Learning visuotactile skills with two multifingered hands, 2024. URLhttps://arxiv.org/abs/2404.16823

work page arXiv 2024

[24] [24]

Z.-H. Yin, C. Wang, L. Pineda, F. Hogan, K. Bodduluri, A. Sharma, P. Lancaster, I. Prasad, M. Kalakrishnan, J. Malik, M. Lambeta, T. Wu, P. Abbeel, and M. Mukadam. Dexteritygen: Foundation controller for unprecedented dexterity, 2025. URLhttps://arxiv.org/abs/ 2502.04307

work page arXiv 2025

[25] [25]

X. Liu, H. Wang, and L. Yi. Dexndm: Closing the reality gap for dexterous in-hand rotation via joint-wise neural dynamics model, 2025. URLhttps://arxiv.org/abs/2510.08556

work page arXiv 2025

[26] [26]

Jiang, S

S. Jiang, S. Zhao, Y . Fan, and P. Yin. Gelfusion: Enhancing robotic manipulation under visual constraints via visuotactile fusion, 2025. URLhttps://arxiv.org/abs/2505.07455

work page arXiv 2025

[27] [27]

Y . She, S. Wang, S. Dong, N. Sunil, A. Rodriguez, and E. Adelson. Cable manipulation with a tactile-reactive gripper, 2020. URLhttps://arxiv.org/abs/1910.02860

work page arXiv 2020

[28] [28]

F. R. Hogan, M. Bauza, O. Canal, E. Donlon, and A. Rodriguez. Tactile regrasp: Grasp adjustments via simulated tactile transformations, 2018. URLhttps://arxiv.org/abs/ 1803.01940

work page internal anchor Pith review Pith/arXiv arXiv 2018

[29] [29]

GelSlim: A High-Resolution, Compact, Robust, and Calibrated Tactile-sensing Finger

E. Donlon, S. Dong, M. Liu, J. Li, E. Adelson, and A. Rodriguez. Gelslim: A high-resolution, compact, robust, and calibrated tactile-sensing finger, 2018. URLhttps://arxiv.org/abs/ 1803.00628

work page internal anchor Pith review Pith/arXiv arXiv 2018

[30] [30]

DM-Tac W: High-resolution vision-based tactile sensor.https://www

Daimon Robotics. DM-Tac W: High-resolution vision-based tactile sensor.https://www. dmrobot.com/en/, 2025. Accessed: 2026-05-28

2025

[31] [31]

Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation

A. Alspach, K. Hashimoto, N. Kuppuswamy, and R. Tedrake. Soft-bubble: A highly compliant dense geometry tactile sensor for robot manipulation, 2019. URLhttps://arxiv.org/abs/ 1904.02252

work page internal anchor Pith review Pith/arXiv arXiv 2019

[32] [32]

W. K. Do and M. K. III. Densetact: Optical tactile sensor for dense shape reconstruction, 2022. URLhttps://arxiv.org/abs/2201.01367

work page arXiv 2022

[33] [33]

Bhirangi, V

R. Bhirangi, V . Pattabiraman, E. Erciyes, Y . Cao, T. Hellebrekers, and L. Pinto. Anyskin: Plug- and-play skin sensing for robotic touch, 2024. URLhttps://arxiv.org/abs/2409.08276. 10

work page arXiv 2024

[34] [34]

Wettels, V

N. Wettels, V . Santos, R. Johansson, and G. Loeb. Biomimetic tactile sensor array.Advanced Robotics, 22:829–849, 08 2008. doi:10.1163/156855308X314533

work page doi:10.1163/156855308x314533 2008

[35] [35]

M. S. Li and H. S. Stuart. Acoustac: Tactile sensing with acoustic resonance for electronics- free soft skin, 2023. URLhttps://arxiv.org/abs/2307.09730

work page arXiv 2023

[36] [36]

Zhang, D.-G

K. Zhang, D.-G. Kim, E. T. Chang, H.-H. Liang, Z. He, K. Lampo, P. Wu, I. Kymissis, and M. Ciocarlie. Vibecheck: Using active acoustic tactile sensing for contact-rich manipulation,

[37] [37]

URLhttps://arxiv.org/abs/2504.15535

work page internal anchor Pith review Pith/arXiv arXiv

[38] [38]

M. A. Lee, Y . Zhu, K. Srinivasan, P. Shah, S. Savarese, L. Fei-Fei, A. Garg, and J. Bohg. Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks, 2019. URLhttps://arxiv.org/abs/1810.10191

work page internal anchor Pith review Pith/arXiv arXiv 2019

[39] [39]

Sharma, C

A. Sharma, C. Higuera, C. K. Bodduluri, Z. Liu, T. Fan, T. Hellebrekers, M. Lambeta, B. Boots, M. Kaess, T. Wu, F. R. Hogan, and M. Mukadam. Self-supervised perception for tactile skin covered dexterous hands, 2025. URLhttps://arxiv.org/abs/2505.11420

work page arXiv 2025

[40] [40]

Higuera, A

C. Higuera, A. Sharma, T. Fan, C. K. Bodduluri, B. Boots, M. Kaess, M. Lambeta, T. Wu, Z. Liu, F. R. Hogan, and M. Mukadam. Tactile beyond pixels: Multisensory touch representa- tions for robot manipulation, 2025. URLhttps://arxiv.org/abs/2506.14754

work page arXiv 2025

[41] [41]

Z. Xu, R. Uppuluri, X. Zhang, C. Fitch, P. G. Crandall, W. Shou, D. Wang, and Y . She. Unit: Data efficient tactile representation with generalization to unseen objects. 2025. URLhttps: //arxiv.org/abs/2408.06481

work page arXiv 2025

[42] [42]

TacO: Benchmarking Tactile Sensors for Object Manipulation

A. Zorin, Z. Si, M. Park, J. Park, A. Buynitsky, S. Bhadang, T. Park, S. J. Yoon, Y .-L. Park, O. Kroemer, Z. Temel, M. T. Tolley, S. Yi, and X. Wang. Taco: Benchmarking tactile sensors for object manipulation, 2026. URLhttps://arxiv.org/abs/2605.21976

work page internal anchor Pith review Pith/arXiv arXiv 2026

[43] [43]

Jiang, Y

G. Jiang, Y . Liang, J. Ye, J.-Y . Huang, C. Jing, R. Duan, P. Abbeel, X. Wang, and X. Zou. Cross-hand latent representation for vision-language-action models, 2026. URLhttps:// arxiv.org/abs/2603.10158

work page arXiv 2026

[44] [44]

Bauer, E

E. Bauer, E. Nava, and R. K. Katzschmann. Latent action diffusion for cross-embodiment manipulation, 2026. URLhttps://arxiv.org/abs/2506.14608

work page arXiv 2026

[45] [45]

T. Wang, D. Bhatt, X. Wang, and N. Atanasov. Cross-embodiment robot manipulation skill transfer using latent space alignment, 2024. URLhttps://arxiv.org/abs/2406.01968

work page arXiv 2024

[46] [46]

Dastider, H

A. Dastider, H. Fang, and M. Lin. Cross-embodiment robotic manipulation synthesis via guided demonstrations through cyclevae and human behavior transformer, 2025. URLhttps: //arxiv.org/abs/2503.08622

work page arXiv 2025

[47] [47]

Q. Bu, Y . Yang, J. Cai, S. Gao, G. Ren, M. Yao, P. Luo, and H. Li. Univla: Learning to act anywhere with task-centric latent actions, 2025. URLhttps://arxiv.org/abs/2505. 06111

2025

[48] [48]

Sensor-Invariant Tactile Representation

H. Gupta, Y . Mo, S. Jin, and W. Yuan. Sensor-invariant tactile representation, 2025. URL https://arxiv.org/abs/2502.19638

work page internal anchor Pith review Pith/arXiv arXiv 2025

[49] [49]

R. Feng, Y . Zhou, S. Mei, D. Zhou, P. Wang, S. Cui, B. Fang, G. Yao, and D. Hu. Anytouch 2: General optical tactile representation learning for dynamic tactile perception, 2026. URL https://arxiv.org/abs/2602.09617

work page arXiv 2026

[50] [50]

Rodriguez, Y

S. Rodriguez, Y . Dou, W. van den Bogert, M. Oller, K. So, A. Owens, and N. Fazeli. Con- trastive touch-to-touch pretraining, 2024. URLhttps://arxiv.org/abs/2410.11834. 11

work page arXiv 2024

[51] [51]

Z. Chen, F. Ni, K. Luo, Z. Wu, X. Zhang, E. Spyrakos-Papastavridis, L. Jamone, N. F. Lepora, J. Deng, and S. Luo. Uniforce: A unified latent force model for robot manipulation with diverse tactile sensors, 2026. URLhttps://arxiv.org/abs/2602.01153

work page arXiv 2026

[52] [52]

Z. Chen, N. Ou, X. Zhang, Z. Wu, Y . Zhao, Y . Wang, E. S. Papastavridis, N. Lepora, L. Jamone, J. Deng, and S. Luo. Training tactile sensors to learn force sensing from each other, 2025. URL https://arxiv.org/abs/2503.01058

work page arXiv 2025

[53] [53]

J. Hou, X. Zhou, Q. Yang, and A. J. Spiers. Unitac-nv: A unified tactile representation for non-vision-based tactile sensors, 2025. URLhttps://arxiv.org/abs/2506.19699

work page arXiv 2025

[54] [54]

Z. Chen, N. Ou, X. Zhang, and S. Luo. Transforce: Transferable force prediction for vision- based tactile sensors with sequential image translation, 2025. URLhttps://arxiv.org/ abs/2409.09870

work page arXiv 2025

[55] [55]

Y . Wi, J. Yin, E. Xiang, A. Sharma, J. Malik, M. Mukadam, N. Fazeli, and T. Hellebrekers. Tactalign: Human-to-robot policy transfer via tactile alignment, 2026. URLhttps://arxiv. org/abs/2602.13579

work page arXiv 2026

[56] [56]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations, 2020. URLhttps://arxiv.org/abs/2002.05709

work page internal anchor Pith review Pith/arXiv arXiv 2020

[57] [57]

P. Wu, Y . Shentu, Z. Yi, X. Lin, and P. Abbeel. Gello: A general, low-cost, and intuitive tele- operation framework for robot manipulators, 2024. URLhttps://arxiv.org/abs/2309. 13037. 12 A Data Collection Details Sensors.To prevent too much visual change the eFlesh housing is 3D-printed in black TPU and the FlexiTac surface is covered with black anti-sli...

2024