{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:L3RKWYWFJQ4BNXY42UHJEXOEWL","short_pith_number":"pith:L3RKWYWF","schema_version":"1.0","canonical_sha256":"5ee2ab62c54c3816df1cd50e925dc4b2f78c24392e01a1ad5b9946f95d355144","source":{"kind":"arxiv","id":"2604.07993","version":2},"attestation_state":"computed","paper":{"title":"HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation","license":"http://creativecommons.org/licenses/by/4.0/","headline":"HEX achieves state-of-the-art whole-body manipulation on humanoid robots by aligning states across embodiments and modeling coordination with a mixture-of-experts predictor.","cross_cats":[],"primary_cat":"cs.RO","authors_text":"Badong Chen, Chengkai Hou, Fei Liao, Jian Tang, Jiawei Wang, Kun Wu, Langzhe Gu, Lei Sun, Meng Li, Shanghang Zhang, Shuanghao Bai, Wanqi Zhou, Xinhua Wang, Xinyuan Lv, Zhengping Che, Zhiyuan Xu, Ziluo Ding","submitted_at":"2026-04-09T09:01:43Z","abstract_excerpt":"Humans achieve complex manipulation through coordinated whole-body control, whereas most Vision-Language-Action (VLA) models treat robot body parts largely independently, making high-DoF humanoid control challenging and often unstable. We present HEX, a state-centric framework for coordinated manipulation on full-sized bipedal humanoid robots. HEX introduces a humanoid-aligned universal state representation for scalable learning across heterogeneous embodiments, and incorporates a Mixture-of-Experts Unified Proprioceptive Predictor to model whole-body coordination and temporal motion dynamics "},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"2604.07993","kind":"arxiv","version":2},"metadata":{"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.RO","submitted_at":"2026-04-09T09:01:43Z","cross_cats_sorted":[],"title_canon_sha256":"303deb626acdf6dc44d0b5298cf9fd98db8e61898e8e21073d7204bfb6ab1381","abstract_canon_sha256":"b5712e9fd01eaf7464aaaf029a9f92d2507994a1189a21704c422048ea990fd0"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-20T01:05:12.764062Z","signature_b64":"uQnJpkC4fAg3WWcREkcqyNZfLYG0mmG6lCZsOtO1qGb7dMG3b337OyfiZU9R2iuVcFpv8cr0xlhZ62HXXQNjBw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"5ee2ab62c54c3816df1cd50e925dc4b2f78c24392e01a1ad5b9946f95d355144","last_reissued_at":"2026-05-20T01:05:12.763135Z","signature_status":"signed_v1","first_computed_at":"2026-05-20T01:05:12.763135Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"HEX: Humanoid-Aligned Experts for Cross-Embodiment Whole-Body Manipulation","license":"http://creativecommons.org/licenses/by/4.0/","headline":"HEX achieves state-of-the-art whole-body manipulation on humanoid robots by aligning states across embodiments and modeling coordination with a mixture-of-experts predictor.","cross_cats":[],"primary_cat":"cs.RO","authors_text":"Badong Chen, Chengkai Hou, Fei Liao, Jian Tang, Jiawei Wang, Kun Wu, Langzhe Gu, Lei Sun, Meng Li, Shanghang Zhang, Shuanghao Bai, Wanqi Zhou, Xinhua Wang, Xinyuan Lv, Zhengping Che, Zhiyuan Xu, Ziluo Ding","submitted_at":"2026-04-09T09:01:43Z","abstract_excerpt":"Humans achieve complex manipulation through coordinated whole-body control, whereas most Vision-Language-Action (VLA) models treat robot body parts largely independently, making high-DoF humanoid control challenging and often unstable. We present HEX, a state-centric framework for coordinated manipulation on full-sized bipedal humanoid robots. HEX introduces a humanoid-aligned universal state representation for scalable learning across heterogeneous embodiments, and incorporates a Mixture-of-Experts Unified Proprioceptive Predictor to model whole-body coordination and temporal motion dynamics "},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Experiments on real-world humanoid manipulation tasks show that HEX achieves state-of-the-art performance in task success rate and generalization, particularly in fast-reaction and long-horizon scenarios.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the humanoid-aligned universal state representation combined with the Mixture-of-Experts Unified Proprioceptive Predictor can reliably capture and generalize whole-body coordination and temporal dynamics across heterogeneous embodiments from large-scale multi-embodiment trajectory data.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"HEX is a new framework with humanoid-aligned state representation, mixture-of-experts proprioceptive predictor, history tokens, and residual-gated fusion that achieves state-of-the-art success and generalization on real humanoid manipulation tasks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"HEX achieves state-of-the-art whole-body manipulation on humanoid robots by aligning states across embodiments and modeling coordination with a mixture-of-experts predictor.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"4e98a8debfe02bc922d9d3921bd2e6c38cfb7e3c89a49530122bfc2fa8e33591"},"source":{"id":"2604.07993","kind":"arxiv","version":2},"verdict":{"id":"7f340550-28e9-47a3-94ae-f6d6bc6613a5","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-10T17:36:47.891396Z","strongest_claim":"Experiments on real-world humanoid manipulation tasks show that HEX achieves state-of-the-art performance in task success rate and generalization, particularly in fast-reaction and long-horizon scenarios.","one_line_summary":"HEX is a new framework with humanoid-aligned state representation, mixture-of-experts proprioceptive predictor, history tokens, and residual-gated fusion that achieves state-of-the-art success and generalization on real humanoid manipulation tasks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the humanoid-aligned universal state representation combined with the Mixture-of-Experts Unified Proprioceptive Predictor can reliably capture and generalize whole-body coordination and temporal dynamics across heterogeneous embodiments from large-scale multi-embodiment trajectory data.","pith_extraction_headline":"HEX achieves state-of-the-art whole-body manipulation on humanoid robots by aligning states across embodiments and modeling coordination with a mixture-of-experts predictor."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.07993/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2604.07993","created_at":"2026-05-20T01:05:12.763259+00:00"},{"alias_kind":"arxiv_version","alias_value":"2604.07993v2","created_at":"2026-05-20T01:05:12.763259+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2604.07993","created_at":"2026-05-20T01:05:12.763259+00:00"},{"alias_kind":"pith_short_12","alias_value":"L3RKWYWFJQ4B","created_at":"2026-05-20T01:05:12.763259+00:00"},{"alias_kind":"pith_short_16","alias_value":"L3RKWYWFJQ4BNXY4","created_at":"2026-05-20T01:05:12.763259+00:00"},{"alias_kind":"pith_short_8","alias_value":"L3RKWYWF","created_at":"2026-05-20T01:05:12.763259+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":3,"internal_anchor_count":3,"sample":[{"citing_arxiv_id":"2605.23733","citing_title":"Any2Any: Efficient Cross-Embodiment Transfer for Humanoid Whole-Body Tracking","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10921","citing_title":"RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark","ref_index":35,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10903","citing_title":"CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models","ref_index":3,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL","json":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL.json","graph_json":"https://pith.science/api/pith-number/L3RKWYWFJQ4BNXY42UHJEXOEWL/graph.json","events_json":"https://pith.science/api/pith-number/L3RKWYWFJQ4BNXY42UHJEXOEWL/events.json","paper":"https://pith.science/paper/L3RKWYWF"},"agent_actions":{"view_html":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL","download_json":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL.json","view_paper":"https://pith.science/paper/L3RKWYWF","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2604.07993&json=true","fetch_graph":"https://pith.science/api/pith-number/L3RKWYWFJQ4BNXY42UHJEXOEWL/graph.json","fetch_events":"https://pith.science/api/pith-number/L3RKWYWFJQ4BNXY42UHJEXOEWL/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL/action/timestamp_anchor","attest_storage":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL/action/storage_attestation","attest_author":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL/action/author_attestation","sign_citation":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL/action/citation_signature","submit_replication":"https://pith.science/pith/L3RKWYWFJQ4BNXY42UHJEXOEWL/action/replication_record"}},"created_at":"2026-05-20T01:05:12.763259+00:00","updated_at":"2026-05-20T01:05:12.763259+00:00"}