{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2025:GEQEYW4TBN73FHY5ZXYHDEHV5P","short_pith_number":"pith:GEQEYW4T","schema_version":"1.0","canonical_sha256":"31204c5b930b7fb29f1dcdf07190f5ebcc8387136e79c82357d6622ee9cb175b","source":{"kind":"arxiv","id":"2509.20904","version":3},"attestation_state":"computed","paper":{"title":"FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets","license":"http://creativecommons.org/licenses/by/4.0/","headline":"","cross_cats":[],"primary_cat":"cs.IR","authors_text":"Chenchi Zhang, Junjun Zheng, Kairui Fu, Kun Kuang, Shengyu Zhang, Shuwen Xiao, Tao Zhang, Xiangheng Kong, Xinming Zhang, Yuliang Yan, Yuning Jiang, Ziyang Wang","submitted_at":"2025-09-25T08:44:22Z","abstract_excerpt":"Semantic identifiers (SIDs) have gained increasing attention in generative retrieval (GR) for recommendation due to their meaningful semantic discriminability. However, current studies in this field primarily (1) offer limited investigation into the construction strategies for better SIDs, and (2) their SID assessment typically relies on costly GR training. To address these challenges, we propose FORGE, a comprehensive benchmark for FOrming semantic identifieRs for Generative rEtrieval. Specifically, FORGE provides a taxonomy of the SID construction process from several perspectives and valida"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"2509.20904","kind":"arxiv","version":3},"metadata":{"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"cs.IR","submitted_at":"2025-09-25T08:44:22Z","cross_cats_sorted":[],"title_canon_sha256":"b0f8ba49a388d57d07ea8d4bb8c4a0777bd6f3ed2930f37444e543c836b469ef","abstract_canon_sha256":"dc10dccd938d51f5537d4a87938d729c28545fe6e074c51b1f538a83ea4bcc64"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-29T01:04:35.688382Z","signature_b64":"Fz7UBItrPoFgnniF73l0MiUyxKrxJ8OzgBXYiCH9jGlYblBz+ONTR37Aka9eofYOoN+m+3eqJYhcpHb2yO4cBQ==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"31204c5b930b7fb29f1dcdf07190f5ebcc8387136e79c82357d6622ee9cb175b","last_reissued_at":"2026-05-29T01:04:35.687741Z","signature_status":"signed_v1","first_computed_at":"2026-05-29T01:04:35.687741Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"FORGE: Forming Semantic Identifiers for Generative Retrieval in Industrial Datasets","license":"http://creativecommons.org/licenses/by/4.0/","headline":"","cross_cats":[],"primary_cat":"cs.IR","authors_text":"Chenchi Zhang, Junjun Zheng, Kairui Fu, Kun Kuang, Shengyu Zhang, Shuwen Xiao, Tao Zhang, Xiangheng Kong, Xinming Zhang, Yuliang Yan, Yuning Jiang, Ziyang Wang","submitted_at":"2025-09-25T08:44:22Z","abstract_excerpt":"Semantic identifiers (SIDs) have gained increasing attention in generative retrieval (GR) for recommendation due to their meaningful semantic discriminability. However, current studies in this field primarily (1) offer limited investigation into the construction strategies for better SIDs, and (2) their SID assessment typically relies on costly GR training. To address these challenges, we propose FORGE, a comprehensive benchmark for FOrming semantic identifieRs for Generative rEtrieval. Specifically, FORGE provides a taxonomy of the SID construction process from several perspectives and valida"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"2509.20904","kind":"arxiv","version":3},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2509.20904/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2509.20904","created_at":"2026-05-29T01:04:35.687815+00:00"},{"alias_kind":"arxiv_version","alias_value":"2509.20904v3","created_at":"2026-05-29T01:04:35.687815+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2509.20904","created_at":"2026-05-29T01:04:35.687815+00:00"},{"alias_kind":"pith_short_12","alias_value":"GEQEYW4TBN73","created_at":"2026-05-29T01:04:35.687815+00:00"},{"alias_kind":"pith_short_16","alias_value":"GEQEYW4TBN73FHY5","created_at":"2026-05-29T01:04:35.687815+00:00"},{"alias_kind":"pith_short_8","alias_value":"GEQEYW4T","created_at":"2026-05-29T01:04:35.687815+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":7,"internal_anchor_count":7,"sample":[{"citing_arxiv_id":"2602.23964","citing_title":"RAD-DPO: Robust Adaptive Denoising Direct Preference Optimization for Generative Retrieval in E-commerce","ref_index":6,"is_internal_anchor":true},{"citing_arxiv_id":"2605.14434","citing_title":"Efficient Generative Retrieval for E-commerce Search with Semantic Cluster IDs and Expert-Guided RL","ref_index":6,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04976","citing_title":"Tencent Advertising Algorithm Challenge 2025: All-Modality Generative Recommendation","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2604.12234","citing_title":"UniRec: Bridging the Expressive Gap between Generative and Discriminative Recommendation via Chain-of-Attribute","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08933","citing_title":"IAT: Instance-As-Token Compression for Historical User Sequence Modeling in Industrial Recommender Systems","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2604.05329","citing_title":"Semantic Trimming and Auxiliary Multi-step Prediction for Generative Recommendation","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2604.25291","citing_title":"From Local Indices to Global Identifiers: Generative Reranking for Recommender Systems via Global Action Space","ref_index":11,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P","json":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P.json","graph_json":"https://pith.science/api/pith-number/GEQEYW4TBN73FHY5ZXYHDEHV5P/graph.json","events_json":"https://pith.science/api/pith-number/GEQEYW4TBN73FHY5ZXYHDEHV5P/events.json","paper":"https://pith.science/paper/GEQEYW4T"},"agent_actions":{"view_html":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P","download_json":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P.json","view_paper":"https://pith.science/paper/GEQEYW4T","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2509.20904&json=true","fetch_graph":"https://pith.science/api/pith-number/GEQEYW4TBN73FHY5ZXYHDEHV5P/graph.json","fetch_events":"https://pith.science/api/pith-number/GEQEYW4TBN73FHY5ZXYHDEHV5P/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P/action/timestamp_anchor","attest_storage":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P/action/storage_attestation","attest_author":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P/action/author_attestation","sign_citation":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P/action/citation_signature","submit_replication":"https://pith.science/pith/GEQEYW4TBN73FHY5ZXYHDEHV5P/action/replication_record"}},"created_at":"2026-05-29T01:04:35.687815+00:00","updated_at":"2026-05-29T01:04:35.687815+00:00"}