{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2011:ENA5243DAZCMTGWJHEWIAF3GJV","short_pith_number":"pith:ENA5243D","schema_version":"1.0","canonical_sha256":"2341dd73630644c99ac9392c8017664d61c6d9d4ca2c8265205a6655407d534b","source":{"kind":"arxiv","id":"1106.6104","version":3},"attestation_state":"computed","paper":{"title":"Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["cs.LG","cs.SY","math.PR","math.ST","stat.TH"],"primary_cat":"math.OC","authors_text":"Keqin Liu, Qing Zhao, Sattar Vakili","submitted_at":"2011-06-30T02:12:32Z","abstract_excerpt":"In the Multi-Armed Bandit (MAB) problem, there is a given set of arms with unknown reward models. At each time, a player selects one arm to play, aiming to maximize the total expected reward over a horizon of length T. An approach based on a Deterministic Sequencing of Exploration and Exploitation (DSEE) is developed for constructing sequential arm selection policies. It is shown that for all light-tailed reward distributions, DSEE achieves the optimal logarithmic order of the regret, where regret is defined as the total expected reward loss against the ideal case with known reward models. For"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"1106.6104","kind":"arxiv","version":3},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"math.OC","submitted_at":"2011-06-30T02:12:32Z","cross_cats_sorted":["cs.LG","cs.SY","math.PR","math.ST","stat.TH"],"title_canon_sha256":"28ffda2b382a01c80dc054537989d9377c2e60ef39c5cae15596293052d72d28","abstract_canon_sha256":"77a9655253726d305e7870980380be0be153acdca828e93c39637e0a0a41d027"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-18T03:31:31.031025Z","signature_b64":"B9hv/hY/uGDSu8mi5D2EZxI7glEDHhjPUdZuGoEK57cBtqI0oloPzYVLH4PxzGkWQKceztMIsmRZd4EAU084Dw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"2341dd73630644c99ac9392c8017664d61c6d9d4ca2c8265205a6655407d534b","last_reissued_at":"2026-05-18T03:31:31.030259Z","signature_status":"signed_v1","first_computed_at":"2026-05-18T03:31:31.030259Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"Deterministic Sequencing of Exploration and Exploitation for Multi-Armed Bandit Problems","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"","cross_cats":["cs.LG","cs.SY","math.PR","math.ST","stat.TH"],"primary_cat":"math.OC","authors_text":"Keqin Liu, Qing Zhao, Sattar Vakili","submitted_at":"2011-06-30T02:12:32Z","abstract_excerpt":"In the Multi-Armed Bandit (MAB) problem, there is a given set of arms with unknown reward models. At each time, a player selects one arm to play, aiming to maximize the total expected reward over a horizon of length T. An approach based on a Deterministic Sequencing of Exploration and Exploitation (DSEE) is developed for constructing sequential arm selection policies. It is shown that for all light-tailed reward distributions, DSEE achieves the optimal logarithmic order of the regret, where regret is defined as the total expected reward loss against the ideal case with known reward models. For"},"claims":{"count":0,"items":[],"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"source":{"id":"1106.6104","kind":"arxiv","version":3},"verdict":{"id":null,"model_set":{},"created_at":null,"strongest_claim":"","one_line_summary":"","pipeline_version":null,"weakest_assumption":"","pith_extraction_headline":""},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"1106.6104","created_at":"2026-05-18T03:31:31.030400+00:00"},{"alias_kind":"arxiv_version","alias_value":"1106.6104v3","created_at":"2026-05-18T03:31:31.030400+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.1106.6104","created_at":"2026-05-18T03:31:31.030400+00:00"},{"alias_kind":"pith_short_12","alias_value":"ENA5243DAZCM","created_at":"2026-05-18T12:26:28.662955+00:00"},{"alias_kind":"pith_short_16","alias_value":"ENA5243DAZCMTGWJ","created_at":"2026-05-18T12:26:28.662955+00:00"},{"alias_kind":"pith_short_8","alias_value":"ENA5243D","created_at":"2026-05-18T12:26:28.662955+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":0,"internal_anchor_count":0,"sample":[]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV","json":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV.json","graph_json":"https://pith.science/api/pith-number/ENA5243DAZCMTGWJHEWIAF3GJV/graph.json","events_json":"https://pith.science/api/pith-number/ENA5243DAZCMTGWJHEWIAF3GJV/events.json","paper":"https://pith.science/paper/ENA5243D"},"agent_actions":{"view_html":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV","download_json":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV.json","view_paper":"https://pith.science/paper/ENA5243D","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=1106.6104&json=true","fetch_graph":"https://pith.science/api/pith-number/ENA5243DAZCMTGWJHEWIAF3GJV/graph.json","fetch_events":"https://pith.science/api/pith-number/ENA5243DAZCMTGWJHEWIAF3GJV/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV/action/timestamp_anchor","attest_storage":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV/action/storage_attestation","attest_author":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV/action/author_attestation","sign_citation":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV/action/citation_signature","submit_replication":"https://pith.science/pith/ENA5243DAZCMTGWJHEWIAF3GJV/action/replication_record"}},"created_at":"2026-05-18T03:31:31.030400+00:00","updated_at":"2026-05-18T03:31:31.030400+00:00"}