Customer-r1: Personalized simulation of human behaviors via rl-based llm agent in online shopping.arXiv preprint arXiv:2510.07230, 2025b

Ziyi Wang, Yuxuan Lu, Yimeng Zhang, Jing Huang, Dakuo Wang · 2025 · arXiv 2510.07230

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Can LLMs Think Like Consumers? Benchmarking Crowd-Level Reaction Reconstruction with ConsumerSimBench

cs.CL · 2026-05-16 · unverdicted · novelty 7.0

ConsumerSimBench evaluates 13 LLMs on reconstructing crowd reactions from 1,553 Chinese social-media topics using 23,122 auditable yes-no criteria, finding maximum coverage of 47.8% by Gemini-3.1-Pro.

ShopGym: An Integrated Framework for Realistic Simulation and Scalable Benchmarking of E-Commerce Web Agents

cs.AI · 2026-05-15 · conditional · novelty 7.0

ShopGym introduces ShopArena to convert live storefronts into self-contained sandbox shops and ShopGuru to synthesize 224 benchmark tasks, with validation showing structural preservation and positive correlation of agent performance between synthetic and live shops.

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents

cs.AI · 2026-05-14 · unverdicted · novelty 6.0 · 2 refs

SimPersona induces a discrete buyer-type space from clickstreams via VQ-VAE, maps types to LLM persona tokens, fine-tunes agents on traces, and samples from merchant distributions to achieve 78% conversion-rate alignment on 42 held-out storefronts.

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

cs.CV · 2026-04-06 · unverdicted · novelty 6.0

FileGram grounds AI agent personalization in file-system behavioral traces via a data simulation engine, a diagnostic benchmark, and a bottom-up memory architecture.

citing papers explorer

Showing 4 of 4 citing papers.

Can LLMs Think Like Consumers? Benchmarking Crowd-Level Reaction Reconstruction with ConsumerSimBench cs.CL · 2026-05-16 · unverdicted · none · ref 42
ConsumerSimBench evaluates 13 LLMs on reconstructing crowd reactions from 1,553 Chinese social-media topics using 23,122 auditable yes-no criteria, finding maximum coverage of 47.8% by Gemini-3.1-Pro.
ShopGym: An Integrated Framework for Realistic Simulation and Scalable Benchmarking of E-Commerce Web Agents cs.AI · 2026-05-15 · conditional · none · ref 21
ShopGym introduces ShopArena to convert live storefronts into self-contained sandbox shops and ShopGuru to synthesize 224 benchmark tasks, with validation showing structural preservation and positive correlation of agent performance between synthetic and live shops.
SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents cs.AI · 2026-05-14 · unverdicted · none · ref 27 · 2 links
SimPersona induces a discrete buyer-type space from clickstreams via VQ-VAE, maps types to LLM persona tokens, fine-tunes agents on traces, and samples from merchant distributions to achieve 78% conversion-rate alignment on 42 held-out storefronts.
FileGram: Grounding Agent Personalization in File-System Behavioral Traces cs.CV · 2026-04-06 · unverdicted · none · ref 21
FileGram grounds AI agent personalization in file-system behavioral traces via a data simulation engine, a diagnostic benchmark, and a bottom-up memory architecture.

Customer-r1: Personalized simulation of human behaviors via rl-based llm agent in online shopping.arXiv preprint arXiv:2510.07230, 2025b

fields

years

verdicts

representative citing papers

citing papers explorer