TRACE: A Conversational Framework for Sustainable Tourism Recommendation with Agentic Counterfactual Explanations
Pith reviewed 2026-05-10 14:36 UTC · model grok-4.3
The pith
TRACE uses AI agents and counterfactual explanations to nudge users toward sustainable tourism recommendations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TRACE is a multi-agent LLM-based framework for tourism recommendations that balances user relevance with environmental impact through an orchestrator-worker architecture, where agents elicit latent sustainability preferences, construct user personas, and generate agentic counterfactual explanations to promote reflection on lower-impact alternatives.
What carries the argument
The modular orchestrator-worker architecture consisting of specialized agents for sustainability preference elicitation, persona construction, recommendation balancing, and counterfactual explanation generation.
If this is right
- Users receive interactive nudges that surface greener travel options without direct pressure.
- Recommendation quality is maintained as shown by preserved relevance in user studies.
- Interactive responsiveness remains intact during the conversation.
- Semantic analyses confirm that the explanations align with sustainable decision-making goals.
Where Pith is reading between the lines
- Similar agentic approaches could be adapted to encourage sustainable choices in other recommendation areas like dining or product purchases.
- Long-term studies might show whether these nudges lead to lasting changes in travel behavior.
- The framework's design supports adding more specialized agents for additional factors such as cultural impact or local economy benefits.
Load-bearing premise
LLM-based agents can reliably draw out accurate sustainability preferences and generate unbiased, non-hallucinated counterfactual explanations.
What would settle it
An experiment in which users do not choose more sustainable options when given the counterfactual explanations compared to a version without them, or where the generated explanations are rated as inaccurate by participants.
Figures
read the original abstract
Traditional conversational travel recommender systems primarily optimize for user relevance and convenience, often reinforcing popular, overcrowded destinations and carbon-intensive travel choices. To address this, we present TRACE (Tourism Recommendation with Agentic Counterfactual Explanations), a multi-agent, LLM-based framework that promotes sustainable tourism through interactive nudging. TRACE uses a modular orchestrator-worker architecture where specialized agents elicit latent sustainability preferences, construct structured user personas, and generate recommendations that balance relevance with environmental impact. A key innovation lies in its use of agentic counterfactual explanations and LLM-driven clarifying questions, which together surface greener alternatives and refine understanding of intent, fostering user reflection without coercion. User studies and semantic alignment analyses demonstrate that TRACE effectively supports sustainable decision-making while preserving recommendation quality and interactive responsiveness. TRACE is implemented on Google's Agent Development Kit, with full code, Docker setup, prompts, and a publicly available demo video to ensure reproducibility. A project summary, including all resources, prompts, and demo access, is available at https://ashmibanerjee.github.io/trace-chatbot.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents TRACE, a multi-agent LLM-based conversational framework for tourism recommendations that promotes sustainable choices via an orchestrator-worker architecture. Specialized agents elicit latent sustainability preferences, construct user personas, and generate recommendations balanced with environmental impact, using agentic counterfactual explanations and clarifying questions to encourage reflection without coercion. The central claim is that user studies and semantic alignment analyses demonstrate effective support for sustainable decision-making while preserving recommendation quality and interactive responsiveness. The system is implemented on Google's Agent Development Kit with full code, Docker setup, prompts, and a public demo for reproducibility.
Significance. If the user-study claims hold, this represents a meaningful engineering contribution to sustainable recommender systems in information retrieval by addressing carbon-intensive travel patterns through interactive nudging. The explicit provision of code, prompts, Docker configuration, and demo video is a clear strength that supports reproducibility and extension by the community.
major comments (2)
- [Abstract and Evaluation] Abstract and the user-studies section: the central claim that 'user studies and semantic alignment analyses demonstrate that TRACE effectively supports sustainable decision-making while preserving recommendation quality' is unsupported because no details are given on study design, sample size, metrics (e.g., reflection or alignment scores), statistical tests, or quantitative results. This information is required to evaluate whether the observed effects are genuine or artifacts of the LLM agents.
- [Framework (§3)] Framework description: the multi-agent architecture for preference elicitation, persona construction, and counterfactual generation contains no validation steps (e.g., expert annotation, consistency checks across prompts, or bias audits) against hallucinations or systematic biases. This assumption is load-bearing for the claim that the system surfaces genuine sustainability preferences rather than model defaults.
minor comments (1)
- [Discussion] The paper would benefit from a dedicated limitations subsection that explicitly discusses risks of LLM-induced bias in the agentic components.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment below and describe the revisions we will make to improve the clarity and rigor of the manuscript.
read point-by-point responses
-
Referee: [Abstract and Evaluation] Abstract and the user-studies section: the central claim that 'user studies and semantic alignment analyses demonstrate that TRACE effectively supports sustainable decision-making while preserving recommendation quality' is unsupported because no details are given on study design, sample size, metrics (e.g., reflection or alignment scores), statistical tests, or quantitative results. This information is required to evaluate whether the observed effects are genuine or artifacts of the LLM agents.
Authors: We agree that the current version of the manuscript does not provide sufficient methodological detail on the user studies and semantic alignment analyses to fully substantiate the claims in the abstract. In the revised manuscript we will expand the evaluation section with a complete description of the study design, participant sample size and recruitment, the specific metrics (including reflection scores, semantic alignment scores, recommendation quality, and responsiveness), the statistical tests applied, and the quantitative results with supporting tables or figures. We will also update the abstract to reference these additions where appropriate. revision: yes
-
Referee: [Framework (§3)] Framework description: the multi-agent architecture for preference elicitation, persona construction, and counterfactual generation contains no validation steps (e.g., expert annotation, consistency checks across prompts, or bias audits) against hallucinations or systematic biases. This assumption is load-bearing for the claim that the system surfaces genuine sustainability preferences rather than model defaults.
Authors: We acknowledge that the framework description would be strengthened by explicit validation procedures for the agentic components. While the public code, prompts, and demo already enable community inspection, we will revise Section 3 to add a dedicated validation subsection. This will include prompt consistency checks across multiple runs, expert annotation of sample outputs for hallucination and bias detection, and any systematic audits performed during development. These additions will directly address concerns about whether the system elicits genuine preferences. revision: yes
Circularity Check
No circularity; engineering framework with independent user-study validation
full rationale
The paper presents TRACE as a modular multi-agent LLM architecture for eliciting sustainability preferences and generating counterfactual explanations in tourism recommendations. Its central claims rest on the described system design plus reported user studies and semantic alignment analyses, none of which involve mathematical derivations, fitted parameters renamed as predictions, or load-bearing self-citations that reduce the result to its own inputs. No equations, uniqueness theorems, or ansatzes are invoked; the contribution is self-contained as an implemented framework with reproducibility artifacts.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLM agents can accurately elicit and model latent user sustainability preferences through interactive dialogue without significant bias or error.
Reference graph
Works this paper leans on
-
[1]
Ionut Arghire. 2026. Chainlit Vulnerabilities May Leak Sensitive Infor- mation. https://www.securityweek.com/chainlit-vulnerabilities-may-leak- sensitive-information/. Accessed 2026-02
work page 2026
-
[2]
Ashmi Banerjee. 2023. Fairness and sustainability in multistakeholder tourism recommender systems. InProceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization. 274–279
work page 2023
- [3]
-
[4]
Ashmi Banerjee, Tunar Mahmudov, Emil Adler, Fitri Nur Aisyah, and Wolfgang Wörndl. 2025. Modeling sustainable city trips: integrating CO 2 e emissions, popularity, and seasonality into tourism recommender systems.Information Technology & Tourism27, 1 (2025), 189–226
work page 2025
-
[5]
Ashmi Banerjee, Tunar Mahmudov, and Wolfgang Wörndl. 2024. Green Desti- nation Recommender: A Web Application to Encourage Responsible City Trip Recommendations. InAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization. 486–490
work page 2024
- [6]
-
[7]
Ashmi Banerjee, Adithi Satish, Fitri Nur Aisyah, Wolfgang Wörndl, and Yashar Deldjoo. 2025. SynthTRIPs: A Knowledge-Grounded Framework for Benchmark Data Generation for Personalized Tourism Recommenders. InProceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval. 3743–3752
work page 2025
-
[8]
Oren Barkan, Veronika Bogina, Liya Gurevitch, Yuval Asher, and Noam Koenig- stein. 2024. A counterfactual framework for learning and evaluating explanations for recommender systems. InProceedings of the ACM Web Conference 2024. 3723– 3733
work page 2024
-
[9]
Keping Bi, Qingyao Ai, and W Bruce Croft. 2021. Asking clarifying questions based on negative feedback in conversational search. InProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval. 157–166
work page 2021
-
[10]
Chainlit. 2026. Chainlit: Get Started – Overview. https://docs.chainlit.io/get- started/overview
work page 2026
- [11]
- [12]
-
[13]
FastAPI. 2026. FastAPI Documentation. https://fastapi.tiangolo.com/
work page 2026
-
[14]
Firebase and Google Cloud. 2026. Cloud Firestore Documentation. https://firebase. google.com/docs/firestore
work page 2026
-
[15]
Google Cloud. 2026. Cloud Run Documentation. https://cloud.google.com/run
work page 2026
-
[16]
Google Cloud. 2026. Overview of Agent Development Kit. https://docs.cloud. google.com/agent-builder/agent-development-kit/overview
work page 2026
-
[17]
Google Cloud. 2026. Vertex AI Platform. https://cloud.google.com/vertex-ai
work page 2026
-
[18]
2024.Gemini 2.5: Technical Report
Google DeepMind. 2024.Gemini 2.5: Technical Report. Technical Report. Google DeepMind. https://storage.googleapis.com/deepmind-media/gemini/gemini_ v2_5_report.pdf
work page 2024
-
[19]
Shengyu Gu. 2024. A survey of large language models in tourism (Tourism LLMs). Preprint on Qeios(2024)
work page 2024
-
[20]
Riccardo Guidotti. 2024. Counterfactual explanations and how to find them: literature review and benchmarking.Data Mining and Knowledge Discovery38, 5 (2024), 2770–2824
work page 2024
-
[21]
Haya Halimeh and Oliver Müller. 2025. Towards Greener Choices: Decision Information Nudging for Sustainability-Aware Recommender Explanations. In International Workshop on Recommender Systems for Sustainability and Social Good. Springer, 27–42
work page 2025
- [22]
-
[23]
Ankur Joshi, Saket Kale, Satish Chandel, and D Kumar Pal. 2015. Likert scale: Explored and explained.British journal of applied science & technology7, 4 (2015), 396–403
work page 2015
-
[24]
Sara Kemper, Justin Cui, Kai Dicarlantonio, Kathy Lin, Danjie Tang, Anton Ko- rikov, and Scott Sanner. 2024. Retrieval-augmented conversational recommen- dation with prompt-based semi-structured natural language state tracking. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2786–2790
work page 2024
- [25]
-
[26]
Noemi Mauro, Livio Scarpinati, Fabio Ferrero, Angelo Geninatti Cossatin, and Claudio Mattutino. 2024. Point-of-Interest Recommender Systems: Nudging towards Sustainable Tourism. InAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization. 491–495
work page 2024
- [27]
-
[28]
Xuhui Ren, Hongzhi Yin, Tong Chen, Hao Wang, Zi Huang, and Kai Zheng
-
[29]
Learning to ask appropriate questions in conversational recommendation. InProceedings of the 44th international ACM SIGIR conference on research and development in information retrieval. 808–817
- [30]
-
[31]
Ivan Sekulić, Weronika Łajewska, Krisztian Balog, and Fabio Crestani. 2024. Estimating the usefulness of clarifying questions and answers for conversational search. InEuropean Conference on Information Retrieval. Springer, 384–392
work page 2024
-
[32]
Zijian Shao, Jiancan Wu, Weijian Chen, and Xiang Wang. 2025. Personal Travel Solver: A Preference-Driven LLM-Solver System for Travel Planning. InProceed- ings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 27622–27642
work page 2025
-
[33]
Juntao Tan, Shuyuan Xu, Yingqiang Ge, Yunqi Li, Xu Chen, and Yongfeng Zhang
-
[34]
InProceedings of the 30th ACM International Conference on Information & Knowledge Management
Counterfactual explainable recommendation. InProceedings of the 30th ACM International Conference on Information & Knowledge Management. 1784–1793. SIGIR ’26, July 20–24, 2026, Melbourne, VIC, Australia Ashmi Banerjee, Adithi Satish, Wolfgang Wörndl, and Yashar Deldjoo
work page 2026
-
[35]
Ke Wang, Shuai Yan, Haoran Yuan, Yanling Huang, Yuhang Wu, Fei Li, Shengying Yang, and Huan Deng. 2025. Toward Interpretable and Persistent Personalization: A Memory-Augmented Agent Framework for LLM-Based Travel Planning.IEEE Access13 (2025), 193125–193141
work page 2025
-
[36]
Xiangmeng Wang, Qian Li, Dianer Yu, Qing Li, and Guandong Xu. 2024. Coun- terfactual explanation for fairness in recommendation.ACM Transactions on Information Systems42, 4 (2024), 1–30
work page 2024
-
[37]
Zhefan Wang, Yuanqing Yu, Wendi Zheng, Weizhi Ma, and Min Zhang. 2024. Macrec: A multi-agent collaboration framework for recommendation. (2024), 2760–2764
work page 2024
-
[38]
Dianer Yu, Qian Li, Xiangmeng Wang, Qing Li, and Guandong Xu. 2023. Coun- terfactual explainable conversational recommendation.IEEE Transactions on Knowledge and Data Engineering36, 6 (2023), 2388–2400
work page 2023
-
[39]
Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck
-
[40]
InProceedings of the web conference 2020
Generating clarifying questions for information retrieval. InProceedings of the web conference 2020. 418–428
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.