Self-CriTeach lets an LLM generate symbolic domains that supply both chain-of-thought training data and structured rewards, producing a planning-enhanced model with better success rates and generalization.
In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Self-CriTeach: LLM Self-Teaching and Self-Critiquing for Improving Robotic Planning via Automated Domain Generation
Self-CriTeach lets an LLM generate symbolic domains that supply both chain-of-thought training data and structured rewards, producing a planning-enhanced model with better success rates and generalization.