PR-CAD unifies text-to-CAD generation and editing via progressive refinement with LLMs, a new interaction dataset, and RL-enhanced reasoning to achieve better controllability and faithfulness.
InAnnual Meeting of the Association for Computational Linguistics
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
This systematic survey organizes prompt engineering into a taxonomy of 58 LLM techniques and 40 others, supplies a shared vocabulary, and offers guidelines for state-of-the-art models.
MDS selects better multi-turn dialogues for instruction tuning by combining bin-wise global coverage with local entity-topic and format consistency scoring, outperforming prior selectors on benchmarks.
OutSafe-Bench supplies the first large-scale four-modality safety dataset and evaluation framework that exposes persistent unsafe outputs in nine leading multimodal LLMs.
The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.
citing papers explorer
-
PR-CAD: Progressive Refinement for Unified Controllable and Faithful Text-to-CAD Generation with Large Language Models
PR-CAD unifies text-to-CAD generation and editing via progressive refinement with LLMs, a new interaction dataset, and RL-enhanced reasoning to achieve better controllability and faithfulness.
-
The Prompt Report: A Systematic Survey of Prompt Engineering Techniques
This systematic survey organizes prompt engineering into a taxonomy of 58 LLM techniques and 40 others, supplies a shared vocabulary, and offers guidelines for state-of-the-art models.
-
Data Selection for Multi-turn Dialogue Instruction Tuning
MDS selects better multi-turn dialogues for instruction tuning by combining bin-wise global coverage with local entity-topic and format consistency scoring, outperforming prior selectors on benchmarks.
-
OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models
OutSafe-Bench supplies the first large-scale four-modality safety dataset and evaluation framework that exposes persistent unsafe outputs in nine leading multimodal LLMs.
-
The Rise and Potential of Large Language Model Based Agents: A Survey
The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.
-
LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods
A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.