BRepCLIP is the first contrastive pretraining framework that tokenizes BRep CAD geometry into surface and curve vocabularies and aligns the resulting embeddings with CLIP text and image encoders, reporting large gains in retrieval and zero-shot classification over point-based baselines.
BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Recent advancements in deep learning have actively addressed complex challenges within the Computer-Aided Design (CAD) domain.However, most existing approaches rely on task-specifi c models requiring structural modifi cations for new tasks, and they predominantly focus on point clouds or images rather than the industry-standard Boundary Representation (B-rep) format. To address these limitations, we propose BrepCoder, a unifi ed Multimodal Large Language Model (MLLM) that performs diverse CAD tasks from B-rep inputs. By leveraging the code generation capabilities of Large Language Models (LLMs), we convert CAD modeling sequences into Python-like code and align them with B-rep. We then adopt a two-stage training strategy: First, pre-training on reverse engineering to learn geometric features and design logic. Second, eff ectively extending the model to various downstream tasks such as completion, error correction, and CAD-QA. Consequently, by interpreting B-rep as structural code, BrepCoder achieves superior generalization across diverse tasks, demonstrating its potential as a general-purpose CAD agent.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding
BRepCLIP is the first contrastive pretraining framework that tokenizes BRep CAD geometry into surface and curve vocabularies and aligns the resulting embeddings with CLIP text and image encoders, reporting large gains in retrieval and zero-shot classification over point-based baselines.