Recognition: 2 theorem links
· Lean TheoremUnified Modeling of Lane and Lane Topology for Driving Scene Reasoning
Pith reviewed 2026-05-12 01:58 UTC · model grok-4.3
The pith
Modeling lanes and topology as connected predecessor and successor lanes enables direct perception of both from raw images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We propose an innovative method called UniTopo, which represents the topological relationships between lanes as connected lanes, encompassing predecessor lanes, successor lanes, and their interconnections. This unified representation of lanes and lane topology allows us to simultaneously obtain both the positions and topological information of lanes within a shared perception pipeline, establishing a new paradigm for directly perceiving lane topology from original image features.
What carries the argument
Unified representation of lanes and their topology as predecessor and successor connections processed inside a single perception pipeline starting from raw image features.
If this is right
- Lane positions and topology are obtained simultaneously without post-processing from detections.
- The method reports TOP_ll scores of 30.1 percent and 31.8 percent on the two OpenLane-V2 subsets.
- These scores exceed the prior best method by 6.0 percent and 8.6 percent respectively.
- A direct-perception paradigm replaces the previous reasoning-by-detection workflow.
Where Pith is reading between the lines
- Error accumulation between detection and topology stages may be reduced because both are learned jointly.
- Similar unification of detection and relational reasoning could apply to other scene elements such as traffic lights.
- End-to-end driving stacks might absorb this single-stage lane module without modular hand-offs.
Load-bearing premise
Representing lane topology through predecessor and successor connections is enough to capture the needed relationships without separate detection steps or major loss of information.
What would settle it
On the OpenLane-V2 test sets, a pipeline that first detects lanes and then computes topology separately would need to match or exceed the reported TOP_ll scores of 30.1 percent and 31.8 percent for the unified model to lose its claimed advantage.
Figures
read the original abstract
Autonomous vehicles need to perceive not only physical elements in the driving scene, such as lane lines and traffic lights, but also logical elements like lane centerlines and their topology. Existing lane topology reasoning methods typically follow a reasoning-by-detection paradigm, where lane topological relationships are primarily derived from lane detection results. In this paper, we propose an innovative method called Unified Modeling of Lane and Lane Topology (UniTopo), which represents the topological relationships between lanes as connected lanes, encompassing predecessor lanes, successor lanes, and their interconnections. This unified representation of lanes and lane topology allows us to simultaneously obtain both the positions and topological information of lanes within a shared perception pipeline, establishing a new paradigm for directly perceiving lane topology from original image features. We validate our method on the driving scene reasoning benchmark OpenLane-V2, which consists of two subsets, built based on Argoverse2 and nuScenes, respectively. Our method achieves TOP_ll of 30.1% and 31.8% on the two subsets, significantly surpassing the existing state-of-the-art method T^2SG by 6.0% and 8.6%.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes UniTopo, a unified modeling method for lanes and lane topology in driving scenes. It represents topological relationships as connected predecessor/successor lanes (and their interconnections) to enable simultaneous perception of lane geometry and topology directly from raw image features within a single pipeline, departing from the conventional reasoning-by-detection paradigm. On the OpenLane-V2 benchmark (Argoverse2 and nuScenes subsets), the method reports TOP_ll scores of 30.1% and 31.8%, outperforming prior SOTA T^2SG by 6.0% and 8.6%.
Significance. If the unified representation truly supports direct topology perception without implicit detection stages or feature bottlenecks, the work could shift the field toward more integrated perception pipelines for autonomous driving, with potential gains in efficiency and reduced error propagation. The reported benchmark improvements are concrete and would be a meaningful advance if backed by full architectural details, loss formulations, and ablations.
major comments (1)
- Abstract: The load-bearing claim that representing topology as predecessor/successor connections 'allows us to simultaneously obtain both the positions and topological information of lanes within a shared perception pipeline' and establishes 'a new paradigm for directly perceiving lane topology from original image features' requires explicit confirmation that no separate lane detection head or intermediate instance representation is used. Without equations, network diagrams, or loss terms showing how connections are regressed/classified from image features alone, it remains unclear whether the joint objective avoids trading geometric accuracy for topological accuracy or reintroduces implicit detection.
minor comments (1)
- The abstract references the TOP_ll metric and OpenLane-V2 subsets but provides no definition of the metric, no comparison table, and no mention of other standard metrics (e.g., lane detection mAP or topology-specific scores) used in the benchmark.
Simulated Author's Rebuttal
We thank the referee for their detailed review and constructive comments. We address the major comment below and clarify the unified end-to-end design of UniTopo while revising the manuscript for greater explicitness.
read point-by-point responses
-
Referee: Abstract: The load-bearing claim that representing topology as predecessor/successor connections 'allows us to simultaneously obtain both the positions and topological information of lanes within a shared perception pipeline' and establishes 'a new paradigm for directly perceiving lane topology from original image features' requires explicit confirmation that no separate lane detection head or intermediate instance representation is used. Without equations, network diagrams, or loss terms showing how connections are regressed/classified from image features alone, it remains unclear whether the joint objective avoids trading geometric accuracy for topological accuracy or reintroduces implicit detection.
Authors: We thank the referee for this observation. UniTopo is designed as a single end-to-end network: a shared image backbone extracts features that feed directly into a unified prediction head. This head simultaneously regresses lane geometry (as ordered points) and classifies predecessor/successor connections between lane instances, with no separate detection head, no post-processing instance grouping, and no intermediate lane representations. Topology is not derived after detection but is an explicit output of the same feature embeddings via a connectivity classification branch. The full architecture (including network diagram in Figure 2), the joint loss (geometric regression plus topology cross-entropy, Equation 4), and the direct regression of connections from image features are detailed in Section 3. Ablation studies confirm that joint optimization improves rather than trades off geometric and topological accuracy. To make the abstract claim self-contained, we have revised it to explicitly state the absence of a separate detection stage and added a short clarifying sentence referencing the unified head. revision: yes
Circularity Check
No significant circularity; empirical results on external benchmarks
full rationale
The paper introduces a unified representation of lanes and topology as predecessor/successor connections to enable direct perception from image features in one pipeline, contrasting it with prior reasoning-by-detection approaches. Validation consists of performance numbers (TOP_ll 30.1% and 31.8%) on the public OpenLane-V2 benchmark subsets, with direct numerical comparison to an external SOTA method T^2SG. No equations, loss terms, fitted parameters, or self-citations appear in the provided text that would reduce the claimed improvements or the new paradigm to a tautology or input fit by construction. The derivation chain therefore remains open to external falsification via benchmark results rather than closing on its own definitions or prior author work.
Axiom & Free-Parameter Ledger
free parameters (1)
- neural network weights and hyperparameters
axioms (2)
- domain assumption Lane topology can be faithfully represented as a graph of predecessor and successor connections
- domain assumption Image features contain sufficient information to infer both geometry and topology jointly
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
UniTopo defines two groups of queries for piecewise lanes and connected lanes, uses a shared lane decoder to interact with BEV features, and employs a shared lane head to obtain lane positions and the lane-to-lane topology relationships. In addition, we design a Topology-Aware Attention Module (TAM) to incorporate lane connection information into the features of piecewise lanes.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a method for unified modeling of lane and lane topology that concurrently perceives lanes and their topological structures, establishing a new paradigm distinct from the reasoning-by-detection approach.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
nuScenes: A multimodal dataset for autonomous driving,
H. Caesar, V . Bankiti, A. H. Lang, S. V ora, V . E. Liong, Q. Xu, A. Krishnan, Y . Pan, G. Baldan, and O. Beijbom, “nuScenes: A multimodal dataset for autonomous driving,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11 621–11 631
work page 2020
-
[2]
Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting
B. Wilson, W. Qi, T. Agarwal, J. Lambert, J. Singh, S. Khandelwal, B. Pan, R. Kumar, A. Hartnett, J. K. Ponteset al., “Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting,”arXiv preprint arXiv:2301.00493, 2023
work page internal anchor Pith review arXiv 2023
-
[3]
Scalability in Perception for Autonomous Driving: Waymo Open Dataset,
P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V . Patnaik, P. Tsui, J. Guo, Y . Zhou, Y . Chai, B. Caineet al., “Scalability in Perception for Autonomous Driving: Waymo Open Dataset,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2446–2454
work page 2020
-
[4]
Planning-oriented Autonomous Driving,
Y . Hu, J. Yang, L. Chen, K. Li, C. Sima, X. Zhu, S. Chai, S. Du, T. Lin, W. Wanget al., “Planning-oriented Autonomous Driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17 853–17 862
work page 2023
-
[5]
V AD: Vectorized Scene Representation for Efficient Autonomous Driving,
B. Jiang, S. Chen, Q. Xu, B. Liao, J. Chen, H. Zhou, Q. Zhang, W. Liu, C. Huang, and X. Wang, “V AD: Vectorized Scene Representation for Efficient Autonomous Driving,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 8340–8350
work page 2023
-
[6]
OpenLane-V2: A Topology Reasoning Bench- mark for Unified 3D HD Mapping,
H. Wang, T. Li, Y . Li, L. Chen, C. Sima, Z. Liu, B. Wang, P. Jia, Y . Wang, S. Jianget al., “OpenLane-V2: A Topology Reasoning Bench- mark for Unified 3D HD Mapping,”Advances in Neural Information Processing Systems, vol. 36, pp. 18 873–18 884, 2024
work page 2024
-
[7]
Graph-based Topology Reasoning for Driving Scenes,
T. Li, L. Chen, X. Geng, H. Wang, Y . Li, Z. Liu, S. Jiang, Y . Wang, H. Xu, C. Xuet al., “Graph-based Topology Reasoning for Driving Scenes,”arXiv preprint arXiv:2304.05277, 2023
-
[8]
TopoMLP: A Simple yet Strong Pipeline for Driving Topology Reasoning,
D. Wu, J. Chang, F. Jia, Y . Liu, T. Wang, and J. Shen, “TopoMLP: A Simple yet Strong Pipeline for Driving Topology Reasoning,”arXiv preprint arXiv:2310.06753, 2023
-
[9]
Enhancing 3D Lane Detection and Topology Reasoning with 2D Lane Priors,
H. Li, Z. Huang, Z. Wang, W. Rong, N. Wang, and S. Liu, “Enhancing 3D Lane Detection and Topology Reasoning with 2D Lane Priors,”arXiv preprint arXiv:2406.03105, 2024
-
[10]
TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes,
Y . Fu, W. Liao, X. Liu, Y . Ma, F. Dai, Y . Zhanget al., “TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving Scenes,” arXiv preprint arXiv:2405.14747, 2024
-
[11]
RoadPainter: Points Are Ideal Navigators for Topology TransformER,
Z. Ma, S. Liang, Y . Wen, W. Lu, and G. Wan, “RoadPainter: Points Are Ideal Navigators for Topology TransformER,” inEuropean Conference on Computer Vision, 2024, pp. 179–195
work page 2024
-
[12]
Driving Scene Un- derstanding with Traffic Scene-Assisted Topology Graph Transformer,
F. Rong, W. Peng, M. Lan, Q. Zhang, and L. Zhang, “Driving Scene Un- derstanding with Traffic Scene-Assisted Topology Graph Transformer,” inProceedings of the 32nd ACM International Conference on Multime- dia, 2024, pp. 10 075–10 084
work page 2024
-
[13]
T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving,
C. Lv, M. Qi, L. Liu, and H. Ma, “T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 17 197– 17 206
work page 2025
-
[14]
Augmenting Lane Perception and Topology Under- standing with Standard Definition Navigation Maps,
K. Z. Luo, X. Weng, Y . Wang, S. Wu, J. Li, K. Q. Weinberger, Y . Wang, and M. Pavone, “Augmenting Lane Perception and Topology Under- standing with Standard Definition Navigation Maps,” inInternational Conference on Robotics and Automation, 2024, pp. 4029–4035
work page 2024
-
[15]
LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving,
T. Li, P. Jia, B. Wang, L. Chen, K. Jiang, J. Yan, and H. Li, “LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving,” arXiv preprint arXiv:2312.16108, 2023
-
[16]
Semi-Supervised Classification with Graph Convolutional Networks
T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks,”arXiv preprint arXiv:1609.02907, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[17]
Deformable DETR: Deformable Transformers for End-to-End Object Detection
X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable DETR: Deformable Transformers for End-to-End Object Detection,” arXiv preprint arXiv:2010.04159, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[18]
Line-CNN: End-to-End Traffic Line Detection With Line Proposal Unit,
X. Li, J. Li, X. Hu, and J. Yang, “Line-CNN: End-to-End Traffic Line Detection With Line Proposal Unit,”IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 1, pp. 248–258, 2019
work page 2019
-
[19]
Keep your Eyes on the Lane: Real-time Attention- guided Lane Detection,
L. Tabelini, R. Berriel, T. M. Paixao, C. Badue, A. F. De Souza, and T. Oliveira-Santos, “Keep your Eyes on the Lane: Real-time Attention- guided Lane Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 294–302
work page 2021
-
[20]
CLRNet: Cross Layer Refinement Network for Lane Detection,
T. Zheng, Y . Huang, Y . Liu, W. Tang, Z. Yang, D. Cai, and X. He, “CLRNet: Cross Layer Refinement Network for Lane Detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 898–907. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 12
work page 2022
-
[21]
CLRNetV2: A Faster and Stronger Lane Detector,
T. Zheng, Y . Huang, Y . Liu, B. Lin, Z. Yang, D. Cai, and X. He, “CLRNetV2: A Faster and Stronger Lane Detector,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 6, pp. 4271– 4284, 2025
work page 2025
-
[22]
Dense Hybrid Proposal Modulation for Lane Detection,
Y . Wu, L. Zhao, J. Lu, and H. Yan, “Dense Hybrid Proposal Modulation for Lane Detection,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 11, pp. 6845–6859, 2023
work page 2023
-
[23]
S. Tan, Y . Zhang, and S. Zhu, “SMFRNet: Complex Scene Lane Detec- tion With Start Point-Guided Multi-Dimensional Feature Refinement,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 12, pp. 13 364–13 372, 2024
work page 2024
-
[24]
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection,
Y . Zhang, L. Zhu, W. Feng, H. Fu, M. Wang, Q. Li, C. Li, and S. Wang, “VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 681–15 690
work page 2021
-
[25]
Recursive Video Lane Detection,
D. Jin, D. Kim, and C.-S. Kim, “Recursive Video Lane Detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 8473–8482
work page 2023
-
[26]
STADet: Streaming Timing-Aware Video Lane Detection,
K. He, J. Xie, X. Dai, K. Chang, F. Chen, and Z. Wang, “STADet: Streaming Timing-Aware Video Lane Detection,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 9, pp. 8644– 8656, 2024
work page 2024
-
[27]
LaneTCA: Enhancing Video Lane Detection With Temporal Context Aggregation,
K. Zhou, L. Li, W. Zhou, Y . Wang, H. Feng, and H. Li, “LaneTCA: Enhancing Video Lane Detection With Temporal Context Aggregation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 9, pp. 8574–8585, 2025
work page 2025
-
[28]
3D-LaneNet: End-to-End 3D Multiple Lane Detection,
N. Garnett, R. Cohen, T. Pe’er, R. Lahav, and D. Levi, “3D-LaneNet: End-to-End 3D Multiple Lane Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 2921–2930
work page 2019
-
[29]
Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection,
Y . Guo, G. Chen, P. Zhao, W. Zhang, J. Miao, J. Wang, and T. E. Choe, “Gen-LaneNet: A Generalized and Scalable Approach for 3D Lane Detection,” inEuropean Conference on Computer Vision, 2020, pp. 666–681
work page 2020
-
[30]
3D-LaneNet+: Anchor Free Lane Detection using a Semi- Local Representation,
N. Efrat, M. Bluvstein, S. Oron, D. Levi, N. Garnett, and B. E. Shlomo, “3D-LaneNet+: Anchor Free Lane Detection using a Semi- Local Representation,”arXiv preprint arXiv:2011.01535, 2020
-
[31]
Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints,
R. Liu, D. Chen, T. Liu, Z. Xiong, and Z. Yuan, “Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints,” inProceedings of the AAAI Conference on Artificial Intelligence, 2022, pp. 1765–1772
work page 2022
-
[32]
PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark,
L. Chen, C. Sima, Y . Li, Z. Zheng, J. Xu, X. Geng, H. Li, C. He, J. Shi, Y . Qiaoet al., “PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark,” inEuropean Conference on Computer Vision, 2022, pp. 550–567
work page 2022
-
[33]
Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection,
S. Huang, Z. Shen, Z. Huang, Z.-h. Ding, J. Dai, J. Han, N. Wang, and S. Liu, “Anchor3DLane: Learning to Regress 3D Anchors for Monocular 3D Lane Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 17 451–17 460
work page 2023
-
[34]
Anchor3DLane++: 3D Lane Detection via Sample-Adaptive Sparse 3D Anchor Regression,
S. Huang, Z. Shen, Z. Huang, Y . Liao, J. Han, N. Wang, and S. Liu, “Anchor3DLane++: 3D Lane Detection via Sample-Adaptive Sparse 3D Anchor Regression,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 47, no. 3, pp. 1660–1673, 2025
work page 2025
-
[35]
LATR: 3D Lane Detection from Monocular Images with Transformer,
Y . Luo, C. Zheng, X. Yan, T. Kun, C. Zheng, S. Cui, and Z. Li, “LATR: 3D Lane Detection from Monocular Images with Transformer,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 7941–7952
work page 2023
-
[36]
Cross-view Semantic Segmentation for Sensing Surroundings,
B. Pan, J. Sun, H. Y . T. Leung, A. Andonian, and B. Zhou, “Cross-view Semantic Segmentation for Sensing Surroundings,”IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4867–4873, 2020
work page 2020
-
[37]
Cross-View Transformers for Real-Time Map-View Semantic Segmentation,
B. Zhou and P. Kr ¨ahenb¨uhl, “Cross-View Transformers for Real-Time Map-View Semantic Segmentation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 13 760–13 769
work page 2022
-
[38]
Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer,
S. Chen, T. Cheng, X. Wang, W. Meng, Q. Zhang, and W. Liu, “Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer,”arXiv preprint arXiv:2206.04584, 2022
-
[39]
HDMapNet: An Online HD Map Construction and Evaluation Framework,
Q. Li, Y . Wang, Y . Wang, and H. Zhao, “HDMapNet: An Online HD Map Construction and Evaluation Framework,” inInternational Conference on Robotics and Automation, 2022, pp. 4628–4634
work page 2022
-
[40]
VectorMapNet: End- to-end Vectorized HD Map Learning,
Y . Liu, T. Yuan, Y . Wang, Y . Wang, and H. Zhao, “VectorMapNet: End- to-end Vectorized HD Map Learning,” inInternational Conference on Machine Learning, 2023, pp. 22 352–22 369
work page 2023
-
[41]
Maptr: Structured modeling and learning for online vectorized hd map construction,
B. Liao, S. Chen, X. Wang, T. Cheng, Q. Zhang, W. Liu, and C. Huang, “MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction,”arXiv preprint arXiv:2208.14437, 2022
-
[42]
MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction,
B. Liao, S. Chen, Y . Zhang, B. Jiang, Q. Zhang, W. Liu, C. Huang, and X. Wang, “MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction,”arXiv preprint arXiv:2308.05736, 2023
-
[43]
Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction,
Z. Liu, X. Zhang, G. Liu, J. Zhao, and N. Xu, “Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction,” inEuropean Conference on Computer Vision, 2025, pp. 461–477
work page 2025
-
[44]
StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Con- struction,
T. Yuan, Y . Liu, Y . Wang, Y . Wang, and H. Zhao, “StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Con- struction,” inProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024, pp. 7356–7365
work page 2024
-
[45]
Structured Bird’s-Eye-View Traffic Scene Understanding from Onboard Images,
Y . B. Can, A. Liniger, D. P. Paudel, and L. Van Gool, “Structured Bird’s-Eye-View Traffic Scene Understanding from Onboard Images,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 661–15 670
work page 2021
-
[46]
End-to-End Object Detection with Transformers,
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” in European Conference on Computer Vision, 2020, pp. 213–229
work page 2020
-
[47]
Z. Xu, Y . Liu, Y . Sun, M. Liu, and L. Wang, “CenterLineDet: CenterLine Graph Detection for Road Lanes with Vehicle-mounted Sensors by Transformer for HD Map Generation,” inInternational Conference on Robotics and Automation, 2023, pp. 3553–3559
work page 2023
-
[48]
Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction,
Liao, Bencheng and Chen, Shaoyu and Jiang, Bo and Cheng, Tianheng and Zhang, Qian and Liu, Wenyu and Huang, Chang and Wang, Xing- gang, “Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction,” inEuropean Conference on Computer Vision, 2025, pp. 334–351
work page 2025
-
[49]
Continuity Preserving Online CenterLine Graph Learning,
Y . Han, K. Yu, and Z. Li, “Continuity Preserving Online CenterLine Graph Learning,” inEuropean Conference on Computer Vision, 2024, pp. 342–359
work page 2024
-
[50]
RATopo: Improving Lane Topology Reasoning via Redundancy Assignment,
H. Li, S. Huang, L. Xu, Y . Gao, B. Mu, and S. Liu, “RATopo: Improving Lane Topology Reasoning via Redundancy Assignment,” inProceedings of the 33rd ACM International Conference on Multimedia, 2025, pp. 777–786
work page 2025
-
[51]
H. Ye, M. Qi, Z. Liu, L. Liu, and H. Ma, “SafeDriveRAG: Towards Safe Autonomous Driving with Knowledge Graph-based Retrieval- Augmented Generation,” inProceedings of the 33rd ACM International Conference on Multimedia, 2025, pp. 11 170–11 178
work page 2025
-
[52]
SGFormer: Semantic Graph Transformer for Point Cloud-based 3D Scene Graph Generation,
C. Lv, M. Qi, X. Li, Z. Yang, and H. Ma, “SGFormer: Semantic Graph Transformer for Point Cloud-based 3D Scene Graph Generation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 5, 2024, pp. 4035–4043
work page 2024
-
[53]
Attentive Relational Networks for Mapping Images to Scene Graphs,
M. Qi, W. Li, Z. Yang, Y . Wang, and J. Luo, “Attentive Relational Networks for Mapping Images to Scene Graphs,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 3957–3966
work page 2019
-
[54]
Deep Residual Learning for Image Recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778
work page 2016
-
[55]
Feature Pyramid Networks for Object Detection,
T.-Y . Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117–2125
work page 2017
-
[56]
Z. Li, W. Wang, H. Li, E. Xie, C. Sima, T. Lu, Y . Qiao, and J. Dai, “BEV- Former: Learning Bird’s-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers,” inEuropean Conference on Computer Vision, 2022, pp. 1–18
work page 2022
-
[57]
Focal Loss for Dense Object Detection,
T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Doll ´ar, “Focal Loss for Dense Object Detection,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2017, pp. 2980–2988
work page 2017
-
[58]
Generalized Intersection over Union: A Metric and A Loss for Bound- ing Box Regression,
H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, “Generalized Intersection over Union: A Metric and A Loss for Bound- ing Box Regression,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 658–666
work page 2019
-
[59]
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2016
work page 2016
-
[60]
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment,
Q. Chen, X. Chen, J. Wang, S. Zhang, K. Yao, H. Feng, J. Han, E. Ding, G. Zeng, and J. Wang, “Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6633–6642
work page 2023
-
[61]
D. Jia, Y . Yuan, H. He, X. Wu, H. Yu, W. Lin, L. Sun, C. Zhang, and H. Hu, “DETRs with Hybrid Matching,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19 702–19 712. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 13
work page 2023
-
[62]
Decoupled Weight Decay Regularization
I. Loshchilov, “Decoupled Weight Decay Regularization,”arXiv preprint arXiv:1711.05101, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[63]
Y . Luo, L. Zheng, T. Guan, J. Yu, and Y . Yang, “Taking A Closer Look at Domain Shift: Category-level Adversaries for Semantics Consistent Domain Adaptation.” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2507–2516
work page 2019
-
[64]
Category-Level Adversarial Adaptation for Semantic Segmentation using Purified Fea- tures
Y . Luo, P. Liu, L. Zheng, T. Guan, J. Yu, and Y . Yang, “Category-Level Adversarial Adaptation for Semantic Segmentation using Purified Fea- tures.”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 8, pp. 3940–3956, 2021
work page 2021
-
[65]
Kill Two Birds with One Stone: Domain Generalization for Semantic Segmentation via Network Pruning
Y . Luo, P. Liu, and Y . Yang, “Kill Two Birds with One Stone: Domain Generalization for Semantic Segmentation via Network Pruning.”In- ternational Journal of Computer Vision, vol. 133, no. 1, pp. 335–352, 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.