Recognition: unknown
View-Consistent 3D Scene Editing via Dual-Path Structural Correspondense and Semantic Continuity
Pith reviewed 2026-05-10 05:44 UTC · model grok-4.3
The pith
Recasting 3D scene editing as joint distribution modeling across viewpoints with a dual-path consistency mechanism produces precise multi-view consistent text-driven edits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We recast multi-view consistent 3D editing from a distributional perspective: 3D scene editing essentially requires a joint distribution modeling across viewpoints. Based on this insight, we propose a view-consistent 3D editing framework that explicitly introduces cross-view dependencies into the editing process. Furthermore, motivated by the observation that structural correspondence and semantic continuity rely on different cross-view cues, we introduce a dual-path consistency mechanism consisting of projection-guided structural guidance and patch-level semantic propagation for effective cross-view editing. Further, we construct a paired multi-view editing dataset that provides reliable su
What carries the argument
dual-path consistency mechanism of projection-guided structural guidance and patch-level semantic propagation that handles different cross-view cues separately
If this is right
- Superior editing performance with precise and consistent views for complex scenes compared to prior render-edit-optimize methods.
- Reduced reliance on inference-time synchronization improves robustness and generalization.
- Dedicated paths allow structural correspondence and semantic continuity to be maintained using their respective cross-view cues.
- The paired multi-view editing dataset supplies reliable supervision for learning cross-view consistency.
Where Pith is reading between the lines
- The distributional framing could be tested on other multi-view tasks such as consistent novel-view synthesis from edits.
- The separation of structural and semantic paths may suggest similar decompositions for consistency problems in video or 4D generation.
- If the paired dataset generalizes, it could support supervised training of additional cross-view editing models beyond the current framework.
Load-bearing premise
The dual-path mechanism of projection-guided structural guidance and patch-level semantic propagation, combined with supervision from the new paired dataset, will produce reliable cross-view consistency without limiting generalization or introducing new artifacts in real-world scenes.
What would settle it
Edited multi-view images of a complex real-world scene exhibiting visible structural mismatches or semantic drifts between viewpoints would show the dual-path approach has not delivered the claimed consistency.
Figures
read the original abstract
Text-driven 3D scene editing has recently attracted increasing attention. Most existing methods follow a render-edit-optimize pipeline, where multi-view images are rendered from a 3D scene, edited with 2D image editors, and then used to optimize the underlying 3D representation. However, cross-view inconsistency remains a major bottleneck. Although recent methods introduce geometric cues, cross-view interactions, or video priors to mitigate this issue, they still largely rely on inference-time synchronization and thus remain limited in robustness and generalization.In this work, we recast multi-view consistent 3D editing from a distributional perspective: 3D scene editing essentially requires a joint distribution modeling across viewpoints.Based on this insight, we propose a view-consistent 3D editing framework that explicitly introduces cross-view dependencies into the editing process. Furthermore, motivated by the observation that structural correspondence and semantic continuity rely on different cross-view cues, we introduce a dual-path consistency mechanism consisting of projection-guided structural guidance and patch-level semantic propagation for effective cross-view editing. Further, we construct a paired multi-view editing dataset that provides reliable supervision for learning cross-view consistency in edited scenes. Extensive experiments demonstrate that our method achieves superior editing performance with precise and consistent views for complex scenes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript recasts text-driven 3D scene editing as joint distribution modeling across viewpoints to overcome cross-view inconsistency in the standard render-edit-optimize pipeline. It introduces a dual-path consistency mechanism (projection-guided structural guidance combined with patch-level semantic propagation) and constructs a paired multi-view editing dataset to provide supervision for learning cross-view dependencies. The central claim is that this framework delivers superior editing performance with precise and consistent multi-view results on complex scenes.
Significance. If the experimental claims hold, the distributional perspective and dual-path mechanism could provide a more robust, training-based alternative to inference-time synchronization techniques for enforcing view consistency. The new paired dataset would also supply a concrete resource for future work on supervised cross-view editing, with potential impact on applications requiring coherent 3D content generation.
major comments (2)
- [Method and Experiments] The central performance claim depends on the dual-path mechanism producing reliable cross-view consistency that generalizes beyond the training distribution. The manuscript provides no explicit cross-dataset validation or testing on real-world captures outside the custom paired set (see Method description and Experiments section), which directly bears on whether the distributional insight mitigates inconsistency in complex scenes or merely fits the training data.
- [Abstract and Experiments] The abstract asserts that 'extensive experiments demonstrate superior editing performance with precise and consistent views,' yet the provided text contains no quantitative metrics, baseline comparisons, ablation results, or error analysis to support this. Without these details, the superiority and consistency claims cannot be verified as load-bearing evidence.
minor comments (2)
- [Title] The title contains a spelling error ('Correspondense' should be 'Correspondence').
- [Method] Notation for the dual-path components (projection-guided structural guidance and patch-level semantic propagation) is introduced at a high level; explicit equations or pseudocode would clarify how the two paths interact during training and inference.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions where appropriate to strengthen the presentation of our distributional perspective and dual-path framework.
read point-by-point responses
-
Referee: [Method and Experiments] The central performance claim depends on the dual-path mechanism producing reliable cross-view consistency that generalizes beyond the training distribution. The manuscript provides no explicit cross-dataset validation or testing on real-world captures outside the custom paired set (see Method description and Experiments section), which directly bears on whether the distributional insight mitigates inconsistency in complex scenes or merely fits the training data.
Authors: We appreciate the referee's emphasis on the need to demonstrate generalization. Our paired multi-view editing dataset was constructed from diverse 3D scenes and editing instructions specifically to encourage learning of cross-view dependencies that are not tied to a narrow distribution. The dual-path mechanism relies on projection-based structural cues and patch-level semantic propagation, which draw from general geometric and semantic principles. That said, we agree that explicit testing on external real-world captures would provide stronger evidence that the approach mitigates inconsistency rather than overfitting the training set. In the revised manuscript, we will add experiments on additional real-world multi-view datasets to evaluate generalization. revision: yes
-
Referee: [Abstract and Experiments] The abstract asserts that 'extensive experiments demonstrate superior editing performance with precise and consistent views,' yet the provided text contains no quantitative metrics, baseline comparisons, ablation results, or error analysis to support this. Without these details, the superiority and consistency claims cannot be verified as load-bearing evidence.
Authors: We apologize for any lack of prominence in the experimental details. The full manuscript contains a dedicated Experiments section that reports quantitative metrics on editing quality and multi-view consistency, direct comparisons against relevant baselines, ablation studies isolating the contributions of the structural and semantic paths, and error analysis across scene types. The abstract is written as a concise summary of these results. To make the claims more immediately verifiable, we will revise the abstract to include a brief reference to key quantitative outcomes and ensure the Experiments section is clearly signposted from the abstract. revision: partial
Circularity Check
No significant circularity detected in the derivation chain
full rationale
The paper introduces a new view-consistent 3D editing framework by recasting the problem as joint distribution modeling across viewpoints, proposing a dual-path mechanism (projection-guided structural guidance plus patch-level semantic propagation), and constructing a new paired multi-view editing dataset for supervision. No load-bearing steps reduce by construction to self-defined quantities, fitted parameters renamed as predictions, or self-citation chains. The abstract and method framing present an independent proposal whose claims rest on experimental validation rather than tautological equivalence to inputs. This matches the reader's assessment of no evident circular reasoning.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions,
A. Haque, M. Tancik, A. A. Efros, A. Holynski, and A. Kanazawa, “Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions,” inPro- ceedings of the IEEE/CVF International Conference on Computer Vi- sion, pp. 19740–19750, 2023
2023
-
[2]
Editing Conditional Radiance Fields,
S. Liu, X. Zhang, Z. Zhang, R. Zhang, J.-Y . Zhu, and B. Russell, “Editing Conditional Radiance Fields,” inProceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5773–5783, 2021
2021
-
[3]
GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing,
J. Wu, J.-W. Bian, X. Li, G. Wang, I. Reid, P. Torr, and V . A. Prisacariu, “GaussCtrl: Multi-View Consistent Text-Driven 3D Gaussian Splatting Editing,” inEuropean Conference on Computer Vision, pp. 55–71, Springer, 2024
2024
-
[4]
TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Manipulation,
C. Luo, D. Di, X. Yang, Y . Ma, Z. Xue, W. Chen, X. Gou, and Y . Liu, “TrAME: Trajectory-Anchored Multi-View Editing for Text-Guided 3D Gaussian Manipulation,”IEEE Transactions on Multimedia, vol. 27, pp. 2886–2898, 2025
2025
-
[5]
EditSplat: Multi-View Fusion and Attention- Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting,
D. I. Lee, H. Park, J. Seo, E. Park, H. Park, H. D. Baek, S. Shin, S. Kim, and S. Kim, “EditSplat: Multi-View Fusion and Attention- Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting,” inProceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 11135–11145, 2025
2025
-
[6]
DGE: Direct Gaussian 3D Editing by Consistent Multi-View Editing,
M. Chen, I. Laina, and A. Vedaldi, “DGE: Direct Gaussian 3D Editing by Consistent Multi-View Editing,” inEuropean Conference on Computer Vision, pp. 74–92, Springer, 2024
2024
-
[7]
Fast Multi-View Consistent 3D Editing with Video Priors,
L. Chen, R. Li, G. Zhang, P. Wang, and L. Zhang, “Fast Multi-View Consistent 3D Editing with Video Priors,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 40, pp. 2948–2956, 2026. JOURNAL OF LATEX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 10
2026
-
[8]
GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting,
Y . Chen, Z. Chen, C. Zhang, F. Wang, X. Yang, Y . Wang, Z. Cai, L. Yang, H. Liu, and G. Lin, “GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting,” inProceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 21476– 21485, 2024
2024
-
[9]
Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories,
S. Hong, J. Karras, R. Martin-Brualla, and I. Kemelmacher-Shlizerman, “Perturb-and-Revise: Flexible 3D Editing with Generative Trajectories,” inProceedings of the Computer Vision and Pattern Recognition Confer- ence, pp. 16293–16303, 2025
2025
-
[10]
Denoising Diffusion Probabilistic Models,
J. Ho, A. Jain, and P. Abbeel, “Denoising Diffusion Probabilistic Models,”Advances in Neural Information Processing Systems, vol. 33, pp. 6840–6851, 2020
2020
-
[11]
Denoising Diffusion Implicit Models
J. Song, C. Meng, and S. Ermon, “Denoising Diffusion Implicit Models,” arXiv preprint arXiv:2010.02502, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[12]
High-Resolution Image Synthesis with Latent Diffusion Models,
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-Resolution Image Synthesis with Latent Diffusion Models,” in Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 10684–10695, 2022
2022
-
[13]
Tuning- Free High-Resolution Video Diffusion with Spatial-Temporal Latent Grouping,
Z. Chen, F. Long, Z. Qiu, T. Yao, W. Zhou, J. Luo, and T. Mei, “Tuning- Free High-Resolution Video Diffusion with Spatial-Temporal Latent Grouping,”IEEE Transactions on Multimedia, vol. 28, pp. 42–56, 2026
2026
-
[14]
DiffW: Multi-Encoder Based on Conditional Diffusion Model for Robust Image Watermarking,
T. Luo, R. Hu, Z. He, G. Jiang, H. Xu, Y . Song, and C.-C. Chang, “DiffW: Multi-Encoder Based on Conditional Diffusion Model for Robust Image Watermarking,”IEEE Transactions on Multimedia, vol. 28, pp. 837–852, 2026
2026
-
[15]
S. Zheng, J. Jiang, and W. Li, “V-Bridge: Bridging Video Generative Priors to Versatile Few-Shot Image Restoration,”arXiv preprint arXiv:2603.13089, 2026
-
[16]
Deep Point Set Resampling via Gradient Fields,
H. Chen, B. Du, S. Luo, and W. Hu, “Deep Point Set Resampling via Gradient Fields,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 2913–2930, 2022
2022
-
[17]
Multi-View User Preference Modeling for Personalized Text-to-Image Generation,
H. Zhang, T. Wu, and Y . Wei, “Multi-View User Preference Modeling for Personalized Text-to-Image Generation,”IEEE Transactions on Multimedia, vol. 27, pp. 3082–3091, 2025
2025
-
[18]
CLIP-GAN: Stacking CLIPs and GAN for Efficient and Controllable Text-to-Image Synthesis,
Y . Hou, W. Zhang, Z. Zhu, and H. Yu, “CLIP-GAN: Stacking CLIPs and GAN for Efficient and Controllable Text-to-Image Synthesis,”IEEE Transactions on Multimedia, vol. 27, pp. 3702–3715, 2025
2025
-
[19]
Adding Conditional Control to Text-to-Image Diffusion Models,
L. Zhang, A. Rao, and M. Agrawala, “Adding Conditional Control to Text-to-Image Diffusion Models,” inProceedings of the IEEE/CVF international conference on computer vision, pp. 3836–3847, 2023
2023
-
[20]
IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models
H. Ye, J. Zhang, S. Liu, X. Han, and W. Yang, “Ip-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models,”arXiv preprint arXiv:2308.06721, 2023
work page internal anchor Pith review arXiv 2023
-
[21]
Scalable Diffusion Models with Transformers,
W. Peebles and S. Xie, “Scalable Diffusion Models with Transformers,” inProceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4195–4205, 2023
2023
-
[22]
Attention is All You Need,
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All You Need,”Advances in neural information processing systems, vol. 30, 2017
2017
-
[23]
Free-Merging: Fourier Transform for Efficient Model Merging,
S. Zheng and H. Wang, “Free-Merging: Fourier Transform for Efficient Model Merging,” inProceedings of the IEEE/CVF International Con- ference on Computer Vision, pp. 3863–3873, 2025
2025
-
[24]
DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning,
S. Zheng, H. Wang, and T. Mu, “DCLP: Neural Architecture Predictor with Curriculum Contrastive Learning,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, pp. 17051–17059, 2024
2024
-
[25]
Building Normalizing Flows with Stochastic Interpolants,
M. Albergo and E. Vanden-Eijnden, “Building Normalizing Flows with Stochastic Interpolants,” inICLR 2023 Conference, 2023
2023
-
[26]
Flow Matching for Generative Modeling,
Y . Lipman, R. T. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow Matching for Generative Modeling,” inThe Eleventh International Conference on Learning Representations, 2023
2023
-
[27]
Geometry and Perception Guided Gaussians for Multiview-Consistent 3D Generation from a Single Image,
P. Li, B. Du, and W. Hu, “Geometry and Perception Guided Gaussians for Multiview-Consistent 3D Generation from a Single Image,”arXiv preprint arXiv:2506.21152, 2025
-
[28]
Diffusion Probabilistic Models for 3D Point Cloud Generation,
S. Luo and W. Hu, “Diffusion Probabilistic Models for 3D Point Cloud Generation,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2837–2845, 2021
2021
-
[29]
LION: Latent Point Diffusion Models for 3D Shape Generation,
A. Vahdat, F. Williams, Z. Gojcic, O. Litany, S. Fidler,et al., “LION: Latent Point Diffusion Models for 3D Shape Generation,”Advances in neural information processing systems, vol. 35, pp. 10021–10039, 2022
2022
-
[30]
Point-E: A System for Generating 3D Point Clouds from Complex Prompts
A. Nichol, H. Jun, P. Dhariwal, P. Mishkin, and M. Chen, “Point-E: A System for Generating 3D Point Clouds from Complex Prompts,”arXiv preprint arXiv:2212.08751, 2022
work page internal anchor Pith review arXiv 2022
-
[31]
From Part to Whole: 3D Generative World Model with an Adaptive Structural Hierarchy,
B. Du, D. Liu, P. Li, and W. Hu, “From Part to Whole: 3D Generative World Model with an Adaptive Structural Hierarchy,”arXiv preprint arXiv:2603.21557, 2026
-
[32]
In: arXiv preprint arXiv:2412.19413 (2024) 2
B. Du, W. Hu, and R. Liao, “Multi-Scale Latent Point Consistency Models for 3D Shape Generation,”arXiv preprint arXiv:2412.19413, 2024
-
[33]
Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing,
B. Du, X. Gao, W. Hu, and R. Liao, “Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20850–20859, 2024
2024
-
[34]
Null- Text Inversion for Editing Real Images Using Guided Diffusion Models,
R. Mokady, A. Hertz, K. Aberman, Y . Pritch, and D. Cohen-Or, “Null- Text Inversion for Editing Real Images Using Guided Diffusion Models,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6038–6047, 2023
2023
-
[35]
Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation,
N. Tumanyan, M. Geyer, S. Bagon, and T. Dekel, “Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1921–1930, 2023
1921
-
[36]
TextFace: Text-to-Style Mapping based Face Generation and Manipulation,
X. Hou, X. Zhang, Y . Li, and L. Shen, “TextFace: Text-to-Style Mapping based Face Generation and Manipulation,”IEEE Transactions on Multimedia, vol. 25, pp. 3409–3419, 2022
2022
-
[37]
Learning by Imagination: A Joint Framework for Text-Based Image Manipulation and Change Captioning,
K. E. Ak, Y . Sun, and J. H. Lim, “Learning by Imagination: A Joint Framework for Text-Based Image Manipulation and Change Captioning,”IEEE Transactions on Multimedia, vol. 25, pp. 3006–3016, 2022
2022
-
[38]
From External to Internal: Structuring Image for Text-to-Image Attributes Manipulation,
L. Gao, Q. Zhao, J. Zhu, S. Su, L. Cheng, and L. Zhao, “From External to Internal: Structuring Image for Text-to-Image Attributes Manipulation,”IEEE Transactions on Multimedia, vol. 25, pp. 7248– 7261, 2022
2022
-
[39]
Instructpix2pix: Learning to Follow Image Editing Instructions,
T. Brooks, A. Holynski, and A. A. Efros, “Instructpix2pix: Learning to Follow Image Editing Instructions,” inProceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 18392– 18402, 2023
2023
-
[40]
FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space
B. F. Labs, S. Batifol, A. Blattmann, F. Boesel, S. Consul, C. Diagne, T. Dockhorn, J. English, Z. English, P. Esser,et al., “FLUX. 1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space,”arXiv preprint arXiv:2506.15742, 2025
work page internal anchor Pith review arXiv 2025
-
[41]
C. Wu, J. Li, J. Zhou, J. Lin, K. Gao, K. Yan, S.-m. Yin, S. Bai, X. Xu, Y . Chen,et al., “Qwen-Image Technical Report,”arXiv preprint arXiv:2508.02324, 2025
work page internal anchor Pith review arXiv 2025
-
[42]
MorphNeRF: Text-Guided 3D-Aware Editing via Morphing Generative Neural Radiance Fields,
Y . Yu, R. Wu, Y . Men, S. Lu, M. Cui, X. Xie, and C. Miao, “MorphNeRF: Text-Guided 3D-Aware Editing via Morphing Generative Neural Radiance Fields,”IEEE Transactions on Multimedia, vol. 26, pp. 8516–8528, 2024
2024
-
[43]
AugGS: Self-Augmented Gaussians with Structural Masks for Sparse-View 3D Reconstruction,
B. Du, L. Meng, and W. Hu, “AugGS: Self-Augmented Gaussians with Structural Masks for Sparse-View 3D Reconstruction,”arXiv preprint arXiv:2408.04831, 2024
-
[44]
3D Gaussian Splatting for Real-Time Radiance Field Rendering.,
B. Kerbl, G. Kopanas, T. Leimk ¨uhler, G. Drettakis,et al., “3D Gaussian Splatting for Real-Time Radiance Field Rendering.,”ACM Trans. Graph., vol. 42, no. 4, pp. 139–1, 2023
2023
-
[45]
Depth Anything 3: Recovering the Visual Space from Any Views
H. Lin, S. Chen, J. Liew, D. Y . Chen, Z. Li, G. Shi, J. Feng, and B. Kang, “Depth Anything 3: Recovering the Visual Space from Any Views,” arXiv preprint arXiv:2511.10647, 2025
work page internal anchor Pith review arXiv 2025
-
[46]
DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning- Based 3D Vision,
L. Ling, Y . Sheng, Z. Tu, W. Zhao, C. Xin, K. Wan, L. Yu, Q. Guo, Z. Yu, et al., “DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning- Based 3D Vision,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22160–22169, 2024
2024
-
[47]
RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos,
H. Xia, Y . Fu, S. Liu, and X. Wang, “RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 22378–22389, 2024
2024
-
[48]
O. Sim ´eoni, H. V . V o, M. Seitzer, F. Baldassarre, M. Oquab, C. Jose, V . Khalidov, M. Szafraniec, S. Yi, M. Ramamonjisoa,et al., “DINOV3,” arXiv preprint arXiv:2508.10104, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[49]
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields,
J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields,” inProceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, pp. 5470–5479, 2022
2022
-
[50]
BlendedMVS: A Large-Scale Dataset for Generalized Multi- View Stereo Networks,
Y . Yao, Z. Luo, S. Li, J. Zhang, Y . Ren, L. Zhou, T. Fang, and L. Quan, “BlendedMVS: A Large-Scale Dataset for Generalized Multi- View Stereo Networks,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1790–1799, 2020
2020
-
[51]
Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines,
B. Mildenhall, P. P. Srinivasan, R. Ortiz-Cayon, N. K. Kalantari, R. Ramamoorthi, R. Ng, and A. Kar, “Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines,”ACM Transactions on Graphics (ToG), vol. 38, no. 4, pp. 1–14, 2019
2019
-
[52]
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators,
R. Gal, O. Patashnik, H. Maron, A. H. Bermano, G. Chechik, and D. Cohen-Or, “StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators,”ACM Transactions on Graphics (TOG), vol. 41, no. 4, pp. 1–13, 2022
2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.