High-Fidelity Mural Restoration via a Unified Hybrid Mask-Aware Transformer

Chi Zhang; Jincheng Jiang; Qianhao Han; Zheng Zheng

arxiv: 2604.03984 · v1 · submitted 2026-04-05 · 💻 cs.CV

High-Fidelity Mural Restoration via a Unified Hybrid Mask-Aware Transformer

Jincheng Jiang , Qianhao Han , Chi Zhang , Zheng Zheng This is my paper

Pith reviewed 2026-05-13 17:35 UTC · model grok-4.3

classification 💻 cs.CV

keywords mural restorationhybrid transformerimage inpaintingcultural heritagemask-aware filteringdigital restorationdeep learning

0 comments

The pith

The Hybrid Mask-Aware Transformer restores ancient murals by combining local texture modeling with long-range structural inference while preserving undamaged regions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents HMAT, a unified framework for high-fidelity restoration of degraded ancient murals that must reconstruct large missing structures without altering authentic areas. It pairs Mask-Aware Dynamic Filtering for local textures with a Transformer bottleneck for global structure, then adds mask-conditional style fusion to adapt to varied degradation shapes. A Teacher-Forcing Decoder with hard-gated skip connections further enforces fidelity in valid pixels. Tests on the DHMural dataset and a curated Nine-Colored Deer dataset show the method matches or exceeds prior approaches in structural coherence and visual accuracy across different degradation levels.

Core claim

HMAT integrates Mask-Aware Dynamic Filtering for robust local texture modeling, a Transformer bottleneck for long-range structural inference, a mask-conditional style fusion module that dynamically guides generation, and a Teacher-Forcing Decoder with hard-gated skip connections that enforce fidelity in undamaged regions while focusing reconstruction on missing areas.

What carries the argument

The Hybrid Mask-Aware Transformer (HMAT) framework, which uses mask-aware dynamic filtering and a transformer bottleneck together with mask-conditional style fusion and hard-gated skip connections.

Load-bearing premise

The mask-conditional style fusion and hard-gated skip connections will generalize across diverse real-world mural degradation patterns beyond the DHMural and Nine-Colored Deer datasets.

What would settle it

Running HMAT on a fresh collection of murals that exhibit degradation morphologies absent from the two training datasets and observing clear drops in structural coherence or fidelity scores compared with baseline methods.

Figures

Figures reproduced from arXiv: 2604.03984 by Chi Zhang, Jincheng Jiang, Qianhao Han, Zheng Zheng.

**Figure 1.** Figure 1: Overview of the proposed Hybrid Mask-Aware Transformer (HMAT). The core architecture is a unified generator featuring a Hybrid Encoder (MADF + Transformer) for robust feature extraction, Mask-Conditional Style Fusion (SF) to dynamically guide synthesis, and a Teacher-Forcing Decoder (TFD) to enforce absolute historical fidelity in undamaged regions. The resulting structural completion is subsequently proce… view at source ↗

**Figure 2.** Figure 2: Qualitative comparison of style dimensionality configurations on the NineColored Deer dataset. To evaluate capacity distribution, we compare our Baseline (simg = 360, slatent = 180, smask = 64) against Equal Capacity (simg = 180, slatent = 180, smask = 180) and Heavy Semantic Bias (simg = 360, slatent = 64, smask = 16). 4.4 Comparison with State of the Arts Our hybrid framework achieves the best performan… view at source ↗

**Figure 3.** Figure 3: Qualitative comparison with state-of-the-art methods on the DHMural dataset. structures that are difficult for CNNs to recover because of their limited receptive fields. At the same time, its highly uniform style, since all samples are cropped from a single painting, allows the Transformer bottleneck to learn global structural patterns effectively. By combining MADF for local texture preservation with a T… view at source ↗

read the original abstract

Ancient murals are valuable cultural artifacts, but many have suffered severe degradation due to environmental exposure, material aging, and human activity. Restoring these artworks is challenging because it requires both reconstructing large missing structures and strictly preserving authentic, undamaged regions. This paper presents the Hybrid Mask-Aware Transformer (HMAT), a unified framework for high-fidelity mural restoration. HMAT integrates Mask-Aware Dynamic Filtering for robust local texture modeling with a Transformer bottleneck for long-range structural inference. To further address the diverse morphology of degradation, we introduce a mask-conditional style fusion module that dynamically guides the generative process. In addition, a Teacher-Forcing Decoder with hard-gated skip connections is designed to enforce fidelity in valid regions and focus reconstruction on missing areas. We evaluate HMAT on the DHMural dataset and a curated Nine-Colored Deer dataset under varying degradation levels. Experimental results demonstrate that the proposed method achieves competitive performance compared to state-of-the-art approaches, while producing more structurally coherent and visually faithful restorations. These findings suggest that HMAT provides an effective solution for the digital restoration of cultural heritage murals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a hybrid transformer for mural restoration that adds mask-aware filtering and gated decoder skips to preserve real areas while filling damage, but the gains rest on limited dataset tests without clear ablations.

read the letter

The main thing to know is that this work assembles a Hybrid Mask-Aware Transformer for ancient mural restoration. It combines Mask-Aware Dynamic Filtering for local textures, a transformer bottleneck for global structure, mask-conditional style fusion, and a Teacher-Forcing Decoder with hard-gated skips to keep undamaged regions untouched while reconstructing missing parts. That specific mix is presented as new for this domain, and it directly targets the constraint that you cannot alter authentic pigment or structure in cultural artifacts. On the DHMural and Nine-Colored Deer datasets the method reportedly yields more coherent and faithful outputs than prior approaches, which is a reasonable engineering response to the problem. The gated skips and conditional fusion are straightforward ways to enforce fidelity, and the overall framing makes sense for a niche but worthwhile application. The soft spot is the evaluation scope. Results are shown only on those two datasets under controlled degradation levels, with no cross-dataset or out-of-distribution tests reported. Without the actual metrics, baseline comparisons, or ablation numbers visible in the abstract, it is hard to judge how much each module contributes or whether the style fusion overfits to the morphology of these particular murals. Real-world degradations vary in crack patterns, pigment fading, and lighting, so the claimed structural coherence may not transfer as cleanly as hoped. This paper is for researchers in digital heritage preservation or constrained image restoration. A reader working on mask-guided inpainting or transformer applications to art conservation would find the module ideas useful and could adapt them. It deserves peer review. The architecture is motivated by a clear practical need, the claims are falsifiable, and the work is coherent on its own terms even if more diverse testing would strengthen the conclusions.

Referee Report

2 major / 2 minor

Summary. The manuscript presents the Hybrid Mask-Aware Transformer (HMAT) for high-fidelity restoration of degraded ancient murals. It integrates Mask-Aware Dynamic Filtering for local texture modeling with a Transformer bottleneck for long-range structural inference, introduces a mask-conditional style fusion module to dynamically guide generation according to degradation morphology, and proposes a Teacher-Forcing Decoder with hard-gated skip connections to enforce fidelity in valid regions while focusing reconstruction on missing areas. The method is evaluated on the DHMural dataset and a curated Nine-Colored Deer dataset under varying degradation levels, with the central claim that it achieves competitive performance against state-of-the-art approaches while producing more structurally coherent and visually faithful restorations.

Significance. If the empirical results hold, this work offers a targeted advance for digital cultural heritage preservation by providing a unified hybrid architecture that simultaneously handles large missing structures and strict preservation of authentic regions. The mask-aware components and gated skips represent a concrete contribution to conditional image restoration, with potential applicability to other domains involving partial degradation. The paper's emphasis on both local filtering and global Transformer inference is a strength, as is the focus on real cultural artifacts rather than synthetic benchmarks alone.

major comments (2)

[§4] The central empirical claim (abstract and §4) that HMAT produces more structurally coherent restorations rests on evaluation solely on DHMural and Nine-Colored Deer under controlled degradation levels. No cross-dataset, cross-domain, or out-of-distribution tests are reported, which directly bears on whether the mask-conditional style fusion and hard-gated skip connections generalize to arbitrary real-world mural patterns (e.g., different crack topologies or pigment fading). This is a load-bearing gap for the generalization assertion.
[§4] §4 (and associated tables/figures): the abstract asserts competitive results with superior coherence, yet the evaluation description provides no concrete metrics (PSNR, SSIM, LPIPS, or user-study scores), no listed baselines, no ablation tables isolating the contribution of the style fusion or gated skips, and no error analysis. Without these quantitative details, the support for the performance claim cannot be verified.

minor comments (2)

[Abstract] Abstract: consider adding one or two key quantitative results (e.g., average PSNR improvement) to make the 'competitive performance' claim immediately concrete for readers.
[§3] Notation: the distinction between 'mask-conditional style fusion' and 'Mask-Aware Dynamic Filtering' should be clarified with a short equation or diagram reference in §3 to avoid reader confusion about module roles.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the two major comments below and will revise the paper to strengthen the empirical evaluation and generalization analysis.

read point-by-point responses

Referee: [§4] The central empirical claim (abstract and §4) that HMAT produces more structurally coherent restorations rests on evaluation solely on DHMural and Nine-Colored Deer under controlled degradation levels. No cross-dataset, cross-domain, or out-of-distribution tests are reported, which directly bears on whether the mask-conditional style fusion and hard-gated skip connections generalize to arbitrary real-world mural patterns (e.g., different crack topologies or pigment fading). This is a load-bearing gap for the generalization assertion.

Authors: We agree that cross-dataset and out-of-distribution evaluation is necessary to substantiate the generalization of the mask-aware components. In the revision we will add experiments on additional real mural images with varied degradation patterns (e.g., different crack topologies and pigment fading) drawn from public cultural-heritage collections, together with controlled synthetic OOD degradations, to directly test the robustness of the style fusion and gated-skip mechanisms. revision: yes
Referee: [§4] §4 (and associated tables/figures): the abstract asserts competitive results with superior coherence, yet the evaluation description provides no concrete metrics (PSNR, SSIM, LPIPS, or user-study scores), no listed baselines, no ablation tables isolating the contribution of the style fusion or gated skips, and no error analysis. Without these quantitative details, the support for the performance claim cannot be verified.

Authors: We acknowledge that the current presentation of §4 lacks sufficient quantitative detail. The revised manuscript will explicitly report PSNR, SSIM, LPIPS, and user-study scores; list all baselines with implementation details; include ablation tables that isolate the mask-conditional style fusion and hard-gated skip connections; and add a dedicated error-analysis subsection with failure-case visualizations and quantitative breakdown by degradation type. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on independent evaluation

full rationale

The paper describes a neural architecture (HMAT) with modules such as Mask-Aware Dynamic Filtering, mask-conditional style fusion, and hard-gated skip connections, then reports competitive performance on DHMural and Nine-Colored Deer datasets. No equations, derivations, or parameter-fitting steps are present that could reduce a claimed prediction to its own inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims are therefore self-contained empirical statements rather than tautological reductions, consistent with a standard computer-vision methods paper.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Only abstract available so ledger is limited to high-level assumptions; model relies on standard deep learning training but introduces no explicit new entities.

free parameters (1)

model hyperparameters
Learning rates, layer counts, and fusion weights are implicitly fitted during training on mural data.

axioms (1)

domain assumption Degradation can be accurately represented by binary masks that separate valid and missing regions
Invoked throughout the mask-aware modules and decoder design.

pith-pipeline@v0.9.0 · 5494 in / 1064 out tokens · 34059 ms · 2026-05-13T17:35:07.524708+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

[1]

Jiang et al

Authors, A.: Nine-colored deer mural dataset.https://drive.google.com/file/ d/163XtOx_0A8bo-oU2piKp_w77f0kaS5du/view?usp=sharing (2026), dataset used for evaluation 12 J. Jiang et al

work page 2026
[2]

ACM Trans

Barnes,C.,Shechtman,E.,Finkelstein,A.,Goldman,D.:Patchmatch:Arandomized correspondence algorithm for structural image editing. ACM Trans. Graph.28 (2009)

work page 2009
[3]

In: Pro- ceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH)

Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Pro- ceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). pp. 417–424 (2000)

work page 2000
[4]

IEEE Transactions on Image Processing13(9), 1200–1212 (2004)

Criminisi, A., Perez, P., Toyama, K.: Region filling and object removal by exemplar- based image inpainting. IEEE Transactions on Image Processing13(9), 1200–1212 (2004)

work page 2004
[5]

Pattern Recognition145, 109897 (2024)

Huang, W., Deng, Y., Hui, S., Wu, Y., Zhou, S., Wang, J.: Sparse self-attention transformer for image inpainting. Pattern Recognition145, 109897 (2024)

work page 2024
[6]

Visual Informatics6(1), 1–13 (2022)

Li, M., Wang, Y., Xu, Y.Q.: Computing for chinese cultural heritage. Visual Informatics6(1), 1–13 (2022)

work page 2022
[7]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., Jia, J.: Mat: Mask-aware transformer for large hole image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

work page 2022
[8]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 10012– 10022 (2021)

work page 2021
[9]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Gool, L.V.: Repaint: Inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 11461–11471 (2022)

work page 2022
[10]

In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: Edgeconnect: Structure guided image inpainting using edge prediction. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). pp. 3265–3274 (2019)

work page 2019
[11]

International Journal of Computer Vision132(7), 2367–2400 (2024)

Quan, W., Chen, J., Liu, Y., Yan, D.M., Wonka, P.: Deep learning-based image and video inpainting: A survey. International Journal of Computer Vision132(7), 2367–2400 (2024)

work page 2024
[12]

In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015. pp. 234–241 (2015)

work page 2015
[13]

In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Shao, H., Xu, Q., Wen, P., Gao, P., Yang, Z., Huang, Q.: Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild . In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 20202–20212. IEEE Computer Society (2023)

work page 2023
[14]

Scientific reports5(1) (2015)

Sun, M., Zhang, D., Wang, Z., Ren, J., Chai, B., Sun, J.: What’s wrong with the murals at the mogao grottoes: a near-infrared hyperspectral imaging method. Scientific reports5(1) (2015)

work page 2015
[15]

In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., Naumov, N., Aliev, H., Chigorin, V.: Resolution-robust large mask inpainting with fourier convolutions. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 2149–2159 (2022)

work page 2022
[16]

In: Advances in Neural Information Processing Systems (NeurIPS)

Tian, K., Jiang, Y., Yuan, Z., Peng, B., Wang, L.: Visual autoregressive model- ing: Scalable image generation via next-scale prediction. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37 (2024)

work page 2024
[17]

Pattern Recognition134, 109046 (2023) Hybrid Mask-Aware Transformer for Mural Restoration 13

Xiang, H., Zou, Q., Nawaz, M.A., Huang, X., Zhang, F., Yu, H.: Deep learning for image inpainting: A survey. Pattern Recognition134, 109046 (2023) Hybrid Mask-Aware Transformer for Mural Restoration 13

work page 2023
[18]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

work page 2019
[19]

In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) (2018)

Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) (2018)

work page 2018
[20]

Information Fusion90, 74–94 (2023)

Zhang, X., Zhai, D., Li, T., Zhou, Y., Lin, Y.: Image inpainting based on deep learning: A review. Information Fusion90, 74–94 (2023)

work page 2023
[21]

In: International Conference on Learning Representations (ICLR) (2021)

Zhao, S., Cui, J., Sheng, Y., Dong, Y., Chang, E.I., Chang, Y., et al.: Large scale im- age completion via co-modulated generative adversarial networks. In: International Conference on Learning Representations (ICLR) (2021)

work page 2021
[22]

IEEE Transactions on Image Processing30, 4855–4866 (2021)

Zhu, M., He, D., Li, X., Li, C., Li, F., Liu, X., Ding, E., Zhang, Z.: Image inpainting by end-to-end cascaded refinement with mask awareness. IEEE Transactions on Image Processing30, 4855–4866 (2021)

work page 2021

[1] [1]

Jiang et al

Authors, A.: Nine-colored deer mural dataset.https://drive.google.com/file/ d/163XtOx_0A8bo-oU2piKp_w77f0kaS5du/view?usp=sharing (2026), dataset used for evaluation 12 J. Jiang et al

work page 2026

[2] [2]

ACM Trans

Barnes,C.,Shechtman,E.,Finkelstein,A.,Goldman,D.:Patchmatch:Arandomized correspondence algorithm for structural image editing. ACM Trans. Graph.28 (2009)

work page 2009

[3] [3]

In: Pro- ceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH)

Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: Pro- ceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH). pp. 417–424 (2000)

work page 2000

[4] [4]

IEEE Transactions on Image Processing13(9), 1200–1212 (2004)

Criminisi, A., Perez, P., Toyama, K.: Region filling and object removal by exemplar- based image inpainting. IEEE Transactions on Image Processing13(9), 1200–1212 (2004)

work page 2004

[5] [5]

Pattern Recognition145, 109897 (2024)

Huang, W., Deng, Y., Hui, S., Wu, Y., Zhou, S., Wang, J.: Sparse self-attention transformer for image inpainting. Pattern Recognition145, 109897 (2024)

work page 2024

[6] [6]

Visual Informatics6(1), 1–13 (2022)

Li, M., Wang, Y., Xu, Y.Q.: Computing for chinese cultural heritage. Visual Informatics6(1), 1–13 (2022)

work page 2022

[7] [7]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

Li, W., Lin, Z., Zhou, K., Qi, L., Wang, Y., Jia, J.: Mat: Mask-aware transformer for large hole image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

work page 2022

[8] [8]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 10012– 10022 (2021)

work page 2021

[9] [9]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Lugmayr, A., Danelljan, M., Romero, A., Yu, F., Timofte, R., Gool, L.V.: Repaint: Inpainting using denoising diffusion probabilistic models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 11461–11471 (2022)

work page 2022

[10] [10]

In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: Edgeconnect: Structure guided image inpainting using edge prediction. In: 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). pp. 3265–3274 (2019)

work page 2019

[11] [11]

International Journal of Computer Vision132(7), 2367–2400 (2024)

Quan, W., Chen, J., Liu, Y., Yan, D.M., Wonka, P.: Deep learning-based image and video inpainting: A survey. International Journal of Computer Vision132(7), 2367–2400 (2024)

work page 2024

[12] [12]

In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomed- ical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015. pp. 234–241 (2015)

work page 2015

[13] [13]

In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV)

Shao, H., Xu, Q., Wen, P., Gao, P., Yang, Z., Huang, Q.: Building Bridge Across the Time: Disruption and Restoration of Murals In the Wild . In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV). pp. 20202–20212. IEEE Computer Society (2023)

work page 2023

[14] [14]

Scientific reports5(1) (2015)

Sun, M., Zhang, D., Wang, Z., Ren, J., Chai, B., Sun, J.: What’s wrong with the murals at the mogao grottoes: a near-infrared hyperspectral imaging method. Scientific reports5(1) (2015)

work page 2015

[15] [15]

In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Suvorov, R., Logacheva, E., Mashikhin, A., Remizova, A., Ashukha, A., Silvestrov, A., Naumov, N., Aliev, H., Chigorin, V.: Resolution-robust large mask inpainting with fourier convolutions. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 2149–2159 (2022)

work page 2022

[16] [16]

In: Advances in Neural Information Processing Systems (NeurIPS)

Tian, K., Jiang, Y., Yuan, Z., Peng, B., Wang, L.: Visual autoregressive model- ing: Scalable image generation via next-scale prediction. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37 (2024)

work page 2024

[17] [17]

Pattern Recognition134, 109046 (2023) Hybrid Mask-Aware Transformer for Mural Restoration 13

Xiang, H., Zou, Q., Nawaz, M.A., Huang, X., Zhang, F., Yu, H.: Deep learning for image inpainting: A survey. Pattern Recognition134, 109046 (2023) Hybrid Mask-Aware Transformer for Mural Restoration 13

work page 2023

[18] [18]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.: Free-form image inpainting with gated convolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

work page 2019

[19] [19]

In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) (2018)

Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR) (2018)

work page 2018

[20] [20]

Information Fusion90, 74–94 (2023)

Zhang, X., Zhai, D., Li, T., Zhou, Y., Lin, Y.: Image inpainting based on deep learning: A review. Information Fusion90, 74–94 (2023)

work page 2023

[21] [21]

In: International Conference on Learning Representations (ICLR) (2021)

Zhao, S., Cui, J., Sheng, Y., Dong, Y., Chang, E.I., Chang, Y., et al.: Large scale im- age completion via co-modulated generative adversarial networks. In: International Conference on Learning Representations (ICLR) (2021)

work page 2021

[22] [22]

IEEE Transactions on Image Processing30, 4855–4866 (2021)

Zhu, M., He, D., Li, X., Li, C., Li, F., Liu, X., Ding, E., Zhang, Z.: Image inpainting by end-to-end cascaded refinement with mask awareness. IEEE Transactions on Image Processing30, 4855–4866 (2021)

work page 2021