Geometric Coastline Localization using Vision-Language Models

Bernhard Pfahringer; Eibe Frank; Karin Bryan; Mark Dickson; Rafia Malik

arxiv: 2606.10468 · v1 · pith:KVXJICY2new · submitted 2026-06-09 · 💻 cs.CV

Geometric Coastline Localization using Vision-Language Models

Rafia Malik , Bernhard Pfahringer , Karin Bryan , Mark Dickson , Eibe Frank This is my paper

Pith reviewed 2026-06-27 14:08 UTC · model grok-4.3

classification 💻 cs.CV

keywords coastline detectionvision-language modelsgeometric boundary localizationremote sensingpolyline predictionHausdorff distanceEarth Mover's Distance

0 comments

The pith

Coastline extraction improves when formulated as direct polyline prediction by a vision-language model rather than pixel segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that treating coastline detection as geometric boundary localization, instead of pixel-wise segmentation, better matches how coastlines are actually defined by geomorphic proxies and used in coastal change analysis. It builds CoastlineVLM-7B on a GeoChat-7B base that detects presence, classifies proxy type, and grounds the line as a polyline from high-resolution aerial imagery. Under one-pixel supervision on the New Zealand Coastal Change Dataset, this yields lower Hausdorff distance (31.84 m vs 37.74 m) and Earth Mover's Distance (17.32 m vs 21.12 m) than segmentation baselines. The work shows geometry metrics outperform overlap metrics like IoU for this task.

Core claim

Formulating the task as geometric boundary localization and training a vision-language model to output polylines directly produces better global geometric alignment with reference coastlines than mask-based segmentation approaches, as measured by reduced Hausdorff and Earth Mover's distances on the NZCCD under strict one-pixel boundary supervision.

What carries the argument

CoastlineVLM-7B, a vision-language model on the GeoChat-7B/LLaVA-1.5 architecture that jointly performs presence detection, proxy-type classification, and coastline grounding to predict a polyline.

If this is right

Geometry-based evaluation metrics should be prioritized over IoU when assessing coastline localization quality.
Direct polyline output aligns the learning objective with the geomorphic proxies used in coastal monitoring.
Vision-language models can incorporate semantic reasoning for proxy classification alongside geometric grounding.
Output representation choice affects how well automated methods match operational coastal change workflows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same direct-polyline formulation could apply to other linear boundary tasks such as river banks or road edges in remote sensing.
Adding textual prompts describing expected proxy types might further improve generalization across different coastal environments.
If the VLM's proxy classification proves robust, it could enable automated proxy selection without manual post-processing steps.

Load-bearing premise

That geometry-based metrics such as Hausdorff distance and Earth Mover's Distance are more appropriate than pixel-overlap metrics for judging operational coastline quality, and that the model's steps generalize beyond the NZCCD training distribution.

What would settle it

Evaluation on an independent coastal dataset from another region where the VLM shows no reduction in Hausdorff or EMD distances, or where IoU better correlates with field-measured change accuracy.

Figures

Figures reproduced from arXiv: 2606.10468 by Bernhard Pfahringer, Eibe Frank, Karin Bryan, Mark Dickson, Rafia Malik.

**Figure 2.** Figure 2: Overview of the CoastlineVLM-7B framework. Input aerial images are encoded using a CLIP-based vision encoder, and the resulting visual tokens are [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Normalized confusion matrix for proxy-type classification (Task II). [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of geometric localization distance metrics from Table 5. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Distribution of coastline proxy types in the training dataset before [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 7.** Figure 7: Qualitative comparison across challenging coastal environments in the West Coast test set. Columns are grouped by common failure modes: road adjacency [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

read the original abstract

Coastline detection in remote sensing imagery is commonly formulated as a pixel-wise segmentation problem, where the final coastline is extracted from a predicted mask through post-processing. This formulation relegates coastline geometry, the primary representation used in coastal change analysis, to a secondary artifact rather than the learning objective. In practice, coastlines are defined by geomorphic proxies such as vegetation lines, dune toes, or cliff edges, rather than an instantaneous land-water boundary often used in pixel-based segmentation approaches. In this work, we revisit coastline extraction from a representation perspective and formulate the task as geometric boundary localization. We use the New Zealand Coastal Change Dataset (NZCCD) and high-resolution aerial imagery from Land Information New Zealand (LINZ) to develop CoastlineVLM-7B, a vision-language model (VLM) built on the GeoChat-7B/LLaVA-1.5 architecture that jointly performs coastline presence detection, proxy-type classification, and coastline grounding. The model directly predicts a coastline as a polyline rather than a dense segmentation mask. We evaluate CoastlineVLM-7B against segmentation baselines under strict one-pixel boundary supervision. Results show that geometry-based metrics are more suitable for assessing coastline localization quality than pixel-overlap metrics such as Intersection over Union (IoU). CoastlineVLM-7B improves global geometric alignment with reference coastlines, reducing Hausdorff distance from 37.74 m to 31.84 m and Earth Mover's Distance from 21.12 m to 17.32 m. These results indicate that output representation is a critical design choice in coastline extraction, and that geometry-oriented learning, combined with the semantic reasoning capabilities of vision-language models, aligns well with how coastlines are defined and evaluated in operational coastal monitoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gets a 7B VLM to output coastlines directly as polylines and reports better Hausdorff and EMD numbers than segmentation baselines, but the comparison does not isolate the representation change from model capacity.

read the letter

The main takeaway is that reframing coastline extraction as joint VLM-based proxy classification plus polyline grounding produces lower geometric distances on the NZCCD dataset than standard segmentation approaches. CoastlineVLM-7B reduces Hausdorff from 37.74 m to 31.84 m and EMD from 21.12 m to 17.32 m under one-pixel boundary supervision.

What the work actually does is take the existing GeoChat/LLaVA backbone and add explicit detection, proxy-type classification, and direct polyline prediction in one forward pass. This matches the geomorphic definitions used in coastal monitoring more closely than land-water masks, and the choice of geometry metrics follows from that. The empirical numbers are the concrete part of the contribution.

The soft spot is the baseline comparison. The paper pits a 7B-parameter VLM against segmentation models without matching parameter counts, pretraining, or training protocols, so the gains cannot be cleanly attributed to the polyline output rather than scale or semantic pretraining. The claim that geometry metrics are more suitable also rests on the alignment argument rather than a direct test against operational correction effort.

This is for people working on high-resolution coastal remote sensing who already use or are considering VLMs. A reader focused on representation choices in applied vision tasks will see a practical example, but the paper does not claim broader impact.

It deserves peer review because the task reformulation is clear and the reported metrics are specific, even though the controls need tightening in revision.

Referee Report

3 major / 2 minor

Summary. The paper claims that reformulating coastline detection as direct geometric polyline prediction via a 7B-parameter VLM (CoastlineVLM-7B, built on GeoChat/LLaVA) yields superior global alignment to reference coastlines compared to segmentation baselines, with Hausdorff distance reduced from 37.74 m to 31.84 m and EMD from 21.12 m to 17.32 m on NZCCD/LINZ imagery under one-pixel boundary supervision. It further argues that geometry-based metrics are more suitable than IoU for evaluation and that output representation is a critical design choice, enabled by the VLM's joint detection-classification-grounding capabilities.

Significance. If the central empirical comparison can be made robust, the work provides evidence that aligning the learning objective with polyline geometry (rather than post-processed masks) better matches operational coastal monitoring practices that rely on geomorphic proxies. The concrete metric gains on held-out high-resolution aerial imagery and the use of semantic reasoning in a VLM are strengths that could inform representation choices in remote-sensing tasks.

major comments (3)

[Experiments] Experiments section: the segmentation baselines are not described with parameter counts, architectures, or training protocols matched to the 7B VLM (including its vision encoder and language-based proxy steps), so the reported Hausdorff/EMD gains cannot be isolated to the polyline formulation versus differences in model scale or pretraining. This directly undermines the claim that output representation is the critical factor.
[Results] Results section: no statistical significance testing, confidence intervals, or variance estimates accompany the metric improvements (37.74 m to 31.84 m Hausdorff; 21.12 m to 17.32 m EMD), and baseline hyperparameter/training details are absent, weakening the quantitative support for the central claim.
[Discussion] Discussion section: the assertion that geometry-based metrics are more suitable than pixel-overlap metrics rests on qualitative argument rather than a quantitative validation (e.g., correlation with operational coastal-change metrics or expert utility), which is load-bearing for the recommendation to prefer them.

minor comments (2)

[Methods] Clarify in the methods how the VLM's polyline output is evaluated under the same 'strict one-pixel boundary supervision' applied to segmentation masks.
The abstract and results would benefit from explicit statement of the number of test images and any cross-validation procedure used for the held-out evaluation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our results. We respond to each major comment below and indicate planned revisions.

read point-by-point responses

Referee: [Experiments] Experiments section: the segmentation baselines are not described with parameter counts, architectures, or training protocols matched to the 7B VLM (including its vision encoder and language-based proxy steps), so the reported Hausdorff/EMD gains cannot be isolated to the polyline formulation versus differences in model scale or pretraining. This directly undermines the claim that output representation is the critical factor.

Authors: We will expand the Experiments section to fully describe the segmentation baselines, including their architectures (e.g., specific CNN variants), approximate parameter counts, training protocols, hyperparameters, and how they were adapted to the one-pixel boundary supervision on NZCCD/LINZ data. While exact matching of scale and pretraining to the 7B VLM is not feasible given the architectural differences (standard segmentation networks versus a VLM with language-based proxy classification), we will explicitly discuss these differences as a potential confounding factor and clarify that the comparison highlights the benefit of direct polyline prediction enabled by the VLM's joint capabilities. This addresses the isolation concern without overstating the representation effect alone. revision: partial
Referee: [Results] Results section: no statistical significance testing, confidence intervals, or variance estimates accompany the metric improvements (37.74 m to 31.84 m Hausdorff; 21.12 m to 17.32 m EMD), and baseline hyperparameter/training details are absent, weakening the quantitative support for the central claim.

Authors: We will add the baseline hyperparameter and training details in the revised Experiments section. For the metric improvements, we will include confidence intervals computed over the test set images and, where computationally feasible, report results from multiple training seeds or cross-validation folds to provide variance estimates. If full statistical significance testing (e.g., paired t-tests) cannot be performed without additional runs, we will note this limitation explicitly while still reporting the observed differences with the available data. revision: yes
Referee: [Discussion] Discussion section: the assertion that geometry-based metrics are more suitable than pixel-overlap metrics rests on qualitative argument rather than a quantitative validation (e.g., correlation with operational coastal-change metrics or expert utility), which is load-bearing for the recommendation to prefer them.

Authors: We will strengthen the Discussion by adding references to how operational coastal monitoring (e.g., in NZCCD documentation) prioritizes geomorphic proxies and geometric representations over instantaneous pixel boundaries. We will also attempt a quantitative check by correlating Hausdorff/EMD values with known coastal change indicators in the dataset; if this correlation analysis is limited by data availability, we will acknowledge the primarily qualitative basis of the argument and frame the preference for geometry metrics as aligned with domain practice rather than as a fully validated superiority claim. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical comparison with no derivations or self-referential reductions

full rationale

The paper contains no equations, derivations, or parameter-fitting steps that could reduce to inputs by construction. It reports direct empirical results from training and evaluating CoastlineVLM-7B on held-out NZCCD imagery, measuring Hausdorff and EMD distances against segmentation baselines. The central claim rests on these held-out metric comparisons rather than any self-definitional loop, fitted-input prediction, or self-citation chain. Model architecture is referenced to external GeoChat/LLaVA work with no overlapping authors, and no uniqueness theorems or ansatzes are invoked. This is a standard empirical ML evaluation paper whose results are falsifiable on external data.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions of supervised fine-tuning of large vision-language models and the representativeness of the NZCCD dataset for New Zealand coastal conditions; no new physical constants or invented entities are introduced.

axioms (2)

domain assumption Fine-tuned VLMs can reliably perform joint classification and spatial grounding tasks when trained on paired image-text data
Invoked by the choice of LLaVA-1.5/GeoChat base and the joint training objective described in the abstract
domain assumption One-pixel boundary supervision is sufficient to train and evaluate polyline outputs against reference coastlines
Stated as the evaluation protocol in the abstract

pith-pipeline@v0.9.1-grok · 5860 in / 1529 out tokens · 18212 ms · 2026-06-27T14:08:33.669684+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 20 canonical work pages

[1]

E. H. Boak, I. L. Turner, Shoreline definition and detection: A review, Journal of Coastal Research 21 (4) (2005) 688– 703

2005
[2]

E. R. Thieler, E. A. Himmelstoss, J. L. Zichichi, A. Er- gul, Digital shoreline analysis system (dsas) version 4.0—an arcgis extension for calculating shoreline change, Tech. Rep. 2008-1278, U.S. Geological Survey (2009). doi:10.3133/ofr20081278

work page doi:10.3133/ofr20081278 2008
[3]

Learning and Instruction , author =

P. Scala, G. Manno, G. Ciraolo, Coastal dynamics an- alyzer (cda): A qgis plugin for transect based analy- sis of coastal erosion, SoftwareX 28 (2024) 101894. doi:https://doi.org/10.1016/j.softx.2024.101894. URL https://www.sciencedirect.com/science/ article/pii/S2352711024002644

work page doi:10.1016/j.softx.2024.101894 2024
[4]

W. Sun, C. Chen, W. Liu, G. Yang, X. Meng, L. Wang, K. Ren, Coastline extraction using remote sensing: A re- view, GIScience & Remote Sensing 60 (1) (2023) 2243671. doi:10.1080/15481603.2023.2243671

work page doi:10.1080/15481603.2023.2243671 2023
[5]

Toure, O

S. Toure, O. Diop, K. Kpalma, A. S. Maiga, Shoreline detection using optical remote sensing: A review, ISPRS International Journal of Geo-Information 8 (2) (2019) 75. doi:10.3390/ijgi8020075

work page doi:10.3390/ijgi8020075 2019
[6]

Z. Yang, G. Wang, L. Feng, Y . Wang, G. Wang, S. Liang, A transformer model for coastline prediction in weitou bay, china, Remote Sensing 15 (19) (2023) 4771. doi:10.3390/rs15194771

work page doi:10.3390/rs15194771 2023
[7]

Gens, Remote sensing of coastlines: Detec- tion, extraction and monitoring, International Jour- nal of Remote Sensing 31 (7) (2010) 1819–1836

R. Gens, Remote sensing of coastlines: Detec- tion, extraction and monitoring, International Jour- nal of Remote Sensing 31 (7) (2010) 1819–1836. doi:10.1080/01431160902926673

work page doi:10.1080/01431160902926673 2010
[8]

L. Tang, T. Ai, M. Yang, J. Stoter, An adaptive simplifica- tion method for coastlines using bridging skeleton lines, ISPRS International Journal of Geo-Information 13 (5) (2024) 155. doi:10.3390/ijgi13050155

work page doi:10.3390/ijgi13050155 2024
[9]

J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431–3440

2015
[10]

Ronneberger, P

O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: Inter- national Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234– 241

2015
[11]

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in: Proceedings of the Eu- ropean conference on computer vision (ECCV), 2018, pp. 801–818

2018
[12]

E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, P. Luo, Segformer: Simple and efficient design for seman- tic segmentation with transformers, NeurIPS (2021)

2021
[13]

M. Tuck, M. Dickson, E. Ryan, M. Ford, T. Kon- lechner, A national scale coastal change dataset for aotearoa new zealand, Data in Brief 57 (2024) 111104. doi:https://doi.org/10.1016/j.dib.2024.111104

work page doi:10.1016/j.dib.2024.111104 2024
[14]

Accessed 10 March 2026 (2026)

Department of Energy, Environment and Climate Action (Victoria), Vcmp sites – shorelines, victorian Coastal Mon- itoring Program (VCMP) shoreline dataset. Accessed 10 March 2026 (2026). URL https://discover.data.vic.gov.au/datas et/vcmp-sites-shorelines

2026
[15]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

K. Kuckreja, M. S. Danish, M. Naseer, A. Das, S. Khan, F. S. Khan, Geochat: Grounded large vision- language model for remote sensing, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 27831–27840. doi:10.1109/CVPR52733.2024.02629

work page doi:10.1109/cvpr52733.2024.02629 2024
[16]

H. Liu, C. Li, Q. Wu, Y . J. Lee, Visual instruction tuning, Advances in neural information processing systems 36 (2023) 34892–34916

2023
[17]

H. G. Barrow, J. M. Tenenbaum, R. C. Bolles, H. C. Wolf, Parametric correspondence and chamfer matching: Two new techniques for image matching, Proceedings of the 5th International Joint Conference on Artificial Intelligence (1977) 659–663

1977
[18]

D. P. Huttenlocher, G. A. Klanderman, W. J. Rucklidge, Comparing images using the hausdorffdistance, IEEE Transactions on Pattern Analysis and Machine Intelligence 15 (9) (1993) 850–863. doi:10.1109/34.232073

work page doi:10.1109/34.232073 1993
[19]

Eiter, H

T. Eiter, H. Mannila, Computing discrete fréchet distance, Tech. Rep. CD-TR 94/64, Technical University of Vienna (1994). URL https://www.kr.tuwien.ac.at/staff/eite r/et-archive/files/cdtr9464.pdf 10

1994
[20]

Rubner, C

Y . Rubner, C. Tomasi, L. J. Guibas, The earth mover’s distance as a metric for image retrieval, International Journal of Computer Vision 40 (2) (2000) 99–121. doi:10.1023/A:1026543900054

work page doi:10.1023/a:1026543900054 2000
[21]

K. V os, M. D. Harley, K. D. Splinter, J. A. Simmons, I. L. Turner, Coastsat: A google earth engine-enabled python toolkit to extract shorelines from publicly available satellite imagery, Environmental Modelling & Software 122 (2019) 104528

2019
[22]

M. S. J. Rogers, M. Bithell, S. M. Brooks, T. Spencer, Vedge_detector: automated coastal vegetation edge detec- tion using a convolutional neural network, International Journal of Remote Sensing 42 (13) (2021) 4805–4835. arXiv:https://doi.org/10.1080/01431161.2021.1897185, doi:10.1080/01431161.2021.1897185. URL https://doi.org/10.1080/01431161.2021. 1897185

work page doi:10.1080/01431161.2021.1897185 2021
[23]

Mattyus, W

G. Mattyus, W. Luo, R. Urtasun, Deeproadmapper: Ex- tracting road topology from aerial images, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3438–3446

2017
[24]

Bastani, S

F. Bastani, S. He, S. Abbar, M. Alizadeh, H. Balakr- ishnan, S. Chawla, S. Madden, D. DeWitt, Roadtracer: Automatic extraction of road networks from aerial im- ages, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4720–4728. doi:10.1109/CVPR.2018.00496

work page doi:10.1109/cvpr.2018.00496 2018
[25]

Acuna, H

D. Acuna, H. Ling, A. Kar, S. Fidler, Efficient interactive annotation of segmentation datasets with polygon-rnn++, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 859–

2018
[26]

doi:10.1109/CVPR.2018.00096

work page doi:10.1109/cvpr.2018.00096 2018
[27]

Castrejon, J

L. Castrejon, J. Pont-Tuset, J. T. Barron, F. Marques, J. Ma- lik, Polygon-rnn: Annotating object instances with a poly- gon, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5230– 5238

2017
[28]

S. Wei, S. Ji, M. Lu, From lines to polygons: Polyg- onal building contour extraction from high-resolution remote sensing imagery, ISPRS Journal of Photogram- metry and Remote Sensing 210 (2024) 107–121. doi:10.1016/j.isprsjprs.2024.02.006

work page doi:10.1016/j.isprsjprs.2024.02.006 2024
[29]

Isikdogan, A

F. Isikdogan, A. C. Bovik, P. Passalacqua, Surface water mapping by deep learning, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10 (11) (2017) 4909–4918

2017
[30]

Tetteh, V

G. Tetteh, V . Efremov, N. D. Forkert, M. Schneider, J. Kirschke, B. Weber, C. Zimmer, M. Piraud, B. H. Menze, Deepvesselnet: Vessel segmentation, centerline predic- tion, and bifurcation detection in 3-d angiographic vol- umes, Frontiers in Neuroscience V olume 14 - 2020 (2020). doi:10.3389/fnins.2020.592352. URL https://www.frontiersin.org/journals/n eu...

work page doi:10.3389/fnins.2020.592352 2020
[31]

Radford, J

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever, Learning transferable visual models from natural language supervision, in: M. Meila, T. Zhang (Eds.), Proceedings of the 38th International Con- ference on Machine Learning, V ol. 139 of Proceedings of Machine Learning Re...

2021
[32]

Zhang, M

W. Zhang, M. Cai, T. Zhang, Y . Zhuang, X. Mao, Earthgpt: A universal multimodal large language model for multi- sensor image comprehension in remote sensing domain, IEEE Transactions on Geoscience and Remote Sensing 62 (2024) 1–20. doi:10.1109/TGRS.2024.3409624

work page doi:10.1109/tgrs.2024.3409624 2024
[33]

Y . Zhan, Z. Xiong, Y . Yuan, Skyeyegpt: Unifying re- mote sensing vision-language tasks via instruction tun- ing with large language model, ISPRS Journal of Pho- togrammetry and Remote Sensing 221 (2025) 64–77. doi:https://doi.org/10.1016/j.isprsjprs.2025.01.020. URL https://www.sciencedirect.com/science/ article/pii/S0924271625000206

work page doi:10.1016/j.isprsjprs.2025.01.020 2025
[34]

URLhttps://data.linz.govt.nz

Land Information New Zealand, Linz data service: Open geospatial data portal, accessed: 2024 (2024). URLhttps://data.linz.govt.nz

2024
[35]

Ortega-Adame, M

C. Ortega-Adame, M. Gonzalez-Audicana, A. Salinas- Castillo, et al., Swed: A benchmark dataset for seman- tic segmentation of water bodies from sentinel-2 im- agery, Remote Sensing of Environment 273 (2022) 112859. doi:10.1016/j.rse.2022.112859

work page doi:10.1016/j.rse.2022.112859 2022
[36]

D. H. Douglas, T. K. Peucker, Algorithms for the reduction of the number of points required to represent a digitized line or its caricature, The Canadian Cartographer 10 (2) (1973) 112–122. doi:10.3138/FM57-6770-U75U-7727. Appendix A. Geometric Distance Metrics This appendix provides the formal definitions of the geomet- ric distance metrics used to evalu...

work page doi:10.3138/fm57-6770-u75u-7727 1973

[1] [1]

E. H. Boak, I. L. Turner, Shoreline definition and detection: A review, Journal of Coastal Research 21 (4) (2005) 688– 703

2005

[2] [2]

E. R. Thieler, E. A. Himmelstoss, J. L. Zichichi, A. Er- gul, Digital shoreline analysis system (dsas) version 4.0—an arcgis extension for calculating shoreline change, Tech. Rep. 2008-1278, U.S. Geological Survey (2009). doi:10.3133/ofr20081278

work page doi:10.3133/ofr20081278 2008

[3] [3]

Learning and Instruction , author =

P. Scala, G. Manno, G. Ciraolo, Coastal dynamics an- alyzer (cda): A qgis plugin for transect based analy- sis of coastal erosion, SoftwareX 28 (2024) 101894. doi:https://doi.org/10.1016/j.softx.2024.101894. URL https://www.sciencedirect.com/science/ article/pii/S2352711024002644

work page doi:10.1016/j.softx.2024.101894 2024

[4] [4]

W. Sun, C. Chen, W. Liu, G. Yang, X. Meng, L. Wang, K. Ren, Coastline extraction using remote sensing: A re- view, GIScience & Remote Sensing 60 (1) (2023) 2243671. doi:10.1080/15481603.2023.2243671

work page doi:10.1080/15481603.2023.2243671 2023

[5] [5]

Toure, O

S. Toure, O. Diop, K. Kpalma, A. S. Maiga, Shoreline detection using optical remote sensing: A review, ISPRS International Journal of Geo-Information 8 (2) (2019) 75. doi:10.3390/ijgi8020075

work page doi:10.3390/ijgi8020075 2019

[6] [6]

Z. Yang, G. Wang, L. Feng, Y . Wang, G. Wang, S. Liang, A transformer model for coastline prediction in weitou bay, china, Remote Sensing 15 (19) (2023) 4771. doi:10.3390/rs15194771

work page doi:10.3390/rs15194771 2023

[7] [7]

Gens, Remote sensing of coastlines: Detec- tion, extraction and monitoring, International Jour- nal of Remote Sensing 31 (7) (2010) 1819–1836

R. Gens, Remote sensing of coastlines: Detec- tion, extraction and monitoring, International Jour- nal of Remote Sensing 31 (7) (2010) 1819–1836. doi:10.1080/01431160902926673

work page doi:10.1080/01431160902926673 2010

[8] [8]

L. Tang, T. Ai, M. Yang, J. Stoter, An adaptive simplifica- tion method for coastlines using bridging skeleton lines, ISPRS International Journal of Geo-Information 13 (5) (2024) 155. doi:10.3390/ijgi13050155

work page doi:10.3390/ijgi13050155 2024

[9] [9]

J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431–3440

2015

[10] [10]

Ronneberger, P

O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: Inter- national Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234– 241

2015

[11] [11]

L.-C. Chen, Y . Zhu, G. Papandreou, F. Schroff, H. Adam, Encoder-decoder with atrous separable convolution for semantic image segmentation, in: Proceedings of the Eu- ropean conference on computer vision (ECCV), 2018, pp. 801–818

2018

[12] [12]

E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, P. Luo, Segformer: Simple and efficient design for seman- tic segmentation with transformers, NeurIPS (2021)

2021

[13] [13]

M. Tuck, M. Dickson, E. Ryan, M. Ford, T. Kon- lechner, A national scale coastal change dataset for aotearoa new zealand, Data in Brief 57 (2024) 111104. doi:https://doi.org/10.1016/j.dib.2024.111104

work page doi:10.1016/j.dib.2024.111104 2024

[14] [14]

Accessed 10 March 2026 (2026)

Department of Energy, Environment and Climate Action (Victoria), Vcmp sites – shorelines, victorian Coastal Mon- itoring Program (VCMP) shoreline dataset. Accessed 10 March 2026 (2026). URL https://discover.data.vic.gov.au/datas et/vcmp-sites-shorelines

2026

[15] [15]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

K. Kuckreja, M. S. Danish, M. Naseer, A. Das, S. Khan, F. S. Khan, Geochat: Grounded large vision- language model for remote sensing, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 27831–27840. doi:10.1109/CVPR52733.2024.02629

work page doi:10.1109/cvpr52733.2024.02629 2024

[16] [16]

H. Liu, C. Li, Q. Wu, Y . J. Lee, Visual instruction tuning, Advances in neural information processing systems 36 (2023) 34892–34916

2023

[17] [17]

H. G. Barrow, J. M. Tenenbaum, R. C. Bolles, H. C. Wolf, Parametric correspondence and chamfer matching: Two new techniques for image matching, Proceedings of the 5th International Joint Conference on Artificial Intelligence (1977) 659–663

1977

[18] [18]

D. P. Huttenlocher, G. A. Klanderman, W. J. Rucklidge, Comparing images using the hausdorffdistance, IEEE Transactions on Pattern Analysis and Machine Intelligence 15 (9) (1993) 850–863. doi:10.1109/34.232073

work page doi:10.1109/34.232073 1993

[19] [19]

Eiter, H

T. Eiter, H. Mannila, Computing discrete fréchet distance, Tech. Rep. CD-TR 94/64, Technical University of Vienna (1994). URL https://www.kr.tuwien.ac.at/staff/eite r/et-archive/files/cdtr9464.pdf 10

1994

[20] [20]

Rubner, C

Y . Rubner, C. Tomasi, L. J. Guibas, The earth mover’s distance as a metric for image retrieval, International Journal of Computer Vision 40 (2) (2000) 99–121. doi:10.1023/A:1026543900054

work page doi:10.1023/a:1026543900054 2000

[21] [21]

K. V os, M. D. Harley, K. D. Splinter, J. A. Simmons, I. L. Turner, Coastsat: A google earth engine-enabled python toolkit to extract shorelines from publicly available satellite imagery, Environmental Modelling & Software 122 (2019) 104528

2019

[22] [22]

M. S. J. Rogers, M. Bithell, S. M. Brooks, T. Spencer, Vedge_detector: automated coastal vegetation edge detec- tion using a convolutional neural network, International Journal of Remote Sensing 42 (13) (2021) 4805–4835. arXiv:https://doi.org/10.1080/01431161.2021.1897185, doi:10.1080/01431161.2021.1897185. URL https://doi.org/10.1080/01431161.2021. 1897185

work page doi:10.1080/01431161.2021.1897185 2021

[23] [23]

Mattyus, W

G. Mattyus, W. Luo, R. Urtasun, Deeproadmapper: Ex- tracting road topology from aerial images, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 3438–3446

2017

[24] [24]

Bastani, S

F. Bastani, S. He, S. Abbar, M. Alizadeh, H. Balakr- ishnan, S. Chawla, S. Madden, D. DeWitt, Roadtracer: Automatic extraction of road networks from aerial im- ages, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4720–4728. doi:10.1109/CVPR.2018.00496

work page doi:10.1109/cvpr.2018.00496 2018

[25] [25]

Acuna, H

D. Acuna, H. Ling, A. Kar, S. Fidler, Efficient interactive annotation of segmentation datasets with polygon-rnn++, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 859–

2018

[26] [26]

doi:10.1109/CVPR.2018.00096

work page doi:10.1109/cvpr.2018.00096 2018

[27] [27]

Castrejon, J

L. Castrejon, J. Pont-Tuset, J. T. Barron, F. Marques, J. Ma- lik, Polygon-rnn: Annotating object instances with a poly- gon, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5230– 5238

2017

[28] [28]

S. Wei, S. Ji, M. Lu, From lines to polygons: Polyg- onal building contour extraction from high-resolution remote sensing imagery, ISPRS Journal of Photogram- metry and Remote Sensing 210 (2024) 107–121. doi:10.1016/j.isprsjprs.2024.02.006

work page doi:10.1016/j.isprsjprs.2024.02.006 2024

[29] [29]

Isikdogan, A

F. Isikdogan, A. C. Bovik, P. Passalacqua, Surface water mapping by deep learning, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10 (11) (2017) 4909–4918

2017

[30] [30]

Tetteh, V

G. Tetteh, V . Efremov, N. D. Forkert, M. Schneider, J. Kirschke, B. Weber, C. Zimmer, M. Piraud, B. H. Menze, Deepvesselnet: Vessel segmentation, centerline predic- tion, and bifurcation detection in 3-d angiographic vol- umes, Frontiers in Neuroscience V olume 14 - 2020 (2020). doi:10.3389/fnins.2020.592352. URL https://www.frontiersin.org/journals/n eu...

work page doi:10.3389/fnins.2020.592352 2020

[31] [31]

Radford, J

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, I. Sutskever, Learning transferable visual models from natural language supervision, in: M. Meila, T. Zhang (Eds.), Proceedings of the 38th International Con- ference on Machine Learning, V ol. 139 of Proceedings of Machine Learning Re...

2021

[32] [32]

Zhang, M

W. Zhang, M. Cai, T. Zhang, Y . Zhuang, X. Mao, Earthgpt: A universal multimodal large language model for multi- sensor image comprehension in remote sensing domain, IEEE Transactions on Geoscience and Remote Sensing 62 (2024) 1–20. doi:10.1109/TGRS.2024.3409624

work page doi:10.1109/tgrs.2024.3409624 2024

[33] [33]

Y . Zhan, Z. Xiong, Y . Yuan, Skyeyegpt: Unifying re- mote sensing vision-language tasks via instruction tun- ing with large language model, ISPRS Journal of Pho- togrammetry and Remote Sensing 221 (2025) 64–77. doi:https://doi.org/10.1016/j.isprsjprs.2025.01.020. URL https://www.sciencedirect.com/science/ article/pii/S0924271625000206

work page doi:10.1016/j.isprsjprs.2025.01.020 2025

[34] [34]

URLhttps://data.linz.govt.nz

Land Information New Zealand, Linz data service: Open geospatial data portal, accessed: 2024 (2024). URLhttps://data.linz.govt.nz

2024

[35] [35]

Ortega-Adame, M

C. Ortega-Adame, M. Gonzalez-Audicana, A. Salinas- Castillo, et al., Swed: A benchmark dataset for seman- tic segmentation of water bodies from sentinel-2 im- agery, Remote Sensing of Environment 273 (2022) 112859. doi:10.1016/j.rse.2022.112859

work page doi:10.1016/j.rse.2022.112859 2022

[36] [36]

D. H. Douglas, T. K. Peucker, Algorithms for the reduction of the number of points required to represent a digitized line or its caricature, The Canadian Cartographer 10 (2) (1973) 112–122. doi:10.3138/FM57-6770-U75U-7727. Appendix A. Geometric Distance Metrics This appendix provides the formal definitions of the geomet- ric distance metrics used to evalu...

work page doi:10.3138/fm57-6770-u75u-7727 1973