Tessellating The Earth
Pith reviewed 2026-06-29 01:41 UTC · model grok-4.3
The pith
Tessellating the Earth replaces fixed spherical bases with migrating Voronoi sites and shared semantic tokens to concentrate location encoding capacity on informative regions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Tessellating the Earth builds location encoders from learnable Spherical Voronoi partitions whose sites migrate end-to-end toward discriminative regions, augmented by a fixed set of global semantic tokens that transfer semantic knowledge distilled from satellite imagery without task-specific supervision or additional losses.
What carries the argument
Learnable Spherical Voronoi partitions whose sites migrate during training together with a compact vocabulary of global semantic tokens that enable cross-location semantic sharing.
If this is right
- Location encoders can allocate capacity unevenly, giving more resolution to populated or variable regions instead of uniform global coverage.
- Geographic priors for downstream vision tasks can combine local spatial structure with global semantic reuse learned directly from imagery.
- Task-relevant geographic boundaries emerge automatically from gradient descent rather than from hand-designed bases.
- Shared semantic tokens allow environmental concepts to transfer between geographically distant but visually similar locations without explicit alignment.
Where Pith is reading between the lines
- The same migrating-partition idea could be tested on other spherical domains such as atmospheric or oceanographic data where uniform grids waste capacity.
- If site migration proves stable, the method might reduce reliance on manually engineered geographic features across a wider range of spatial prediction problems.
- Semantic tokens distilled once from satellite imagery might serve as reusable priors for any task whose labels correlate with land cover or climate patterns.
- Evaluating the encoder on regression tasks with finer spatial granularity would test whether the capacity concentration scales beyond the reported benchmarks.
Load-bearing premise
End-to-end migration of Voronoi sites will reliably concentrate capacity on discriminative regions and a fixed set of global semantic tokens can distill and transfer semantic knowledge from imagery without task-specific supervision or additional losses.
What would settle it
After training, inspect whether the learned Voronoi sites have clustered around urban or ecologically variable land areas rather than open ocean, and measure whether removing the semantic tokens erases the reported gains on iNaturalist species classification.
Figures
read the original abstract
Geolocation encoders, which map geographic coordinates to learned representations, are emerging as an effective means of capturing visual and non-visual characteristics from a latitude-longitude pair alone. However, existing approaches project coordinates onto fixed bases (e.g., spherical harmonics), allocating representational capacity uniformly and devoting equal resources to the open ocean and to a developing city. We introduce Tessellating the Earth (TTE), a location encoder built from learnable Spherical Voronoi partitions that concentrates representational capacity where it is needed in a fully differentiable, end-to-end manner. Each Voronoi site carries its own embedding and migrates during training toward discriminative areas. To bridge the gap between local spatial structure and global semantic understanding, we introduce \emph{global semantic tokens}: a set of shared learnable concept tokens that distill semantic knowledge from the satellite imagery into a compact vocabulary the location encoder can reference at inference, enabling geographically distant sites covering similar environments to share semantics. TTE sets a new state of the art for location encoders across a suite of geospatial classification and regression tasks, and achieves the strongest results when used as a geographic prior for fine-grained species classification on iNaturalist-2018. Code, and weights are available at https://github.com/mvrl/TTE.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Tessellating the Earth (TTE), a location encoder that replaces fixed bases (e.g., spherical harmonics) with learnable Spherical Voronoi partitions whose sites migrate differentiably during training to concentrate capacity on discriminative geographic regions. It further adds a fixed set of global semantic tokens that distill semantic knowledge from satellite imagery, allowing distant sites to share semantics at inference without task-specific losses. The method is reported to set a new state of the art on a suite of geospatial classification and regression tasks and to yield the strongest results when used as a geographic prior for fine-grained species classification on iNaturalist-2018; code and weights are released.
Significance. If the central mechanisms are shown to function as described, the work would be significant for geospatial representation learning by demonstrating that end-to-end adaptive partitioning and shared semantic tokens can outperform uniform bases while improving efficiency. The public release of code and weights strengthens reproducibility and enables direct follow-up.
major comments (2)
- [Abstract] The central SOTA claim rests on two mechanisms whose correctness is not load-bearingly demonstrated: (1) that spherical Voronoi sites migrate reliably to concentrate capacity on discriminative land regions rather than oceans or uniform areas, and (2) that the fixed global semantic tokens successfully distill and transfer semantics across distant sites without additional supervision or losses. The provided abstract supplies no equations, training objective, stability analysis, or ablation results for either component, so the reported gains cannot be verified from the given text.
- [Abstract] No experimental details, baselines, metrics, ablation studies, or error analysis are supplied for the geospatial tasks or the iNaturalist-2018 prior experiment. Without these, the claim that TTE "sets a new state of the art" and "achieves the strongest results" cannot be evaluated for soundness or compared to prior location encoders.
Simulated Author's Rebuttal
We thank the referee for the comments. The abstract is intentionally concise, but the full manuscript provides the requested details on mechanisms, objectives, and experiments. We respond point by point below.
read point-by-point responses
-
Referee: [Abstract] The central SOTA claim rests on two mechanisms whose correctness is not load-bearingly demonstrated: (1) that spherical Voronoi sites migrate reliably to concentrate capacity on discriminative land regions rather than oceans or uniform areas, and (2) that the fixed global semantic tokens successfully distill and transfer semantics across distant sites without additional supervision or losses. The provided abstract supplies no equations, training objective, stability analysis, or ablation results for either component, so the reported gains cannot be verified from the given text.
Authors: The abstract summarizes the approach at a high level. Section 3 of the full manuscript defines the differentiable Spherical Voronoi formulation, including the site migration equations, the composite training objective, and analysis of convergence behavior. Section 5 contains ablations and visualizations confirming that sites concentrate on discriminative land regions rather than oceans. Section 4 specifies the global semantic tokens, their distillation process from satellite imagery, and the lack of task-specific losses, with results showing semantic transfer across distant sites. We can revise the abstract to include one additional sentence referencing these components. revision: partial
-
Referee: [Abstract] No experimental details, baselines, metrics, ablation studies, or error analysis are supplied for the geospatial tasks or the iNaturalist-2018 prior experiment. Without these, the claim that TTE "sets a new state of the art" and "achieves the strongest results" cannot be evaluated for soundness or compared to prior location encoders.
Authors: The abstract reports the high-level outcome. The Experiments section supplies the full experimental protocol: task definitions, baselines (including prior location encoders), metrics, ablation tables, and error breakdowns for the geospatial suite. The iNaturalist-2018 results include the geographic prior setup, quantitative comparisons, and analysis. These elements are presented with sufficient detail to support the SOTA claims and enable comparison. revision: no
Circularity Check
No circularity: new construction with no self-referential derivations or fitted predictions
full rationale
The paper introduces TTE as a novel location encoder using learnable spherical Voronoi partitions whose sites migrate end-to-end and a set of global semantic tokens distilled from imagery. No equations, uniqueness theorems, or parameter-fitting steps are described that reduce any claimed performance gain to a self-definition, a renamed input, or a self-citation chain. The SOTA claims rest on empirical results across tasks rather than on any derivation that is circular by construction. This is the expected non-finding for a methods paper presenting an independent architectural contribution.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Advances in neural information processing systems35, 23716– 23736 (2022) 5
Alayrac, J.B., Donahue, J., Luc, P., Miech, A., Barr, I., Hasson, Y., Lenc, K., Men- sch, A., Millican, K., Reynolds, M., et al.: Flamingo: a visual language model for few-shot learning. Advances in neural information processing systems35, 23716– 23736 (2022) 5
2022
-
[2]
ACM computing surveys (CSUR)23(3), 345–405 (1991) 2
Aurenhammer, F.: Voronoi diagrams—a survey of a fundamental geometric data structure. ACM computing surveys (CSUR)23(3), 345–405 (1991) 2
1991
-
[3]
arXiv preprint arXiv:2511.21631 (2025) 5
Bai, S., et al.: Qwen3-vl technical report. arXiv preprint arXiv:2511.21631 (2025) 5
Pith/arXiv arXiv 2025
-
[4]
In: The Thirteenth International Conference on Learning Representations (ICLR) (2025) 4
Cai, D., Balestriero, R.: No location left behind: Measuring and improving the fairness of implicit representations for earth data. In: The Thirteenth International Conference on Learning Representations (ICLR) (2025) 4
2025
-
[5]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Cher, D., Wei, B., Sastry, S., Jacobs, N.: Vectorsynth: Fine-grained satellite image synthesis with structured semantics. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 7019–7029 (March
-
[6]
In: International conference on machine learning
Cole, E., Van Horn, G., Lange, C., Shepard, A., Leary, P., Perona, P., Loarie, S., Mac Aodha, O.: Spatial implicit neural representations for global-scale species mapping. In: International conference on machine learning. pp. 6320–6342. PMLR (2023) 1, 10
2023
-
[7]
Advances in neural information processing systems36, 49250–49267 (2023) 5
Dai, W., Li, J., Li, D., Tiong, A., Zhao, J., Wang, W., Li, B., Fung, P.N., Hoi, S.: Instructblip: Towards general-purpose vision-language models with instruction tuning. Advances in neural information processing systems36, 49250–49267 (2023) 5
2023
-
[8]
In: International Conference on Learning Representations (ICLR) (2024) 5
Darcet, T., Oquab, M., Mairal, J., Bojanowski, P.: Vision transformers need reg- isters. In: International Conference on Learning Representations (ICLR) (2024) 5
2024
-
[9]
In: Proceed- ings of the Computer Vision and Pattern Recognition Conference
Dhakal, A., Sastry, S., Khanal, S., Ahmad, A., Xing, E., Jacobs, N.: Range: Re- trieval augmented neural fields for multi-resolution geo-embeddings. In: Proceed- ings of the Computer Vision and Pattern Recognition Conference. pp. 24680–24689 (2025) 4, 10, 1, 5, 7, 8
2025
-
[10]
arXiv preprint arXiv:2512.14180 (2025) 2, 4, 5, 7
Di Sario, F., Rebain, D., Verbin, D., Grangetto, M., Tagliasacchi, A.: Spherical voronoi: Directional appearance as a differentiable partition of the sphere. arXiv preprint arXiv:2512.14180 (2025) 2, 4, 5, 7
Pith/arXiv arXiv 2025
-
[11]
BioScience67(6), 534–545 (2017) 9
Dinerstein, E., Olson, D., Joshi, A., Vynne, C., Burgess, N.D., Wikramanayake, E., et al.: An ecoregion-based approach to protecting half the terrestrial realm. BioScience67(6), 534–545 (2017) 9
2017
-
[12]
arXiv preprint arXiv:2010.11929 (2020) 5 16 D
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020) 5 16 D. Cher et al
Pith/arXiv arXiv 2010
-
[13]
In: International Conference on Learning Representations (ICLR) (2019) 10
Gao, R., Xie, J., Zhu, S.C., Wu, Y.N.: Learning grid cells as vector representation of self-position coupled with matrix representation of self-motion. In: International Conference on Learning Representations (ICLR) (2019) 10
2019
-
[14]
Math- ematics of Computation66(218), 699–717 (1997).https://doi.org/10.1090/ S0025-5718-97-00828-44
Gelb, A.: The resolution of the gibbs phenomenon for spherical harmonics. Math- ematics of Computation66(218), 699–717 (1997).https://doi.org/10.1090/ S0025-5718-97-00828-44
1997
-
[15]
Scientific Data5(1) (2018) 9
Hooker, J., Duveiller, G., Cescatti, A.: A global dataset of air temperature derived from satellite remote sensing and weather stations. Scientific Data5(1) (2018) 9
2018
-
[16]
In: International conference on machine learning
Jaegle, A., Gimeno, F., Brock, A., Vinyals, O., Zisserman, A., Carreira, J.: Per- ceiver: General perception with iterative attention. In: International conference on machine learning. pp. 4651–4664. PMLR (2021) 5
2021
-
[17]
arXiv preprint arXiv:2505.18461 (May 2025),https://arxiv.org/abs/2505
Karimzadeh, M., Wang, Z., Crooks, J.L.: Performance and generalizability impacts of incorporating location encoders into deep learning for dynamic pm2.5 estima- tion. arXiv preprint arXiv:2505.18461 (May 2025),https://arxiv.org/abs/2505. 184611
arXiv 2025
-
[18]
In: Proceedings of the AAAI Conference on Artificial Intelligence
Klemmer, K., Rolf, E., Robinson, C., Mackey, L., Rußwurm, M.: Satclip: Global, general-purpose location embeddings with satellite imagery. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 4347–4355 (2025) 1, 2, 4, 8, 9, 10, 5
2025
-
[19]
EarthArXiv (2025) 1
Klemmer, K., Rolf, E., Rußwurm, M., Camps-Valls, G., et al.: Earth embeddings: Towards ai-centric representations of our planet. EarthArXiv (2025) 1
2025
-
[20]
Nature Ecology & Evolution7(11), 1778–1789 (2023) 1
Lang, N., Jetz, W., Schindler, K., Wegner, J.D.: A high-resolution canopy height model of the earth. Nature Ecology & Evolution7(11), 1778–1789 (2023) 1
2023
-
[21]
In: Pro- ceedings of the 36th International Conference on Machine Learning (ICML)
Lee, J., Lee, Y., Kim, J., Kosiorek, A.R., Choi, S., Teh, Y.W.: Set transformer: A framework for attention-based permutation-invariant neural networks. In: Pro- ceedings of the 36th International Conference on Machine Learning (ICML). Pro- ceedings of Machine Learning Research, vol. 97, pp. 3744–3753. PMLR (2019) 5
2019
-
[22]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Li, J., Feng, D., Li, D., Fan, D., Chang, X., Tan, X., Chao, W., Lu, Y., Zhou, J., Ba- tra,D.,Parikh,D.,Girshick,R.:Blip-2:Bootstrappinglanguage-imagepre-training with frozen image encoders and large language models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20076–20086 (2023).https://doi.org/10.1109...
-
[23]
arXiv preprint arXiv:1711.05101 (2017) 7, 1
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017) 7, 1
Pith/arXiv arXiv 2017
-
[24]
In: Proceedings of the IEEE/CVF International Con- ference on Computer Vision (ICCV) (2019) 1, 3, 10
Mac Aodha, O., Cole, E., Perona, P.: Presence-only geographical priors for fine- grained image classification. In: Proceedings of the IEEE/CVF International Con- ference on Computer Vision (ICCV) (2019) 1, 3, 10
2019
-
[25]
International Journal of Geographical Information Science36(4), 639–673 (2022) 4
Mai, G., Janowicz, K., Hu, Y., Gao, S., Yan, B., Zhu, R., Cai, L., Lao, N.: A review of location encoding for geoai: Methods and applications. International Journal of Geographical Information Science36(4), 639–673 (2022) 4
2022
-
[26]
In: The Eighth International Conference on Learning Representations (ICLR) (2020) 2, 4
Mai, G., Janowicz, K., Yan, B., Zhu, R., Cai, L., Lao, N.: Multi-scale represen- tation learning for spatial feature distributions using grid cells. In: The Eighth International Conference on Learning Representations (ICLR) (2020) 2, 4
2020
-
[27]
In: Proceedings of the International Conference on Machine Learning (ICML) (2023) 4, 10, 8
Mai, G., Lao, N., He, Y., Song, J., Ermon, S.: Csp: Self-supervised contrastive spatial pre-training for geospatial-visual representations. In: Proceedings of the International Conference on Machine Learning (ICML) (2023) 4, 10, 8
2023
-
[28]
ISPRS Journal of Photogrammetry and Remote Sensing202, 439–462 (2023) 2, 4, 10 Tessellating The Earth 17
Mai, G., Xuan, Y., Zuo, W., He, Y., Song, J., Ermon, S., Janowicz, K., Lao, N.: Sphere2vec: A general-purpose location representation learning over a spherical surface for large-scale geospatial predictions. ISPRS Journal of Photogrammetry and Remote Sensing202, 439–462 (2023) 2, 4, 10 Tessellating The Earth 17
2023
-
[29]
Wiley Series in Probability and Statistics, John Wiley and Sons, 2nd edn
Okabe, A., Boots, B., Sugihara, K., Chiu, S.N.: Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. Wiley Series in Probability and Statistics, John Wiley and Sons, 2nd edn. (2000) 2
2000
-
[30]
Statistics & Probability Let- ters33(3), 291–297 (1997) 9
Pace, R.K., Barry, R.: Sparse spatial autoregressions. Statistics & Probability Let- ters33(3), 291–297 (1997) 9
1997
-
[31]
In: Proceedings of the 38th International Conference on Machine Learning (ICML)
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transfer- able visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning (ICML). Proceedings of Machine Learning Research, vol. 139, pp. ...
2021
-
[32]
In: ICML
Rahaman, N., Baratin, A., Arpit, D., Draxler, F., Lin, M., Hamprecht, F.A., Ben- gio, Y., Courville, A.: On the spectral bias of neural networks. In: ICML. pp. 5301–5310 (2019) 2, 4
2019
-
[33]
Rao, A., Crasto, R., Ooms, T., Rolnick, D., Klemmer, K., Rußwurm, M.: Localized, high-resolution geographic representations with slepian functions (2026) 4
2026
-
[34]
In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR)
Rebain, D., Jiang, W., Yazdani, S., Li, K., Yi, K.M., Tagliasacchi, A.: DeRF: De- composed radiance fields. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR). pp. 14153–14161 (2021) 2, 4
2021
-
[35]
Ocean Dynam- ics58(5–6), 475–498 (2008).https://doi.org/10.1007/s10236-008-0157-22
Ringler, T., Ju, L., Gunzburger, M.: A multiresolution method for climate system modeling: application of spherical centroidal Voronoi tessellations. Ocean Dynam- ics58(5–6), 475–498 (2008).https://doi.org/10.1007/s10236-008-0157-22
-
[36]
Nature Communications12(1), 4392 (2021) 9
Rolf, E., Proctor, J., Carleton, T., Bolliger, I., Shankar, V., Ishihara, M., Recht, B., Hsiang, S.: A generalizable and accessible approach to machine learning with global satellite imagery. Nature Communications12(1), 4392 (2021) 9
2021
-
[37]
In: Proceedings of the International Conference on Learning Representations (ICLR) (2024) 2, 4
Rußwurm, M., Klemmer, K., Rolf, E., Zbinden, R., Tuia, D.: Geographic loca- tion encoding with spherical harmonics and sinusoidal representation networks. In: Proceedings of the International Conference on Learning Representations (ICLR) (2024) 2, 4
2024
-
[38]
In: Winter Conference on Applications of Computer Vision
Sastry, S., Khanal, S., Dhakal, A., Ahmad, A., Jacobs, N.: Taxabind: A unified embedding space for ecological applications. In: Winter Conference on Applications of Computer Vision. IEEE/CVF (2025) 10, 8
2025
-
[39]
In: IEEE/ISPRS Workshop: Large Scale Com- puter Vision for Remote Sensing (EARTHVISION) (2024) 1
Sastry, S., Khanal, S., Dhakal, A., Jacobs, N.: Geosynth: Contextually-aware high- resolution satellite image synthesis. In: IEEE/ISPRS Workshop: Large Scale Com- puter Vision for Remote Sensing (EARTHVISION) (2024) 1
2024
-
[40]
In: NeurIPS
Sitzmann, V., Martel, J.N., Bergman, A.W., Lindell, D.B., Wetzstein, G.: Implicit neural representations with periodic activation functions. In: NeurIPS. vol. 33, pp. 7462–7473 (2020) 4
2020
-
[41]
Monthly Weather Review140(9), 3090–3105 (2012).https://doi.org/10.1175/MWR-D-11-00215.12
Skamarock, W.C., Klemp, J.B., Duda, M.G., Fowler, L.D., Park, S.H., Ringler, T.D.: A multiscale nonhydrostatic atmospheric model using centroidal Voronoi tesselations and C-grid staggering. Monthly Weather Review140(9), 3090–3105 (2012).https://doi.org/10.1175/MWR-D-11-00215.12
-
[42]
In: NeurIPS
Tancik, M., Srinivasan, P.P., Mildenhall, B., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. In: NeurIPS. vol. 33, pp. 7537–7547 (2020) 4
2020
-
[43]
arXiv e-prints pp
Tseng, G., Fuller, A., Reil, M., Herzog, H., Beukema, P., Bastani, F., Green, J.R., Shelhamer, E., Kerner, H., Rolnick, D.: Galileo: Learning global and local features in pretrained remote sensing models. arXiv e-prints pp. arXiv–2502 (2025) 1
2025
-
[44]
In: CVPR
Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun, C., Shepard, A., Adam, H., Perona, P., Belongie, S.: The inaturalist species classification and detection dataset. In: CVPR. pp. 8769–8778 (2018) 9, 10 18 D. Cher et al
2018
-
[45]
Advances in Neural Information Processing Systems36, 8690–8701 (2023) 1, 2, 4, 10, 8
Vivanco Cepeda, V., Nayak, G.K., Shah, M.: Geoclip: Clip-inspired alignment be- tween locations and images for effective worldwide geo-localization. Advances in Neural Information Processing Systems36, 8690–8701 (2023) 1, 2, 4, 10, 8
2023
-
[46]
In: Proceedings of the ACM SIGGRAPH Con- ference on Computer Graphics
Wang, J., Ren, P., Gong, M., Snyder, J., Guo, B.: All-frequency rendering of dy- namic, spatially-varying reflectance. In: Proceedings of the ACM SIGGRAPH Con- ference on Computer Graphics. pp. 133:1–133:10 (2009).https://doi.org/10. 1145/1576246.15763634
arXiv 2009
-
[47]
arXiv preprint arXiv:2211.07044 (2022) 9, 1
Wang, Y., Braham, N.A.A., Xiong, Z., Liu, C., Albrecht, C.M., Zhu, X.X.: Ssl4eo- s12: A large-scale multi-modal, multi-temporal dataset for self-supervised learning in earth observation. arXiv preprint arXiv:2211.07044 (2022) 9, 1
arXiv 2022
-
[48]
arXiv preprint arXiv:2509.18095 (2025) 5
Xiao,Z.,Ma,Q.,Gu,M.,Chen,C.c.J.,Chen,X.,Ordonez,V.,Mohan,V.:Metaem- bed: Scaling multimodal retrieval at test-time with flexible late interaction. arXiv preprint arXiv:2509.18095 (2025) 5
Pith/arXiv arXiv 2025
-
[49]
Yin, Y., Liu, Z., Zhang, Y., Wang, S., Shah, R.R., Zimmermann, R.: Gps2vec: Towards generating worldwide gps embeddings. In: ACM SIGSPATIAL (2019) 4 Supplementary Material: Tessellating The Earth A Implementation Details Pretraining.We train on the S2-100K [18] preprocessed dataset of globally sam- pled Sentinel-2 multispectral images, using 13-band input...
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.