The Hierarchical Morphotope Classification: A Theory-Driven Framework for Large-Scale Analysis of Built Form

Anna Br\'azdov\'a; Daniela Dan\v{c}ejov\'a; Krasen Samardzhiev; Lisa Winkler; Martin Fleischmann

arxiv: 2509.10083 · v1 · submitted 2025-09-12 · 💻 cs.CY

The Hierarchical Morphotope Classification: A Theory-Driven Framework for Large-Scale Analysis of Built Form

Martin Fleischmann , Krasen Samardzhiev , Anna Br\'azdov\'a , Daniela Dan\v{c}ejov\'a , Lisa Winkler This is my paper

Pith reviewed 2026-05-18 17:57 UTC · model grok-4.3

classification 💻 cs.CY

keywords morphotope classificationurban morphologybuilt formhierarchical classificationopen dataregionalisationmorphometric analysisscalable urban analysis

0 comments

The pith

HiMoC classifies built form by first delineating morphotopes as smallest localities with distinctive character using open building and street data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents the Hierarchical Morphotope Classification as a method to group urban patterns into contiguous localities called morphotopes and then arrange them in a hierarchy. The approach starts from open data on buildings and streets, applies a regionalisation step to create morphologically distinct units, and builds a taxonomic tree based on dissimilarity in morphometric profiles. A sympathetic reader would care because the resulting framework supports scalable, reproducible classification that works across countries rather than remaining limited to single cities. It offers a theory-grounded alternative that complements land-use maps by focusing directly on physical form.

Core claim

The paper claims that the morphotope concept can be operationalised through the SA3 regionalisation method to produce contiguous, morphologically distinct localities from open data on buildings and streets, after which these units are organised into a hierarchical taxonomic tree that reflects their morphometric dissimilarity and permits flexible, interpretable classification of built fabric at continental scales.

What carries the argument

The morphotope, operationalised as the smallest locality with a distinctive character via the SA3 Spatial Agglomerative Adaptive Aggregation regionalisation applied to morphometric profiles of buildings and streets.

If this is right

The method groups over 90 million building footprints into more than 500,000 morphotopes across Central Europe.
Classification becomes applicable beyond a single country because the hierarchy is built from open data and a reproducible algorithm.
Users obtain flexible, interpretable categories of built fabric that complement existing land-use products.
The framework supports applications in urban planning, environmental analysis, and socio-spatial studies by providing a nuanced view of urban structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The hierarchy could be linked to socio-economic datasets to test whether morphological similarity predicts patterns of social or economic outcomes.
Environmental models of urban heat or air quality might incorporate morphotope boundaries as units for simulation rather than arbitrary grid cells.
The same open-data pipeline could be rerun after major construction events to measure morphological change over time.
Cross-validation against historical maps might reveal whether the derived morphotopes align with longstanding urban districts.

Load-bearing premise

Morphometric profiles derived from open building and street data are sufficient to capture the distinctive character of a locality and that SA3 regionalisation produces contiguous, morphologically distinct morphotopes.

What would settle it

Empirical comparison showing that the morphotopes produced by HiMoC do not correspond to areas identified as morphologically distinct through independent field surveys or expert visual assessment would falsify the central claim.

read the original abstract

Built environment, formed of a plethora of patterns of building, streets, and plots, has a profound impact on how cities are perceived and function. While various methods exist to classify urban patterns, they often lack a strong theoretical foundation, are not scalable beyond a local level, or sacrifice detail for broader application. This paper introduces the Hierarchical Morphotope Classification (HiMoC), a novel, theory-driven, and computationally scalable method of classification of built form. HiMoC operationalises the idea of a morphotope - the smallest locality with a distinctive character - using a bespoke regionalisation method SA3 (Spatial Agglomerative Adaptive Aggregation), to delineate contiguous, morphologically distinct localities. These are further organised into a hierarchical taxonomic tree reflecting their dissimilarity based on morphometric profile derived from buildings and streets retrieved from open data, allowing flexible, interpretable classification of built fabric, that can be applied beyond a scale of a single country. The method is tested on a subset of countries of Central Europe, grouping over 90 million building footprints into over 500,000 morphotopes. The method extends the capabilities of available morphometric analyses, while offering a complementary perspective to existing large scale data products, which are focusing primarily on land use or use conceptual definition of urban fabric types. This theory-grounded, reproducible, unsupervised and scalable method facilitates a nuanced understanding of urban structure, with broad applications in urban planning, environmental analysis, and socio-spatial studies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Hierarchical Morphotope Classification (HiMoC) framework, which operationalizes the morphotope concept—the smallest locality with a distinctive character—via a custom SA3 (Spatial Agglomerative Adaptive Aggregation) regionalisation algorithm applied to morphometric profiles derived from open building and street data. These morphotopes are organized into a hierarchical taxonomic tree reflecting dissimilarity, enabling flexible classification. The approach is demonstrated on Central European data, grouping over 90 million building footprints into over 500,000 morphotopes, and positioned as a scalable, theory-driven, unsupervised complement to land-use-focused urban classification products.

Significance. If the core claims hold, the work provides a notable contribution by delivering a reproducible, computationally scalable, and theory-grounded method for large-scale built-form analysis using open data. Strengths include the hierarchical structure for flexible interpretation and the explicit linkage to morphotope theory, which could support applications in urban planning, environmental analysis, and socio-spatial studies while complementing existing large-scale data products. The unsupervised and scalable design is a clear asset for continental or global extensions.

major comments (2)

[SA3 regionalisation description] The description of the SA3 regionalisation (methods section): the claim that SA3 produces contiguous, morphologically distinct morphotopes—the central operationalisation step—is not supported by any reported cluster quality metrics such as silhouette scores, Davies-Bouldin indices, or external validation against known morphological typologies. Without these, the assertion that the 500,000 morphotopes reflect genuine 'distinctive character' rather than artifacts from OSM coverage gaps remains unsubstantiated and is load-bearing for all downstream hierarchical and application claims.
[Results / Central Europe test] Central Europe test results: no baseline comparisons, sensitivity tests on SA3 agglomeration parameters, or error analysis are provided despite grouping 90 million buildings; this leaves the quantitative performance of the method without demonstrated support and weakens the claim of scalability and nuance over existing approaches.

minor comments (2)

[Abstract and methods] The abstract and methods would benefit from an explicit table listing the morphometric variables (e.g., building density, street network metrics) used to construct profiles, to improve reproducibility and clarity of the input to SA3.
[Hierarchical tree construction] Notation for the hierarchical taxonomic tree levels and dissimilarity measure is introduced without a formal definition or pseudocode; adding this would aid readers in understanding the claimed flexibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which have helped us identify areas where the manuscript can be strengthened. We address each major comment point by point below, providing clarifications on the methodological choices and outlining specific revisions.

read point-by-point responses

Referee: [SA3 regionalisation description] The description of the SA3 regionalisation (methods section): the claim that SA3 produces contiguous, morphologically distinct morphotopes—the central operationalisation step—is not supported by any reported cluster quality metrics such as silhouette scores, Davies-Bouldin indices, or external validation against known morphological typologies. Without these, the assertion that the 500,000 morphotopes reflect genuine 'distinctive character' rather than artifacts from OSM coverage gaps remains unsubstantiated and is load-bearing for all downstream hierarchical and application claims.

Authors: We agree that additional quantitative support would strengthen the presentation. However, SA3 is a spatially constrained regionalisation procedure that enforces contiguity via adaptive aggregation; standard internal validation indices such as silhouette scores or Davies-Bouldin indices assume non-spatial, distance-based clusters and are therefore not directly applicable. We will revise the methods section to clarify this design rationale and to include sensitivity tests on the agglomeration parameters. We will also expand the discussion of OSM data limitations, describing the preprocessing steps taken to mitigate coverage gaps, and add qualitative comparisons against existing morphological studies for selected cities as a form of external reference. revision: partial
Referee: [Results / Central Europe test] Central Europe test results: no baseline comparisons, sensitivity tests on SA3 agglomeration parameters, or error analysis are provided despite grouping 90 million buildings; this leaves the quantitative performance of the method without demonstrated support and weakens the claim of scalability and nuance over existing approaches.

Authors: We accept that the current results section would benefit from these additions to better demonstrate performance. In the revised manuscript we will insert a dedicated subsection reporting sensitivity tests on the principal SA3 parameters, showing stability of the resulting morphotope counts and profiles. We will also provide a baseline comparison against a non-hierarchical, non-spatial clustering method applied to a representative subsample, and we will expand the error analysis to quantify the influence of data-quality issues. These changes will directly support the scalability and comparative claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained operationalisation of prior theory

full rationale

The paper introduces HiMoC as a new operationalisation of the existing morphotope concept via the bespoke SA3 regionalisation algorithm applied to morphometric profiles derived from open building and street data. No equations or steps reduce outputs to fitted parameters by construction, nor do self-citations form load-bearing justifications for uniqueness or ansatzes. The hierarchical tree and 500k morphotopes emerge from the described clustering process rather than renaming known results or importing unverified self-citations as external facts. The derivation chain remains independent of the target claims.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The central claim rests on the existence of morphotopes as meaningful units and on the validity of SA3 for delineating them; these are introduced without independent empirical grounding or falsifiable tests beyond the descriptive application.

free parameters (1)

SA3 agglomeration parameters
The bespoke regionalisation method SA3 is described as adaptive but no specific thresholds or stopping criteria are stated in the abstract.

axioms (1)

domain assumption A morphotope is the smallest locality with a distinctive character that can be captured by morphometric profiles of buildings and streets.
This premise is invoked when the paper states it operationalises the morphotope idea using SA3.

invented entities (2)

morphotope no independent evidence
purpose: Smallest locality with distinctive character as the base unit of classification
Core new conceptual unit introduced to ground the classification.
SA3 (Spatial Agglomerative Adaptive Aggregation) no independent evidence
purpose: Regionalisation algorithm to delineate contiguous morphotopes
Bespoke method created for this framework.

pith-pipeline@v0.9.0 · 5813 in / 1328 out tokens · 48770 ms · 2026-05-18T17:57:40.918646+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

HiMoC operationalises the idea of a morphotope – the smallest locality with a distinctive character – using a bespoke regionalisation method SA3 (Spatial Agglomerative Adaptive Aggregation), to delineate contiguous, morphologically distinct localities. These are further organised into a hierarchical taxonomic tree reflecting their dissimilarity based on morphometric profile derived from buildings and streets retrieved from open data
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The only parameter required by SA3 is the minimum number of buildings to form a morphotope... we selected a value of 75

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages

[1]

On the Discovery of Urban Typologies: Data Mining the Multi-Dimensional Character of Neighbourhoods

https://doi.org/10.5311/JOSIS.2024.28.319. Fleischmann, Martin, Anastassia Vybornova, James D. Gaboardi, Anna Brázdová, and Daniela Dančejová. 2025. Adaptive Continuity-Preserving Simplification of Street Networks. arXiv:2504.16198. arXiv. https://doi.org/10.48550/arXiv.2504.16198. Gil, Jorge, Nuno Montenegro, J N Beirão, and J P Duarte. 2012. “On the Dis...

work page doi:10.5311/josis.2024.28.319 2024
[2]

Beyond Housing Preferences: Urban Structure and Actualisation of Residential Area Preferences

Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29063-3_18. Hasanzadeh, Kamyar, Marketta Kyttä, and Greg Brown. 2019. “Beyond Housing Preferences: Urban Structure and Actualisation of Residential Area Preferences. ” Urban Science 3 (1): 21. https://doi. org/10.3390/urbansci3010021. Hijazi, Ihab, Xin Li, Reinhard Koenig, et al. 2016. “Measu...

work page doi:10.1007/978-3-642-29063-3_18 2019
[3]

Classifying Settlement Types from Multi-Scale Spatial Patterns of Building Footprints

https://doi.org/10.1016/j.landurbplan.2007.02.010. Jochem, Warren C, Douglas R Leasure, Oliver Pannell, Heather R Chamberlain, Patricia Jones, and Andrew J Tatem. 2020. “Classifying Settlement Types from Multi-Scale Spatial Patterns of Building Footprints. ” Environment and Planning B: Urban Analytics and City Science, May, 239980832092120. https://doi.or...

work page doi:10.1016/j.landurbplan.2007.02.010 2007
[4]

Clustering Patterns of Urban Built-up Areas with Curves of Fractal Scaling Behaviour

“Clustering Patterns of Urban Built-up Areas with Curves of Fractal Scaling Behaviour. ” Environment and Planning B: Planning and Design 37 (5): 942–54. https://doi.org/10.1068/b36039. Van den Bossche, Joris, Kelsey Jordahl, Martin Fleischmann, et al. 2025. Geopandas/Geopandas: V1.1.1. Version v1.1.1. Zenodo, released June. https://doi.org/10.5281/zenodo....

work page doi:10.1068/b36039 2025
[5]

Area of a building is denoted as (1) 𝑎𝑏𝑙𝑔 and defined as an area covered by a building footprint in m²

work page
[6]

Perimeter of a building is denoted as (2) 𝑝𝑏𝑙𝑔 and defined as the sum of lengths of the building exterior walls in m

work page
[7]

Courtyard area of a building is denoted as (3) 𝑎𝑏𝑙𝑔𝑐 and defined as the sum of areas of interior holes in footprint polygons in m²

work page
[8]

It captures the relation of building footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle (Dibble et al

Circular compactness of a building is denoted as (4) 𝐶𝐶𝑜𝑏𝑙𝑔 = 𝑎𝑏 𝑙 𝑔 𝑎𝑏 𝑙 𝑔 𝐶 where 𝑎𝑏𝑙𝑔𝐶 is an area of minimal enclosing circle. It captures the relation of building footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle (Dibble et al. 2015)

work page 2015
[9]

It uses only external shape (shapely.geometry.exterior), courtyards are not included

Corners of a building is denoted as (5) 𝐶𝑜𝑟𝑏𝑙𝑔 = ∑𝑛 𝑖=1 𝑐𝑏𝑙𝑔 where 𝑐𝑏𝑙𝑔 is defined as a vertex of building exterior shape with an angle between adjacent line segments ≤ 170 degrees. It uses only external shape (shapely.geometry.exterior), courtyards are not included. Character is adapted from (Steiniger et al. 2008) to exclude non-corner-like vertices

work page 2008
[10]

Squareness of a building is denoted as (6) 𝑆𝑞𝑢𝑏𝑙𝑔 = ∑𝑛 𝑖 =1 𝐷𝑐𝑏 𝑙 𝑔 𝑖 𝑛 where 𝐷 is the deviation of angle of corner 𝑐𝑏𝑙𝑔𝑖 from 90 degrees and 𝑛 is a number of corners

work page
[11]

It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as the shape characters with the best performance

Equivalent rectangular index of a building is denoted as (7) 𝐸𝑅𝐼𝑏𝑙𝑔 = √ 𝑎𝑏 𝑙 𝑔 𝑎𝑏 𝑙 𝑔 𝐵 × 𝑝𝑏 𝑙 𝑔 𝐵 𝑝𝑏 𝑙 𝑔 where 𝑎𝑏𝑙𝑔𝐵 is an area of a minimal rotated bounding rectangle of a building (MBR) footprint and 𝑝𝑏𝑙𝑔𝐵 its perimeter of MBR. It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as the shape characters with the best performance

work page 2017
[12]

It captures the ratio of shorter to the longer dimension of MBR to indirectly capture the deviation of the shape from a square (Schirmer and Axhausen 2015)

Elongation of a building is denoted as (8) 𝐸𝑙𝑜𝑏𝑙𝑔 = 𝑙𝑏 𝑙 𝑔 𝐵 𝑤𝑏 𝑙 𝑔 𝐵 where 𝑙𝑏𝑙𝑔𝐵 is length of MBR and 𝑤𝑏𝑙𝑔𝐵 is width of MBR. It captures the ratio of shorter to the longer dimension of MBR to indirectly capture the deviation of the shape from a square (Schirmer and Axhausen 2015)

work page 2015
[13]

The axis itself does not have to be fully within the polygon

Longest axis length of a tessellation cell is denoted as (9) 𝐿𝐴𝐿𝑐𝑒𝑙𝑙 = 𝑑𝑐𝑒𝑙𝑙𝐶 where 𝑑𝑐𝑒𝑙𝑙𝐶 is a diameter of the minimal circumscribed circle around the tessellation cell polygon. The axis itself does not have to be fully within the polygon. It could be seen as a proxy of plot depth for tessellation-based analysis

work page
[14]

Area of a tessellation cell is denoted as (10) 𝑎𝑐𝑒𝑙𝑙 and defined as an area covered by a tessellation cell footprint in m²

work page
[15]

It captures the relation of tessellation cell footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle

Circular compactness of a tessellation cell is denoted as (11) 𝐶𝐶𝑜𝑐𝑒𝑙𝑙 = 𝑎𝑐𝑒 𝑙 𝑙 𝑎𝑐𝑒 𝑙 𝑙 𝐶 where 𝑎𝑐𝑒𝑙𝑙𝐶 is an area of minimal enclosing circle. It captures the relation of tessellation cell footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle. The Hierarchical Morphotope Classification 23

work page
[16]

It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as a shape character of the best performance

Equivalent rectangular index of a tessellation cell is denoted as (12) 𝐸𝑅𝐼𝑐𝑒𝑙𝑙 = √ 𝑎𝑐𝑒 𝑙 𝑙 𝑎𝑐𝑒 𝑙 𝑙 𝐵 × 𝑝𝑐𝑒 𝑙 𝑙 𝐵 𝑝𝑐𝑒 𝑙 𝑙 where 𝑎𝑐𝑒𝑙𝑙𝐵 is an area of the minimal rotated bounding rectangle of a tessellation cell (MBR) footprint and 𝑝𝑐𝑒𝑙𝑙𝐵 its perimeter of MBR. It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as a shape charac...

work page 2017
[17]

Coverage area ratio (CAR) is one of the commonly used characters capturing intensity of development

Coverage area ratio of a tessellation cell is denoted as (13) 𝐶𝐴𝑅𝑐𝑒𝑙𝑙 = 𝑎𝑏 𝑙 𝑔 𝑎𝑐𝑒 𝑙 𝑙 where 𝑎𝑏𝑙𝑔 is an area of a building and 𝑎𝑐𝑒𝑙𝑙 is an area of related tessellation cell (Schirmer and Axhausen 2015). Coverage area ratio (CAR) is one of the commonly used characters capturing intensity of development. However, the definitions vary based on the spatial unit

work page 2015
[18]

2015; Gil et al

Length of a street segment is denoted as (14) 𝑙𝑒𝑑𝑔 and defined as a length of a LineString geometry in metres (Dibble et al. 2015; Gil et al. 2012)

work page 2015
[19]

The algorithm generates street sections every 3 meters alongside the street segment, and measures mean value

Width of a street profile is denoted as (15) 𝑤𝑠𝑝 = 1 𝑛 (∑𝑛 𝑖=1 𝑤𝑖) where 𝑤𝑖 is width of a street section i. The algorithm generates street sections every 3 meters alongside the street segment, and measures mean value. In the case of the open-ended street, 50 metres is used as a perception-based proximity limit (Araldi and Fusco 2019)

work page 2019
[20]

The algorithm generates street sections every 3 meters alongside the street segment

Openness of a street profile is denoted as (16) 𝑂𝑝𝑒𝑠𝑝 = 1 − ∑ ℎ𝑖𝑡 2 ∑ 𝑠𝑒𝑐 where ∑ ℎ𝑖𝑡 is a sum of section lines (left and right sides separately) intersecting buildings and ∑ 𝑠𝑒𝑐 total number of street sections. The algorithm generates street sections every 3 meters alongside the street segment

work page
[21]

The algorithm generates street sections every 3 meters alongside the street segment

Width deviation of a street profile is denoted as (17) 𝑤𝐷𝑒𝑣𝑠𝑝 = √ 1 𝑛 ∑𝑛 𝑖=1 (𝑤𝑖 − 𝑤𝑠𝑝) 2 where 𝑤𝑖 is width of a street section i and 𝑤𝑠𝑝 is mean width. The algorithm generates street sections every 3 meters alongside the street segment

work page
[22]

It captures the deviation of a segment shape from a straight line

Linearity of a street segment is denoted as (18) 𝐿𝑖𝑛𝑒𝑑𝑔 = 𝑙𝑒 𝑢 𝑐𝑙 𝑙𝑒 𝑑𝑔 where 𝑙𝑒𝑢𝑐𝑙 is Euclidean distance between endpoints of a street segment and 𝑙𝑒𝑑𝑔 is a street segment length. It captures the deviation of a segment shape from a straight line. It is adapted from Araldi and Fusco (2019)

work page 2019
[23]

It captures the area which is likely served by each segment

Area covered by a street segment is denoted as (19) 𝑎𝑒𝑑𝑔 = ∑𝑛 𝑖=1 𝑎𝑐𝑒𝑙𝑙𝑖 where 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 belonging to the street segment. It captures the area which is likely served by each segment

work page
[24]

It reflects the granularity of development along each segment

Buildings per meter of a street segment is denoted as (20) 𝐵𝑝𝑀𝑒𝑑𝑔 = ∑ 𝑏𝑙𝑔 𝑙𝑒 𝑑𝑔 where ∑ 𝑏𝑙𝑔 is a number of buildings belonging to a street segment and 𝑙𝑒𝑑𝑔 is a length of a street segment. It reflects the granularity of development along each segment

work page
[25]

It captures the area which is likely served by each node

Area covered by a street node is denoted as (21) 𝑎𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖=1 𝑎𝑐𝑒𝑙𝑙𝑖 where 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 belonging to the street node. It captures the area which is likely served by each node

work page
[26]

It captures the amount of wall space facing the open space (Hamaina et al

Shared walls ratio of adjacent buildings is denoted as (22) 𝑆𝑊𝑅𝑏𝑙𝑔 = 𝑝𝑏 𝑙 𝑔 𝑠ℎ𝑎𝑟𝑒 𝑑 𝑝𝑏 𝑙 𝑔 where 𝑝𝑏𝑙𝑔𝑠ℎ𝑎𝑟𝑒 𝑑 is a length of a perimeter shared with adjacent buildings and 𝑝𝑏𝑙𝑔 is a perimeter of a building. It captures the amount of wall space facing the open space (Hamaina et al. 2012)

work page 2012
[27]

It is adapted from Hijazi et al

Mean distance to neighbouring buildings is denoted as The Hierarchical Morphotope Classification 24 (23) 𝑁𝐷𝑖𝑏𝑙𝑔 = 1 𝑛 ∑𝑛 𝑖=1 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 where 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 is a distance between building and building 𝑖 on a neighbouring tessellation cell. It is adapted from Hijazi et al. (2016). It captures the average proximity to other buildings

work page 2016
[28]

It reflects granularity of morphological tessellation

Weighted neighbours of a tessellation cell is denoted as (24) 𝑊𝑁𝑒𝑐𝑒𝑙𝑙 = ∑ 𝑐𝑒𝑙𝑙𝑛 𝑝𝑐𝑒 𝑙 𝑙 where ∑ 𝑐𝑒𝑙𝑙𝑛 is a number of cell neighbours and 𝑝𝑐𝑒𝑙𝑙 is a perimeter of a cell. It reflects granularity of morphological tessellation

work page
[29]

It captures the scale of morphological tessellation

Area covered by neighbouring cells is denoted as (25) 𝑎𝑐𝑒𝑙𝑙𝑛 = ∑𝑛 𝑖=1 𝑎𝑐𝑒𝑙𝑙𝑖 where 𝑎𝑐𝑒𝑙𝑙𝑖 is area of tessellation cell 𝑖 within topological distance 1. It captures the scale of morphological tessellation

work page
[30]

It captures an accessible area

Reached area by neighbouring segments is denoted as (26) 𝑎𝑒𝑑𝑔𝑛 = ∑𝑛 𝑖=1 𝑎𝑒𝑑𝑔𝑖 where 𝑎𝑒𝑑𝑔𝑖 is an area covered by a street segment 𝑖 within topological distance 1. It captures an accessible area

work page
[31]

It reflects the basic degree centrality

Degree of a street node is denoted as (27) 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 = ∑𝑗𝑒𝑑𝑔𝑖𝑗 where 𝑒𝑑𝑔𝑖𝑗 is an edge of a street network between node 𝑖 and node 𝑗. It reflects the basic degree centrality

work page
[32]

It captures the average proximity to other nodes

Mean distance to neighbouring nodes from a street node is denoted as (28) 𝑀𝐷𝑖𝑛𝑜𝑑𝑒 = 1 𝑛 ∑𝑛 𝑖=1 𝑑𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒𝑖 where 𝑑𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒𝑖 is a distance between node and node 𝑖 within topological distance 1. It captures the average proximity to other nodes

work page
[33]

It captures accessible granularity

Reached cells by neighbouring nodes is denoted as (29) 𝑅𝐶𝑛𝑜𝑑𝑒𝑛 = ∑𝑛 𝑖=1 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 where 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 is number of tessellation cells on node 𝑖 within topological distance 1. It captures accessible granularity

work page
[34]

It captures an accessible area

Reached area by neighbouring nodes is denoted as (30) 𝑎𝑛𝑜𝑑𝑒𝑛 = ∑𝑛 𝑖=1 𝑎𝑛𝑜𝑑𝑒𝑖 where 𝑎𝑛𝑜𝑑𝑒𝑖 is an area covered by a street node 𝑖 within topological distance 1. It captures an accessible area

work page
[35]

Number of courtyards of adjacent buildings is denoted as (31) 𝑁𝐶𝑜𝑏𝑙𝑔𝑎𝑑𝑗 where 𝑁𝐶𝑜𝑏𝑙𝑔𝑎𝑑𝑗 is a number of interior rings of a polygon composed of footprints of adjacent buildings (Schirmer and Axhausen 2015)

work page 2015
[36]

Perimeter wall length of adjacent buildings is denoted as (32) 𝑝𝑏𝑙𝑔𝑎𝑑𝑗 where 𝑝𝑏𝑙𝑔𝑎𝑑𝑗 is a length of an exterior ring of a polygon composed of footprints of adjacent buildings

work page
[37]

It is adapted from Caruso et al

Mean inter-building distance between neighbouring buildings is denoted as (33) 𝐼𝐵𝐷𝑏𝑙𝑔 = 1 𝑛 ∑𝑛 𝑖=1 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 where 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 is a distance between building and building 𝑖 on a tessellation cell within topological distance 3. It is adapted from Caruso et al. (2017). It captures the average proximity between buildings

work page 2017
[38]

It is adapted from Vanderhaegen and Canters (2017)

Building adjacency of neighbouring buildings is denoted as (34) 𝐵𝑢𝐴𝑏𝑙𝑔 = ∑ 𝑏𝑙𝑔𝑎𝑑𝑗 ∑ 𝑏𝑙𝑔 where ∑ 𝑏𝑙𝑔𝑎𝑑𝑗 is a number of joined built-up structures within topological distance three and ∑ 𝑏𝑙𝑔 is a number of buildings within topological distance 3. It is adapted from Vanderhaegen and Canters (2017). The Hierarchical Morphotope Classification 25

work page 2017
[39]

Weighted reached blocks of neighbouring tessellation cells is denoted as (35) 𝑊𝑅𝐵𝑐𝑒𝑙𝑙 = ∑ 𝑏𝑙𝑘 ∑𝑛 𝑖 =1 𝑎𝑐𝑒 𝑙 𝑙 𝑖 where ∑ 𝑏𝑙𝑘 is a number of blocks within topological distance three and 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 within topological distance three

work page
[40]

A subgraph is defined as a network within topological distance five around a node

Local meshedness of a street network is denoted as (36) 𝑀𝑒𝑠𝑛𝑜𝑑𝑒 = 𝑒−𝑣+1 2𝑣−5 where 𝑒 is a number of edges in a subgraph, and 𝑣 is the number of nodes in a subgraph (Feliciotti 2018). A subgraph is defined as a network within topological distance five around a node

work page 2018
[41]

Mean segment length of a street network is denoted as (37) 𝑀𝑆𝐿𝑒𝑑𝑔 = 1 𝑛 ∑𝑛 𝑖=1 𝑙𝑒𝑑𝑔𝑖 where 𝑙𝑒𝑑𝑔𝑖 is a length of a street segment 𝑖 within a topological distance 3 around a segment

work page
[42]

Cul-de-sac length of a street network is denoted as (38) 𝐶𝐷𝐿𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖=1 𝑙𝑒𝑑𝑔𝑖 , if 𝑒𝑑𝑔𝑖 is cul-de-sac where 𝑙𝑒𝑑𝑔𝑖 is a length of a street segment 𝑖 within a topological distance 3 around a node

work page
[43]

It captures accessible granularity

Reached cells by street network segments is denoted as (39) 𝑅𝐶𝑒𝑑𝑔 = ∑𝑛 𝑖=1 𝑐𝑒𝑙𝑙𝑠𝑒𝑑𝑔𝑖 where 𝑐𝑒𝑙𝑙𝑠𝑒𝑑𝑔𝑖 is number of tessellation cells on segment 𝑖 within topological distance 3. It captures accessible granularity

work page
[44]

A subgraph is defined as a network within topological distance five around a node

Node density of a street network is denoted as (40) 𝐷𝑛𝑜𝑑𝑒 = ∑ 𝑛𝑜𝑑𝑒 ∑𝑛 𝑖 =1 𝑙𝑒 𝑑𝑔 𝑖 where ∑ 𝑛𝑜𝑑𝑒 is a number of nodes within a subgraph and 𝑙𝑒𝑑𝑔𝑖 is a length of a segment 𝑖 within a subgraph. A subgraph is defined as a network within topological distance five around a node

work page
[45]

It captures accessible granularity

Reached cells by street network nodes is denoted as (41) 𝑅𝐶𝑛𝑜𝑑𝑒𝑛 𝑒 𝑡 = ∑𝑛 𝑖=1 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 where 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 is number of tessellation cells on node 𝑖 within topological distance 3. It captures accessible granularity

work page
[46]

It captures an accessible area

Reached area by street network nodes is denoted as (42) 𝑎𝑛𝑜𝑑𝑒𝑛 𝑒 𝑡 = ∑𝑛 𝑖=1 𝑎𝑛𝑜𝑑𝑒𝑖 where 𝑎𝑛𝑜𝑑𝑒𝑖 is an area covered by a street node 𝑖 within topological distance 3. It captures an accessible area

work page
[47]

Adapted from (Boeing 2017)

Proportion of cul-de-sacs within a street network is denoted as (43) 𝑝𝐶𝐷𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 , if 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 =1 ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from (Boeing 2017)

work page 2017
[48]

Adapted from (Boeing 2017)

Proportion of 3-way intersections within a street network is denoted as (44) 𝑝3𝑊𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 , if 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 =3 ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from (Boeing 2017)

work page 2017
[49]

Adapted from (Boeing 2017)

Proportion of 4-way intersections within a street network is denoted as (45) 𝑝4𝑊𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 , if 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 =4 ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from (Boeing 2017)

work page 2017
[50]

A subgraph is defined as a network within topological distance five around a node

Weighted node density of a street network is denoted as (46) 𝑤𝐷𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 −1 ∑𝑛 𝑖 =1 𝑙𝑒 𝑑𝑔 𝑖 where 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 is a degree of a node 𝑖 within a subgraph and 𝑙𝑒𝑑𝑔𝑖 is a length of a segment 𝑖 within a subgraph. A subgraph is defined as a network within topological distance five around a node

work page
[51]

A subgraph is defined as a network within topological distance five around a node

Local closeness centrality of a street network is denoted as The Hierarchical Morphotope Classification 26 (47) 𝑙𝐶𝐶𝑛𝑜𝑑𝑒 = 𝑛−1 ∑𝑛 −1 𝑣=1 𝑑(𝑣,𝑢) where 𝑑(𝑣, 𝑢) is the shortest-path distance between 𝑣 and 𝑢, and 𝑛 is the number of nodes within a subgraph. A subgraph is defined as a network within topological distance five around a node

work page
[52]

Square clustering of a street network is denoted as (48) 𝑠𝐶𝑙𝑛𝑜𝑑𝑒 = ∑𝑘𝑣 𝑢 =1 ∑𝑘𝑣 𝑤=𝑢 +1 𝑞𝑣(𝑢,𝑤) ∑𝑘𝑣 𝑢 =1 ∑𝑘𝑣 𝑤=𝑢 +1 [𝑎𝑣(𝑢,𝑤)+𝑞𝑣(𝑢,𝑤)] where 𝑞𝑣(𝑢, 𝑤) are the number of common neighbours of 𝑢 and 𝑤 other than 𝑣 (ie squares), and 𝑎𝑣(𝑢, 𝑤)= (𝑘𝑢 − (1 + 𝑞𝑣(𝑢, 𝑤)+ 𝜃𝑢𝑣))(𝑘𝑤− (1 + 𝑞𝑣(𝑢, 𝑤)+ 𝜃𝑢𝑤)), where 𝜃𝑢𝑤 = 1 if 𝑢 and 𝑤 are connected and 0 otherwise (Lind et al. 2005)

work page 2005
[53]

Connected buildings count is denoted as (49) 𝑐𝑏𝑙𝑔 and defined as number of buildings directly adjacent to the target building

work page
[54]

Connected buildings area is denoted as (50) 𝑎𝑐𝑏𝑙𝑔 and defined as total area of all buildings directly adjacent to the target building

work page
[55]

Connected buildings perimeter is denoted as (51) 𝑝𝑐𝑏𝑙𝑔 and defined as total perimeter of all buildings directly adjacent to the target building

work page
[56]

Connected buildings elongation is denoted as (52) 𝑚𝑖𝑏𝐸𝑙𝑜𝑐𝑏𝑙𝑔 = 𝐸𝑙𝑜(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐸𝑙𝑜 is the elongation formula defined previously

work page
[57]

Connected buildings elongation is denoted as (53) 𝑚𝑖𝑏𝐸𝑅𝐼𝑐𝑏𝑙𝑔 = 𝐸𝑅𝐼(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐸𝑅𝐼 is the elongation formula defined previously

work page
[58]

Connected buildings circular compactness is denoted as (54) 𝑚𝑖𝑏𝐶𝐶𝑜𝑐𝑏𝑙𝑔 = 𝐶𝐶𝑜(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐶𝐶𝑜 is the elongation formula defined previously

work page
[59]

Connected buildings longest axis length is denoted as (55) 𝑚𝑖𝑏𝐿𝐴𝐿𝑐𝑏𝑙𝑔 = 𝐿𝐴𝐿(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐿𝐴𝐿 is the elongation formula defined previously

work page
[60]

Connected buildings facade ratio is denoted as (56) 𝑚𝑖𝑏𝐹𝑅𝑐𝑏𝑙𝑔 = 𝑚𝑖𝑏𝐴𝑟𝑒𝑐𝑏 𝑙 𝑔 𝑚𝑖𝑏𝑃 𝑒𝑟𝑐𝑏 𝑙 𝑔 where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝑚𝑖𝑏𝐴𝑟𝑒 and 𝑚𝑖𝑏𝑃𝑒𝑟 are the formulas defined previously

work page
[61]

Connected buildings square compactness is denoted as (57) 𝑚𝑖𝑏𝑆𝐶𝑜𝑐𝑏𝑙𝑔 = ( 4√𝑚𝑖𝑏𝐴𝑟𝑒𝑐𝑏 𝑙 𝑔 𝑚𝑖𝑏𝑃 𝑒𝑟𝑐𝑏 𝑙 𝑔 ) 2 where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝑚𝑖𝑏𝐴𝑟𝑒 and 𝑚𝑖𝑏𝑃𝑒𝑟 are the formulas defined previously

work page
[62]

Deviation of building area in tessellation neighbourhood is denoted as (58) 𝑚𝑖𝑐𝐵𝐴𝐷𝑐𝑒𝑙𝑙 and is defined as the standard deviation in the areas of all buildings within tessellation cells, directly adjacent to the target tesellation cell

work page
[63]

Likely Occupied Area

Deviation of building area in node-attached buildings is denoted as (59) 𝑚𝑖𝑑𝐵𝐴𝐷𝑛𝑜𝑑𝑒 and is defined as the standard deviation in the areas of all buildings attached to the target node. The Hierarchical Morphotope Classification 27 There are three additional indicator variable calculated per morphotope - “Likely Occupied Area”, “Area of the largest ten conn...

work page
[64]

First, it generates a full Ward clustering tree based on differences in feature space, and adjacency in geographic space

work page
[65]

Second it uses Leaf extraction to generate a set of clusters from the dendrogram. The linkage matrix is generated by computing distances between observations based on the Ward formula, subject to a restriction that new connections must be spatially adjacent enclosed tessel- lation cells. The leaf extraction algorithm processes the resulting dendrogram as follows:

work page
[66]

The dendrogram is cut at all possible levels - one for each connection - starting from the lowest to the highest distance value

work page
[67]

If a cluster has more than N ETCs it is marked for extraction

At every level the number of members within each cluster and its constituent children are counted. If a cluster has more than N ETCs it is marked for extraction

work page
[68]

Since the members of a marked cluster keeps increasing until a merger occurs, typically each extracted cluster has more than N members

When one cluster marked for extraction merges with another, both are extracted from the dendrogram as separate clusters. Since the members of a marked cluster keeps increasing until a merger occurs, typically each extracted cluster has more than N members

work page
[69]

Perimeter of the largest ten connected structures

All points that are never part of a marked cluster are treated as outliers and marked as noise. Before applying the clustering algorithm, the all variables are preprocessed using a Quantile Transformer with a uniform distribution. This data transformation produces a relatively more equal weighing of all features when calculating distances between observat...

work page 2024

[1] [1]

On the Discovery of Urban Typologies: Data Mining the Multi-Dimensional Character of Neighbourhoods

https://doi.org/10.5311/JOSIS.2024.28.319. Fleischmann, Martin, Anastassia Vybornova, James D. Gaboardi, Anna Brázdová, and Daniela Dančejová. 2025. Adaptive Continuity-Preserving Simplification of Street Networks. arXiv:2504.16198. arXiv. https://doi.org/10.48550/arXiv.2504.16198. Gil, Jorge, Nuno Montenegro, J N Beirão, and J P Duarte. 2012. “On the Dis...

work page doi:10.5311/josis.2024.28.319 2024

[2] [2]

Beyond Housing Preferences: Urban Structure and Actualisation of Residential Area Preferences

Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29063-3_18. Hasanzadeh, Kamyar, Marketta Kyttä, and Greg Brown. 2019. “Beyond Housing Preferences: Urban Structure and Actualisation of Residential Area Preferences. ” Urban Science 3 (1): 21. https://doi. org/10.3390/urbansci3010021. Hijazi, Ihab, Xin Li, Reinhard Koenig, et al. 2016. “Measu...

work page doi:10.1007/978-3-642-29063-3_18 2019

[3] [3]

Classifying Settlement Types from Multi-Scale Spatial Patterns of Building Footprints

https://doi.org/10.1016/j.landurbplan.2007.02.010. Jochem, Warren C, Douglas R Leasure, Oliver Pannell, Heather R Chamberlain, Patricia Jones, and Andrew J Tatem. 2020. “Classifying Settlement Types from Multi-Scale Spatial Patterns of Building Footprints. ” Environment and Planning B: Urban Analytics and City Science, May, 239980832092120. https://doi.or...

work page doi:10.1016/j.landurbplan.2007.02.010 2007

[4] [4]

Clustering Patterns of Urban Built-up Areas with Curves of Fractal Scaling Behaviour

“Clustering Patterns of Urban Built-up Areas with Curves of Fractal Scaling Behaviour. ” Environment and Planning B: Planning and Design 37 (5): 942–54. https://doi.org/10.1068/b36039. Van den Bossche, Joris, Kelsey Jordahl, Martin Fleischmann, et al. 2025. Geopandas/Geopandas: V1.1.1. Version v1.1.1. Zenodo, released June. https://doi.org/10.5281/zenodo....

work page doi:10.1068/b36039 2025

[5] [5]

Area of a building is denoted as (1) 𝑎𝑏𝑙𝑔 and defined as an area covered by a building footprint in m²

work page

[6] [6]

Perimeter of a building is denoted as (2) 𝑝𝑏𝑙𝑔 and defined as the sum of lengths of the building exterior walls in m

work page

[7] [7]

Courtyard area of a building is denoted as (3) 𝑎𝑏𝑙𝑔𝑐 and defined as the sum of areas of interior holes in footprint polygons in m²

work page

[8] [8]

It captures the relation of building footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle (Dibble et al

Circular compactness of a building is denoted as (4) 𝐶𝐶𝑜𝑏𝑙𝑔 = 𝑎𝑏 𝑙 𝑔 𝑎𝑏 𝑙 𝑔 𝐶 where 𝑎𝑏𝑙𝑔𝐶 is an area of minimal enclosing circle. It captures the relation of building footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle (Dibble et al. 2015)

work page 2015

[9] [9]

It uses only external shape (shapely.geometry.exterior), courtyards are not included

Corners of a building is denoted as (5) 𝐶𝑜𝑟𝑏𝑙𝑔 = ∑𝑛 𝑖=1 𝑐𝑏𝑙𝑔 where 𝑐𝑏𝑙𝑔 is defined as a vertex of building exterior shape with an angle between adjacent line segments ≤ 170 degrees. It uses only external shape (shapely.geometry.exterior), courtyards are not included. Character is adapted from (Steiniger et al. 2008) to exclude non-corner-like vertices

work page 2008

[10] [10]

Squareness of a building is denoted as (6) 𝑆𝑞𝑢𝑏𝑙𝑔 = ∑𝑛 𝑖 =1 𝐷𝑐𝑏 𝑙 𝑔 𝑖 𝑛 where 𝐷 is the deviation of angle of corner 𝑐𝑏𝑙𝑔𝑖 from 90 degrees and 𝑛 is a number of corners

work page

[11] [11]

It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as the shape characters with the best performance

Equivalent rectangular index of a building is denoted as (7) 𝐸𝑅𝐼𝑏𝑙𝑔 = √ 𝑎𝑏 𝑙 𝑔 𝑎𝑏 𝑙 𝑔 𝐵 × 𝑝𝑏 𝑙 𝑔 𝐵 𝑝𝑏 𝑙 𝑔 where 𝑎𝑏𝑙𝑔𝐵 is an area of a minimal rotated bounding rectangle of a building (MBR) footprint and 𝑝𝑏𝑙𝑔𝐵 its perimeter of MBR. It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as the shape characters with the best performance

work page 2017

[12] [12]

It captures the ratio of shorter to the longer dimension of MBR to indirectly capture the deviation of the shape from a square (Schirmer and Axhausen 2015)

Elongation of a building is denoted as (8) 𝐸𝑙𝑜𝑏𝑙𝑔 = 𝑙𝑏 𝑙 𝑔 𝐵 𝑤𝑏 𝑙 𝑔 𝐵 where 𝑙𝑏𝑙𝑔𝐵 is length of MBR and 𝑤𝑏𝑙𝑔𝐵 is width of MBR. It captures the ratio of shorter to the longer dimension of MBR to indirectly capture the deviation of the shape from a square (Schirmer and Axhausen 2015)

work page 2015

[13] [13]

The axis itself does not have to be fully within the polygon

Longest axis length of a tessellation cell is denoted as (9) 𝐿𝐴𝐿𝑐𝑒𝑙𝑙 = 𝑑𝑐𝑒𝑙𝑙𝐶 where 𝑑𝑐𝑒𝑙𝑙𝐶 is a diameter of the minimal circumscribed circle around the tessellation cell polygon. The axis itself does not have to be fully within the polygon. It could be seen as a proxy of plot depth for tessellation-based analysis

work page

[14] [14]

Area of a tessellation cell is denoted as (10) 𝑎𝑐𝑒𝑙𝑙 and defined as an area covered by a tessellation cell footprint in m²

work page

[15] [15]

It captures the relation of tessellation cell footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle

Circular compactness of a tessellation cell is denoted as (11) 𝐶𝐶𝑜𝑐𝑒𝑙𝑙 = 𝑎𝑐𝑒 𝑙 𝑙 𝑎𝑐𝑒 𝑙 𝑙 𝐶 where 𝑎𝑐𝑒𝑙𝑙𝐶 is an area of minimal enclosing circle. It captures the relation of tessellation cell footprint shape to its minimal enclosing circle, illustrating the similarity of shape and circle. The Hierarchical Morphotope Classification 23

work page

[16] [16]

It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as a shape character of the best performance

Equivalent rectangular index of a tessellation cell is denoted as (12) 𝐸𝑅𝐼𝑐𝑒𝑙𝑙 = √ 𝑎𝑐𝑒 𝑙 𝑙 𝑎𝑐𝑒 𝑙 𝑙 𝐵 × 𝑝𝑐𝑒 𝑙 𝑙 𝐵 𝑝𝑐𝑒 𝑙 𝑙 where 𝑎𝑐𝑒𝑙𝑙𝐵 is an area of the minimal rotated bounding rectangle of a tessellation cell (MBR) footprint and 𝑝𝑐𝑒𝑙𝑙𝐵 its perimeter of MBR. It is a measure of shape complexity identified by Basaraner and Cetinkaya (2017) as a shape charac...

work page 2017

[17] [17]

Coverage area ratio (CAR) is one of the commonly used characters capturing intensity of development

Coverage area ratio of a tessellation cell is denoted as (13) 𝐶𝐴𝑅𝑐𝑒𝑙𝑙 = 𝑎𝑏 𝑙 𝑔 𝑎𝑐𝑒 𝑙 𝑙 where 𝑎𝑏𝑙𝑔 is an area of a building and 𝑎𝑐𝑒𝑙𝑙 is an area of related tessellation cell (Schirmer and Axhausen 2015). Coverage area ratio (CAR) is one of the commonly used characters capturing intensity of development. However, the definitions vary based on the spatial unit

work page 2015

[18] [18]

2015; Gil et al

Length of a street segment is denoted as (14) 𝑙𝑒𝑑𝑔 and defined as a length of a LineString geometry in metres (Dibble et al. 2015; Gil et al. 2012)

work page 2015

[19] [19]

The algorithm generates street sections every 3 meters alongside the street segment, and measures mean value

Width of a street profile is denoted as (15) 𝑤𝑠𝑝 = 1 𝑛 (∑𝑛 𝑖=1 𝑤𝑖) where 𝑤𝑖 is width of a street section i. The algorithm generates street sections every 3 meters alongside the street segment, and measures mean value. In the case of the open-ended street, 50 metres is used as a perception-based proximity limit (Araldi and Fusco 2019)

work page 2019

[20] [20]

The algorithm generates street sections every 3 meters alongside the street segment

Openness of a street profile is denoted as (16) 𝑂𝑝𝑒𝑠𝑝 = 1 − ∑ ℎ𝑖𝑡 2 ∑ 𝑠𝑒𝑐 where ∑ ℎ𝑖𝑡 is a sum of section lines (left and right sides separately) intersecting buildings and ∑ 𝑠𝑒𝑐 total number of street sections. The algorithm generates street sections every 3 meters alongside the street segment

work page

[21] [21]

The algorithm generates street sections every 3 meters alongside the street segment

Width deviation of a street profile is denoted as (17) 𝑤𝐷𝑒𝑣𝑠𝑝 = √ 1 𝑛 ∑𝑛 𝑖=1 (𝑤𝑖 − 𝑤𝑠𝑝) 2 where 𝑤𝑖 is width of a street section i and 𝑤𝑠𝑝 is mean width. The algorithm generates street sections every 3 meters alongside the street segment

work page

[22] [22]

It captures the deviation of a segment shape from a straight line

Linearity of a street segment is denoted as (18) 𝐿𝑖𝑛𝑒𝑑𝑔 = 𝑙𝑒 𝑢 𝑐𝑙 𝑙𝑒 𝑑𝑔 where 𝑙𝑒𝑢𝑐𝑙 is Euclidean distance between endpoints of a street segment and 𝑙𝑒𝑑𝑔 is a street segment length. It captures the deviation of a segment shape from a straight line. It is adapted from Araldi and Fusco (2019)

work page 2019

[23] [23]

It captures the area which is likely served by each segment

Area covered by a street segment is denoted as (19) 𝑎𝑒𝑑𝑔 = ∑𝑛 𝑖=1 𝑎𝑐𝑒𝑙𝑙𝑖 where 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 belonging to the street segment. It captures the area which is likely served by each segment

work page

[24] [24]

It reflects the granularity of development along each segment

Buildings per meter of a street segment is denoted as (20) 𝐵𝑝𝑀𝑒𝑑𝑔 = ∑ 𝑏𝑙𝑔 𝑙𝑒 𝑑𝑔 where ∑ 𝑏𝑙𝑔 is a number of buildings belonging to a street segment and 𝑙𝑒𝑑𝑔 is a length of a street segment. It reflects the granularity of development along each segment

work page

[25] [25]

It captures the area which is likely served by each node

Area covered by a street node is denoted as (21) 𝑎𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖=1 𝑎𝑐𝑒𝑙𝑙𝑖 where 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 belonging to the street node. It captures the area which is likely served by each node

work page

[26] [26]

It captures the amount of wall space facing the open space (Hamaina et al

Shared walls ratio of adjacent buildings is denoted as (22) 𝑆𝑊𝑅𝑏𝑙𝑔 = 𝑝𝑏 𝑙 𝑔 𝑠ℎ𝑎𝑟𝑒 𝑑 𝑝𝑏 𝑙 𝑔 where 𝑝𝑏𝑙𝑔𝑠ℎ𝑎𝑟𝑒 𝑑 is a length of a perimeter shared with adjacent buildings and 𝑝𝑏𝑙𝑔 is a perimeter of a building. It captures the amount of wall space facing the open space (Hamaina et al. 2012)

work page 2012

[27] [27]

It is adapted from Hijazi et al

Mean distance to neighbouring buildings is denoted as The Hierarchical Morphotope Classification 24 (23) 𝑁𝐷𝑖𝑏𝑙𝑔 = 1 𝑛 ∑𝑛 𝑖=1 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 where 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 is a distance between building and building 𝑖 on a neighbouring tessellation cell. It is adapted from Hijazi et al. (2016). It captures the average proximity to other buildings

work page 2016

[28] [28]

It reflects granularity of morphological tessellation

Weighted neighbours of a tessellation cell is denoted as (24) 𝑊𝑁𝑒𝑐𝑒𝑙𝑙 = ∑ 𝑐𝑒𝑙𝑙𝑛 𝑝𝑐𝑒 𝑙 𝑙 where ∑ 𝑐𝑒𝑙𝑙𝑛 is a number of cell neighbours and 𝑝𝑐𝑒𝑙𝑙 is a perimeter of a cell. It reflects granularity of morphological tessellation

work page

[29] [29]

It captures the scale of morphological tessellation

Area covered by neighbouring cells is denoted as (25) 𝑎𝑐𝑒𝑙𝑙𝑛 = ∑𝑛 𝑖=1 𝑎𝑐𝑒𝑙𝑙𝑖 where 𝑎𝑐𝑒𝑙𝑙𝑖 is area of tessellation cell 𝑖 within topological distance 1. It captures the scale of morphological tessellation

work page

[30] [30]

It captures an accessible area

Reached area by neighbouring segments is denoted as (26) 𝑎𝑒𝑑𝑔𝑛 = ∑𝑛 𝑖=1 𝑎𝑒𝑑𝑔𝑖 where 𝑎𝑒𝑑𝑔𝑖 is an area covered by a street segment 𝑖 within topological distance 1. It captures an accessible area

work page

[31] [31]

It reflects the basic degree centrality

Degree of a street node is denoted as (27) 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 = ∑𝑗𝑒𝑑𝑔𝑖𝑗 where 𝑒𝑑𝑔𝑖𝑗 is an edge of a street network between node 𝑖 and node 𝑗. It reflects the basic degree centrality

work page

[32] [32]

It captures the average proximity to other nodes

Mean distance to neighbouring nodes from a street node is denoted as (28) 𝑀𝐷𝑖𝑛𝑜𝑑𝑒 = 1 𝑛 ∑𝑛 𝑖=1 𝑑𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒𝑖 where 𝑑𝑛𝑜𝑑𝑒,𝑛𝑜𝑑𝑒𝑖 is a distance between node and node 𝑖 within topological distance 1. It captures the average proximity to other nodes

work page

[33] [33]

It captures accessible granularity

Reached cells by neighbouring nodes is denoted as (29) 𝑅𝐶𝑛𝑜𝑑𝑒𝑛 = ∑𝑛 𝑖=1 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 where 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 is number of tessellation cells on node 𝑖 within topological distance 1. It captures accessible granularity

work page

[34] [34]

It captures an accessible area

Reached area by neighbouring nodes is denoted as (30) 𝑎𝑛𝑜𝑑𝑒𝑛 = ∑𝑛 𝑖=1 𝑎𝑛𝑜𝑑𝑒𝑖 where 𝑎𝑛𝑜𝑑𝑒𝑖 is an area covered by a street node 𝑖 within topological distance 1. It captures an accessible area

work page

[35] [35]

Number of courtyards of adjacent buildings is denoted as (31) 𝑁𝐶𝑜𝑏𝑙𝑔𝑎𝑑𝑗 where 𝑁𝐶𝑜𝑏𝑙𝑔𝑎𝑑𝑗 is a number of interior rings of a polygon composed of footprints of adjacent buildings (Schirmer and Axhausen 2015)

work page 2015

[36] [36]

Perimeter wall length of adjacent buildings is denoted as (32) 𝑝𝑏𝑙𝑔𝑎𝑑𝑗 where 𝑝𝑏𝑙𝑔𝑎𝑑𝑗 is a length of an exterior ring of a polygon composed of footprints of adjacent buildings

work page

[37] [37]

It is adapted from Caruso et al

Mean inter-building distance between neighbouring buildings is denoted as (33) 𝐼𝐵𝐷𝑏𝑙𝑔 = 1 𝑛 ∑𝑛 𝑖=1 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 where 𝑑𝑏𝑙𝑔,𝑏𝑙𝑔𝑖 is a distance between building and building 𝑖 on a tessellation cell within topological distance 3. It is adapted from Caruso et al. (2017). It captures the average proximity between buildings

work page 2017

[38] [38]

It is adapted from Vanderhaegen and Canters (2017)

Building adjacency of neighbouring buildings is denoted as (34) 𝐵𝑢𝐴𝑏𝑙𝑔 = ∑ 𝑏𝑙𝑔𝑎𝑑𝑗 ∑ 𝑏𝑙𝑔 where ∑ 𝑏𝑙𝑔𝑎𝑑𝑗 is a number of joined built-up structures within topological distance three and ∑ 𝑏𝑙𝑔 is a number of buildings within topological distance 3. It is adapted from Vanderhaegen and Canters (2017). The Hierarchical Morphotope Classification 25

work page 2017

[39] [39]

Weighted reached blocks of neighbouring tessellation cells is denoted as (35) 𝑊𝑅𝐵𝑐𝑒𝑙𝑙 = ∑ 𝑏𝑙𝑘 ∑𝑛 𝑖 =1 𝑎𝑐𝑒 𝑙 𝑙 𝑖 where ∑ 𝑏𝑙𝑘 is a number of blocks within topological distance three and 𝑎𝑐𝑒𝑙𝑙𝑖 is an area of tessellation cell 𝑖 within topological distance three

work page

[40] [40]

A subgraph is defined as a network within topological distance five around a node

Local meshedness of a street network is denoted as (36) 𝑀𝑒𝑠𝑛𝑜𝑑𝑒 = 𝑒−𝑣+1 2𝑣−5 where 𝑒 is a number of edges in a subgraph, and 𝑣 is the number of nodes in a subgraph (Feliciotti 2018). A subgraph is defined as a network within topological distance five around a node

work page 2018

[41] [41]

Mean segment length of a street network is denoted as (37) 𝑀𝑆𝐿𝑒𝑑𝑔 = 1 𝑛 ∑𝑛 𝑖=1 𝑙𝑒𝑑𝑔𝑖 where 𝑙𝑒𝑑𝑔𝑖 is a length of a street segment 𝑖 within a topological distance 3 around a segment

work page

[42] [42]

Cul-de-sac length of a street network is denoted as (38) 𝐶𝐷𝐿𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖=1 𝑙𝑒𝑑𝑔𝑖 , if 𝑒𝑑𝑔𝑖 is cul-de-sac where 𝑙𝑒𝑑𝑔𝑖 is a length of a street segment 𝑖 within a topological distance 3 around a node

work page

[43] [43]

It captures accessible granularity

Reached cells by street network segments is denoted as (39) 𝑅𝐶𝑒𝑑𝑔 = ∑𝑛 𝑖=1 𝑐𝑒𝑙𝑙𝑠𝑒𝑑𝑔𝑖 where 𝑐𝑒𝑙𝑙𝑠𝑒𝑑𝑔𝑖 is number of tessellation cells on segment 𝑖 within topological distance 3. It captures accessible granularity

work page

[44] [44]

A subgraph is defined as a network within topological distance five around a node

Node density of a street network is denoted as (40) 𝐷𝑛𝑜𝑑𝑒 = ∑ 𝑛𝑜𝑑𝑒 ∑𝑛 𝑖 =1 𝑙𝑒 𝑑𝑔 𝑖 where ∑ 𝑛𝑜𝑑𝑒 is a number of nodes within a subgraph and 𝑙𝑒𝑑𝑔𝑖 is a length of a segment 𝑖 within a subgraph. A subgraph is defined as a network within topological distance five around a node

work page

[45] [45]

It captures accessible granularity

Reached cells by street network nodes is denoted as (41) 𝑅𝐶𝑛𝑜𝑑𝑒𝑛 𝑒 𝑡 = ∑𝑛 𝑖=1 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 where 𝑐𝑒𝑙𝑙𝑠𝑛𝑜𝑑𝑒𝑖 is number of tessellation cells on node 𝑖 within topological distance 3. It captures accessible granularity

work page

[46] [46]

It captures an accessible area

Reached area by street network nodes is denoted as (42) 𝑎𝑛𝑜𝑑𝑒𝑛 𝑒 𝑡 = ∑𝑛 𝑖=1 𝑎𝑛𝑜𝑑𝑒𝑖 where 𝑎𝑛𝑜𝑑𝑒𝑖 is an area covered by a street node 𝑖 within topological distance 3. It captures an accessible area

work page

[47] [47]

Adapted from (Boeing 2017)

Proportion of cul-de-sacs within a street network is denoted as (43) 𝑝𝐶𝐷𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 , if 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 =1 ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from (Boeing 2017)

work page 2017

[48] [48]

Adapted from (Boeing 2017)

Proportion of 3-way intersections within a street network is denoted as (44) 𝑝3𝑊𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 , if 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 =3 ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from (Boeing 2017)

work page 2017

[49] [49]

Adapted from (Boeing 2017)

Proportion of 4-way intersections within a street network is denoted as (45) 𝑝4𝑊𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 , if 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 =4 ∑𝑛 𝑖 =1 𝑛𝑜𝑑𝑒𝑖 where 𝑛𝑜𝑑𝑒𝑖 is a node whiting topological distance five around a node. Adapted from (Boeing 2017)

work page 2017

[50] [50]

A subgraph is defined as a network within topological distance five around a node

Weighted node density of a street network is denoted as (46) 𝑤𝐷𝑛𝑜𝑑𝑒 = ∑𝑛 𝑖 =1 𝑑𝑒𝑔𝑛 𝑜𝑑𝑒 𝑖 −1 ∑𝑛 𝑖 =1 𝑙𝑒 𝑑𝑔 𝑖 where 𝑑𝑒𝑔𝑛𝑜𝑑𝑒𝑖 is a degree of a node 𝑖 within a subgraph and 𝑙𝑒𝑑𝑔𝑖 is a length of a segment 𝑖 within a subgraph. A subgraph is defined as a network within topological distance five around a node

work page

[51] [51]

A subgraph is defined as a network within topological distance five around a node

Local closeness centrality of a street network is denoted as The Hierarchical Morphotope Classification 26 (47) 𝑙𝐶𝐶𝑛𝑜𝑑𝑒 = 𝑛−1 ∑𝑛 −1 𝑣=1 𝑑(𝑣,𝑢) where 𝑑(𝑣, 𝑢) is the shortest-path distance between 𝑣 and 𝑢, and 𝑛 is the number of nodes within a subgraph. A subgraph is defined as a network within topological distance five around a node

work page

[52] [52]

Square clustering of a street network is denoted as (48) 𝑠𝐶𝑙𝑛𝑜𝑑𝑒 = ∑𝑘𝑣 𝑢 =1 ∑𝑘𝑣 𝑤=𝑢 +1 𝑞𝑣(𝑢,𝑤) ∑𝑘𝑣 𝑢 =1 ∑𝑘𝑣 𝑤=𝑢 +1 [𝑎𝑣(𝑢,𝑤)+𝑞𝑣(𝑢,𝑤)] where 𝑞𝑣(𝑢, 𝑤) are the number of common neighbours of 𝑢 and 𝑤 other than 𝑣 (ie squares), and 𝑎𝑣(𝑢, 𝑤)= (𝑘𝑢 − (1 + 𝑞𝑣(𝑢, 𝑤)+ 𝜃𝑢𝑣))(𝑘𝑤− (1 + 𝑞𝑣(𝑢, 𝑤)+ 𝜃𝑢𝑤)), where 𝜃𝑢𝑤 = 1 if 𝑢 and 𝑤 are connected and 0 otherwise (Lind et al. 2005)

work page 2005

[53] [53]

Connected buildings count is denoted as (49) 𝑐𝑏𝑙𝑔 and defined as number of buildings directly adjacent to the target building

work page

[54] [54]

Connected buildings area is denoted as (50) 𝑎𝑐𝑏𝑙𝑔 and defined as total area of all buildings directly adjacent to the target building

work page

[55] [55]

Connected buildings perimeter is denoted as (51) 𝑝𝑐𝑏𝑙𝑔 and defined as total perimeter of all buildings directly adjacent to the target building

work page

[56] [56]

Connected buildings elongation is denoted as (52) 𝑚𝑖𝑏𝐸𝑙𝑜𝑐𝑏𝑙𝑔 = 𝐸𝑙𝑜(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐸𝑙𝑜 is the elongation formula defined previously

work page

[57] [57]

Connected buildings elongation is denoted as (53) 𝑚𝑖𝑏𝐸𝑅𝐼𝑐𝑏𝑙𝑔 = 𝐸𝑅𝐼(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐸𝑅𝐼 is the elongation formula defined previously

work page

[58] [58]

Connected buildings circular compactness is denoted as (54) 𝑚𝑖𝑏𝐶𝐶𝑜𝑐𝑏𝑙𝑔 = 𝐶𝐶𝑜(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐶𝐶𝑜 is the elongation formula defined previously

work page

[59] [59]

Connected buildings longest axis length is denoted as (55) 𝑚𝑖𝑏𝐿𝐴𝐿𝑐𝑏𝑙𝑔 = 𝐿𝐴𝐿(𝑐𝑏𝑙𝑔) where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝐿𝐴𝐿 is the elongation formula defined previously

work page

[60] [60]

Connected buildings facade ratio is denoted as (56) 𝑚𝑖𝑏𝐹𝑅𝑐𝑏𝑙𝑔 = 𝑚𝑖𝑏𝐴𝑟𝑒𝑐𝑏 𝑙 𝑔 𝑚𝑖𝑏𝑃 𝑒𝑟𝑐𝑏 𝑙 𝑔 where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝑚𝑖𝑏𝐴𝑟𝑒 and 𝑚𝑖𝑏𝑃𝑒𝑟 are the formulas defined previously

work page

[61] [61]

Connected buildings square compactness is denoted as (57) 𝑚𝑖𝑏𝑆𝐶𝑜𝑐𝑏𝑙𝑔 = ( 4√𝑚𝑖𝑏𝐴𝑟𝑒𝑐𝑏 𝑙 𝑔 𝑚𝑖𝑏𝑃 𝑒𝑟𝑐𝑏 𝑙 𝑔 ) 2 where 𝑐𝑏𝑙𝑔 are all buildings adjacent to the target building and 𝑚𝑖𝑏𝐴𝑟𝑒 and 𝑚𝑖𝑏𝑃𝑒𝑟 are the formulas defined previously

work page

[62] [62]

Deviation of building area in tessellation neighbourhood is denoted as (58) 𝑚𝑖𝑐𝐵𝐴𝐷𝑐𝑒𝑙𝑙 and is defined as the standard deviation in the areas of all buildings within tessellation cells, directly adjacent to the target tesellation cell

work page

[63] [63]

Likely Occupied Area

Deviation of building area in node-attached buildings is denoted as (59) 𝑚𝑖𝑑𝐵𝐴𝐷𝑛𝑜𝑑𝑒 and is defined as the standard deviation in the areas of all buildings attached to the target node. The Hierarchical Morphotope Classification 27 There are three additional indicator variable calculated per morphotope - “Likely Occupied Area”, “Area of the largest ten conn...

work page

[64] [64]

First, it generates a full Ward clustering tree based on differences in feature space, and adjacency in geographic space

work page

[65] [65]

Second it uses Leaf extraction to generate a set of clusters from the dendrogram. The linkage matrix is generated by computing distances between observations based on the Ward formula, subject to a restriction that new connections must be spatially adjacent enclosed tessel- lation cells. The leaf extraction algorithm processes the resulting dendrogram as follows:

work page

[66] [66]

The dendrogram is cut at all possible levels - one for each connection - starting from the lowest to the highest distance value

work page

[67] [67]

If a cluster has more than N ETCs it is marked for extraction

At every level the number of members within each cluster and its constituent children are counted. If a cluster has more than N ETCs it is marked for extraction

work page

[68] [68]

Since the members of a marked cluster keeps increasing until a merger occurs, typically each extracted cluster has more than N members

When one cluster marked for extraction merges with another, both are extracted from the dendrogram as separate clusters. Since the members of a marked cluster keeps increasing until a merger occurs, typically each extracted cluster has more than N members

work page

[69] [69]

Perimeter of the largest ten connected structures

All points that are never part of a marked cluster are treated as outliers and marked as noise. Before applying the clustering algorithm, the all variables are preprocessed using a Quantile Transformer with a uniform distribution. This data transformation produces a relatively more equal weighing of all features when calculating distances between observat...

work page 2024