hood2vec: Identifying Similar Urban Areas Using Mobility Networks

Alexandros Labrinidis; Konstantinos Pelechrinis; Xin Liu

arxiv: 1907.11951 · v1 · pith:7MFAHXUJnew · submitted 2019-07-17 · 💻 cs.SI · cs.LG· stat.ML

hood2vec: Identifying Similar Urban Areas Using Mobility Networks

Xin Liu , Konstantinos Pelechrinis , Alexandros Labrinidis This is my paper

Pith reviewed 2026-05-24 20:06 UTC · model grok-4.3

classification 💻 cs.SI cs.LGstat.ML

keywords urban areasmobility networksnode embeddingsFoursquareneighborhood similarityvenue typescheck-inshood2vec

0 comments

The pith

Mobility networks from check-ins can measure urban area similarity in ways that differ from venue type comparisons.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents hood2vec as a way to learn embeddings for urban neighborhoods directly from a mobility network built on Foursquare check-in data. The goal is to capture time-varying resident movements as a signal of how areas function and relate to each other. When these mobility-derived similarities are compared to similarities computed from the mix of venue types in each area, the two rankings show low correlation. A sympathetic reader would take this to mean that dynamic movement patterns and static place inventories are picking up separate dimensions of what makes neighborhoods alike. The work therefore positions mobility data as an additional lens rather than a replacement for traditional venue-based descriptions.

Core claim

hood2vec applies node embedding methods to a mobility network constructed from Foursquare check-ins, producing vector representations of urban areas whose pairwise similarities exhibit low correlation with similarities obtained by comparing the venue-type profiles of those same areas, which indicates that mobility dynamics and venue types capture different aspects of similarity between urban areas.

What carries the argument

hood2vec, a node-embedding procedure applied to a directed mobility network whose edges are weighted by check-in flows between urban areas.

If this is right

Areas judged similar by venue mix can still be functionally distinct when their mobility patterns are examined.
Mobility-based similarity can surface cross-city neighborhood matches that venue lists alone would miss.
Combining both signals could produce richer characterizations of urban neighborhoods than either signal alone.
The approach extends naturally to any city with timestamped location data that can be aggregated into area-to-area flows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Urban planners might use mobility embeddings to identify neighborhoods that serve similar daily roles even when their building stock differs.
The same embedding technique could be applied to transportation or cell-phone flow data to test whether the low-correlation result holds beyond check-in sources.
If the two similarity views remain distinct, recommendation systems for new residents or tourists could offer both venue-style and movement-style matches.

Load-bearing premise

Foursquare check-ins supply a sufficiently complete and unbiased record of how people actually move through the city over time.

What would settle it

Re-running the same embedding procedure on a different mobility dataset for the same cities and obtaining high correlation with the venue-type similarities would falsify the claim that the two sources capture distinct aspects.

read the original abstract

Which area in NYC is the most similar to Lower East Side? What about the NoHo Arts District in Los Angeles? Traditionally this task utilizes information about the type of places located within the areas and some popularity/quality metric. We take a different approach. In particular, urban dwellers' time-variant mobility is a reflection of how they interact with their city over time. Hence, in this paper, we introduce an approach, namely hood2vec, to identify the similarity between urban areas through learning a node embedding of the mobility network captured through Foursquare check-ins. We compare the pairwise similarities obtained from hood2vec with the ones obtained from comparing the types of venues in the different areas. The low correlation between the two indicates that the mobility dynamics and the venue types potentially capture different aspects of similarity between urban areas.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

hood2vec applies node embeddings to a Foursquare mobility graph and reports low correlation with venue-type similarity, but the result rests on whether the check-in data actually captures resident mobility flows.

read the letter

hood2vec builds a mobility network from Foursquare check-ins, treats urban areas as nodes, learns embeddings on that graph, and then measures pairwise area similarity from the embeddings. It compares those similarities to ones derived from venue-type distributions and finds low correlation, which the authors read as evidence that mobility patterns and venue composition pick up different aspects of neighborhood likeness. The setup is straightforward and the baseline comparison is a natural one to run. The paper does a clean job of framing the task and showing that the two signals are not redundant on this dataset. That is the main concrete contribution: a new empirical signal derived from mobility rather than static place attributes. The soft spot is the data. Foursquare check-ins are not a representative sample of how residents move through a city; they skew by age, income, venue popularity, and tourist presence. If those biases dominate the graph, the embeddings will encode the sampling artifacts rather than genuine time-variant mobility, and the low correlation could simply reflect noise in one or both measures instead of orthogonal information. The abstract gives no robustness checks, no validation against other mobility sources, and no details on graph construction or embedding hyperparameters. Without those, it is hard to know how much weight to put on the result. This is a narrow methods paper aimed at researchers who already work with location-based social media data and network embeddings for urban analytics. It will not shift broader theory or practice, but it supplies one additional signal that could be tested in combination with others. I would send it to peer review. The method is simple enough that referees can evaluate whether the data limitations undermine the central claim or whether the approach still adds usable information despite them.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes hood2vec, a method that constructs a mobility network from Foursquare check-ins and learns node embeddings to quantify similarity between urban areas. It reports a low correlation between the resulting pairwise similarities and those obtained by comparing venue types across areas, concluding that mobility dynamics and venue composition capture distinct aspects of urban-area similarity.

Significance. If the low-correlation result is shown to be robust and the embeddings are demonstrated to faithfully encode resident mobility (rather than data artifacts), the work would indicate that dynamic interaction patterns supply a complementary signal to static venue inventories. This could inform applications in urban planning and neighborhood recommendation. The approach applies standard network-embedding techniques to mobility data, but the absence of methodological specifics, statistical validation, or robustness checks in the provided text limits any assessment of its incremental contribution.

major comments (2)

[Abstract] Abstract: the central claim that the observed low correlation demonstrates that 'mobility dynamics and the venue types potentially capture different aspects' is not supported by any reported statistical test, confidence interval, or robustness check on the correlation value; without these, measurement error or sampling bias in the Foursquare source could equally explain the result.
[Abstract] Abstract: no description is given of how the mobility network is built from check-ins (edge weighting, temporal aggregation, resident vs. tourist filtering), the embedding algorithm, or its hyperparameters; these omissions make it impossible to evaluate whether the embeddings actually encode time-variant resident mobility flows as asserted.

minor comments (1)

[Abstract] Abstract: the phrase 'namely hood2vec' introduces the method name without indicating its relationship to established embedding frameworks such as node2vec or DeepWalk.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the observed low correlation demonstrates that 'mobility dynamics and the venue types potentially capture different aspects' is not supported by any reported statistical test, confidence interval, or robustness check on the correlation value; without these, measurement error or sampling bias in the Foursquare source could equally explain the result.

Authors: The abstract uses cautious language ('indicates' and 'potentially') rather than asserting that the result demonstrates distinct aspects. We agree, however, that the claim would be strengthened by reporting the actual correlation value along with any available statistical measures or caveats about data artifacts. We will revise the abstract to include the correlation coefficient and qualify the interpretation to acknowledge possible influences from sampling bias or measurement error in the Foursquare data. revision: yes
Referee: [Abstract] Abstract: no description is given of how the mobility network is built from check-ins (edge weighting, temporal aggregation, resident vs. tourist filtering), the embedding algorithm, or its hyperparameters; these omissions make it impossible to evaluate whether the embeddings actually encode time-variant resident mobility flows as asserted.

Authors: The abstract is intentionally concise. The full manuscript describes the mobility network construction from Foursquare check-ins (including edge weighting and temporal aspects) and specifies the embedding algorithm with its hyperparameters. To improve clarity, we will revise the abstract to include a brief high-level description of the network construction and embedding approach. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison of independent similarity measures

full rationale

The paper defines hood2vec as node embeddings learned from a mobility network constructed directly from Foursquare check-ins, then reports an empirical low correlation between the resulting pairwise similarities and a separately computed venue-type similarity baseline. This is a direct measurement against an external reference rather than any derivation, fitted parameter, or self-citation that reduces the claimed result to its own inputs by construction. No equations, ansatzes, or uniqueness theorems are invoked that would create self-definitional or load-bearing circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that check-in data faithfully represent mobility.

pith-pipeline@v0.9.0 · 5674 in / 1022 out tokens · 18770 ms · 2026-05-24T20:06:14.397465+00:00 · methodology

hood2vec: Identifying Similar Urban Areas Using Mobility Networks

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)