pith. sign in

arxiv: 2606.27513 · v1 · pith:H4DLLFNYnew · submitted 2026-06-25 · ⚛️ physics.soc-ph

Toward a Hybrid Digital Twin of Society: Quantifying Cognitive-Spatial Linkages Through Online-Offline Feedback Networks

Pith reviewed 2026-06-29 00:59 UTC · model grok-4.3

classification ⚛️ physics.soc-ph
keywords feedback networkurban mobilityonline-offline interactioncognitive-spatial linkagesconcentration entropydigital twinBudapest dataCOVID-19 disruption
0
0 comments X

The pith

A Feedback Network framework reveals that urban mobility arises from interactions between online searches and physical visits.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a computational method called the Feedback Network to track how digital information seeking and real-world movement influence each other in the same individuals. It applies this to Google data from Budapest residents over several years, adapting measures of geographic spread to compare semantic exploration online with physical mobility offline. The analysis finds that online searches cluster more tightly around repeated topics while physical movements cover more varied ground, yet specific areas like retail create enduring links between the two domains. The work shows these patterns held even as the pandemic altered routines, with movement disrupted more than search habits. If accurate, this means models of city life must treat online and offline activity as coupled rather than separate.

Core claim

The Feedback Network models transitions between search-related activity clusters and location-related activity clusters drawn from the same people's data, evaluated through Concentration Entropy to distinguish routine versus exploratory flows; results indicate online exploration remains more concentrated than offline mobility, stable linkages persist in retail and business services, and the COVID-19 period widened the gap by disrupting spatial routines more than cognitive ones, establishing that urban mobility depends on the interaction between informational exposure and spatial encounter.

What carries the argument

The Feedback Network, which captures co-evolution of cognitive activity clusters from searches and spatial activity clusters from visits and is assessed by Concentration Entropy to quantify whether flows concentrate on routines or spread across exploratory transitions.

If this is right

  • Online search patterns remain narrower and more repetitive than the diverse range of physical movements.
  • Stable cognitive-spatial behavioral loops form around retail and business services.
  • The pandemic affected realized movement more strongly than digital exploration, increasing the separation between the two.
  • Urban mobility modeling requires joint treatment of informational exposure and spatial encounters rather than isolated study.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same network approach could be applied to forecast how platform changes might shift physical traffic patterns in specific neighborhoods.
  • Testing the framework across multiple cities would show whether the concentration difference between online and offline activity is general or Budapest-specific.
  • City services could monitor these loops to anticipate how online trends translate into demand for physical locations.
  • Alternative data sources beyond one platform would test whether the observed patterns depend on the particular digital environment.

Load-bearing premise

Google Search and Location History data collected via donation in Budapest accurately and representatively capture individuals' cognitive activity and physical behavior without major selection biases or platform distortions.

What would settle it

Repeating the analysis on a comparable dataset that shows no measurable difference in Concentration Entropy between online and offline activities, or that finds no persistent retail linkages after basic controls, would undermine the claimed distinction in feedback loops.

Figures

Figures reproduced from arXiv: 2606.27513 by Julia Koltai, Rafiazka Hilman.

Figure 1
Figure 1. Figure 1: Distribution of online search clusters, derived from the corresponding 25 Google Trends categories. This pipeline was chosen for its scalable and language-agnostic classification capabil￾ities when handling unstructured digital trace data. The clustering results are presented in [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of offline visit clusters, derived from the corresponding 10 Foursquare categories [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Radius of gyration is computed on the geolocations of visited places (spatial) and on query embeddings (cognitive) (Fig. 3a), and normalized by their mean values (Fig. 3b). Concentration in the online search space is higher than in the offline physical space. Exploration profile (Fig. 3c) with four archetypes based on cognitive and spatial exploration where high alignment in both domains (HH-red) accounts … view at source ↗
Figure 4
Figure 4. Figure 4: The individual feedback network in 2021 is presented as directed connections at the activity level (Fig. 4a) and the cluster level (Fig. 4b), magnifying the interaction between online and offline activities. An individual feedback network from 2021 is presented in [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: b) illustrates the Collective Cognitive-Spatial Feedback Network aggregated across all individuals from the year 2018. In this network those spatial events (location visits) and cognitive events (online searches) are linked, which happened within a one￾hour timeframe. The tie goes from the event with the earlier timestamp to the event with the later timestamp. In order to observe robust results, instead of… view at source ↗
Figure 6
Figure 6. Figure 6: Collective Feedback Network and cognitive-spatial linkage structure in 2022. The linkage matrix (Fig. 6a) illustrates forward linkage probabilities between online search categories (“S ”) and offline visitation categories (“V ”) and vica vera, where brighter cells correspond to stronger linkage likelihoods from origin cluster ci to destina￾tion cluster cj (displayed for wci,cj ≥ 0.1). Compared with earlier… view at source ↗
Figure 7
Figure 7. Figure 7: The average forward linkage matrix (Fig. 7a) summarizes heterogeneous link￾age probabilities between online search categories (’S ’) and offline visitation categories (’V ’) over the entire observation period (2018-2022). Cell value represents the mean probability of transitioning from origin cluster ci to destination cluster cj with annota￾tions displayed for stronger linkages (wc1,c2 ≥ 0.1). Hierarchical… view at source ↗
read the original abstract

Digital platforms increasingly shape how people experience and navigate cities, linking virtual information seeking with physical mobility. Despite this interdependence, online and offline activities are often studied separately in urban mobility research. This paper introduces the Feedback Network, a computational framework that captures interactions between cognitive activity in digital environments and behavior in physical space. Using Google Search and Location History data from the same individuals, collected through a data donation framework in Budapest, Hungary, between 2018 and 2022, we examine how online search patterns and offline visitation behavior co-evolve. We combine semantic and spatial analytical approaches. Radius of gyration is adapted to measure variation in geographic mobility and semantic exploration, enabling comparison between physical movement and online cognitive dispersion. A Feedback Network models transitions between search-related and location-related activity clusters and is evaluated using Concentration Entropy, which measures whether behavioral flows are concentrated around routine pathways or distributed across exploratory transitions. The results show that online exploration is more concentrated than offline mobility, suggesting narrower and more repetitive semantic interests, while physical movement remains relatively diverse. Persistent linkages between search and visitation activities related to retail and business services indicate stable cognitive-spatial behavioral loops. The COVID-19 pandemic disrupted spatial routines more strongly than cognitive exploration, widening the gap between digital engagement and realized movement. The findings demonstrate that urban mobility depends on the interaction between informational exposure and spatial encounter and provide a foundation for Hybrid Digital Twins of Society.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper introduces the Feedback Network, a framework modeling transitions between semantic clusters from Google Search and spatial clusters from Location History for the same Budapest individuals (2018-2022 data donation). It adapts radius of gyration to compare semantic and geographic dispersion, evaluates the network via Concentration Entropy (measuring concentration of behavioral flows), and reports that online exploration is more concentrated/repetitive than offline mobility, with persistent retail/business linkages, stronger COVID disruption to spatial than cognitive patterns, and overall evidence that urban mobility depends on informational-spatial interactions, providing a basis for Hybrid Digital Twins of Society.

Significance. If the empirical patterns hold after addressing data and methodological issues, the work offers a potentially useful computational bridge between digital cognitive traces and physical mobility, with implications for urban analytics and digital twin modeling. The data-donation approach and entropy-based evaluation are novel elements, but significance is limited by the absence of robustness checks or external validation.

major comments (3)
  1. [Data collection / Methods] Data and methods sections: The central claim of general cognitive-spatial linkages rests on the Budapest Google data-donation cohort being representative, yet no post-stratification, inverse-probability weighting, or comparison to census/mobility surveys is described; this selection bias (tech-savvy, privacy-consenting users) directly undermines extrapolation to 'Hybrid Digital Twins of Society'.
  2. [Feedback Network / Concentration Entropy] Feedback Network and Concentration Entropy definition (likely §3-4): The entropy metric for evaluating transitions risks circularity if cluster definitions or thresholds are derived from the same data partitions used to build the network, with no independent benchmarking or sensitivity analysis reported; this affects the reported concentration gap and loop findings.
  3. [Results] Results on COVID differential impact and retail/business loops: These key empirical claims lack reported error bars, robustness to alternative clusterings, or controls for platform-specific distortions in search/location data, making it unclear whether the patterns are load-bearing or artifactual.
minor comments (1)
  1. [Abstract] Abstract and introduction: The term 'Concentration Entropy' is introduced without a concise formula or reference to its computation details, reducing immediate clarity for readers.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the thoughtful and constructive comments, which help clarify the scope and limitations of our work. We address each major comment below and indicate the revisions planned for the manuscript.

read point-by-point responses
  1. Referee: [Data collection / Methods] Data and methods sections: The central claim of general cognitive-spatial linkages rests on the Budapest Google data-donation cohort being representative, yet no post-stratification, inverse-probability weighting, or comparison to census/mobility surveys is described; this selection bias (tech-savvy, privacy-consenting users) directly undermines extrapolation to 'Hybrid Digital Twins of Society'.

    Authors: We agree that the data-donation sample is subject to self-selection and is not statistically representative of the Budapest or Hungarian population. The manuscript presents the Feedback Network as a methodological framework demonstrated on this linked individual-level dataset rather than a population-representative study. In revision we will add an explicit Limitations subsection that discusses selection biases, the exploratory character of the findings, and the challenges of generalizing from data-donation cohorts. Where aggregate public mobility statistics permit, we will include brief comparisons to contextualize the sample; however, the linked semantic-spatial structure is unique to this donation and cannot be re-weighted to census margins without additional individual-level covariates that are unavailable. revision: partial

  2. Referee: [Feedback Network / Concentration Entropy] Feedback Network and Concentration Entropy definition (likely §3-4): The entropy metric for evaluating transitions risks circularity if cluster definitions or thresholds are derived from the same data partitions used to build the network, with no independent benchmarking or sensitivity analysis reported; this affects the reported concentration gap and loop findings.

    Authors: Semantic clusters are obtained via topic modeling on search queries and spatial clusters via density-based or k-means partitioning of location coordinates; these steps are performed independently before the transition network is constructed. Concentration Entropy is then computed on the resulting directed graph of cluster transitions. To address the concern, the revised manuscript will include (i) a null-model benchmark that randomizes transitions while preserving cluster sizes and (ii) sensitivity analyses that vary the number of clusters and the clustering hyperparameters, reporting the stability of the online-offline concentration gap and the retail/business loop statistics across these choices. revision: yes

  3. Referee: [Results] Results on COVID differential impact and retail/business loops: These key empirical claims lack reported error bars, robustness to alternative clusterings, or controls for platform-specific distortions in search/location data, making it unclear whether the patterns are load-bearing or artifactual.

    Authors: We accept that the current results section would benefit from additional statistical support. In the revision we will (a) attach bootstrap or permutation-based error bars to the Concentration Entropy differences and to the pre-/post-COVID comparisons, (b) repeat the main analyses under at least two alternative clustering schemes (different topic-model initializations and spatial clustering algorithms), and (c) add a short discussion of known platform-specific features of Google Search and Location History data together with the safeguards already present in the linked-donation design. These additions will clarify which patterns remain stable under reasonable methodological variation. revision: yes

standing simulated objections not resolved
  • External validation against independent, non-Google linked semantic-spatial datasets is not currently feasible; no comparable public resource exists that records both search queries and precise location histories for the same individuals over multiple years.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces an empirical computational framework (Feedback Network + Concentration Entropy) applied to donated Google Search and Location History data. Clustering, transition modeling, and entropy calculation are standard data-analytic steps performed on observed activity sequences; the reported findings (online concentration vs. offline diversity, retail/business loops, COVID disruption) are presented as outcomes of these computations rather than quantities defined by construction from the same partitions. No equations, self-citations, or uniqueness claims appear in the provided text that would reduce the central results to tautological inputs. The derivation chain remains self-contained against external data.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on the validity of the data donation approach, the meaningful adaptation of radius of gyration to search semantics, and the definition of the Feedback Network; no independent evidence or external benchmarks are referenced.

free parameters (1)
  • Cluster definitions or entropy thresholds in Feedback Network
    Likely required to compute Concentration Entropy and transitions but unspecified in abstract.
axioms (1)
  • domain assumption Radius of gyration can be validly adapted from physical mobility to measure semantic exploration in search data
    Invoked when combining semantic and spatial analytical approaches.
invented entities (1)
  • Feedback Network no independent evidence
    purpose: Models transitions between search-related and location-related activity clusters
    Newly introduced computational framework to capture online-offline interactions.

pith-pipeline@v0.9.1-grok · 5787 in / 1301 out tokens · 52969 ms · 2026-06-29T00:59:29.180938+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 3 canonical work pages · 3 internal anchors

  1. [1]

    AEA papers and proceedings , volume=

    Estimating economic characteristics with phone data , author=. AEA papers and proceedings , volume=. 2018 , organization=

  2. [2]

    Information, Communication & Society , volume=

    User-centric approaches for collecting Facebook data in the ‘post-API age’: Experiences from two studies and recommendations for future research , author=. Information, Communication & Society , volume=. 2023 , publisher=

  3. [3]

    Public opinion quarterly , volume=

    Filter bubbles, echo chambers, and online news consumption , author=. Public opinion quarterly , volume=. 2016 , publisher=

  4. [4]

    nature , volume=

    Understanding individual human mobility patterns , author=. nature , volume=. 2008 , publisher=

  5. [5]

    Donáció alapú digitális adatgyűjtés , copyright =

    Kmetty, Zoltán and Koltai, Júlia and Stefkovics, Ádám and Rakovics, Zsófia and Knap, Árpád and Váradi, Bendegúz , month = nov, year =. Donáció alapú digitális adatgyűjtés , copyright =

  6. [6]

    Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

    The structure of information pathways in a social communication network , author=. Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining , pages=

  7. [7]

    Scientific reports , volume=

    From mobile phone data to the spatial structure of cities , author=. Scientific reports , volume=. 2014 , publisher=

  8. [8]

    2011 , publisher=

    The filter bubble: How the new personalized web is changing what we read and how we think , author=. 2011 , publisher=

  9. [9]

    Science , volume=

    Limits of predictability in human mobility , author=. Science , volume=. 2010 , publisher=

  10. [10]

    2008 , publisher=

    Introduction to information retrieval , author=. 2008 , publisher=

  11. [11]

    Sentence-bert: Sentence embeddings using siamese bert-networks , author=. Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) , pages=

  12. [12]

    Text Embeddings by Weakly-Supervised Contrastive Pre-training

    Text embeddings by weakly-supervised contrastive pre-training , author=. arXiv preprint arXiv:2212.03533 , year=

  13. [13]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    Umap: Uniform manifold approximation and projection for dimension reduction , author=. arXiv preprint arXiv:1802.03426 , year=

  14. [14]

    Pacific-Asia conference on knowledge discovery and data mining , pages=

    Density-based clustering based on hierarchical density estimates , author=. Pacific-Asia conference on knowledge discovery and data mining , pages=. 2013 , organization=

  15. [15]

    Transportation , volume=

    Assessing the accuracy of the Sydney Household Travel Survey with GPS , author=. Transportation , volume=. 2007 , publisher=

  16. [16]

    Physics Reports , volume=

    Human mobility: Models and applications , author=. Physics Reports , volume=. 2018 , publisher=

  17. [17]

    Environment and planning B: Urban analytics and City Science , volume=

    Digital twins , author=. Environment and planning B: Urban analytics and City Science , volume=. 2018 , publisher=

  18. [18]

    Sustainability , volume=

    Urban digital twins for smart cities and citizens: The case study of Herrenberg, Germany , author=. Sustainability , volume=. 2020 , publisher=

  19. [19]

    Computational social science

    Social science. Computational social science. , author=. Science (New York, NY) , volume=

  20. [20]

    Nature , volume=

    The scales of human mobility , author=. Nature , volume=. 2020 , publisher=

  21. [21]

    Nature communications , volume=

    Returners and explorers dichotomy in human mobility , author=. Nature communications , volume=. 2015 , publisher=

  22. [22]

    GeoJournal , volume=

    The real-time city? Big data and smart urbanism , author=. GeoJournal , volume=. 2014 , publisher=

  23. [23]

    Cambridge journal of regions, economy and society , volume=

    The ‘actually existing smart city’ , author=. Cambridge journal of regions, economy and society , volume=. 2015 , publisher=

  24. [24]

    Journal of retailing , volume=

    From multi-channel retailing to omni-channel retailing: introduction to the special issue on multi-channel retailing , author=. Journal of retailing , volume=. 2015 , publisher=

  25. [25]

    PloS one , volume=

    Twitter reveals human mobility dynamics during the COVID-19 pandemic , author=. PloS one , volume=. 2020 , publisher=

  26. [26]

    2026 , note =

    Google , title =. 2026 , note =

  27. [27]

    2016 , note =

    Foursquare , title =. 2016 , note =

  28. [28]

    Environment and Planning B: Urban Analytics and City Science , volume=

    The social digital twin: The social turn in the field of smart cities , author=. Environment and Planning B: Urban Analytics and City Science , volume=. 2023 , publisher=

  29. [29]

    Proceedings of the 56th Hawaii International Conference on System Sciences , year=

    Towards a digital twin of society , author=. Proceedings of the 56th Hawaii International Conference on System Sciences , year=

  30. [30]

    Personal and Ubiquitous Computing , volume=

    Utilizing digital traces of mobile phones for understanding social dynamics in urban areas , author=. Personal and Ubiquitous Computing , volume=. 2020 , publisher=

  31. [31]

    2009 12th international IEEE conference on intelligent transportation systems , pages=

    A holistic framework for the study of urban traces and the profiling of urban processes and dynamics , author=. 2009 12th international IEEE conference on intelligent transportation systems , pages=. 2009 , organization=

  32. [32]

    Cities , volume=

    The adoption of urban digital twins , author=. Cities , volume=. 2022 , publisher=

  33. [33]

    Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances on Resilient and Intelligent Cities , pages=

    Using digital trace data to identify regions and cities , author=. Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances on Resilient and Intelligent Cities , pages=

  34. [34]

    Landscape and Urban Planning , volume=

    Exponential distance decay in urban park visitation: A comparative analysis of recreational mobility across 20 US metropolitan areas , author=. Landscape and Urban Planning , volume=. 2026 , publisher=

  35. [35]

    International Journal of Urban Sciences , volume=

    Urban mobility and public transport: future perspectives and review , author=. International Journal of Urban Sciences , volume=. 2021 , publisher=

  36. [36]

    Plos one , volume=

    Assessing public transport accessibility for people with physical disabilities in burgos, spain: A user-centered approach to inclusive urban mobility , author=. Plos one , volume=. 2025 , publisher=

  37. [37]

    Applied Spatial Analysis and Policy , volume=

    Land use spatial optimization using accessibility maps to integrate land use and transport in urban areas , author=. Applied Spatial Analysis and Policy , volume=. 2022 , publisher=

  38. [38]

    International journal of community well-being , volume=

    Understanding the relationship between urban public space and social cohesion: A systematic review , author=. International journal of community well-being , volume=. 2024 , publisher=

  39. [39]

    Environment and Planning B: Urban Analytics and City Science , pages=

    Do digital platforms influence gentrification? An analysis of Nanjing’s central urban area , author=. Environment and Planning B: Urban Analytics and City Science , pages=. 2026 , publisher=

  40. [40]

    Applied Geography , volume=

    Route repetition and activity spaces: spatial networks, routes, stops and routines , author=. Applied Geography , volume=. 2026 , publisher=

  41. [41]

    Frontiers in public health , volume=

    Google effects on memory: a meta-analytical review of the media effects of intensive Internet search behavior , author=. Frontiers in public health , volume=. 2024 , publisher=

  42. [42]

    Land Use Policy , volume=

    Deciphering the effect of user-generated content on park visitation: A comparative study of nine Chinese cities in the Pearl River Delta , author=. Land Use Policy , volume=. 2024 , publisher=

  43. [43]

    Plos one , volume=

    Spatiotemporal behavior pattern differentiation and preference identification of tourists from the perspective of ecotourism destination based on the tourism digital footprint data , author=. Plos one , volume=. 2023 , publisher=

  44. [44]

    Urban transformations , volume=

    Beyond the smart city: A typology of platform urbanism , author=. Urban transformations , volume=. 2022 , publisher=

  45. [45]

    Current Issues in Tourism , volume=

    ‘I want to record and share my wonderful journey’: Chinese Millennials’ production and sharing of short-form travel videos on TikTok or Douyin , author=. Current Issues in Tourism , volume=. 2022 , publisher=

  46. [46]

    EPJ Data Science , volume=

    Classifying social position with social media behavioral data , author=. EPJ Data Science , volume=. 2025 , publisher=

  47. [47]

    Multilingual E5 Text Embeddings: A Technical Report

    Multilingual e5 text embeddings: A technical report , author=. arXiv preprint arXiv:2402.05672 , year=

  48. [48]

    Nature Reviews Methods Primers , volume=

    Uniform manifold approximation and projection , author=. Nature Reviews Methods Primers , volume=. 2024 , publisher=

  49. [49]

    , author=

    hdbscan: Hierarchical density based clustering. , author=. J. Open Source Softw. , volume=

  50. [50]

    2026 , note =

    OpenStreetMap , title =. 2026 , note =

  51. [51]

    Environment and Planning B: Urban Analytics and City Science , volume=

    Using Foursquare place data for estimating building block use , author=. Environment and Planning B: Urban Analytics and City Science , volume=. 2017 , publisher=

  52. [52]

    2016 , note =

    Google , title =. 2016 , note =

  53. [53]

    Decision Support Systems , volume=

    Digitizing local search: An empirical analysis of mobile search behavior in offline shopping , author=. Decision Support Systems , volume=. 2023 , publisher=