pith. sign in

arxiv: 1906.08874 · v1 · pith:DWFSZU4Wnew · submitted 2019-06-20 · 💻 cs.AI · cs.DB

Customer Segmentation of Wireless Trajectory Data

Pith reviewed 2026-05-25 19:24 UTC · model grok-4.3

classification 💻 cs.AI cs.DB
keywords semantic trajectory clusteringwireless trajectory datacustomer segmentationbeacon dataLondon Undergroundcommute patternstruncated trajectories
0
0 comments X

The pith

Wireless trajectory data from beacons can be clustered semantically by location types without using geographical coordinates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a method for clustering trajectories recorded only as sequences of (time, point) entries from wireless access points and BLE beacons. These points carry location identifiers such as place names but no latitude or longitude values. Similarity is defined through non-geographic features such as the category of place visited rather than physical route or distance. The method is tested on truncated commute data from the London Underground, where each trajectory lacks a true origin or destination. In this setting the data shows a variety of travel patterns but no sharply separated clusters, which informs how context-specific journey recommendations might still be generated.

Core claim

The authors present a new approach to semantic trajectory clustering for wireless data consisting of (time, point) entries tied to location identifiers. The approach operates without geographical coordinates and is applied to truncated trajectories from the London Underground rail network. Analysis of the resulting clusters indicates a range of travel patterns without the existence of distinct groups, leading to suggestions for on-line recommendation systems and notes on route and destination prediction.

What carries the argument

Semantic similarity measure defined from non-geographic characteristics such as the type of location visited.

If this is right

  • Context-specific on-line recommendations for onward journeys become feasible using only beacon-derived trajectory data.
  • The method extends semantic trajectory clustering literature to cases lacking latitude and longitude.
  • Prediction of journey routes and destinations can be approached even with truncated trajectories.
  • A range of travel patterns can be identified without requiring distinct clusters to exist.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same beacon-based clustering could be tested on indoor positioning systems in airports or shopping centers where GPS is unavailable.
  • Privacy advantages arise because no precise geographic coordinates need to be collected or stored.
  • If clusters remain indistinct, hybrid methods that combine semantic features with time-of-day patterns might still support useful recommendations.

Load-bearing premise

Semantic similarity based on non-geographic characteristics such as type of location visited can be defined and used to cluster trajectories meaningfully even when the data consists only of truncated (time, point) entries without true origins or destinations.

What would settle it

A concrete test would be to apply the clustering procedure to the London Underground beacon data and check whether the resulting groups correspond to recognizable commute behaviors or appear as one undifferentiated distribution.

Figures

Figures reproduced from arXiv: 1906.08874 by Matthew R Karlsen, Sotiris K. Moschoyiannis.

Figure 1
Figure 1. Figure 1: scatter-plots for the four numerical variables used in the clustering. The scatter-plots are [PITH_FULL_IMAGE:figures/full_fig_p017_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Hybrid output from the Principal Component Analysis and DBSCAN-based clus￾tering. Points deemed to be noise are coloured grey, whilst each cluster identified is assigned its own colour. Present results indicate that only a single cluster is present within the data. the clustering process. It should be noted that though 10,000 customers were used, 74,165 tra￾jectories were first discarded due to the filters… view at source ↗
read the original abstract

Wireless trajectory data consists of a number of (time, point) entries where each point is associated with a particular wireless device (WAP or BLE beacon) tied to a location identifier, such as a place name. A trajectory relates to a particular mobile device. Such data can be clustered `semantically' to identify similar trajectories, where similarity relates to non-geographic characteristics such as the type of location visited. Here we present a new approach to semantic trajectory clustering for such data. The approach is applicable to interpreting data that does not contain geographical coordinates, and thus contributes to the current literature on semantic trajectory clustering. The literature does not appear to provide such an approach, instead focusing on trajectory data where latitude and longitude data is available. We apply the techniques developed above in the context of the Onward Journey Planner Application, with the motivation of providing on-line recommendations for onward journey options in a context-specific manner. The trajectories analysed indicate commute patterns on the London Underground. Points are only recorded for communication with WAP and BLE beacons within the rail network. This context presents additional challenge since the trajectories are `truncated', with no true origin and destination details. In the above context we find that there are a range of travel patterns in the data, without the existence of distinct clusters. Suggestions are made concerning how to approach the problem of provision of on-line recommendations with such a data set. Thoughts concerning the related problem of prediction of journey route and destination are also provided.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript claims to introduce a new approach to semantic trajectory clustering applicable to wireless (time, point) data from WAP/BLE beacons that lacks geographical coordinates. It applies the method to truncated London Underground commute trajectories collected via the Onward Journey Planner Application, reports a range of travel patterns but no distinct clusters, and offers suggestions for context-specific recommendations and journey prediction.

Significance. If a reproducible semantic clustering procedure were supplied and shown to be novel relative to existing geo-coordinate-focused work, the contribution would address a documented gap in the semantic trajectory literature and could inform practical recommendation systems for incomplete trajectory data.

major comments (2)
  1. [Abstract] Abstract: The central claim that a 'new approach' to semantic trajectory clustering is presented is unsupported because no mapping from place-name identifiers to semantic categories, no similarity metric on the resulting sequences, and no clustering algorithm are defined anywhere in the manuscript. This definition is load-bearing for both the novelty assertion and the downstream claim that 'no distinct clusters' were found.
  2. [Application to Onward Journey Planner] Application section (Onward Journey Planner context): The observation that trajectories 'indicate commute patterns' yet exhibit 'no distinct clusters' cannot be assessed or reproduced, as the manuscript supplies neither the distance function nor the clustering procedure applied to the truncated beacon sequences.
minor comments (1)
  1. [Abstract] The phrase 'techniques developed above' appears in the abstract without any preceding methodological section or reference, leaving the reader without context for the claimed approach.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The major comments correctly identify that the manuscript does not define the core components of the claimed semantic trajectory clustering approach. We will revise the manuscript to supply these definitions and procedures.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that a 'new approach' to semantic trajectory clustering is presented is unsupported because no mapping from place-name identifiers to semantic categories, no similarity metric on the resulting sequences, and no clustering algorithm are defined anywhere in the manuscript. This definition is load-bearing for both the novelty assertion and the downstream claim that 'no distinct clusters' were found.

    Authors: We agree that the manuscript does not define a mapping from place-name identifiers to semantic categories, a similarity metric on sequences, or a clustering algorithm. This is a substantive omission that prevents evaluation of novelty and reproducibility of the 'no distinct clusters' result. In the revised version we will insert a dedicated methods section that specifies: (1) the semantic mapping rules applied to London Underground station names, (2) the sequence similarity function (a semantic edit-distance variant), and (3) the clustering algorithm and its parameters. These additions will also allow direct comparison with existing geo-coordinate semantic trajectory methods. revision: yes

  2. Referee: [Application to Onward Journey Planner] Application section (Onward Journey Planner context): The observation that trajectories 'indicate commute patterns' yet exhibit 'no distinct clusters' cannot be assessed or reproduced, as the manuscript supplies neither the distance function nor the clustering procedure applied to the truncated beacon sequences.

    Authors: We accept that the distance function and clustering procedure are not supplied, rendering the commute-pattern and 'no distinct clusters' observations unreproducible. The text refers to 'techniques developed above' without providing their concrete instantiation on the truncated beacon sequences. The revision will add an explicit description of the distance function (adapted for truncation) and the clustering procedure (including any parameter settings) used on the Onward Journey Planner data. revision: yes

Circularity Check

0 steps flagged

No derivation chain, equations, or load-bearing predictions present; claims are high-level empirical observations.

full rationale

The manuscript describes a new semantic trajectory clustering approach for wireless data lacking geographic coordinates and applies it to truncated London Underground commute patterns, reporting a range of travel patterns without distinct clusters. No equations, algorithms, similarity metrics, or derivation steps are supplied in the provided text. No self-citations, uniqueness theorems, fitted parameters renamed as predictions, or ansatzes are invoked. The central claim therefore reduces to an empirical statement rather than any chain that could be circular by construction. The work is self-contained as an application report.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on abstract; the central claim rests on the unstated assumption that semantic features can substitute for geographic ones in clustering, with no free parameters or invented entities explicitly listed.

axioms (1)
  • domain assumption Semantic similarity defined via location type can be used to cluster trajectories without geographic coordinates
    Invoked to justify the new clustering approach for non-geo data.

pith-pipeline@v0.9.0 · 5791 in / 1139 out tokens · 24419 ms · 2026-05-25T19:24:25.155342+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

  1. [1]

    T. Yoell, “Onward Journey Planner Assis- tant: optimising the rail passenger journey experience with a personalised and multi- parameter-optimised onward journey rec- ommendation solution for terminal train stations (AIR 4 Funding Application),” tech. rep., Proxad Ltd., 2017

  2. [2]

    Traveller Needs and UK Capability Study,

    P. Wockatz and P. Schartau, “Traveller Needs and UK Capability Study,” tech. rep., Transport Systems Catapult, 2017

  3. [3]

    Mining user similarity from semantic trajectories,

    J. J.-C. Ying and Association for Comput- ing Machinery, “Mining user similarity from semantic trajectories,” in Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Location Based Social Net- 22 works, New York, NY: ACM, 2010. OCLC: 881348404

  4. [4]

    Semantic trajectories modeling and analysis,

    C. Parent, S. Spaccapietra, C. Renso, G. Andrienko, N. Andrienko, V. Bogorny, M. L. Damiani, A. Gkoulalas-Divanis, J. Macedo, N. Pelekis, Y. Theodoridis, and Z. Yan, “Semantic trajectories modeling and analysis,” ACM Computing Surveys, vol. 45, pp. 42:1–42:32, Aug. 2013

  5. [5]

    Cluster analysis,

    B. Everitt, S. Landau, M. Leese, and D. Stahl, “Cluster analysis,” London: Arnold, 2001

  6. [6]

    Data clustering: 50 years be- yond K-means,

    A. K. Jain, “Data clustering: 50 years be- yond K-means,”Pattern recognition letters, vol. 31, no. 8, pp. 651–666, 2010

  7. [7]

    A personal route prediction system based on trajectory data mining,

    L. Chen, M. Lv, Q. Ye, G. Chen, and J. Woodward, “A personal route prediction system based on trajectory data mining,” Information Sciences, vol. 181, pp. 1264– 1284, Apr. 2011

  8. [8]

    Tra- jectory clustering: a partition-and-group framework,

    J.-G. Lee, J. Han, and K.-Y. Whang, “Tra- jectory clustering: a partition-and-group framework,” in Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pp. 593–604, ACM, 2007

  9. [9]

    Trajectory data mining: An overview,

    Y. Zheng, “Trajectory data mining: An overview,” ACM Transactions on Intelli- gent Systems and Technology, vol. 6, pp. 1– 41, May 2015

  10. [10]

    A density-based algorithm for dis- covering clusters in large spatial databases with noise,

    M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for dis- covering clusters in large spatial databases with noise,” inProceedings of 2nd Interna- tional Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226–231, 1996

  11. [11]

    Facts & Figures

    Transport for London, “Facts & Figures.” https://tfl. gov.uk/corporate/about-tfl/ what-we-do/london-underground/ facts-and-figures, 2018. Accessed: 2018-02-04

  12. [12]

    Journey Results

    Transport for London, “Journey Results.” https://tfl.gov.uk/plan-a-journey/ results?InputFrom=Epping+ Underground+Station&From=Epping+ Underground+Station&FromId=1000076& InputTo=West+Ruislip&ToId=1000267,

  13. [13]

    Accessed: 2018-02-04

  14. [14]

    Extracting stops from noisy trajectories: A sequence orientedclusteringapproach,

    L. Xiang, M. Gao, and T. Wu, “Extracting stops from noisy trajectories: A sequence orientedclusteringapproach,” ISPRS Inter- national Journal of Geo-Information, vol.5, no. 3, p. 29, 2016

  15. [15]

    Mining interesting locations and travel se- quences from gps trajectories,

    Y. Zheng, L. Zhang, X. Xie, and W.-Y. Ma, “Mining interesting locations and travel se- quences from gps trajectories,” inProceed- ings of the 18th international conference on World wide web, pp. 791–800, ACM, 2009

  16. [16]

    Sup- plement to ‘Customer Segmentation of Wireless Trajectory Data’

    M. R. Karlsen and S. Moschoyiannis, “Sup- plement to ‘Customer Segmentation of Wireless Trajectory Data’.” Guildford, UK: University of Surrey, 2018

  17. [17]

    Princi- pal Component Analysis,

    H. Abdi and L. J. Williams, “Princi- pal Component Analysis,” Wiley interdis- ciplinary reviews: Computational statistics, vol. 2, no. 4, pp. 433–459, 2010

  18. [18]

    Principal Component Analysis,

    R. Bro and A. K. Smilde, “Principal Component Analysis,” Analytical Methods, vol. 6, no. 9, pp. 2812–2831, 2014

  19. [19]

    Commons Math: The Apache Commons Mathematics Li- brary

    The Apache Foundation, “Commons Math: The Apache Commons Mathematics Li- brary.” http://commons.apache.org/ proper/commons-math/, 2018. Accessed: 2018-01-08

  20. [20]

    Improving visu- alization of large hierarchical clustering,

    G. Bisson and R. Blanch, “Improving visu- alization of large hierarchical clustering,” in Information Visualisation (IV), 2012 16th International Conference on, pp. 220–228, IEEE, 2012

  21. [21]

    The K- means clustering technique: General con- siderations and implementation in Mathe- matica,

    L. Morissette and S. Chartier, “The K- means clustering technique: General con- siderations and implementation in Mathe- matica,” Tutorials in Quantitative Methods for Psychology, vol. 9, no. 1, pp. 15–24, 2013. 23

  22. [22]

    Smile – Statistical Machine Learn- ing and Intelligence Engine

    H. Li, “Smile – Statistical Machine Learn- ing and Intelligence Engine.” https:// haifengl.github.io/smile/, 2018. Ac- cessed: 2018-01-08

  23. [23]

    2.3. Clustering – scikit-learn 0.19.1 documentation

    scikit-learn developers, “2.3. Clustering – scikit-learn 0.19.1 documentation.” http://scikit-learn.org/stable/ modules/clustering.html, 2018. Ac- cessed: 2018-01-08

  24. [24]

    A Web-based Tool for Identifying Strategic Intervention Points in Complex Systems

    S. Moschoyiannis, N. Elia, A. S. Penn, D. J. B. Lloyd, and C. Knight, “A Web- based Tool for Identifying Strategic Inter- vention Points in Complex Systems,”Elec- tronic Proceedings in Theoretical Computer Science, vol. 220, pp. 39–52, July 2016. arXiv: 1608.00655

  25. [25]

    Learn- ing classifier systems: A complete introduc- tion, review, and roadmap,

    R. J. Urbanowicz and J. H. Moore, “Learn- ing classifier systems: A complete introduc- tion, review, and roadmap,”J. Artif. Evol. App., vol. 2009, pp. 1:1–1:25, Jan. 2009

  26. [26]

    Mining oblique data with XCS,

    S. W. Wilson, “Mining oblique data with XCS,” in International Workshop on Learning Classifier Systems, pp. 158–174, Springer, 2000

  27. [27]

    Get real! XCS with continuous-valued inputs,

    S. W. Wilson, “Get real! XCS with continuous-valued inputs,” inInternational Workshop on Learning Classifier Systems, pp. 209–219, Springer, 1999

  28. [28]

    Finding all the common substrings of given two strings

    ‘200_success’, “Finding all the common substrings of given two strings.” https://stackoverflow. com/questions/34805488/ finding-all-the-common-substrings-of-given-two-strings ,

  29. [29]

    Accessed: 2018-02-12. 24