Proximity Measure of Information Object Features for Solving the Problem of Their Identification in Information Systems

Volodymyr Yuzefovych

arxiv: 2604.04939 · v1 · submitted 2026-02-17 · 💻 cs.AI

Proximity Measure of Information Object Features for Solving the Problem of Their Identification in Information Systems

Volodymyr Yuzefovych This is my paper

Pith reviewed 2026-05-15 21:19 UTC · model grok-4.3

classification 💻 cs.AI

keywords proximity measureinformation objectsfeature identificationquantitative featuresqualitative featuresprobabilistic measurepossibility measureinformation systems

0 comments

The pith

A proximity measure for information object features combines probabilistic and possibility assessments to identify matches from multiple sources without data transformation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a new measure to determine how close features of information objects are when data arrives from independent sources. It uses a probabilistic approach for quantitative values and a possibility measure for qualitative ones, explicitly handling determination errors. This setup aims to decide whether features belong to the same physical object. The author verifies the measure meets standard axioms and offers variants for multiple features. Such a tool could streamline identification tasks in information systems by avoiding the need to convert all data into comparable forms first.

Core claim

The author establishes a quantitative-qualitative proximity measure for the features of information objects. This measure accounts for possible differences in individual feature values due to determination errors. Quantitative features are assessed via a probabilistic measure while qualitative features use a possibility measure. The resulting measure satisfies the axioms required of any proximity measure and supports variants for groups of features, all without requiring transformation of the original feature values.

What carries the argument

The quantitative-qualitative proximity measure, formed by combining probabilistic measures for quantitative features and possibility measures for qualitative features to evaluate similarity directly.

Load-bearing premise

That probabilistic and possibility measures can be meaningfully combined into one unified proximity measure that satisfies all axioms and permits reliable identification without transforming feature values.

What would settle it

Construct a test set of information objects with known matches and mismatches, compute the proximity scores using the proposed measure, and check whether the scores reliably separate matches from non-matches better than chance or if any axiom is violated in the calculations.

read the original abstract

The paper considers a new quantitative-qualitative proximity measure for the features of information objects, where data enters a common information resource from several sources independently. The goal is to determine the possibility of their relation to the same physical object (observation object). The proposed measure accounts for the possibility of differences in individual feature values - both quantitative and qualitative - caused by existing determination errors. To analyze the proximity of quantitative feature values, the author employs a probabilistic measure; for qualitative features, a measure of possibility is used. The paper demonstrates the feasibility of the proposed measure by checking its compliance with the axioms required of any measure. Unlike many known measures, the proposed approach does not require feature value transformation to ensure comparability. The work also proposes several variants of measures to determine the proximity of information objects (IO) based on a group of diverse features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a mixed proximity measure using probability for numbers and possibility for categories but omits the explicit combination formula needed to verify its axioms on heterogeneous features.

read the letter

The one thing to know is that this paper proposes combining a probabilistic measure for quantitative features and a possibility measure for qualitative features into a single proximity score for identifying information objects, all without transforming the feature values. It claims this satisfies the required axioms but doesn't show how the two outputs are actually merged when a feature vector mixes both types. The new part is the direct handling of mixed data types from multiple sources while accounting for determination errors. The approach avoids the common step of normalizing everything to one type, which could simplify some identification pipelines in information systems. Checking axiom compliance is a standard way to validate a measure, and suggesting variants for groups of features shows some thought about practical use. The soft spot is the lack of an explicit aggregation rule. When you have a vector with both quantitative and qualitative features, you get one number from probability and one from possibility; the paper never says how those are merged into one scalar that still meets reflexivity, symmetry, and the rest for the mixed case. Without that formula, any derivations, or even a small example with numbers, you can't confirm the claim holds. The abstract mentions demonstrating feasibility through axiom checks, but since no details are provided, the central argument stays untested. This kind of work would interest people working on entity resolution or data fusion in information systems, especially if they deal with real-world data that mixes numbers and categories. A reader who wants a plug-and-play method or reproducible results won't find it here, though. I wouldn't cite this or bring it to a reading group. It doesn't look ready for peer review; the authors should add the missing combination step and some validation before it deserves referee time.

Referee Report

1 major / 2 minor

Summary. The paper proposes a new quantitative-qualitative proximity measure for features of information objects drawn from multiple independent sources. It applies a probabilistic measure to quantitative feature values and a possibility measure to qualitative feature values to account for determination errors, combines the two without any value transformation, verifies compliance with the axioms required of any measure, and outlines several variants for determining proximity of entire information objects based on groups of mixed features.

Significance. If an explicit aggregation rule can be supplied and shown to preserve the axioms for heterogeneous feature vectors, the measure would offer a practical advantage by avoiding the normalization or transformation steps required by many existing proximity functions. The explicit axiom check is a positive element, but the absence of derivations, concrete examples, or data leaves the central claim unverified.

major comments (1)

[Definition of the proposed proximity measure] No explicit formula is supplied for combining the numeric output of the probabilistic measure (applied to quantitative features) with the numeric output of the possibility measure (applied to qualitative features) into a single scalar proximity value. Because axiom compliance is asserted only after this (unspecified) combination, it is impossible to confirm that reflexivity, symmetry, and the remaining axioms hold when a feature vector contains both types—the load-bearing case for the identification task.

minor comments (2)

[Abstract] The abstract states that the approach 'does not require feature value transformation' but supplies no concrete comparison with prior measures that do require such steps.
[Variants section] The several variants of measures for proximity of information objects are mentioned but never defined or compared.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The point raised about the missing explicit combination formula is valid and will be addressed in revision.

read point-by-point responses

Referee: [Definition of the proposed proximity measure] No explicit formula is supplied for combining the numeric output of the probabilistic measure (applied to quantitative features) with the numeric output of the possibility measure (applied to qualitative features) into a single scalar proximity value. Because axiom compliance is asserted only after this (unspecified) combination, it is impossible to confirm that reflexivity, symmetry, and the remaining axioms hold when a feature vector contains both types—the load-bearing case for the identification task.

Authors: We agree with the referee that the manuscript does not supply an explicit aggregation formula for producing a single scalar proximity value from the probabilistic output on quantitative features and the possibility output on qualitative features. This omission prevents direct verification of axiom preservation for mixed feature vectors. In the revised manuscript we will add a dedicated subsection defining the combination rule (a normalized product of the two measures, scaled to [0,1] and weighted by feature-type reliability when available) together with a short proof that the resulting function satisfies reflexivity, symmetry, and the other measure axioms. A concrete numerical example with one quantitative and one qualitative feature will also be included to demonstrate both the computation and the axiom checks. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the proposed proximity measure

full rationale

The paper constructs a new quantitative-qualitative proximity measure by applying a probabilistic measure to quantitative features and a possibility measure to qualitative features, then verifies compliance with required axioms directly on the resulting construction. No load-bearing steps reduce by definition or self-citation to fitted inputs, prior results, or tautological renaming; the combination is presented as an independent proposal that avoids feature transformation. This matches the default expectation for non-circular papers, with the central claim retaining independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the new measure satisfies standard axioms for any proximity measure and can be directly applied to mixed feature types. No specific free parameters, new entities, or additional axioms are detailed in the abstract.

axioms (1)

domain assumption The measure must comply with the axioms required of any measure
Stated as the basis for demonstrating feasibility of the proposed measure.

pith-pipeline@v0.9.0 · 5438 in / 1133 out tokens · 36258 ms · 2026-05-15T21:19:31.856209+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

To analyze the proximity of quantitative feature values, the author employs a probabilistic measure; for qualitative features, a measure of possibility is used... the proposed approach does not require feature value transformation
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Verification against Axioms... non-negativity, symmetry, identity... triangle inequality

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

[1]

Obod, I.V

I.I. Obod, I.V. Svyd, and O.S. Maltsev, Data Processing of Airspace Surveillance Radar Systems: Tutorial. Kharkiv: Madryd, 2021, 255 p

work page 2021
[2]

Specialising algorithm for the information space formation,

V.Ye. Muhin, V.V. Zavgorodniy, Ya.I. Kornaga, and L.V. Baranovska, "Specialising algorithm for the information space formation," in Information Technology and Security: Papers of XXI International Scientific and Practical Conference (ITS-2021), vol. 21. Kyiv: Engineering, 2021, pp. 123–128

work page 2021
[3]

Mandel, Cluster Analysis

I.D. Mandel, Cluster Analysis. Moscow: Finansy i Statistika, 1988, 176 p

work page 1988
[4]

Analysis of metrics for intelligent information systems,

V. Hryhorovych, "Analysis of metrics for intelligent information systems," Information Systems and Networks, vol. 9, pp. 96–111, 2021. DOI: https://doi.org/10.23939/sisn2021.09.096

work page doi:10.23939/sisn2021.09.096 2021
[5]

Bulletin of the National Technical University

A.O. Fedorov, P.V. Notovskiy, and A.E.Yu. Peredriy, "Using the proximity measure in the problem of distributing the production program by planning periods based on the similarity coefficients of Dake, Jaccard, Maxfedor, Otiai, Rao, Tanimoto," Bulletin of the National Technical University "KhPI". Economical Science, no. 1 (3), pp. 32 –35, 2020. DOI: https:...

work page doi:10.20998/2519-4461.2020.1.32 2020
[6]

Fo rmation of similarity indicators between objects characterised by parameters measured in different measurement scales,

I.P. Gamayun and O.M. Bezmenova, "Fo rmation of similarity indicators between objects characterised by parameters measured in different measurement scales," Bulletin of the National Technical University "KhPI", no. 55 (1097), pp. 88–91, 2014. DOI: https://doi.org/10.20998/%x

work page 2014
[7]

Measurement scales

"Measurement scales." [Online]. Available: https://elib.lntu.edu.ua/sites/default/files/elib_upload/%D0%95%D0%9D%D0%9F_%D0%AF% D0%BA%D0%B8%D0%BC%D1%87%D1%83%D0%BA_%D0%A1%D0%B5%D0%BB%D0 %B5%D0%BF%D0%B8%D0%BD%D0%B0/page8.html. [Accessed: 07-Feb-2026]

work page 2026
[8]

Possibility theory, probability theory and multiple -valued logics: A clarification,

D. Dubois and H. Prade, "Possibility theory, probability theory and multiple -valued logics: A clarification," Annals of Mathematics and Artificial Intelligence, vol. 32, pp. 35–66, 2001

work page 2001
[9]

Uncertainties in data processing, forecasting and decision-making,

L.B. Levenchu k, O.L. Tymoshchuk, V.H. Guskova, and P.I. Bidyuk, "Uncertainties in data processing, forecasting and decision-making," System Research & Information Technologies, no. 3, pp. 66–80, 2023. DOI: 10.20535/SRIT.2308-8893.2023.3.05

work page doi:10.20535/srit.2308-8893.2023.3.05 2023

[1] [1]

Obod, I.V

I.I. Obod, I.V. Svyd, and O.S. Maltsev, Data Processing of Airspace Surveillance Radar Systems: Tutorial. Kharkiv: Madryd, 2021, 255 p

work page 2021

[2] [2]

Specialising algorithm for the information space formation,

V.Ye. Muhin, V.V. Zavgorodniy, Ya.I. Kornaga, and L.V. Baranovska, "Specialising algorithm for the information space formation," in Information Technology and Security: Papers of XXI International Scientific and Practical Conference (ITS-2021), vol. 21. Kyiv: Engineering, 2021, pp. 123–128

work page 2021

[3] [3]

Mandel, Cluster Analysis

I.D. Mandel, Cluster Analysis. Moscow: Finansy i Statistika, 1988, 176 p

work page 1988

[4] [4]

Analysis of metrics for intelligent information systems,

V. Hryhorovych, "Analysis of metrics for intelligent information systems," Information Systems and Networks, vol. 9, pp. 96–111, 2021. DOI: https://doi.org/10.23939/sisn2021.09.096

work page doi:10.23939/sisn2021.09.096 2021

[5] [5]

Bulletin of the National Technical University

A.O. Fedorov, P.V. Notovskiy, and A.E.Yu. Peredriy, "Using the proximity measure in the problem of distributing the production program by planning periods based on the similarity coefficients of Dake, Jaccard, Maxfedor, Otiai, Rao, Tanimoto," Bulletin of the National Technical University "KhPI". Economical Science, no. 1 (3), pp. 32 –35, 2020. DOI: https:...

work page doi:10.20998/2519-4461.2020.1.32 2020

[6] [6]

Fo rmation of similarity indicators between objects characterised by parameters measured in different measurement scales,

I.P. Gamayun and O.M. Bezmenova, "Fo rmation of similarity indicators between objects characterised by parameters measured in different measurement scales," Bulletin of the National Technical University "KhPI", no. 55 (1097), pp. 88–91, 2014. DOI: https://doi.org/10.20998/%x

work page 2014

[7] [7]

Measurement scales

"Measurement scales." [Online]. Available: https://elib.lntu.edu.ua/sites/default/files/elib_upload/%D0%95%D0%9D%D0%9F_%D0%AF% D0%BA%D0%B8%D0%BC%D1%87%D1%83%D0%BA_%D0%A1%D0%B5%D0%BB%D0 %B5%D0%BF%D0%B8%D0%BD%D0%B0/page8.html. [Accessed: 07-Feb-2026]

work page 2026

[8] [8]

Possibility theory, probability theory and multiple -valued logics: A clarification,

D. Dubois and H. Prade, "Possibility theory, probability theory and multiple -valued logics: A clarification," Annals of Mathematics and Artificial Intelligence, vol. 32, pp. 35–66, 2001

work page 2001

[9] [9]

Uncertainties in data processing, forecasting and decision-making,

L.B. Levenchu k, O.L. Tymoshchuk, V.H. Guskova, and P.I. Bidyuk, "Uncertainties in data processing, forecasting and decision-making," System Research & Information Technologies, no. 3, pp. 66–80, 2023. DOI: 10.20535/SRIT.2308-8893.2023.3.05

work page doi:10.20535/srit.2308-8893.2023.3.05 2023