pith. sign in

arxiv: 2504.03605 · v2 · submitted 2025-04-04 · 💻 cs.DM · cs.CC· cs.DS· cs.IT· math.CO· math.IT

Constant Rate Isometric Embeddings of Hamming Metric into Edit Metric

Pith reviewed 2026-05-22 21:14 UTC · model grok-4.3

classification 💻 cs.DM cs.CCcs.DScs.ITmath.COmath.IT
keywords isometric embeddingHamming metricedit metricsynchronization stringsmisalignersinsertion-deletion codesmetric embeddings
0
0 comments X

The pith

There exists an isometric embedding from the Hamming metric to the edit metric with constant rate 1/8.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs a mapping from binary strings of length n to longer binary strings such that the Hamming distance between any two inputs exactly equals the edit distance between their images. This mapping works at a fixed rate of 1/8, so the output strings are only eight times as long as the inputs. The authors achieve this by linking the problem to synchronization strings and introducing a misaligners framework that controls how edits can occur. The result improves conditional lower bounds for optimization problems over edit distance, now with optimal dependence on the input dimension. They also prove that no binary embedding can exceed rate 15/32 and that rates can approach 1 when input and output alphabets are allowed to differ.

Core claim

We present an isometric embedding with rate 1/8 by discovering connections to synchronization strings, which were studied in the context of insertion-deletion codes. At a technical level, we introduce a framework for obtaining high-rate isometric embeddings using a novel object called misaligners. As an immediate consequence of our constant-rate isometric embedding, we improve known conditional lower bounds for various optimization problems in the edit metric, now with optimal dependence on the dimension. We complement our results by showing that no isometric embedding can have rate greater than 15/32 for all positive integers n.

What carries the argument

The misaligners framework, which builds on synchronization strings to produce mappings where edit alignments preserve exact Hamming-to-edit distance equality.

If this is right

  • Conditional lower bounds for edit-metric optimization problems now hold with optimal dependence on dimension.
  • Constant-rate isometric embeddings exist between the Hamming and edit metrics.
  • No binary isometric embedding can exceed rate 15/32.
  • Embeddings over differing alphabets can achieve rates arbitrarily close to 1.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The distance-preservation property could let algorithms for Hamming-space problems run directly on edit-distance instances with only constant blowup.
  • The structural properties of required alignments might extend to other string metrics such as longest common subsequence distance.
  • Small-scale computational verification of the embedding on n up to 10 would give direct evidence that the distance equality holds in practice.

Load-bearing premise

The misaligners built from synchronization strings produce an exact isometric embedding at rate 1/8 that works for every input length.

What would settle it

Take all pairs of length-4 binary strings, apply the embedding, and check whether any Hamming distance fails to match the edit distance of the images.

read the original abstract

A function $\varphi:\{0,1\}^n \to \{0,1\}^N$ is called an isometric embedding of the $n$-dimensional Hamming metric space to the $N$-dimensional edit metric space if, for all $x,y\in\{0,1\}^n$, the Hamming distance between $x$ and $y$ is equal to the edit distance between $\varphi(x)$ and $\varphi(y)$. The rate of such an embedding is defined as the ratio $n/N$. It is well known in the literature how to construct isometric embeddings with rate $\Omega(1/\log n)$. However, achieving even near-isometric embeddings with positive constant rate has remained elusive until now. In this paper, we present an isometric embedding with rate $1/8$ by discovering connections to synchronization strings, which were studied in the context of insertion-deletion codes (Haeupler-Shahrasbi [JACM'21]). At a technical level, we introduce a framework for obtaining high-rate isometric embeddings using a novel object called misaligners. As an immediate consequence of our constant-rate isometric embedding, we improve known conditional lower bounds for various optimization problems in the edit metric, now with optimal dependence on the dimension. We complement our results by showing that no isometric embedding $\varphi:\{0,1\}^n \to \{0,1\}^N$ can have rate greater than $15/32$ for all positive integers $n$. En route to proving this upper bound, we uncover fundamental structural properties necessary for every Hamming-to-edit isometric embedding. We also prove similar upper and lower bounds for embeddings over larger alphabets. Finally, we consider embeddings $\varphi:\Sigma_{\mathrm{in}}^n \to \Sigma_{\mathrm{out}}^N$ between different input and output alphabets, where the rate is given by $\frac{n\log|\Sigma_{\mathrm{in}}|}{N\log|\Sigma_{\mathrm{out}}|}$. In this setting, we show that the rate can be made arbitrarily close to $1$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims to construct the first constant-rate (exactly 1/8) isometric embedding from the n-dimensional Hamming metric to the N-dimensional edit metric via a new misaligners framework built on synchronization-string properties, proves that no such embedding can exceed rate 15/32 for any n, improves conditional lower bounds for edit-metric optimization problems, and shows that the rate can be made arbitrarily close to 1 when input and output alphabets are allowed to differ.

Significance. If the central construction and upper-bound argument hold, the result is a substantial advance: prior isometric embeddings achieved only rate Ω(1/log n), while this delivers a positive constant rate together with a concrete upper bound and an alphabet-generalization result. The misaligners framework and the structural properties derived for the upper bound constitute new technical tools with potential reuse in coding theory and metric embeddings. The paper supplies both matching-style bounds and an application to conditional lower bounds, which strengthens its contribution.

major comments (2)
  1. [misaligners construction] The misaligners framework (introduced to obtain the rate-1/8 embedding) must be shown to produce exact distance preservation for every pair; the reduction from synchronization-string properties needs an explicit argument that no extra insertions or deletions are introduced beyond those controlled by the framework.
  2. [upper-bound proof] The upper-bound argument establishing that rate >15/32 is impossible for all n relies on newly identified structural properties of any Hamming-to-edit isometry; these properties should be stated precisely (e.g., any necessary alignment or periodicity constraints) so that the 15/32 calculation can be verified directly from them.
minor comments (1)
  1. [abstract and construction] Clarify whether the rate-1/8 construction holds for every n or only sufficiently large n; the abstract states the upper bound for all positive integers n, so consistency should be explicit.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation and the recommendation of minor revision. The two major comments identify opportunities to improve the explicitness of our arguments. We address each below and will incorporate the requested clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: [misaligners construction] The misaligners framework (introduced to obtain the rate-1/8 embedding) must be shown to produce exact distance preservation for every pair; the reduction from synchronization-string properties needs an explicit argument that no extra insertions or deletions are introduced beyond those controlled by the framework.

    Authors: The distance-preservation argument appears in the proof of Theorem 3.1, where we reduce to the synchronization-string properties established by Haeupler-Shahrasbi and then invoke the misaligners construction to control all insertions and deletions. We agree that isolating this control into a dedicated lemma (showing that the framework introduces no uncontrolled indels) would make the reduction fully explicit. We will add such a lemma in the revised version. revision: yes

  2. Referee: [upper-bound proof] The upper-bound argument establishing that rate >15/32 is impossible for all n relies on newly identified structural properties of any Hamming-to-edit isometry; these properties should be stated precisely (e.g., any necessary alignment or periodicity constraints) so that the 15/32 calculation can be verified directly from them.

    Authors: The structural properties (alignment and periodicity constraints) are derived in Lemma 4.2 and applied in the proof of Theorem 4.3 to obtain the 15/32 bound. We concur that restating these properties as a self-contained list of necessary conditions, followed by an explicit calculation of the resulting rate upper bound, will allow direct verification. We will revise the presentation accordingly. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces a novel misaligners framework to construct an isometric embedding of rate exactly 1/8, leveraging properties of synchronization strings from the independent prior result Haeupler-Shahrasbi (JACM'21). This citation supplies external, published support on insertion-deletion codes and is not load-bearing for the new derivation; the embedding and distance-preservation arguments are derived within the present work. The complementary upper bound of 15/32 follows from newly proven structural properties of any Hamming-to-edit isometry. No equations reduce by construction to fitted inputs, no self-definitional loops appear, and no uniqueness theorems or ansatzes are imported via overlapping-author citations. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper adds the misaligners framework and the explicit rate-1/8 construction; it draws the synchronization-string properties from prior literature and proves structural upper bounds from first principles.

axioms (1)
  • domain assumption Existence and distance-preserving properties of synchronization strings as established in insertion-deletion coding literature
    The construction explicitly invokes connections to synchronization strings studied in Haeupler-Shahrasbi [JACM'21].
invented entities (1)
  • misaligners no independent evidence
    purpose: Framework for obtaining high-rate isometric embeddings
    Novel object introduced to facilitate the constant-rate construction.

pith-pipeline@v0.9.0 · 5959 in / 1476 out tokens · 136572 ms · 2026-05-22T21:14:47.589785+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.