pith. sign in
Pith Number

pith:7E2NW6ZB

pith:2026:7E2NW6ZBNSOU2XL7N2GUWA47IE
not attested not anchored not stored refs resolved

Real-time Speech Restoration using Data Prediction Mean Flows

Sebastian Braun

Data Prediction Mean Flows let generative speech restoration run in real time with 120 times less compute than prior methods.

arxiv:2605.16251 v1 · 2026-05-15 · eess.AS

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7E2NW6ZBNSOU2XL7N2GUWA47IE}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Compared to state-of-the-art, our proposed mean flow model uses 120x less compute and introduces no algorithmic latency other than the STFT, while achieving similar audio quality.

C2weakest assumption

That the novel low-latency architecture combined with few-step Data Prediction Mean Flows can preserve audio quality comparable to large offline generative models under strict real-time constraints.

C3one line summary

A Data Prediction Mean Flow model enables real-time speech restoration with 120x lower compute and no algorithmic latency beyond the STFT while matching state-of-the-art offline quality.

References

28 extracted · 28 resolved · 3 Pith anchors

[1] Speech restoration aims to address and fix all of those non-linear and de- structive degradations by using generative modeling
[2] Real-time Speech Restoration using Data Prediction Mean Flows 2026 · arXiv:2605.16251
[3] DA TA AND IMPLEMENTA TION 3.1. Training data We generate degraded and target speech pairs with on-the-fly aug- mentation with a similar pipeline as in [7] using studio-quality clean speech from EARS [
[4] Test set and metrics We use the Signal Improvement Challenge 2024 (SIG2024) [21] test set, 500 real-world recordings with typical degradations from de- vices and V oiP processing 2024
[5] CONCLUSIONS This work paved the way to drastically reduce computational cost and latency for general speech restoration flow-matching models. We demonstrate a 120x gain at increased quality by adoptin

Formal links

2 machine-checked theorem links

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-05-20T00:02:00.157755Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

f934db7b216c9d4d5d7f6e8d4b039f412fda21b7afa8469db495be054df8b88b

Aliases

arxiv: 2605.16251 · arxiv_version: 2605.16251v1 · doi: 10.48550/arxiv.2605.16251 · pith_short_12: 7E2NW6ZBNSOU · pith_short_16: 7E2NW6ZBNSOU2XL7 · pith_short_8: 7E2NW6ZB
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7E2NW6ZBNSOU2XL7N2GUWA47IE \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: f934db7b216c9d4d5d7f6e8d4b039f412fda21b7afa8469db495be054df8b88b
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "29a97fc6f62c517e207476a1cf628905f2f66003c80009facfd2f322398b2323",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "eess.AS",
    "submitted_at": "2026-05-15T17:56:04Z",
    "title_canon_sha256": "e7f0115ee427a11cb6c3b9cff7e4512ab2cf16ec333f5f35dee343190f52e0b4"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16251",
    "kind": "arxiv",
    "version": 1
  }
}