FaSST: Fast Sparsifying Secondary Transform

arxiv: 2605.15086 · v1 · pith:MLJD76SLnew · submitted 2026-05-14 · 📡 eess.IV · eess.SP

FaSST: Fast Sparsifying Secondary Transform

Darukeesan Pakiyarajah , Samuel Fern\'andez-Mendui\~na , Eduardo Pavez , Antonio Ortega , Debargha Mukherjee This is my paper

Pith reviewed 2026-05-15 03:14 UTC · model grok-4.3

classification 📡 eess.IV eess.SP

keywords FaSSTsecondary transformGivens rotationsLFNSTAV2sparse orthonormal transformintra predictionvideo coding

0 comments p. Extension

pith:MLJD76SL Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{MLJD76SL}

Prints a linked pith:MLJD76SL badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

FaSST factorizes data-driven secondary transforms into mode-adaptive Givens rotations to match LFNST performance at far lower complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Fast Sparsifying Secondary Transforms (FaSST) as a framework to design low-complexity data-dependent secondary transforms for video coding. It approximates full sparse orthonormal transforms by breaking them into sequences of Givens rotations, with the rotations found through alternating minimization and an approximate factorization step. The number of rotations can vary with each intra prediction mode, which cuts the total work. A sympathetic reader would care because this lets codecs keep the coding gains of learned transforms without the heavy compute that has blocked their use in practice.

Core claim

Our approach approximates data-driven sparse orthonormal transforms (SOTs) by factorizing them into a sequence of Givens rotations. The rotations are efficiently determined using an alternating minimization strategy combined with an approximate Givens factorization procedure. Our method adapts the number of rotations based on the prediction mode, further reducing computational complexity. We design mode-dependent secondary transforms for intra-prediction residuals in AV2 using FaSST.

What carries the argument

Approximate factorization of sparse orthonormal transforms into mode-dependent sequences of Givens rotations via alternating minimization.

If this is right

Mode-adaptive FaSST matches the RD performance of LFNST while cutting computations by 83.67 percent.
By skipping fixed-coefficient truncation, FaSST delivers up to 1.80 percent BD-rate savings over LFNST at 66.24 percent lower complexity.
The transforms are explicitly designed for intra-prediction residuals inside AV2.
Mode-specific rotation counts let the encoder trade complexity against performance per block.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same factorization method could be applied to inter-prediction residuals or to other stages such as primary transforms.
Hardware implementations would see lower power draw because each block uses only the rotations required for its mode.
Extending the design to a larger set of video content beyond the AV2 test set might expose additional gains or edge cases.
Combining FaSST with existing low-complexity approximations already present in VVC could produce cumulative efficiency improvements.

Load-bearing premise

The alternating-minimization procedure combined with approximate Givens factorization produces transforms whose rate-distortion performance remains close to the original data-driven SOTs across all intra prediction modes and content types.

What would settle it

Direct BD-rate measurements on AV2 test sequences showing FaSST underperforms LFNST by more than 0.5 percent on average, or computation counts that fail to deliver at least 60 percent savings.

read the original abstract

Data-dependent secondary transforms, which aim to decorrelate coefficients of a separable primary transform, can improve residual coding efficiency; however, their deployment is often constrained by computational complexity. Recent video codecs use variants of the low-frequency non-separable transform (LFNST), which discards some high-frequency secondary transform coefficients, limiting achievable coding gains. Moreover, existing data-dependent secondary transforms lack explicit rate-distortion (RD) optimal design criteria. In this work, we propose a framework for designing low-complexity data-dependent secondary transforms, termed Fast Sparsifying Secondary Transforms (FaSSTs). Our approach approximates data-driven sparse orthonormal transforms (SOTs) by factorizing them into a sequence of Givens rotations. The rotations are efficiently determined using an alternating minimization strategy combined with an approximate Givens factorization procedure. Our method adapts the number of rotations based on the prediction mode, further reducing computational complexity. We design mode-dependent secondary transforms for intra-prediction residuals in AV2 using FaSST. Experimental results show that mode-adaptive FaSST matches the RD performance of LFNST while reducing the number of computations by 83.67%. Moreover, by avoiding fixed-coefficient truncation, FaSST achieves up to 1.80% BD-rate savings relative to LFNST while operating at 66.24% lower complexity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FaSST gives a workable way to turn data-driven secondary transforms into fast mode-adaptive Givens sequences that cut complexity sharply while holding or improving BD-rate over LFNST in AV2 intra coding.

read the letter

The main thing here is a concrete design recipe: start with sparse orthonormal transforms fitted to intra residuals, then factor them into a short sequence of Givens rotations whose count is chosen per prediction mode. They use alternating minimization plus an approximate factorization step to pick the rotations. That combination is new for this setting and lets them drop the fixed truncation that LFNST uses. The reported numbers are 83.67% fewer computations for matched RD performance and up to 1.80% BD-rate gain at 66% lower complexity. Those are the practical payoffs the paper actually delivers. The approach is self-contained once the training residuals are fixed, and the mode-adaptive rotation count is a straightforward engineering knob that directly addresses encoder speed. What is missing is any direct check on how much the Givens version drifts from the original SOT. No Frobenius residuals, no quadratic-form deviation, and no RD curves for the unapproximated SOTs appear in the text. Without that link it is hard to know whether the observed gains come from the approximation itself or from the particular test clips and the LFNST baseline. The abstract also gives no error bars or breakdown of the test set size and content variety, so the statistical weight of the 1.80% figure stays unclear. Still, the construction is reproducible from the description, the baseline comparison is external, and the complexity claims rest on operation counts rather than fitted parameters. This is the kind of targeted improvement that codec implementers and transform researchers will want to look at. It is not foundational, but it is solid enough on the engineering side to deserve referee time. I would bring it to a reading group focused on video coding tools and would send it out for review rather than desk-reject.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces FaSST, a framework for designing low-complexity data-dependent secondary transforms by approximating sparse orthonormal transforms (SOTs) through factorization into Givens rotations using alternating minimization. Applied to intra-prediction residuals in AV2, mode-adaptive FaSST is shown to match LFNST rate-distortion performance while reducing computations by 83.67%, and achieve up to 1.80% BD-rate savings at 66.24% lower complexity by avoiding fixed-coefficient truncation.

Significance. If the approximation quality holds across modes and content, FaSST supplies a concrete route to deploy data-driven secondary transforms in video codecs at substantially lower complexity than LFNST while removing the truncation penalty, which could translate into measurable coding gains for AV2 and successor standards.

major comments (3)

[Experimental Results] Experimental Results section: the approximation residual between the original data-driven SOTs and the FaSST versions (e.g., ||SOT − FaSST||_F or the deviation in the quadratic form governing coefficient variance) is never reported, so the claim that mode-adaptive FaSST preserves RD performance cannot be verified independently of the particular test set.
[Experimental Results] Experimental Results section: no RD curves or tabulated performance figures are given for the unapproximated SOTs themselves, leaving the 1.80 % BD-rate gain relative to LFNST unanchored and potentially attributable to test-set specifics rather than the proposed construction.
[Method] Method section: the alternating-minimization procedure and approximate Givens factorization lack convergence criteria, initialization strategy, and any bound on the number of rotations per mode, all of which are load-bearing for reproducing the reported complexity reductions of 83.67 % and 66.24 %.

minor comments (2)

[Abstract] Abstract: the phrase 'up to 1.80 % BD-rate savings' is stated without indicating the sequences or QP range that attain the maximum, making the headline claim difficult to interpret.
[Experimental Results] The manuscript does not specify the size or diversity of the AV2 test set used for the BD-rate measurements, which is a standard requirement for reproducibility of the numerical claims.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help strengthen the manuscript. We address each major point below and commit to revisions that enhance reproducibility and verifiability without altering the core claims.

read point-by-point responses

Referee: Experimental Results section: the approximation residual between the original data-driven SOTs and the FaSST versions (e.g., ||SOT − FaSST||_F or the deviation in the quadratic form governing coefficient variance) is never reported, so the claim that mode-adaptive FaSST preserves RD performance cannot be verified independently of the particular test set.

Authors: We agree that reporting the approximation residual is necessary for independent verification. In the revised manuscript we will add explicit tables listing the Frobenius norm ||SOT − FaSST||_F per prediction mode together with the resulting deviation in the quadratic form that governs coefficient variance. These quantities will be presented separately from the RD curves so that approximation quality can be assessed on its own. revision: yes
Referee: Experimental Results section: no RD curves or tabulated performance figures are given for the unapproximated SOTs themselves, leaving the 1.80 % BD-rate gain relative to LFNST unanchored and potentially attributable to test-set specifics rather than the proposed construction.

Authors: The unapproximated SOTs constitute the theoretical target but are computationally prohibitive for full codec integration; therefore their RD curves were not included in the original submission. The 1.80 % BD-rate advantage of FaSST over LFNST is attributable to the removal of fixed-coefficient truncation rather than to test-set idiosyncrasies. We will add a clarifying paragraph that explicitly links the observed gain to the non-truncated design and, where computationally feasible, will supply RD figures for SOTs on a representative subset of modes. revision: partial
Referee: Method section: the alternating-minimization procedure and approximate Givens factorization lack convergence criteria, initialization strategy, and any bound on the number of rotations per mode, all of which are load-bearing for reproducing the reported complexity reductions of 83.67 % and 66.24 %.

Authors: We acknowledge that the algorithmic description must be made fully reproducible. The revised Method section will specify (i) the convergence criterion (relative change in the objective below 10^{-4}), (ii) the initialization (identity matrix followed by a single random Givens sweep), and (iii) the mode-adaptive bound on the number of rotations (chosen to meet a per-mode sparsity target derived from the original SOT). These additions will allow exact reproduction of the stated complexity figures. revision: yes

Circularity Check

0 steps flagged

No significant circularity: FaSST construction and RD claims are externally anchored

full rationale

The paper defines FaSST as an explicit approximation procedure (alternating minimization + approximate Givens factorization) applied to pre-computed data-driven SOTs; the resulting transforms are then evaluated by direct RD measurement against the independent LFNST baseline on AV2 test sequences. No equation or claim reduces the reported BD-rate savings or complexity reduction to a fitted parameter or self-citation by construction. The design procedure is self-contained once the training residuals are supplied, and the headline performance numbers are obtained from external codec integration rather than from any internal redefinition of the objective.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of a data-driven sparse orthonormal transform that can be well-approximated by a short product of Givens rotations without significant rate-distortion loss; this approximation quality is treated as an empirical fact rather than derived from first principles.

free parameters (1)

number of Givens rotations per mode
Chosen adaptively per prediction mode to trade off complexity against approximation fidelity; the specific counts are not stated in the abstract.

axioms (1)

domain assumption A sparse orthonormal transform exists for each intra-prediction residual class that improves coefficient decorrelation beyond the primary transform.
Invoked when the paper states that data-dependent secondary transforms can improve residual coding efficiency.

pith-pipeline@v0.9.0 · 5553 in / 1448 out tokens · 42865 ms · 2026-05-15T03:14:30.967662+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 2 internal anchors

[1]

FaSST: Fast Sparsifying Secondary Transform

INTRODUCTION The non-separable Karhunen–Lo `eve Transform (KLT) is optimal for linear decorrelation under common statistical assumptions [1–5]. However, the non-separable KLT is rarely used in practice due to its lack of low-complexity implementations and memory require- ments. Separable trigonometric transforms, such as the discrete cosine transform (DCT...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

coefficient dropping

PRELIMINARIES 2.1. Secondary transforms Letx∈R N 2 be the vector obtained by stacking the columns of anN×Nblock. Given a primary transformG, and a permutation matrixPthat reorders the transform coefficients (e.g., zig-zag scan), we write the primary transform coefficients as ˆx=PG ⊤x. Let ˜T be ann×nnon-separable orthonormal transform, referred to as the ...

work page
[3]

Moreover, KLTs and their approx- imations do not account for quantization effects

FAST SPARSIFYING SECONDARY TRANSFORMS LFNSTs significantly improve residual coding efficiency but rely on coefficient truncation to control computational complexity, which limits achievable coding gains. Moreover, KLTs and their approx- imations do not account for quantization effects. We address these limitations via our proposed Fast Sparsifying Seconda...

work page
[4]

given the coefficients{ ˆyi}me i=1, we optimize the transform: min SJ meX i=1 ∥ˆxi −S J ˆyi∥2 2 s.t.S J = JY j=1 G(mj, nj, θj).(6) It can be shown that the above problem is equivalent to max SJ tr(ΓSJ )s.t.S J = JY j=1 G(mj, nj, θj),(7) wheretr(·)denotes the trace of a matrix,Γ= ˆY ˆX⊤ denotes the cross-covariance matrix between the thresholded secondary ...

work page
[5]

Specifically, images from the CLIC dataset [20] are compressed using A V2, and residual blocks of sizes 8×8,16×16, and32×32are collected

SIMULA TIONS We use intra-prediction residuals obtained from A V2 for both trans- form learning and evaluation. Specifically, images from the CLIC dataset [20] are compressed using A V2, and residual blocks of sizes 8×8,16×16, and32×32are collected. Among these blocks, some used the DCT and others the ADST as primary transforms, and some used only a prima...

work page
[6]

CONCLUSION In this work, we proposed an FaSST framework for designing fast, data-dependent secondary transforms for residual coding. We formulate secondary transform learning as a rate-distortion-cost- inspired SOT problem with the additional constraint that the trans- form be factorized into a sequence of Givens rotations, enabling efficient implementati...

work page
[7]

A comparative study of image correlation models for directional two-dimensional sources

S. Zhu and B. Zeng, “A comparative study of image correla- tion models for directional two-dimensional sources,” in2011 IEEE 13th International Workshop on Multimedia Signal Pro- cessing, 2011, pp. 1–5

work page 2011
[8]

Mode-Dependent Transforms for Coding Directional Intra Prediction Residuals

C. Yeo, Y . H. Tan, Z. Li, and S. Rahardja, “Mode-dependent transforms for coding directional intra prediction residuals,” IEEE Transactions on Circuits and Systems for Video Technol- ogy, vol. 22, no. 4, pp. 545–554, 2012

work page 2012
[9]

Signal-Independent Separable KLT by Offline Training for Video Coding

K. Fan, R. Wang, W. Lin, L.-Y . Duan, and W. Gao, “Signal- independent separable KLT by offline training for video cod- ing,”IEEE Access, vol. 7, pp. 33087–33093, 2019

work page 2019
[10]

Video content dependent directional transform for intra frame coding

L. Xu, K. N. Ngan, and M. Wang, “Video content dependent directional transform for intra frame coding,” in2012 Picture Coding Symposium. IEEE, 2012, pp. 197–200

work page 2012
[11]

Non-separable mode dependent transforms for intra coding in HEVC

A. Arrufat, P. Philippe, and O. D´eforges, “Non-separable mode dependent transforms for intra coding in HEVC,” in2014 IEEE Visual Communications and Image Processing Confer- ence. IEEE, 2014, pp. 61–64

work page 2014
[12]

Overview of the Versatile Video Coding (VVC) Standard and its Applications

B. Bross, Y .-K. Wang, Y . Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the versatile video coding (VVC) standard and its applications,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736– 3764, 2021

work page 2021
[13]

Low Frequency Non-Separable Transform (LFNST)

M. Koo, M. Salehifar, J. Lim, and S.-H. Kim, “Low frequency non-separable transform (LFNST),” in2019 Picture Coding Symposium (PCS), 2019, pp. 1–5

work page 2019
[14]

Study On Coding Tools Beyond AV1

X. Zhao, L. Zhao, M. Krishnan, Y . Du, S. Liu, D. Mukherjee, Y . Xu, and A. Grange, “Study on coding tools beyond A V1,” in 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2021, pp. 1–6

work page 2021
[15]

Approximation and Compression With Sparse Orthonormal Transforms

O. G. Sezer, O. G. Guleryuz, and Y . Altunbasak, “Approxi- mation and compression with sparse orthonormal transforms,” IEEE Transactions on Image Processing, vol. 24, no. 8, pp. 2328–2343, 2015

work page 2015
[16]

Highly efficient non-separable transforms for next generation video coding

A. Said, X. Zhao, M. Karczewicz, H. E. Egilmez, V . Seregin, and J. Chen, “Highly efficient non-separable transforms for next generation video coding,” in2016 Picture Coding Sym- posium (PCS), 2016, pp. 1–5

work page 2016
[17]

An algorithm for the machine calculation of complex fourier series,

J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex fourier series,”Mathematics of compu- tation, vol. 19, no. 90, pp. 297–301, 1965

work page 1965
[18]

Robust Learning of 2-D Separable Transforms for Next-Generation Video Coding

O. G. Sezer, R. Cohen, and A. Vetro, “Robust learning of 2- D separable transforms for next-generation video coding,” in 2011 Data Compression Conference. IEEE, 2011, pp. 63–72

work page 2011
[19]

NSST: Non-separable secondary transforms for next generation video coding

X. Zhao, J. Chen, A. Said, V . Seregin, H. E. Egilmez, and M. Karczewicz, “NSST: Non-separable secondary transforms for next generation video coding,” in2016 Picture Coding Symposium (PCS), 2016, pp. 1–5

work page 2016
[20]

On secondary transforms for intra prediction residual

A. Saxena and F. C. Fernandes, “On secondary transforms for intra prediction residual,” in2012 IEEE International Confer- ence on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 1201–1204

work page 2012
[21]

Alternating minimization algorithms: From blahut-arimoto to expectation-maximization,

J. A. O’Sullivan, “Alternating minimization algorithms: From blahut-arimoto to expectation-maximization,” inCodes, curves, and signals: Common threads in communications, pp. 173–192. Springer, 1998

work page 1998
[22]

Convergence rates and source conditions for Tikhonov regularization with sparsity constraints

D. A. Lorenz, “Convergence rates and source conditions for tikhonov regularization with sparsity constraints,”arXiv preprint arXiv:0801.1774, 2008

work page internal anchor Pith review Pith/arXiv arXiv 2008
[23]

A generalized solution of the orthogonal procrustes problem,

P. H. Sch ¨onemann, “A generalized solution of the orthogonal procrustes problem,”Psychometrika, vol. 31, no. 1, pp. 1–10, 1966

work page 1966
[24]

Matrix Computations.

G. Golub and C. Van Loan, “Matrix computations, vol. 3 bal- timore,”MD: JHU Press., 2012

work page 2012
[25]

Approximate Fast Graph Fourier Transforms via Multilayer Sparse Approximations

L. Le Magoarou, R. Gribonval, and N. Tremblay, “Approx- imate fast graph fourier transforms via multilayer sparse ap- proximations,”IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 2, pp. 407–420, 2018

work page 2018
[26]

5th challenge on learned image compression,

“5th challenge on learned image compression,” 2022, Online

work page 2022
[27]

Joint Optimization of Primary and Secondary Transforms Using Rate-Distortion Optimized Transform Design

D. Pakiyarajah, E. Pavez, A. Ortega, D. Mukherjee, O. Guleryuz, K.-S. Lu, and K. K. Sivakumar, “Joint optimiza- tion of primary and secondary transforms using rate-distortion optimized transform design,” in2025 IEEE International Con- ference on Image Processing (ICIP), 2025, pp. 2594–2599

work page 2025
[28]

Image quality assessment: from error visibility to structural similarity

Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similar- ity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004

work page 2004
[29]

The disparity between optimal and practical Lagrangian multiplier estimation in video encoders

D. J. Ringis, Vibhoothi, F. Piti´e, and A. Kokaram, “The dispar- ity between optimal and practical Lagrangian multiplier esti- mation in video encoders,”Frontiers in Signal Processing, vol. 3, pp. 1205104, 2023

work page 2023
[30]

High Throughput CABAC Entropy Coding in HEVC

V . Sze and M. Budagavi, “High throughput cabac entropy cod- ing in hevc,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1778–1791, 2012

work page 2012
[31]

Calculation of average PSNR differences be- tween RD-curves,

G. Bjontegaard, “Calculation of average PSNR differences be- tween RD-curves,”ITU-T SG16 Q, vol. 6, 2001

work page 2001

[1] [1]

FaSST: Fast Sparsifying Secondary Transform

INTRODUCTION The non-separable Karhunen–Lo `eve Transform (KLT) is optimal for linear decorrelation under common statistical assumptions [1–5]. However, the non-separable KLT is rarely used in practice due to its lack of low-complexity implementations and memory require- ments. Separable trigonometric transforms, such as the discrete cosine transform (DCT...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[2] [2]

coefficient dropping

PRELIMINARIES 2.1. Secondary transforms Letx∈R N 2 be the vector obtained by stacking the columns of anN×Nblock. Given a primary transformG, and a permutation matrixPthat reorders the transform coefficients (e.g., zig-zag scan), we write the primary transform coefficients as ˆx=PG ⊤x. Let ˜T be ann×nnon-separable orthonormal transform, referred to as the ...

work page

[3] [3]

Moreover, KLTs and their approx- imations do not account for quantization effects

FAST SPARSIFYING SECONDARY TRANSFORMS LFNSTs significantly improve residual coding efficiency but rely on coefficient truncation to control computational complexity, which limits achievable coding gains. Moreover, KLTs and their approx- imations do not account for quantization effects. We address these limitations via our proposed Fast Sparsifying Seconda...

work page

[4] [4]

given the coefficients{ ˆyi}me i=1, we optimize the transform: min SJ meX i=1 ∥ˆxi −S J ˆyi∥2 2 s.t.S J = JY j=1 G(mj, nj, θj).(6) It can be shown that the above problem is equivalent to max SJ tr(ΓSJ )s.t.S J = JY j=1 G(mj, nj, θj),(7) wheretr(·)denotes the trace of a matrix,Γ= ˆY ˆX⊤ denotes the cross-covariance matrix between the thresholded secondary ...

work page

[5] [5]

Specifically, images from the CLIC dataset [20] are compressed using A V2, and residual blocks of sizes 8×8,16×16, and32×32are collected

SIMULA TIONS We use intra-prediction residuals obtained from A V2 for both trans- form learning and evaluation. Specifically, images from the CLIC dataset [20] are compressed using A V2, and residual blocks of sizes 8×8,16×16, and32×32are collected. Among these blocks, some used the DCT and others the ADST as primary transforms, and some used only a prima...

work page

[6] [6]

CONCLUSION In this work, we proposed an FaSST framework for designing fast, data-dependent secondary transforms for residual coding. We formulate secondary transform learning as a rate-distortion-cost- inspired SOT problem with the additional constraint that the trans- form be factorized into a sequence of Givens rotations, enabling efficient implementati...

work page

[7] [7]

A comparative study of image correlation models for directional two-dimensional sources

S. Zhu and B. Zeng, “A comparative study of image correla- tion models for directional two-dimensional sources,” in2011 IEEE 13th International Workshop on Multimedia Signal Pro- cessing, 2011, pp. 1–5

work page 2011

[8] [8]

Mode-Dependent Transforms for Coding Directional Intra Prediction Residuals

C. Yeo, Y . H. Tan, Z. Li, and S. Rahardja, “Mode-dependent transforms for coding directional intra prediction residuals,” IEEE Transactions on Circuits and Systems for Video Technol- ogy, vol. 22, no. 4, pp. 545–554, 2012

work page 2012

[9] [9]

Signal-Independent Separable KLT by Offline Training for Video Coding

K. Fan, R. Wang, W. Lin, L.-Y . Duan, and W. Gao, “Signal- independent separable KLT by offline training for video cod- ing,”IEEE Access, vol. 7, pp. 33087–33093, 2019

work page 2019

[10] [10]

Video content dependent directional transform for intra frame coding

L. Xu, K. N. Ngan, and M. Wang, “Video content dependent directional transform for intra frame coding,” in2012 Picture Coding Symposium. IEEE, 2012, pp. 197–200

work page 2012

[11] [11]

Non-separable mode dependent transforms for intra coding in HEVC

A. Arrufat, P. Philippe, and O. D´eforges, “Non-separable mode dependent transforms for intra coding in HEVC,” in2014 IEEE Visual Communications and Image Processing Confer- ence. IEEE, 2014, pp. 61–64

work page 2014

[12] [12]

Overview of the Versatile Video Coding (VVC) Standard and its Applications

B. Bross, Y .-K. Wang, Y . Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the versatile video coding (VVC) standard and its applications,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736– 3764, 2021

work page 2021

[13] [13]

Low Frequency Non-Separable Transform (LFNST)

M. Koo, M. Salehifar, J. Lim, and S.-H. Kim, “Low frequency non-separable transform (LFNST),” in2019 Picture Coding Symposium (PCS), 2019, pp. 1–5

work page 2019

[14] [14]

Study On Coding Tools Beyond AV1

X. Zhao, L. Zhao, M. Krishnan, Y . Du, S. Liu, D. Mukherjee, Y . Xu, and A. Grange, “Study on coding tools beyond A V1,” in 2021 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2021, pp. 1–6

work page 2021

[15] [15]

Approximation and Compression With Sparse Orthonormal Transforms

O. G. Sezer, O. G. Guleryuz, and Y . Altunbasak, “Approxi- mation and compression with sparse orthonormal transforms,” IEEE Transactions on Image Processing, vol. 24, no. 8, pp. 2328–2343, 2015

work page 2015

[16] [16]

Highly efficient non-separable transforms for next generation video coding

A. Said, X. Zhao, M. Karczewicz, H. E. Egilmez, V . Seregin, and J. Chen, “Highly efficient non-separable transforms for next generation video coding,” in2016 Picture Coding Sym- posium (PCS), 2016, pp. 1–5

work page 2016

[17] [17]

An algorithm for the machine calculation of complex fourier series,

J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex fourier series,”Mathematics of compu- tation, vol. 19, no. 90, pp. 297–301, 1965

work page 1965

[18] [18]

Robust Learning of 2-D Separable Transforms for Next-Generation Video Coding

O. G. Sezer, R. Cohen, and A. Vetro, “Robust learning of 2- D separable transforms for next-generation video coding,” in 2011 Data Compression Conference. IEEE, 2011, pp. 63–72

work page 2011

[19] [19]

NSST: Non-separable secondary transforms for next generation video coding

X. Zhao, J. Chen, A. Said, V . Seregin, H. E. Egilmez, and M. Karczewicz, “NSST: Non-separable secondary transforms for next generation video coding,” in2016 Picture Coding Symposium (PCS), 2016, pp. 1–5

work page 2016

[20] [20]

On secondary transforms for intra prediction residual

A. Saxena and F. C. Fernandes, “On secondary transforms for intra prediction residual,” in2012 IEEE International Confer- ence on Acoustics, Speech and Signal Processing (ICASSP), 2012, pp. 1201–1204

work page 2012

[21] [21]

Alternating minimization algorithms: From blahut-arimoto to expectation-maximization,

J. A. O’Sullivan, “Alternating minimization algorithms: From blahut-arimoto to expectation-maximization,” inCodes, curves, and signals: Common threads in communications, pp. 173–192. Springer, 1998

work page 1998

[22] [22]

Convergence rates and source conditions for Tikhonov regularization with sparsity constraints

D. A. Lorenz, “Convergence rates and source conditions for tikhonov regularization with sparsity constraints,”arXiv preprint arXiv:0801.1774, 2008

work page internal anchor Pith review Pith/arXiv arXiv 2008

[23] [23]

A generalized solution of the orthogonal procrustes problem,

P. H. Sch ¨onemann, “A generalized solution of the orthogonal procrustes problem,”Psychometrika, vol. 31, no. 1, pp. 1–10, 1966

work page 1966

[24] [24]

Matrix Computations.

G. Golub and C. Van Loan, “Matrix computations, vol. 3 bal- timore,”MD: JHU Press., 2012

work page 2012

[25] [25]

Approximate Fast Graph Fourier Transforms via Multilayer Sparse Approximations

L. Le Magoarou, R. Gribonval, and N. Tremblay, “Approx- imate fast graph fourier transforms via multilayer sparse ap- proximations,”IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 2, pp. 407–420, 2018

work page 2018

[26] [26]

5th challenge on learned image compression,

“5th challenge on learned image compression,” 2022, Online

work page 2022

[27] [27]

Joint Optimization of Primary and Secondary Transforms Using Rate-Distortion Optimized Transform Design

D. Pakiyarajah, E. Pavez, A. Ortega, D. Mukherjee, O. Guleryuz, K.-S. Lu, and K. K. Sivakumar, “Joint optimiza- tion of primary and secondary transforms using rate-distortion optimized transform design,” in2025 IEEE International Con- ference on Image Processing (ICIP), 2025, pp. 2594–2599

work page 2025

[28] [28]

Image quality assessment: from error visibility to structural similarity

Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similar- ity,”IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004

work page 2004

[29] [29]

The disparity between optimal and practical Lagrangian multiplier estimation in video encoders

D. J. Ringis, Vibhoothi, F. Piti´e, and A. Kokaram, “The dispar- ity between optimal and practical Lagrangian multiplier esti- mation in video encoders,”Frontiers in Signal Processing, vol. 3, pp. 1205104, 2023

work page 2023

[30] [30]

High Throughput CABAC Entropy Coding in HEVC

V . Sze and M. Budagavi, “High throughput cabac entropy cod- ing in hevc,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1778–1791, 2012

work page 2012

[31] [31]

Calculation of average PSNR differences be- tween RD-curves,

G. Bjontegaard, “Calculation of average PSNR differences be- tween RD-curves,”ITU-T SG16 Q, vol. 6, 2001

work page 2001