How to Design a Compact High-Throughput Video Camera?

Chenxi Qiu; Tao Yue; Xuemei Hu

arxiv: 2604.10619 · v1 · submitted 2026-04-12 · 💻 cs.CV

How to Design a Compact High-Throughput Video Camera?

Chenxi Qiu , Tao Yue , Xuemei Hu This is my paper

Pith reviewed 2026-05-10 16:38 UTC · model grok-4.3

classification 💻 cs.CV

keywords gradient camerahigh-throughput videolow-bit quantizationimage reconstructionmulti-scale CNNreadout bottlenecksingle-chip imagingvideo acquisition

0 comments

The pith

A low-bit gradient camera scheme with multi-scale CNN reconstruction can resolve readout and transmission bottlenecks for high-throughput video on a single compact sensor.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that high-throughput video systems today rely on splicing many sub-images, which creates complex hardware. Gradient cameras capture intensity differences instead of absolute values, allowing faster readout and more compact data representation. By quantizing these gradients to low bit depths using available sensor technology, the design shrinks the data volume enough to match current transmission limits while a multi-scale convolutional neural network reconstructs full-resolution frames. This approach matters because pixel counts continue to rise faster than readout electronics can handle, so efficient intermediate representations become essential for practical single-chip high-speed cameras. Tests on both simulated and real captured sequences confirm that the reconstructed video quality remains usable.

Core claim

A low-bit gradient camera scheme based on existing technologies resolves the readout and transmission bottlenecks for high throughput video imaging by exploiting the fast readout and efficient representation strengths of gradient information, with a multi-scale reconstruction CNN recovering high-resolution images from the captured low-bit gradient data.

What carries the argument

The low-bit gradient camera scheme that records quantized spatial gradients and the multi-scale reconstruction CNN that inverts those gradients into full-resolution video frames.

If this is right

High-throughput video acquisition becomes feasible on a single chip without the need to splice hundreds of sub-sensors.
Readout and output bandwidth no longer scale directly with pixel count or frame rate.
The overall camera system remains compact while supporting higher spatial and temporal resolution.
Reconstruction quality holds across both simulated data and real captured sequences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the hardware proves manufacturable, the approach could be adapted to other gradient-based sensors in scientific or industrial imaging.
The CNN reconstruction step might allow trading sensor bit depth for computational post-processing in future video pipelines.
Extending the multi-scale network to handle motion or varying illumination could broaden applicability beyond the tested conditions.
Integration testing with actual low-bit readout circuits would directly validate whether the information loss remains recoverable.

Load-bearing premise

Current sensor and readout hardware can implement a low-bit gradient camera that still supplies enough information for the multi-scale CNN to recover accurate high-resolution frames at the target throughput.

What would settle it

Build a prototype low-bit gradient sensor, stream its output through the proposed multi-scale CNN, and measure whether the reconstructed video maintains acceptable quality and frame rate compared with a conventional high-bit-depth camera of the same pixel count.

Figures

Figures reproduced from arXiv: 2604.10619 by Chenxi Qiu, Tao Yue, Xuemei Hu.

**Figure 2.** Figure 2: The principle verification of the proposed method with [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: The diagram of our multi-scale fusion reconstruction network. The network takes low-bit HRG map and LRI image as input, and [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: The reconstruction results of different HRG acquisition schemes, the first row is the schematic diagram of HRG gradient acqui [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Experiental results on different datasets. top: Stanford [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Influence of reference HR image on YUP++ dataset. (a). [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 9.** Figure 9: Real captured images, compressive ratio: 0.066. HRG map: top, reconstruction result: middle, groundtruth: bottom. [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

read the original abstract

High throughput video acquisition is a challenging problem and has been drawing increasing attention. Existing high throughput imaging systems splice hundreds of sub-images/videos into high throughput videos, suffering from extremely high system complexity. Alternatively, with pixel sizes reducing to sub-micrometer levels, integrating ultra-high throughput on a single chip is becoming feasible. Nevertheless, the readout and output transmission speed cannot keep pace with the increasing pixel numbers. To this end, this paper analyzes the strength of gradient cameras in fast readout and efficient representation, and proposes a low-bit gradient camera scheme based on existing technologies that can resolve the readout and transmission bottlenecks for high throughput video imaging. A multi-scale reconstruction CNN is proposed to reconstruct high-resolution images. Extensive experiments on both simulated and real data are conducted to demonstrate the promising quality and feasibility of the proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper sketches a single-chip high-throughput video camera via low-bit gradient sensing and a multi-scale CNN, which looks like a grounded engineering step but still needs hardware validation.

read the letter

This paper gives a workable path to single-chip high-throughput video by combining low-bit gradient capture with a multi-scale CNN for image reconstruction. It sidesteps the complexity of multi-camera arrays by pushing more of the load onto smarter sensing and post-processing. They start by noting how gradient cameras already allow fast readout and efficient data representation compared to standard intensity sensors. From there they suggest dropping to low-bit gradients using off-the-shelf tech to cut down on readout and transmission speeds. The CNN then reconstructs the high-res frames from these limited measurements, using multiple scales to recover details. The experiments on simulated and real data are the strongest part. Showing results from both helps, and they claim promising quality without overhyping. It feels like a direct engineering response to the pixel-count versus readout-speed problem. The main uncertainty is in the hardware side. Can you really build a low-bit gradient sensor that maintains enough signal for the CNN at the desired throughputs? The paper positions the experiments as evidence that it works, but without seeing the specific bit rates, frame rates, and error metrics it's hard to gauge how close it is to practical use. Also, the reconstruction quality might vary with scene content, though their tests cover some of that. This would interest researchers in compact imaging hardware and computer vision applications that need high-speed video in small packages. It deserves a serious referee because the problem is real, the proposal is concrete, and the evaluation includes real data. If the numbers hold up under review, it could influence how people design future sensors.

Referee Report

2 major / 3 minor

Summary. The paper analyzes the advantages of gradient cameras for fast readout and compact representation, then proposes a low-bit gradient camera architecture built on existing sensor and readout technologies to overcome readout and transmission bottlenecks in high-pixel-count single-chip video sensors. It introduces a multi-scale CNN to reconstruct full-resolution frames from the low-bit gradient measurements and reports experiments on both simulated and real data that demonstrate promising reconstruction quality and overall feasibility for high-throughput video acquisition.

Significance. If the low-bit gradient scheme proves realizable with current hardware and the multi-scale CNN delivers high-fidelity reconstruction at the target throughputs, the work could enable simpler, more compact high-throughput video systems without the complexity of splicing hundreds of sub-cameras. The emphasis on gradient-domain sensing for data efficiency and the CNN-based recovery pipeline represent a practical contribution to computational imaging for high-speed applications.

major comments (2)

[§4] §4 (Experiments on real data): the manuscript states that experiments demonstrate 'promising quality and feasibility,' yet provides no quantitative metrics (e.g., PSNR, SSIM, or throughput measurements) or direct comparisons against existing high-throughput baselines; without these numbers it is impossible to verify whether the multi-scale CNN recovers sufficient detail from the low-bit gradients to support the central feasibility claim.
[§3.2] §3.2 (Low-bit gradient camera scheme): the claim that the design 'can be realized with existing technologies' rests on qualitative analysis of readout speeds; a concrete calculation or reference to measured sensor parameters (e.g., ADC bit-depth, row readout time) showing that the proposed bit reduction actually meets the target frame rate is missing and is load-bearing for the throughput-resolution argument.

minor comments (3)

[Abstract] The abstract and introduction use 'promising quality' without defining the target quality metric or acceptable error threshold for the intended applications.
[§3] Notation for the gradient operator and the low-bit quantization function is introduced without a dedicated symbols table or consistent equation numbering, making it difficult to follow the data-flow description in §3.
[Figures] Figure captions for the reconstruction results should include the exact bit-depth, frame rate, and sensor parameters used in each experiment to allow reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the insightful comments, which help us improve the clarity and rigor of our work. We provide point-by-point responses to the major comments below.

read point-by-point responses

Referee: [§4] §4 (Experiments on real data): the manuscript states that experiments demonstrate 'promising quality and feasibility,' yet provides no quantitative metrics (e.g., PSNR, SSIM, or throughput measurements) or direct comparisons against existing high-throughput baselines; without these numbers it is impossible to verify whether the multi-scale CNN recovers sufficient detail from the low-bit gradients to support the central feasibility claim.

Authors: We agree that quantitative evaluation is important for validating the reconstruction quality. Although the manuscript includes visual results on real data to demonstrate feasibility, we will add PSNR and SSIM metrics for the real data experiments in the revised version. We will also include throughput calculations and comparisons to relevant baselines to better support the claims. revision: yes
Referee: [§3.2] §3.2 (Low-bit gradient camera scheme): the claim that the design 'can be realized with existing technologies' rests on qualitative analysis of readout speeds; a concrete calculation or reference to measured sensor parameters (e.g., ADC bit-depth, row readout time) showing that the proposed bit reduction actually meets the target frame rate is missing and is load-bearing for the throughput-resolution argument.

Authors: The analysis in §3.2 is based on standard sensor characteristics, but we acknowledge the need for more concrete support. In the revision, we will include a specific calculation using typical values for row readout time and ADC bit-depth from commercial sensors, along with references to relevant datasheets, to demonstrate how the low-bit gradient scheme achieves the required throughput. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper analyzes gradient camera properties for readout efficiency and proposes a low-bit scheme using existing sensor technologies plus a multi-scale CNN for reconstruction, supported by experiments on simulated and real data. No equations, derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The claims rest on external analysis and empirical validation rather than reducing to self-definition or input-by-construction, making the work self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The proposal relies on the unproven assumption that existing technologies can support the low-bit gradient readout at the required scale and that the CNN can recover full images from the reduced data. No free parameters, axioms, or invented entities are explicitly introduced in the abstract.

pith-pipeline@v0.9.0 · 5431 in / 1086 out tokens · 24072 ms · 2026-05-10T16:38:41.661335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 1 internal anchor

[1]

High resolution large format tile- scan camera: Design, calibration, and extended depth of field

Moshe Ben-Ezra. High resolution large format tile- scan camera: Design, calibration, and extended depth of field. In2010 IEEE International Conference on Computational Photography (ICCP), pages 1–8,

work page
[2]

Low-complexity single-image super-resolution based on nonnegative neighbor embedding

Marco Bevilacqua, Aline Roumy, Christine Guille- mot, and Marie line Alberi Morel. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. InProceedings of the British Machine Vision Conference (BMVC), pages 135.1– 135.10, 2012. 6

work page 2012
[3]

Multiscale gigapixel photography.Nature, 486(7403):386–389, 2012

David J Brady, Michael E Gehm, Ronald A Stack, Daniel L Marks, David S Kittle, Dathon R Golish, EM Vera, and Steven D Feller. Multiscale gigapixel photography.Nature, 486(7403):386–389, 2012. 1, 2

work page 2012
[4]

120MXS CMOS sensor.https://canon- cmos-sensors.com/canon-120mxs-cmos- sensor/

Canon. 120MXS CMOS sensor.https://canon- cmos-sensors.com/canon-120mxs-cmos- sensor/. 1, 2, 6

work page
[5]

A dual camera sys- tem for high spatiotemporal resolution video acquisi- tion.IEEE Computer Architecture Letters, (01):1–1,

Ming Cheng, Zhan Ma, Salman Asif, Yiling Xu, Hao- jie Liu, Wenbo Bao, and Jun Sun. A dual camera sys- tem for high spatiotemporal resolution video acquisi- tion.IEEE Computer Architecture Letters, (01):1–1,

work page
[6]

Gigapixel computational imaging

Oliver S Cossairt, Daniel Miau, and Shree K Nayar. Gigapixel computational imaging. In2011 IEEE In- ternational Conference on Computational Photogra- phy (ICCP), pages 1–8, 2011. 1, 2

work page 2011
[7]

Second-order attention network for single image super-resolution

Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang. Second-order attention network for single image super-resolution. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 11065–11074, 2019. 5

work page 2019
[8]

Video-rate imaging of bio- logical dynamics at centimetre scale and micrometre resolution.Nature Photonics, 13(11):809–816, 2019

Jingtao Fan, Jinli Suo, Jiamin Wu, Hao Xie, Yibing Shen, Feng Chen, Guijin Wang, Liangcai Cao, Guofan Jin, Quansheng He, et al. Video-rate imaging of bio- logical dynamics at centimetre scale and micrometre resolution.Nature Photonics, 13(11):809–816, 2019. 1, 2

work page 2019
[9]

Temporal residual networks for dynamic scene recognition

Christoph Feichtenhofer, Axel Pinz, and Richard P Wildes. Temporal residual networks for dynamic scene recognition. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, pages 4728–4737, 2017. 6, 7

work page 2017
[10]

Retrieving gray-level in- formation from a binary sensor and its application to gesture detection

Orazio Gallo, Iuri Frosio, Leonardo Gasparini, Kari Pulli, and Massimo Gottardi. Retrieving gray-level in- formation from a binary sensor and its application to gesture detection. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 21–26, 2015. 2

work page 2015
[11]

A 100 w 128 64 pixels contrast-based asyn- chronous binary vision sensor for sensor networks applications.IEEE Journal of Solid-State Circuits (JSSC), 44(5):1582–1592, 2009

Massimo Gottardi, Nicola Massari, and Syed Arsalan Jawed. A 100 w 128 64 pixels contrast-based asyn- chronous binary vision sensor for sensor networks applications.IEEE Journal of Solid-State Circuits (JSSC), 44(5):1582–1592, 2009. 2, 3

work page 2009
[12]

Closed-loop matters: Dual regression networks for single image super-resolution

Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, and Mingkui Tan. Closed-loop matters: Dual regression networks for single image super-resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5407–5416, 2020. 2, 5 10

work page 2020
[13]

Deep back-projection networks for super-resolution

Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. Deep back-projection networks for super-resolution. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 1664–1673, 2018. 2, 5

work page 2018
[14]

Lsst: from science drivers to reference design and anticipated data products.The Astrophysical Jour- nal, 873(2):111, 2019

ˇZeljko Ivezi´c, Steven M Kahn, J Anthony Tyson, Bob Abel, Emily Acosta, Robyn Allsman, David Alonso, Yusra AlSayyad, Scott F Anderson, John Andrew, et al. Lsst: from science drivers to reference design and anticipated data products.The Astrophysical Jour- nal, 873(2):111, 2019. 1, 2

work page 2019
[15]

Jayasuriya, O

S. Jayasuriya, O. Gallo, J. Gu, T. Aila, and J. Kautz. Reconstructing intensity images from binary spatial gradient cameras. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 337–343, 2017. 2

work page 2017
[16]

Pan-starrs: a large synop- tic survey telescope array

Nicholas Kaiser, Herve Aussel, Barry E Burke, Hans Boesgaard, Ken Chambers, Mark Richard Chun, James N Heasley, Klaus-Werner Hodapp, Bobby Hunt, Robert Jedicke, et al. Pan-starrs: a large synop- tic survey telescope array. InSurvey and Other Tele- scope Technologies and Discoveries, volume 4836, pages 154–164, 2002. 1, 2

work page 2002
[17]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014. 6

work page internal anchor Pith review Pith/arXiv arXiv 2014
[18]

Deep laplacian pyramid networks for fast and accurate super-resolution

Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Deep laplacian pyramid networks for fast and accurate super-resolution. InProceedings of the IEEE Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 624–632, 2017. 2, 5

work page 2017
[19]

Photo-realistic single image super- resolution using a generative adversarial network

Christian Ledig, Lucas Theis, Ferenc Husz ´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Ze- han Wang, et al. Photo-realistic single image super- resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vi- sion and pattern recognition, pages 4681–4690, 2017. 5

work page 2017
[20]

Enhanced deep residual networks for single image super-resolution

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. Enhanced deep residual networks for single image super-resolution. InPro- ceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 136–144,

work page
[21]

Object scene flow for autonomous vehicles

Moritz Menze and Andreas Geiger. Object scene flow for autonomous vehicles. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 3061–3070, 2015. 6

work page 2015
[22]

CSI2.https : / / mipi

MIPI. CSI2.https : / / mipi . org / specifications/csi-2. 1, 3, 6

work page
[23]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. 4

work page 2015
[24]

Singan: Learning a generative model from a single natural image

Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. Singan: Learning a generative model from a single natural image. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), October 2019. 2

work page 2019
[25]

Real-time single image and video super-resolution using an efficient sub-pixel convolu- tional neural network

Wenzhe Shi, Jose Caballero, Ferenc Husz ´ar, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolu- tional neural network. InProceedings of the IEEE Conference on Computer Vision and Pattern Recog- nition (CVPR), pages 1874–1883, 2016. 5

work page 2016
[26]

IMX411, 461, .https://www.sony- semicon.co.jp/e/products/IS/camera/ product.html

Sony. IMX411, 461, .https://www.sony- semicon.co.jp/e/products/IS/camera/ product.html. 3

work page
[27]

IMX411, .https : / / www

Sony. IMX411, .https : / / www . sony - semicon . co . jp / products / common / pdf / IMX411ALR_AQR_Flyer.pdf. 1, 2

work page
[28]

IMX461, .https : / / www

Sony. IMX461, .https : / / www . sony - semicon . co . jp / products / common / pdf / IMX461ALR_AQR_Flyer.pdf. 1, 2, 6

work page
[29]

The (new) stanford light field archive

Stanford. The (new) stanford light field archive. http : / / http : / / lightfield . stanford . edu/lfs.html.6

work page
[30]

ISOCELL Bright HMX, .https : / / www

Sumsung. ISOCELL Bright HMX, .https : / / www . samsung . com / semiconductor / minisite / isocell / mobile - image - sensors / isocell - bright - hmx/. 1, 2, 6, 9

work page
[31]

ISOCELL S5KGH1, .https://www

Sumsung. ISOCELL S5KGH1, .https://www. samsung . com / semiconductor / image - sensor / mobile - image - sensor / S5KGH1/. 1

work page
[32]

Ntire 2017 chal- lenge on single image super-resolution: Methods and results

Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, and Lei Zhang. Ntire 2017 chal- lenge on single image super-resolution: Methods and results. InProceedings of the IEEE Conference on 11 Computer Vision and Pattern Recognition Workshops (CVPRW), pages 114–125, 2017. 6

work page 2017
[33]

Im- age super-resolution using dense skip connections

Tong Tong, Gen Li, Xiejie Liu, and Qinquan Gao. Im- age super-resolution using dense skip connections. In Proceedings of the IEEE International Conference on Computer Vision, pages 4799–4807, 2017. 5

work page 2017
[34]

6.4 an aps-h-size 250mpixel cmos image sensor using col- umn single-slope adcs with dual-gain amplifiers

Hirofumi Totsuka, Toshiki Tsuboi, Takashi Muto, Daisuke Yoshida, Yasushi Matsuno, Masanobu Ohmura, Hidekazu Takahashi, Katsuhito Sakurai, Takeshi Ichikawa, Hiroshi Yuzurihara, et al. 6.4 an aps-h-size 250mpixel cmos image sensor using col- umn single-slope adcs with dual-gain amplifiers. In 2016 IEEE International Solid-State Circuits Confer- ence (ISSCC)...

work page 2016
[35]

Why i want a gradient camera

Jack Tumblin, Amit Agrawal, and Ramesh Raskar. Why i want a gradient camera. InIEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 103–110, 2005. 2, 5

work page 2005
[36]

Esr- gan: Enhanced super-resolution generative adversarial networks

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esr- gan: Enhanced super-resolution generative adversarial networks. InProceedings of the European Conference on Computer Vision (ECCV), pages 0–0, 2018. 4

work page 2018
[37]

Panda: A gigapixel-level human-centric video dataset

Xueyang Wang, Xiya Zhang, Yinheng Zhu, Yuchen Guo, Xiaoyun Yuan, Liuyu Xiang, Zerun Wang, Guiguang Ding, David Brady, Qionghai Dai, and Lu Fang. Panda: A gigapixel-level human-centric video dataset. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), June 2020. 5

work page 2020
[38]

Panda: A gigapixel-level human-centric video dataset

Xueyang Wang, Xiya Zhang, Yinheng Zhu, Yuchen Guo, Xiaoyun Yuan, Liuyu Xiang, Zerun Wang, Guiguang Ding, David Brady, Qionghai Dai, et al. Panda: A gigapixel-level human-centric video dataset. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 3268– 3278, 2020. 8

work page 2020
[39]

On single image scale-up using sparse-representations

Roman Zeyde, Michael Elad, and Matan Protter. On single image scale-up using sparse-representations. In Curves and Surfaces, pages 711–730, 2012. 6

work page 2012
[40]

Gradient-directed multiexposure composition.IEEE Transactions on Image Processing (TIP), 21(4):2318–2323, 2011

Wei Zhang and Wai-Kuen Cham. Gradient-directed multiexposure composition.IEEE Transactions on Image Processing (TIP), 21(4):2318–2323, 2011. 2

work page 2011
[41]

Image super-resolution using very deep residual channel attention networks

Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bi- neng Zhong, and Yun Fu. Image super-resolution using very deep residual channel attention networks. InProceedings of the European Conference on Com- puter Vision (ECCV), pages 286–301, 2018. 5

work page 2018
[42]

Crossnet: An end-to-end reference- based super resolution network using cross-scale warping

Haitian Zheng, Mengqi Ji, Haoqian Wang, Yebin Liu, and Lu Fang. Crossnet: An end-to-end reference- based super resolution network using cross-scale warping. InProceedings of the European Conference on Computer Vision (ECCV), pages 88–104, 2018. 2, 5, 6, 7 12

work page 2018

[1] [1]

High resolution large format tile- scan camera: Design, calibration, and extended depth of field

Moshe Ben-Ezra. High resolution large format tile- scan camera: Design, calibration, and extended depth of field. In2010 IEEE International Conference on Computational Photography (ICCP), pages 1–8,

work page

[2] [2]

Low-complexity single-image super-resolution based on nonnegative neighbor embedding

Marco Bevilacqua, Aline Roumy, Christine Guille- mot, and Marie line Alberi Morel. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. InProceedings of the British Machine Vision Conference (BMVC), pages 135.1– 135.10, 2012. 6

work page 2012

[3] [3]

Multiscale gigapixel photography.Nature, 486(7403):386–389, 2012

David J Brady, Michael E Gehm, Ronald A Stack, Daniel L Marks, David S Kittle, Dathon R Golish, EM Vera, and Steven D Feller. Multiscale gigapixel photography.Nature, 486(7403):386–389, 2012. 1, 2

work page 2012

[4] [4]

120MXS CMOS sensor.https://canon- cmos-sensors.com/canon-120mxs-cmos- sensor/

Canon. 120MXS CMOS sensor.https://canon- cmos-sensors.com/canon-120mxs-cmos- sensor/. 1, 2, 6

work page

[5] [5]

A dual camera sys- tem for high spatiotemporal resolution video acquisi- tion.IEEE Computer Architecture Letters, (01):1–1,

Ming Cheng, Zhan Ma, Salman Asif, Yiling Xu, Hao- jie Liu, Wenbo Bao, and Jun Sun. A dual camera sys- tem for high spatiotemporal resolution video acquisi- tion.IEEE Computer Architecture Letters, (01):1–1,

work page

[6] [6]

Gigapixel computational imaging

Oliver S Cossairt, Daniel Miau, and Shree K Nayar. Gigapixel computational imaging. In2011 IEEE In- ternational Conference on Computational Photogra- phy (ICCP), pages 1–8, 2011. 1, 2

work page 2011

[7] [7]

Second-order attention network for single image super-resolution

Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang. Second-order attention network for single image super-resolution. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 11065–11074, 2019. 5

work page 2019

[8] [8]

Video-rate imaging of bio- logical dynamics at centimetre scale and micrometre resolution.Nature Photonics, 13(11):809–816, 2019

Jingtao Fan, Jinli Suo, Jiamin Wu, Hao Xie, Yibing Shen, Feng Chen, Guijin Wang, Liangcai Cao, Guofan Jin, Quansheng He, et al. Video-rate imaging of bio- logical dynamics at centimetre scale and micrometre resolution.Nature Photonics, 13(11):809–816, 2019. 1, 2

work page 2019

[9] [9]

Temporal residual networks for dynamic scene recognition

Christoph Feichtenhofer, Axel Pinz, and Richard P Wildes. Temporal residual networks for dynamic scene recognition. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition, pages 4728–4737, 2017. 6, 7

work page 2017

[10] [10]

Retrieving gray-level in- formation from a binary sensor and its application to gesture detection

Orazio Gallo, Iuri Frosio, Leonardo Gasparini, Kari Pulli, and Massimo Gottardi. Retrieving gray-level in- formation from a binary sensor and its application to gesture detection. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 21–26, 2015. 2

work page 2015

[11] [11]

A 100 w 128 64 pixels contrast-based asyn- chronous binary vision sensor for sensor networks applications.IEEE Journal of Solid-State Circuits (JSSC), 44(5):1582–1592, 2009

Massimo Gottardi, Nicola Massari, and Syed Arsalan Jawed. A 100 w 128 64 pixels contrast-based asyn- chronous binary vision sensor for sensor networks applications.IEEE Journal of Solid-State Circuits (JSSC), 44(5):1582–1592, 2009. 2, 3

work page 2009

[12] [12]

Closed-loop matters: Dual regression networks for single image super-resolution

Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, and Mingkui Tan. Closed-loop matters: Dual regression networks for single image super-resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5407–5416, 2020. 2, 5 10

work page 2020

[13] [13]

Deep back-projection networks for super-resolution

Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. Deep back-projection networks for super-resolution. InProceedings of the IEEE con- ference on computer vision and pattern recognition, pages 1664–1673, 2018. 2, 5

work page 2018

[14] [14]

Lsst: from science drivers to reference design and anticipated data products.The Astrophysical Jour- nal, 873(2):111, 2019

ˇZeljko Ivezi´c, Steven M Kahn, J Anthony Tyson, Bob Abel, Emily Acosta, Robyn Allsman, David Alonso, Yusra AlSayyad, Scott F Anderson, John Andrew, et al. Lsst: from science drivers to reference design and anticipated data products.The Astrophysical Jour- nal, 873(2):111, 2019. 1, 2

work page 2019

[15] [15]

Jayasuriya, O

S. Jayasuriya, O. Gallo, J. Gu, T. Aila, and J. Kautz. Reconstructing intensity images from binary spatial gradient cameras. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 337–343, 2017. 2

work page 2017

[16] [16]

Pan-starrs: a large synop- tic survey telescope array

Nicholas Kaiser, Herve Aussel, Barry E Burke, Hans Boesgaard, Ken Chambers, Mark Richard Chun, James N Heasley, Klaus-Werner Hodapp, Bobby Hunt, Robert Jedicke, et al. Pan-starrs: a large synop- tic survey telescope array. InSurvey and Other Tele- scope Technologies and Discoveries, volume 4836, pages 154–164, 2002. 1, 2

work page 2002

[17] [17]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014. 6

work page internal anchor Pith review Pith/arXiv arXiv 2014

[18] [18]

Deep laplacian pyramid networks for fast and accurate super-resolution

Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Deep laplacian pyramid networks for fast and accurate super-resolution. InProceedings of the IEEE Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 624–632, 2017. 2, 5

work page 2017

[19] [19]

Photo-realistic single image super- resolution using a generative adversarial network

Christian Ledig, Lucas Theis, Ferenc Husz ´ar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Ze- han Wang, et al. Photo-realistic single image super- resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vi- sion and pattern recognition, pages 4681–4690, 2017. 5

work page 2017

[20] [20]

Enhanced deep residual networks for single image super-resolution

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. Enhanced deep residual networks for single image super-resolution. InPro- ceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 136–144,

work page

[21] [21]

Object scene flow for autonomous vehicles

Moritz Menze and Andreas Geiger. Object scene flow for autonomous vehicles. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 3061–3070, 2015. 6

work page 2015

[22] [22]

CSI2.https : / / mipi

MIPI. CSI2.https : / / mipi . org / specifications/csi-2. 1, 3, 6

work page

[23] [23]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. 4

work page 2015

[24] [24]

Singan: Learning a generative model from a single natural image

Tamar Rott Shaham, Tali Dekel, and Tomer Michaeli. Singan: Learning a generative model from a single natural image. InProceedings of the IEEE/CVF In- ternational Conference on Computer Vision (ICCV), October 2019. 2

work page 2019

[25] [25]

Real-time single image and video super-resolution using an efficient sub-pixel convolu- tional neural network

Wenzhe Shi, Jose Caballero, Ferenc Husz ´ar, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolu- tional neural network. InProceedings of the IEEE Conference on Computer Vision and Pattern Recog- nition (CVPR), pages 1874–1883, 2016. 5

work page 2016

[26] [26]

IMX411, 461, .https://www.sony- semicon.co.jp/e/products/IS/camera/ product.html

Sony. IMX411, 461, .https://www.sony- semicon.co.jp/e/products/IS/camera/ product.html. 3

work page

[27] [27]

IMX411, .https : / / www

Sony. IMX411, .https : / / www . sony - semicon . co . jp / products / common / pdf / IMX411ALR_AQR_Flyer.pdf. 1, 2

work page

[28] [28]

IMX461, .https : / / www

Sony. IMX461, .https : / / www . sony - semicon . co . jp / products / common / pdf / IMX461ALR_AQR_Flyer.pdf. 1, 2, 6

work page

[29] [29]

The (new) stanford light field archive

Stanford. The (new) stanford light field archive. http : / / http : / / lightfield . stanford . edu/lfs.html.6

work page

[30] [30]

ISOCELL Bright HMX, .https : / / www

Sumsung. ISOCELL Bright HMX, .https : / / www . samsung . com / semiconductor / minisite / isocell / mobile - image - sensors / isocell - bright - hmx/. 1, 2, 6, 9

work page

[31] [31]

ISOCELL S5KGH1, .https://www

Sumsung. ISOCELL S5KGH1, .https://www. samsung . com / semiconductor / image - sensor / mobile - image - sensor / S5KGH1/. 1

work page

[32] [32]

Ntire 2017 chal- lenge on single image super-resolution: Methods and results

Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, and Lei Zhang. Ntire 2017 chal- lenge on single image super-resolution: Methods and results. InProceedings of the IEEE Conference on 11 Computer Vision and Pattern Recognition Workshops (CVPRW), pages 114–125, 2017. 6

work page 2017

[33] [33]

Im- age super-resolution using dense skip connections

Tong Tong, Gen Li, Xiejie Liu, and Qinquan Gao. Im- age super-resolution using dense skip connections. In Proceedings of the IEEE International Conference on Computer Vision, pages 4799–4807, 2017. 5

work page 2017

[34] [34]

6.4 an aps-h-size 250mpixel cmos image sensor using col- umn single-slope adcs with dual-gain amplifiers

Hirofumi Totsuka, Toshiki Tsuboi, Takashi Muto, Daisuke Yoshida, Yasushi Matsuno, Masanobu Ohmura, Hidekazu Takahashi, Katsuhito Sakurai, Takeshi Ichikawa, Hiroshi Yuzurihara, et al. 6.4 an aps-h-size 250mpixel cmos image sensor using col- umn single-slope adcs with dual-gain amplifiers. In 2016 IEEE International Solid-State Circuits Confer- ence (ISSCC)...

work page 2016

[35] [35]

Why i want a gradient camera

Jack Tumblin, Amit Agrawal, and Ramesh Raskar. Why i want a gradient camera. InIEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 103–110, 2005. 2, 5

work page 2005

[36] [36]

Esr- gan: Enhanced super-resolution generative adversarial networks

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esr- gan: Enhanced super-resolution generative adversarial networks. InProceedings of the European Conference on Computer Vision (ECCV), pages 0–0, 2018. 4

work page 2018

[37] [37]

Panda: A gigapixel-level human-centric video dataset

Xueyang Wang, Xiya Zhang, Yinheng Zhu, Yuchen Guo, Xiaoyun Yuan, Liuyu Xiang, Zerun Wang, Guiguang Ding, David Brady, Qionghai Dai, and Lu Fang. Panda: A gigapixel-level human-centric video dataset. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), June 2020. 5

work page 2020

[38] [38]

Panda: A gigapixel-level human-centric video dataset

Xueyang Wang, Xiya Zhang, Yinheng Zhu, Yuchen Guo, Xiaoyun Yuan, Liuyu Xiang, Zerun Wang, Guiguang Ding, David Brady, Qionghai Dai, et al. Panda: A gigapixel-level human-centric video dataset. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 3268– 3278, 2020. 8

work page 2020

[39] [39]

On single image scale-up using sparse-representations

Roman Zeyde, Michael Elad, and Matan Protter. On single image scale-up using sparse-representations. In Curves and Surfaces, pages 711–730, 2012. 6

work page 2012

[40] [40]

Gradient-directed multiexposure composition.IEEE Transactions on Image Processing (TIP), 21(4):2318–2323, 2011

Wei Zhang and Wai-Kuen Cham. Gradient-directed multiexposure composition.IEEE Transactions on Image Processing (TIP), 21(4):2318–2323, 2011. 2

work page 2011

[41] [41]

Image super-resolution using very deep residual channel attention networks

Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bi- neng Zhong, and Yun Fu. Image super-resolution using very deep residual channel attention networks. InProceedings of the European Conference on Com- puter Vision (ECCV), pages 286–301, 2018. 5

work page 2018

[42] [42]

Crossnet: An end-to-end reference- based super resolution network using cross-scale warping

Haitian Zheng, Mengqi Ji, Haoqian Wang, Yebin Liu, and Lu Fang. Crossnet: An end-to-end reference- based super resolution network using cross-scale warping. InProceedings of the European Conference on Computer Vision (ECCV), pages 88–104, 2018. 2, 5, 6, 7 12

work page 2018