SciPostGen: Bridging the Gap between Scientific Papers and Poster Layouts
Pith reviewed 2026-05-17 04:42 UTC · model grok-4.3
The pith
SciPostGen dataset shows paper structures are tied to the number of elements in their posters and supports retrieval-augmented layout generation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Paper structures are associated with the number of layout elements in posters. The SciPostGen dataset provides paired annotations at scale to examine this link. A retrieval-augmented framework retrieves layouts consistent with a given paper's structure and employs them as guidance to generate new layouts that satisfy additional constraints specified by poster creators.
What carries the argument
Retrieval-Augmented Poster Layout Generation framework, which retrieves past layouts aligned with a paper's structure and uses them to guide generation of new constraint-aware layouts.
If this is right
- Paper structure can be used to estimate the appropriate number and type of elements for a poster.
- Retrieval from a database of past posters improves alignment between paper content and generated layout.
- The framework produces usable layouts under both constrained and unconstrained conditions.
- Public release of the dataset enables further study of paper-to-poster mappings.
Where Pith is reading between the lines
- Automated poster tools could become standard aids for researchers preparing conference submissions.
- The same retrieval idea might extend to generating slides or figure arrangements from paper text.
- Large-scale paired datasets like this could reveal broader patterns in how scientists choose to visualize their work.
Load-bearing premise
Layouts retrieved from past papers can reliably guide generation for new papers without introducing uncorrectable style mismatches or content omissions.
What would settle it
A test set of papers with novel structures where the generated layouts consistently violate given constraints or show element counts far from those predicted by retrieved similar papers.
Figures
read the original abstract
As the number of scientific papers continues to grow, there is a demand for approaches that can effectively convey research findings, with posters serving as a key medium for presenting paper contents. Poster layouts determine how effectively research is communicated and understood, highlighting their growing importance. In particular, a gap remains in understanding how papers correspond to the layouts that present them, which calls for datasets with paired annotations at scale. To bridge this gap, we introduce SciPostGen, a large-scale dataset for understanding and generating poster layouts from scientific papers. Our analyses based on SciPostGen show that paper structures are associated with the number of layout elements in posters. Based on this insight, we explore a framework, Retrieval-Augmented Poster Layout Generation, which retrieves layouts consistent with a given paper and uses them as guidance for layout generation. We conducted experiments under two conditions: with and without layout constraints typically specified by poster creators. The results show that the retriever estimates layouts aligned with paper structures, and our framework generates layouts that also satisfy given constraints. The dataset and code are publicly available at https://omron-sinicx.github.io/paper2layout/.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces SciPostGen, a large-scale dataset pairing scientific papers with annotated poster layouts. Analyses on the dataset establish associations between paper structures and the number of layout elements in posters. The authors propose a Retrieval-Augmented Poster Layout Generation framework that retrieves past layouts consistent with a new paper and uses them to guide generation; experiments in constrained and unconstrained settings report that the retriever produces layouts aligned with paper structures and that the generated layouts satisfy given constraints. The dataset and code are released publicly.
Significance. If the central claims hold, the work supplies a practical resource for automating poster layout creation from papers, a task relevant to scientific communication. The public dataset and code release is a clear strength that supports reproducibility and follow-on research. The retrieval-augmented approach offers a concrete way to leverage historical structure-layout correlations, though its value hinges on the fidelity of the retrieval step.
major comments (2)
- Abstract and retrieval framework description: the claim that 'the retriever estimates layouts aligned with paper structures' provides no explicit definition or isolated metric for structural alignment (e.g., section hierarchy, figure-to-text ratio, or element ordering) separate from generic embedding similarity. This leaves open whether reported alignment reflects logical structure or surface-level topic/style features, directly affecting the reliability of the downstream generation step and the claimed association between paper structures and layout element counts.
- Experiments section: the reported results under constrained and unconstrained settings assert alignment and constraint satisfaction, yet the abstract and available description do not include quantitative tables, baseline comparisons, or error-bar details that would allow verification of improvement margins or robustness.
minor comments (1)
- The abstract would benefit from a brief statement of dataset scale (number of paper-poster pairs) and annotation methodology to give readers an immediate sense of scope.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and outline the revisions we will make to strengthen the clarity and rigor of the presentation.
read point-by-point responses
-
Referee: [—] Abstract and retrieval framework description: the claim that 'the retriever estimates layouts aligned with paper structures' provides no explicit definition or isolated metric for structural alignment (e.g., section hierarchy, figure-to-text ratio, or element ordering) separate from generic embedding similarity. This leaves open whether reported alignment reflects logical structure or surface-level topic/style features, directly affecting the reliability of the downstream generation step and the claimed association between paper structures and layout element counts.
Authors: We agree that an explicit definition and isolated metric would strengthen the claim. Our dataset analyses demonstrate a clear statistical association between paper section structures (e.g., counts of sections such as Introduction, Methods, Results) and the number and type of layout elements (figures, text blocks, tables) in the corresponding posters. The retriever operates on embeddings derived from the full paper text, which encode both topical content and structural cues such as section ordering and relative lengths. To address the concern directly, we will add a new subsection in the revised manuscript that formally defines structural alignment as the degree to which retrieved layouts preserve the paper's section-to-element mapping (measured by normalized element-count correlation per section type). We will also report an isolated metric, such as the Pearson correlation between paper section statistics and retrieved poster element distributions, separate from overall embedding cosine similarity, to demonstrate that the alignment is not reducible to surface-level topic or style features. revision: yes
-
Referee: [—] Experiments section: the reported results under constrained and unconstrained settings assert alignment and constraint satisfaction, yet the abstract and available description do not include quantitative tables, baseline comparisons, or error-bar details that would allow verification of improvement margins or robustness.
Authors: The full experiments section contains quantitative tables reporting alignment scores (e.g., layout element matching and structural similarity) and constraint satisfaction rates for both settings, along with comparisons against non-retrieval baselines. However, we acknowledge that these details are not summarized in the abstract and that error bars and additional robustness checks could improve verifiability. In the revision we will (i) add error bars to all reported metrics, (ii) expand the baseline comparisons to include at least one additional retrieval-free generation method, and (iii) include a concise summary of the key quantitative improvements in the abstract or introduction to facilitate quick verification. revision: yes
Circularity Check
No circularity: empirical results rest on external dataset and public code
full rationale
The paper introduces the SciPostGen dataset with paired paper-poster annotations and reports empirical analyses linking paper structures to layout element counts. The Retrieval-Augmented Poster Layout Generation framework retrieves prior layouts for guidance and is evaluated under constraint conditions, with all reported alignment and generation outcomes derived from the released dataset and code. No equations, self-definitional reductions, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text that would collapse the central claims to inputs by construction. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Poster layouts can be decomposed into a finite set of countable visual elements whose counts correlate with paper section structure.
- domain assumption Embedding similarity between papers is a reliable indicator of compatible poster layouts.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our analyses based on SciPostGen show that paper structures are associated with the number of layout elements in posters... the retriever estimates layouts aligned with paper structures
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
layout retriever... trained with a contrastive loss... cosine similarity
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Variational transformer networks for layout generation
Diego Martin Arroyo, Janis Postels, and Federico Tombari. Variational transformer networks for layout generation. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 13642–13652, 2021. 2
work page 2021
-
[2]
Layout representation learning with spa- tial and structural hierarchies
Yue Bai, Dipu Manandhar, Zhaowen Wang, John Collo- mosse, and Yun Fu. Layout representation learning with spa- tial and structural hierarchies. InProceedings of the AAAI Conference on Artificial Intelligence, pages 206–214, 2023. 6
work page 2023
-
[3]
Severin- sen, Christy Anna Hipsley, and Stefan Sommer
Elizabeth Louise Baker, Gefan Yang, Michael L. Severin- sen, Christy Anna Hipsley, and Stefan Sommer. Condition- ing non-linear and infinite-dimensional diffusion processes. InProceedings of the 38th Advances in Neural Information Processing Systems, pages 10801–10826, 2024. 7
work page 2024
-
[4]
Enhancing presen- tation slide generation by LLMs with a multi-staged end- to-end approach
Sambaran Bandyopadhyay, Himanshu Maheshwari, Anand- havelu Natarajan, and Apoorv Saxena. Enhancing presen- tation slide generation by LLMs with a multi-staged end- to-end approach. InProceedings of the 17th International Natural Language Generation Conference, pages 222–229,
-
[5]
Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher, Guillem Cucurull, Thomas Scialom, and Robert Stojnic. Nougat: Neural optical understanding for academic documents. arXiv:2308.13418, 2023. 3
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[6]
Lutz Bornmann and R ¨udiger Mutz. Growth rates of mod- ern science: A bibliometric analysis based on the number of publications and cited references.Journal of the Association for Information Science and Technology, 66(11):2215–2222,
-
[7]
Lutz Bornmann, Robin Haunschild, and R ¨udiger Mutz. Growth rates of modern science: a latent piecewise growth curve approach to model publication numbers from estab- lished and new literature databases.Humanities and Social Sciences Communications, 8(1), 2021. 1
work page 2021
-
[8]
Towards aligned layout generation via diffusion model with aesthetic constraints
Jian Chen, Ruiyi Zhang, Yufan Zhou, and Changyou Chen. Towards aligned layout generation via diffusion model with aesthetic constraints. InProceedings of the 12th Interna- tional Conference on Learning Representations, 2024. 2
work page 2024
-
[9]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Ge- offrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th Interna- tional Conference on Machine Learning, pages 1597–1607,
-
[10]
Arman Cohan and Nazli Goharian. Scientific document sum- marization via citation contextualization and scientific dis- course.International Journal on Digital Libraries, 19(2): 287–303, 2018. 1
work page 2018
-
[11]
Rico: A mobile app dataset for building data- driven design applications
Biplab Deka, Zifeng Huang, Chad Franzen, Joshua Hib- schman, Daniel Afergan, Yang Li, Jeffrey Nichols, and Ran- jitha Kumar. Rico: A mobile app dataset for building data- driven design applications. InProceedings of the 30th An- nual ACM Symposium on User Interface Software and Tech- nology, pages 845–854, 2017. 2
work page 2017
-
[12]
Paul Gavrikov and Janis Keuper. Can biases in imagenet models explain generalization? InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22184–22194, 2024. 3, 2
work page 2024
-
[13]
AutoPre- sent: designing structured visuals from scratch
Jiaxin Ge, Zora Zhiruo Wang, Xuhui Zhou, Yi-Hao Peng, Sanjay Subramanian, Qinyue Tan, Maarten Sap, Alane Suhr, Daniel Fried, Graham Neubig, and Trevor Darrell. AutoPre- sent: designing structured visuals from scratch. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2902–2911, 2025. 1
work page 2025
-
[14]
Gonz ´alez, Luca Schmidt, Benjamin M
Rita M. Gonz ´alez, Luca Schmidt, Benjamin M. Schmidt, Philipp Berens, and Dmitry Kobak. The landscape of biomedical research.Patterns, 5(6), 2024. 1
work page 2024
-
[15]
Davis, Vijay Mahadevan, and Abhinav Shrivastava
Kamal Gupta, Justin Lazarow, Alessandro Achille, Larry S. Davis, Vijay Mahadevan, and Abhinav Shrivastava. Layout- Transformer: layout generation and completion with self- attention. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 1004–1014, 2021. 2
work page 2021
-
[16]
Retrieval-augmented layout transformer for content-aware layout generation
Daichi Horita, Naoto Inoue, Kotaro Kikuchi, Kota Yam- aguchi, and Kiyoharu Aizawa. Retrieval-augmented layout transformer for content-aware layout generation. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 67–76, 2024. 3
work page 2024
-
[17]
HsiaoYuan Hsu and Yuxin Peng. PosterO: Structuring lay- out trees to enable language models in generalized content- aware layout generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8117–8127, 2025. 2, 3
work page 2025
-
[18]
PosterLayout: A new benchmark and approach for content-aware visual-textual presentation layout
Hsiao Yuan Hsu, Xiangteng He, Yuxin Peng, Hao Kong, and Qing Zhang. PosterLayout: A new benchmark and approach for content-aware visual-textual presentation layout. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6018–6026, 2023. 2
work page 2023
-
[19]
Layoutlmv3: Pre-training for document ai with unified text and image masking
Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, and Furu Wei. Layoutlmv3: Pre-training for document ai with unified text and image masking. InProceedings of the 30th ACM International Conference on Multimedia, pages 4083–4091,
-
[20]
LayoutDM: Discrete diffusion model for controllable layout generation
Naoto Inoue, Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, and Kota Yamaguchi. LayoutDM: Discrete diffusion model for controllable layout generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10167–10176, 2023. 2
work page 2023
-
[21]
Layout-Corrector: Alleviating layout sticking phenomenon in discrete diffusion model
Shoma Iwai, Atsuki Osanai, Shunsuke Kitada, and Shinichiro Omachi. Layout-Corrector: Alleviating layout sticking phenomenon in discrete diffusion model. InPro- ceedings of the 18th European Conference on Computer Vi- sion, pages 92–110, 2024. 2
work page 2024
-
[22]
Vijay Jaisankar, Sambaran Bandyopadhyay, Kalp Vyas, Varre Suman Chaitanya, and Shwetha Somasundaram. Deep submodular optimization and llm for multimodal content ex- traction and automatic poster generation from long docu- ment. InProceedings of the AAAI Conference on Artificial Intelligence, pages 24221–24229, 2025. 1, 2
work page 2025
-
[23]
Zhaoyun Jiang, Jiaqi Guo, Shizhao Sun, Huayu Deng, Zhongkai Wu, Vuksan Mijovic, Zijiang James Yang, Jian- Guang Lou, and Dongmei Zhang. LayoutFormer++: con- 9 ditional graphic layout generation via constraint serializa- tion and decoding space restriction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18403–184...
work page 2023
-
[24]
Constrained graphic layout generation via latent optimization
Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, and Kota Yamaguchi. Constrained graphic layout generation via latent optimization. InProceedings of the 29th ACM International Conference on Multimedia, page 88–96, 2021. 2, 6
work page 2021
-
[25]
Jill H. Larkin and Herbert A. Simon. Why a diagram is (sometimes) worth ten thousand words.Cognitive Science, 11(1):65–100, 1987. 2
work page 1987
-
[26]
LayoutGAN: Generating graphic layouts with wireframe discriminator
Jianan Li, Tingfa Xu, Jianming Zhang, Aaron Hertzmann, and Jimei Yang. LayoutGAN: Generating graphic layouts with wireframe discriminator. InThe 6th International Con- ference on Learning Representations, 2019. 2
work page 2019
-
[27]
Jianan Li, Jimei Yang, Jianming Zhang, Chang Liu, Christina Wang, and Tingfa Xu. Attribute-conditioned lay- out gan for automatic graphic design.IEEE Transactions on Visualization and Computer Graphics, 27(10):4039—-4048,
-
[28]
Dit: Self-supervised pre-training for docu- ment image transformer
Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, and Furu Wei. Dit: Self-supervised pre-training for docu- ment image transformer. InProceedings of the 30th ACM International Conference on Multimedia, pages 3530–3539,
-
[29]
Auto completion of user interface layout design using transformer-based tree decoders
Yang Li, Julien Amelot, Xin Zhou, Samy Bengio, and Si Si. Auto completion of user interface layout design using transformer-based tree decoders. arXiv:2001.05308, 2020. 2
-
[30]
LayoutPrompter: awaken the design ability of large language models
Jiawei Lin, Jiaqi Guo, Shizhao Sun, Zijiang James Yang, Jian-Guang Lou, and Dongmei Zhang. LayoutPrompter: awaken the design ability of large language models. InPro- ceedings of the 37th Conference on Neural Information Pro- cessing Systems, pages 43852–43879, 2023. 2, 3, 6
work page 2023
-
[31]
Lawrence Zitnick, and Piotr Doll ´ar
Tsung Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Doll ´ar. Microsoft COCO: Common objects in context. InProceedings of the 13th European Conference on Computer Vision, pages 740– 755, 2014. 4
work page 2014
-
[32]
Pref- erence optimization for molecule synthesis with conditional residual energy-based models
Songtao Liu, Hanjun Dai, Yue Zhao, and Peng Liu. Pref- erence optimization for molecule synthesis with conditional residual energy-based models. InProceedings of the 41st In- ternational Conference on Machine Learning, pages 30929– 30945, 2024. 7
work page 2024
-
[33]
Decoupled weight decay regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. InProceedings of the 7th International Con- ference on Learning Representations, 2019. 3
work page 2019
-
[34]
Learn- ing structural similarity of user interface layouts using graph networks
Dipu Manandhar, Dan Ruta, and John Collomosse. Learn- ing structural similarity of user interface layouts using graph networks. InProceedings of the 16th European Conference on Computer Vision, pages 730–746, 2020. 6
work page 2020
-
[35]
SCAF- FLSA: Taming heterogeneity in federated linear stochastic approximation and td learning
Paul Mangold, Sergey Samsonov, Safwan Labbi, Ilya Levin, Reda Alami, Alexey Naumov, and Eric Moulines. SCAF- FLSA: Taming heterogeneity in federated linear stochastic approximation and td learning. InProceedings of the 38th Advances in Neural Information Processing Systems, pages 13927–13981, 2024. 8
work page 2024
-
[36]
Ishani Mondal, Shwetha S, Anandhavelu Natarajan, Aparna Garimella, Sambaran Bandyopadhyay, and Jordan Boyd- Graber. Presentations by the humans and for the humans: Harnessing LLMs for generating persona-aware slides from documents. InProceedings of the 18th Conference of the European Chapter of the Association for Computational Lin- guistics, pages 2664–2...
work page 2024
-
[37]
Douglas L. Nelson, Valerie S. Reed, and John R. Walling. Pictorial superiority effect.Journal of Experimental Psy- chology: Human Learning and Memory, 2(5):523–528,
-
[38]
OpenAI. GPT-5 system card. https://cdn.openai.com/gpt-5- system-card.pdf, 2025. 2, 6
work page 2025
-
[39]
LTSim: layout transportation-based similarity mea- sure for evaluating layout generation
Mayu Otani, Naoto Inoue, Kotaro Kikuchi, and Riku To- gashi. LTSim: layout transportation-based similarity mea- sure for evaluating layout generation. arXiv:2407.12356,
-
[40]
Paper2Poster: benchmarking multimodal poster generation from long-context papers
Wei Pang, Kevin Qinghong Lin, Xiangru Jian, Xi He, and Philip Torr. Paper2Poster: benchmarking multimodal poster generation from long-context papers. InProceedings of the 39th Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2025. 1, 2
work page 2025
-
[41]
Kanya Paramita and Leylia M. Khodra. Tailored summary for automatic poster generator. InProceedings of the 2016 International Conference On Advanced Informatics: Con- cepts, Theory And Application, pages 1–6, 2016. 1
work page 2016
-
[42]
Fair-VPT: Fair visual prompt tuning for image classification
Sungho Park and Hyeran Byun. Fair-VPT: Fair visual prompt tuning for image classification. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12268–12278, 2024. 7
work page 2024
-
[43]
Yi-Hao Peng, Faria Huq, Yue Jiang, Jason Wu, Xin Yue Li, Jeffrey P. Bigham, and Amy Pavel. DreamStruct: Under- standing slides and user interfaces via synthetic data gener- ation. InProceedings of the 18th European Conference on Computer Vision, pages 466–485, 2024. 6
work page 2024
-
[44]
HDQMF: Holographic feature decomposition using quantum algorithms
Prathyush Prasanth Poduval, Zhuowen Zou, and Mohsen Imani. HDQMF: Holographic feature decomposition using quantum algorithms. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 10978–10987, 2024. 8
work page 2024
-
[45]
Learning to generate posters of scientific pa- pers
Yuting Qiang, Yanwei Fu, Yanwen Guo, Zhi-Hua Zhou, and Leonid Sigal. Learning to generate posters of scientific pa- pers. InProceedings of the Thirtieth AAAI Conference on Artificial Intelligence, pages 51–57, 2016. 1, 2
work page 2016
-
[46]
Yu-Ting Qiang, Yan-Wei Fu, Xiao Yu, Yan-Wen Guo, Zhi- Hua Zhou, and Leonid Sigal. Learning to generate posters of scientific papers by probabilistic graphical models.Journal of Computer Science and Technology, 34(1):155–169, 2019. 1, 2
work page 2019
-
[47]
Learning transferable visual models from natural language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learning, pages 8748–8763, 2021. 2, 5
work page 2021
-
[48]
You Only Look Once: Unified, real-time object de- 10 tection
Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You Only Look Once: Unified, real-time object de- 10 tection. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 779–788, 2016. 4
work page 2016
-
[49]
arXiv preprint arXiv:2502.17540
Rohit Saxena, Pasquale Minervini, and Frank Keller. Poster- Sum: a multimodal benchmark for scientific poster summa- rization. arXiv:2502.17540, 2025. 1, 2
-
[50]
PosterLlama: Bridging design ability of language model to content-aware layout generation
Jaejung Seol, Seojun Kim, and Jaejun Yoo. PosterLlama: Bridging design ability of language model to content-aware layout generation. InProceedings of the 18th European Con- ference on Computer Vision, pages 451–468, 2024. 2, 3, 6
work page 2024
-
[51]
Mohit Sharma and Amit Jayant Deshpande. How far can fairness constraints help recover from biased data? InPro- ceedings of the 41st International Conference on Machine Learning, pages 44515–44544, 2024. 7
work page 2024
-
[52]
arXiv preprint arXiv:2505.17104
Tao Sun, Enhao Pan, Zhengkai Yang, Kaixin Sui, Jiajun Shi, Xianfu Cheng, Tongliang Li, Wenhao Huang, Ge Zhang, Jian Yang, and Zhoujun Li. P2P: Automated paper-to-poster generation and fine-grained benchmark. arXiv:2505.17104,
-
[53]
SciPost- Layout: a dataset for layout analysis and layout generation of scientific posters
Shohei Tanaka, Hao Wang, and Yoshitaka Ushiku. SciPost- Layout: a dataset for layout analysis and layout generation of scientific posters. InProceedings of the 35th British Machine Vision Conference, 2024. 2, 3, 6, 1
work page 2024
-
[54]
Optimizing watermarks for large language models
Bram Wouters. Optimizing watermarks for large language models. InProceedings of the 41st International Conference on Machine Learning, pages 53251–53269, 2024. 7
work page 2024
-
[55]
LayoutRAG: Retrieval-augmented model for content-agnostic conditional layout generation
Yuxuan Wu, Le Wang, Sanping Zhou, Mengnan Liu, Gang Hua, and Haoxiang Li. LayoutRAG: Retrieval-augmented model for content-agnostic conditional layout generation. arXiv:2506.02697, 2025. 3
-
[56]
PosterBot: a system for gener- ating posters of scientific papers with neural models
Sheng Xu and Xiaojun Wan. PosterBot: a system for gener- ating posters of scientific papers with neural models. InPro- ceedings of the AAAI Conference on Artificial Intelligence, pages 13233–13235, 2022. 1, 2
work page 2022
-
[57]
Shintaro Yamamoto, Anne Lauscher, Simone Paolo Ponzetto, Goran Glava ˇs, and Shigeo Morishima. Visual summary identification from scientific publications via self- supervised learning.Frontiers in Research Metrics and An- alytics, 6, 2021. 1
work page 2021
-
[58]
Text prompt with nor- mality guidance for weakly supervised video anomaly detec- tion
Zhiwei Yang, Jing Liu, and Peng Wu. Text prompt with nor- mality guidance for weakly supervised video anomaly detec- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 18899–18908,
-
[59]
Metaxas, Sergey Tulyakov, and Jian Ren
Zhixing Zhang, Yanyu Li, Yushu Wu, yanwu xu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin, Junli Cao, Dimitris N. Metaxas, Sergey Tulyakov, and Jian Ren. SF-V: Single forward video generation model. InPro- ceedings of the 38th Annual Conference on Neural Informa- tion Processing Systems, pages 103599–103618, 2024. 8
work page 2024
-
[60]
PosterGen: Aesthetic-Aware Multi-Modal Paper-to-Poster Generation via Multi-Agent LLMs
Zhilin Zhang, Xiang Zhang, Jiaqi Wei, Yiwei Xu, and Chenyu You. PosterGen: Aesthetic-aware paper-to-poster generation via multi-agent llms. arXiv:2508.17188, 2025. 1, 2
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[61]
Pub- LayNet: largest dataset ever for document layout analysis
Xu Zhong, Jianbin Tang, and Antonio Jimeno Yepes. Pub- LayNet: largest dataset ever for document layout analysis. In Proceedings of the 2019 International Conference on Docu- ment Analysis and Recognition, pages 1015–1022, 2019. 2
work page 2019
-
[62]
Scientific poster generation: A new dataset and approach.Pattern Recognition, 164(C),
Xinyi Zhong, Zusheng Tan, Jing Li, Shen Gao, Jing Ma, Shanshan Feng, and Billy Chiu. Scientific poster generation: A new dataset and approach.Pattern Recognition, 164(C),
-
[63]
2 11 SciPostGen: Bridging the Gap between Scientific Papers and Poster Layouts Supplementary Material A. Dataset Details Overview.Figure 8 illustrates an example of the an- notated components in SciPostGen, a dataset comprising 18,097 pairs of scientific papers and their corresponding posters. In the main text, we focused on the paper content an- notation...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.