Reddit's Appetite: Predicting User Engagement with Nutritional Content
Pith reviewed 2026-05-23 04:14 UTC · model grok-4.3
The pith
Nutritional features improve prediction of Reddit food post engagement by nearly 5%.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Analysis of almost half a million food-related posts on Reddit shows that nutritional features improve the accuracy of models predicting user engagement, measured by number of comments, by almost 5 percent, while calorie density contributes positively to the prediction, indicating that higher nutritional content associates with higher engagement levels in food-related posts.
What carries the argument
XGBoost models that incorporate nutritional features (calories and macronutrients) extracted from post text or images to predict comment counts on food-related Reddit posts.
If this is right
- Posts showing meals with higher calorie density attract more comments than lower-density meals.
- Nutritional data helps separate posts that will resonate with the community from those that will not.
- Online initiatives to encourage healthy eating can increase reach by emphasizing nutritional aspects of the content.
- Platforms could adjust content ranking or recommendations using nutritional signals to boost engagement with certain food posts.
Where Pith is reading between the lines
- If the pattern holds, content creators might deliberately highlight calorie information to increase comment volume on their posts.
- The finding raises the question of whether similar nutritional-engagement links appear on image-heavy platforms such as Instagram.
- Automated nutrition estimation tools could be integrated into posting interfaces to surface high-engagement food content in real time.
- Public-health campaigns might test whether framing messages around calorie density increases user interaction compared with other framings.
Load-bearing premise
The calories and macronutrients of meals can be accurately determined from the text or images in Reddit posts.
What would settle it
Re-estimate nutrition values by hand for a random sample of several hundred posts, retrain the models on the corrected values, and check whether the reported accuracy gain of nearly 5 percent disappears.
Figures
read the original abstract
Food communities on online platforms enjoy great popularity among social media users. Due to the far-reaching consequences of food-related content on user eating behavior, recent research has studied the factors that drive user online engagement with food. While most of these studies have focused on visual aspects of food content in social media, only a few initial studies have explored the impact of nutritional content on user engagement. In this paper, we set out to close this gap and analyze food-related posts on Reddit, focusing on the association between the calories and macronutrients of a meal and engagement levels, particularly the number of comments. To that end, we collect and analyze almost half a million food-related posts and uncover differences in nutritional content between engaging and non-engaging posts. Moreover, we train a series of XGBoost models, and evaluate the importance of nutritional content while predicting user engagement and how posts will resonate with the community. We find that nutritional features improve the baseline model's accuracy by almost 5%, with a positive contribution of calorie density towards the prediction of engagement, suggesting that higher nutritional content is associated with higher levels of user engagement in food-related posts. Our results provide valuable insights for the design of more engaging online initiatives aimed at, for example, encouraging healthy eating habits.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript collects and analyzes nearly 500,000 food-related Reddit posts to study the association between nutritional content (calories and macronutrients) and user engagement, measured primarily by comment count. It compares nutritional profiles of engaging versus non-engaging posts and trains XGBoost models showing that adding nutritional features improves baseline prediction accuracy by almost 5%, with calorie density making a positive contribution, suggesting higher nutritional content is linked to greater engagement.
Significance. If the nutritional extraction is reliable and the reported lift is robust, the work provides empirical evidence on how nutritional attributes influence engagement with food content online, with potential value for designing health-related social media interventions. The scale of the dataset is a clear strength.
major comments (2)
- [Abstract] Abstract: the central claim of a ~5% accuracy lift from nutritional features and the positive contribution of calorie density is presented without any description of the nutrition extraction pipeline, baseline feature set, cross-validation scheme, statistical significance tests, or confidence intervals, leaving the result only partially supported.
- [Methods] Methods (nutrition extraction section): the assumption that calories and macronutrients can be recovered from post text or images is load-bearing for both the feature-importance results and the suggested association, yet no validation against ground-truth labels, MAE, or inter-rater metrics is supplied; confounding with post length or other textual cues cannot be ruled out.
minor comments (1)
- [Abstract] Abstract: replace the vague 'almost half a million' with the exact post count.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments, which highlight important aspects of clarity and methodological rigor. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim of a ~5% accuracy lift from nutritional features and the positive contribution of calorie density is presented without any description of the nutrition extraction pipeline, baseline feature set, cross-validation scheme, statistical significance tests, or confidence intervals, leaving the result only partially supported.
Authors: We agree that the abstract, as a concise summary, would benefit from additional methodological context to better support the central claims. The nutrition extraction pipeline, baseline features (textual and metadata), 5-fold cross-validation, and statistical tests (including paired t-tests for accuracy differences) are detailed in the Methods and Results sections, with confidence intervals reported for the performance metrics. In the revised manuscript, we will expand the abstract to briefly reference the extraction approach, baseline model, cross-validation, and the statistical significance of the ~5% lift (p < 0.01). This change will make the abstract more self-contained without altering its length substantially. revision: yes
-
Referee: [Methods] Methods (nutrition extraction section): the assumption that calories and macronutrients can be recovered from post text or images is load-bearing for both the feature-importance results and the suggested association, yet no validation against ground-truth labels, MAE, or inter-rater metrics is supplied; confounding with post length or other textual cues cannot be ruled out.
Authors: This point is well-taken and identifies a genuine gap in the current presentation. The nutrition values were extracted via a combination of rule-based parsing of textual descriptions and a vision-language model applied to images, but the manuscript does not include a dedicated validation against ground-truth nutritional labels or inter-rater agreement metrics. We will add a new subsection in Methods describing the extraction process in greater detail, any available internal consistency checks, and an explicit discussion of potential confounders including post length, vocabulary richness, and image quality. Where ground-truth validation data are unavailable, we will acknowledge this limitation and report sensitivity analyses that control for textual length. These additions will directly address the load-bearing nature of the extraction step. revision: yes
Circularity Check
No significant circularity; standard ML feature extraction and prediction pipeline
full rationale
The paper collects Reddit posts, extracts nutritional features (calories/macros) via unspecified external means, compares engaging vs non-engaging posts, and trains XGBoost models to measure feature importance for engagement prediction. The reported ~5% accuracy lift and positive calorie-density contribution are empirical outcomes of supervised learning on independently extracted features, not quantities defined by the model's own fitted parameters or reduced to self-citation. No self-definitional steps, fitted-input-as-prediction, or load-bearing self-citations appear in the abstract or described pipeline.
Axiom & Free-Parameter Ledger
free parameters (1)
- XGBoost model hyperparameters
axioms (1)
- domain assumption Nutritional content of meals can be reliably inferred from Reddit post text or images
Reference graph
Works this paper leans on
-
[1]
Sofiane Abbar, Yelena Mejova, and Ingmar Weber. 2015. You Tweet What You Eat: Studying Food Consumption Through Twitter. InProceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI ’15) . Association for Computing Machinery, 3197–3206. https: //doi.org/10.1145/2702123.2702153
-
[2]
Butler, Elisabeth Joyce, Robert Kraut, Kimberly S
Jaime Arguello, Brian S. Butler, Elisabeth Joyce, Robert Kraut, Kimberly S. Ling, Carolyn Rosé, and Xiaoqing Wang. 2006. Talk to me: foundations for successful individual-group interactions in online communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems . Association for Computing Machinery, 959–968. https://doi.org/10...
-
[3]
Eytan Bakshy, Jake M. Hofman, Winter A. Mason, and Duncan J. Watts. 2011. Everyone’s an influencer: quantifying influence on twitter. In Proceedings of the fourth ACM international conference on Web search and data mining . Association for Computing Machinery, 65–74. https: //doi.org/10.1145/1935826.1935845
-
[4]
Amy M. Barklamb, Annika Molenaar, Linda Brennan, Stephanie Evans, Jamie Choong, Emma Herron, Mike Reid, and Tracy A. McCaffrey. 2020. Learning the Language of Social Media: A Comparison of Engagement Metrics and Social Media Strategies Used by Food and Nutrition-Related Social Media Accounts. Nutrients 12, 9 (2020), 2839. https://doi.org/10.3390/nu12092839
-
[5]
Jason Baumgartner, Savvas Zannettou, Brian Keegan, Megan Squire, and Jeremy Blackburn. 2020. The Pushshift Reddit Dataset. , 830–839 pages. https://doi.org/10.1609/icwsm.v14i1.7347
-
[6]
Qiang Chen, Chen Min, Wei Zhang, Xiaoyue Ma, and Richard Evans. 2021. Factors driving citizen engagement with government TikTok accounts during the COVID-19 pandemic: Model development and analysis. Journal of medical internet research 23, 2 (2021), e21463. https://doi.org/10.2196/ 21463
work page 2021
-
[7]
Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16) . Association for Computing Machinery, New York, NY, USA, 785–794. https: //doi.org/10.1145/2939672.2939785
-
[8]
Nicholas A. Christakis and James H. Fowler. 2007. The Spread of Obesity in a Large Social Network over 32 Years. New England Journal of Medicine 357, 4 (2007), 370–379. https://doi.org/10.1056/NEJMsa066082
-
[9]
Emily Denniss, Rebecca Lindberg, and Sarah A. McNaughton. 2023. Nutrition-Related Information on Instagram: A Content Analysis of Posts by Popular Australian Accounts. Nutrients 15, 10 (2023), 2332. https://doi.org/10.3390/nu15102332
-
[10]
Anja Feldmann, Oliver Gasser, Franziska Lichtblau, Enric Pujol, Ingmar Poese, Christoph Dietzel, Daniel Wagner, Matthias Wichtlhuber, Juan Tapiador, Narseo Vallina-Rodriguez, Oliver Hohlfeld, and Georgios Smaragdakis. 2020. The Lockdown Effect: Implications of the COVID-19 Pandemic on Internet Traffic. In Proceedings of the ACM Internet Measurement Confer...
-
[11]
Stacey Finkelstein and Ayelet Fishbach. 2010. When Healthy Food Makes You Hungry. Journal of Consumer Research 37, 3 (2010), 357–367. https://doi.org/10.1086/652248 Manuscript submitted to ACM Reddit’s Appetite: Predicting User Engagement with Nutritional Content 19
-
[12]
Marion Garaus and Lidija Lalicic. 2021. The unhealthy-tasty intuition for online recipes–when healthiness perceptions backfire. Appetite 159 (2021), 105066. https://doi.org/10.1016/j.appet.2020.105066
-
[13]
Kristina Gligorić, Arnaud Chiolero, Emre Kıcıman, Ryen W White, Eric Horvitz, and Robert West. 2024. Food choice mimicry on a large university campus. PNAS Nexus 3, 12 (2024), pgae517. https://doi.org/10.1093/pnasnexus/pgae517
-
[14]
Kristina Gligorić, Arnaud Chiolero, Emre Kıcıman, Ryen W. White, and Robert West. 2022. Population-scale dietary interests during the COVID-19 pandemic. Nature Communications 13, 1 (2022), 1073. https://doi.org/10.1038/s41467-022-28498-z
-
[15]
White, Emre Kiciman, Eric Horvitz, Arnaud Chiolero, and Robert West
Kristina Gligorić, Ryen W. White, Emre Kiciman, Eric Horvitz, Arnaud Chiolero, and Robert West. 2021. Formation of Social Ties Influences Food Choice: A Campus-wide Longitudinal Study. Proc. ACM Hum.-Comput. Interact. 5, CSCW1 (2021), 184:1–184:25. https://doi.org/10.1145/3449297
-
[16]
T. Hanifawati, U. S. Ritonga, and E. E. Puspitasari. 2019. Managing Brands’ Popularity on Facebook: Post Time, Content, and Brand Communication Strategies. Journal of Indonesian Economy and Business 34, 2 (2019), 185. https://doi.org/10.22146/jieb.45755
-
[17]
Jack Hessel, Lillian Lee, and David Mimno. 2017. Cats and Captions vs. Creators and the Clock: Comparing Multimodal Content to Context in Predicting Relative Popularity. In Proceedings of the 26th International Conference on World Wide Web (WWW ’17) . International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 927–936....
-
[18]
Katrijn Houben, Anne Roefs, and Anita Jansen. 2010. Guilty pleasures. Implicit preferences for high calorie food in restrained eating. Appetite 55, 1 (2010), 18–24. https://doi.org/10.1016/j.appet.2010.03.003
-
[19]
Moran, Siona Prasad, Anna Li, Denise Simon, Lauren Cleveland, Jared B
Yulin Hswen, Alyssa J. Moran, Siona Prasad, Anna Li, Denise Simon, Lauren Cleveland, Jared B. Hawkins, John S. Brownstein, and Jason Block. 2021. The Federal Menu Labeling Law and Twitter Discussions about Calories in the United States: An Interrupted Time-Series Analysis. International Journal of Environmental Research and Public Health 18, 20 (2021), 10...
-
[20]
Paweł Kabata, Dorota Winniczuk-Kabata, Piotr Maciej Kabata, Janusz Jaśkiewicz, and Karol Połom. 2022. Can Social Media Profiles Be a Reliable Source of Information on Nutrition and Dietetics? Healthcare (Basel, Switzerland) 10, 2 (2022), 397. https://doi.org/10.3390/healthcare10020397
-
[21]
William D.S. Killgore and Deborah A. Yurgelun-Todd. 2006. Affect modulates appetite-related brain activity to images of food. International Journal of Eating Disorders 39, 5 (2006), 357–363. https://doi.org/10.1002/eat.20240
-
[22]
William D. S Killgore, Ashley D Young, Lisa A Femia, Piotr Bogorodzki, Jadwiga Rogowska, and Deborah A Yurgelun-Todd. 2003. Cortical and limbic activation during viewing of high- versus low-calorie foods. NeuroImage 19, 4 (2003), 1381–1394. https://doi.org/10.1016/S1053-8119(03)00191-5
-
[23]
Dokyun Lee, Kartik Hosanagar, and Harikesh S Nair. 2018. Advertising content and consumer engagement on social media: Evidence from Facebook. Management science 64, 11 (2018), 5105–5131. https://doi.org/10.1287/mnsc.2017.2902
-
[24]
Jinha Lee and Heejin Lim. 2023. Visual aesthetics and multisensory engagement in online food delivery services. International Journal of Retail & Distribution Management 51, 8 (2023), 975–990. https://doi.org/10.1108/IJRDM-09-2021-0451
-
[25]
Lister, Hannah Melville, and Hiba Jebeile
Natalie B. Lister, Hannah Melville, and Hiba Jebeile. 2024. What adolescents see on Instagram: Content analysis of #intermittentfasting, #keto, and #lowcarb. Nutrition & Dietetics 81, 3 (2024), 316–324. https://doi.org/10.1111/1747-0080.12853
-
[26]
Could You Define That in Bot Terms
Kiel Long, John Vines, Selina Sutton, Phillip Brooker, Tom Feltwell, Ben Kirman, Julie Barnett, and Shaun Lawson. 2017. "Could You Define That in Bot Terms"? Requesting, Creating and Using Bots on Reddit. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems . Association for Computing Machinery, 3488–3500. https://doi.org/10.114...
-
[27]
Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M
Scott M. Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. 2020. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence 2, 1 (2020), 56–67. https://doi.org/10.1038/s42256-019-0138-9
-
[28]
A Unified Approach to Interpreting Model Predictions
Scott M. Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 4768–4777. https://doi.org/10.48550/arXiv.1705.07874
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1705.07874 2017
-
[29]
Brent McFerran, Darren W. Dahl, Gavan J. Fitzsimons, and Andrea C. Morales. 2010. I’ll Have What She’s Having: Effects of Social Influence and Body Type on the Food Choices of Others. Journal of Consumer Research 36, 6 (2010), 915–929. https://doi.org/10.1086/644611
-
[30]
Kyle McKillop, James Harnly, Pamela Pehrsson, Naomi Fukagawa, and John Finley. 2021. FoodData Central, USDA’s Updated Approach to Food Composition Data Systems. Current Developments in Nutrition 5 (2021), 596. https://doi.org/10.1093/cdn/nzab044_027
-
[31]
Gillian Moran, Laurent Muzellec, and Devon Johnson. 2020. Message content features and social media engagement: evidence from the media industry. Journal of Product & Brand Management 29, 5 (2020), 533–545. https://doi.org/10.1108/jpbm-09-2018-2014
-
[32]
Ludovit Nastisin, Richard Fedorko, Beata Gavurova, and Radovan Bacik. 2024. Examination of Content Types and Social Media Engagement Indicators on Facebook: Case Analysis of a 5-Star Hotels of Visegrad Group Countries. Marketing and Management of Innovations 15, 1 (2024), 112–119. https://doi.org/10.21272/mmi.2024.1-09
-
[33]
Ethan Pancer, Matthew Philp, Maxwell Poole, and Theodore J. Noseworthy. 2022. Content Hungry: How the Nutrition of Food Media Influences Social Media Engagement. Journal of Consumer Psychology 32, 2 (2022), 336–349. https://doi.org/10.1002/jcpy.1246
-
[34]
Matthew Philp, Jenna Jacobson, and Ethan Pancer. 2022. Predicting social media engagement with computer vision: An examination of food marketing on Instagram. Journal of Business Research 149 (2022), 736–747. https://doi.org/10.1016/j.jbusres.2022.05.078
-
[35]
Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) . Association for Computational Linguistics, 3982–3992. https://doi.org/10....
-
[36]
Markus Rokicki, Eelco Herder, and Christoph Trattner. 2017. How Editorial, Temporal and Social Biases Affect Online Food Popularity and Appreciation. Proceedings of the International AAAI Conference on Web and Social Media 11, 1 (2017), 192–200. https://doi.org/10.1609/icwsm.v11i1. Manuscript submitted to ACM 20 Ozegovic et al. 14905
-
[37]
Thorsten Ruprechter. 2025. NutriTransform: Estimating Nutritional Information from Online Food Posts. In Under review
work page 2025
-
[38]
2017.Learning Cross-Modal Embeddings for Cooking Recipes and Food Images
Amaia Salvador, Nicholas Hynes, Yusuf Aytar, Javier Marin, Ferda Ofli, Ingmar Weber, and Antonio Torralba. 2017.Learning Cross-Modal Embeddings for Cooking Recipes and Food Images . IEEE Computer Society. 3068–3076 pages. https://doi.org/10.1109/CVPR.2017.327
-
[39]
Herting, Seung-Lark Lim, Nicolette J
Monica Serrano-Gonzalez, Megan M. Herting, Seung-Lark Lim, Nicolette J. Sullivan, Robert Kim, Juan Espinoza, Christina M. Koppin, Joyce R. Javier, Mimi S. Kim, and Shan Luo. 2021. Developmental Changes in Food Perception and Preference. Frontiers in Psychology 12 (2021). https: //doi.org/10.3389/fpsyg.2021.654200
-
[40]
Sharma and Munmun De Choudhury
Sanket S. Sharma and Munmun De Choudhury. 2015. Measuring and Characterizing Nutritional Information of Food and Ingestion Content in Instagram. In Proceedings of the 24th International Conference on World Wide Web . Association for Computing Machinery, 115–116. https: //doi.org/10.1145/2740908.2742754
-
[41]
Alain D. Starke, Martijn C. Willemsen, and Christoph Trattner. 2021. Nudging Healthy Choices in Food Search Through Visual Attractiveness. Frontiers in Artificial Intelligence 4 (2021). https://doi.org/10.3389/frai.2021.621743
-
[42]
Bongwon Suh, Lichan Hong, Peter Pirolli, and Ed H. Chi. 2010. Want to be Retweeted? Large Scale Analytics on Factors Impacting Retweet in Twitter Network. In 2010 IEEE Second International Conference on Social Computing . IEEE Computer Society, 177–184. https://doi.org/10.1109/ SocialCom.2010.33
work page 2010
-
[43]
Safa Enes Turkoglu and Alev Mutlu. 2023. Improving Retweet Prediction via Tweet Features. In 2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA) . IEEE Computer Society, 1–5. https://doi.org/10.1109/HORA58378.2023.10156692
-
[44]
Bradley P. Turnwald, Kathryn G. Anderson, Hazel Rose Markus, and Alia J. Crum. 2022. Nutritional Analysis of Foods and Beverages Posted in Social Media Accounts of Highly Followed Celebrities. JAMA Network Open 5, 1 (2022), e2143087. https://doi.org/10.1001/jamanetworkopen.2021.43087
-
[45]
Risqo Wahid and Muhammad Wadud. 2020. Social Media Marketing on Instagram: When is The Most Effective Posting Timing? EPRA International Journal of Multidisciplinary Research (IJMR) 6, 7 (2020), 312–321. https://doi.org/10.36713/epra4834
-
[46]
Robert West, Ryen W. White, and Eric Horvitz. 2013. From cookies to cooks: insights on dietary patterns via analysis of web usage logs. In Proceedings of the 22nd international conference on World Wide Web (WWW ’13) . Association for Computing Machinery, New York, NY, USA, 1399–1410. https://doi.org/10.1145/2488388.2488510
-
[47]
Jonas, Kyoko Ohno-Matsui, James Chen, Marcus Ang, and Daniel Shu Wei Ting
Chee Wai Wong, Andrew Tsai, Jost B. Jonas, Kyoko Ohno-Matsui, James Chen, Marcus Ang, and Daniel Shu Wei Ting. 2021. Digital Screen Time During the COVID-19 Pandemic: Risk for a Further Myopia Boom? American Journal of Ophthalmology 223 (2021), 333–337. https: //doi.org/10.1016/j.ajo.2020.07.034
-
[48]
Yiming Yan. 2024. The evolution and impact of Multi-Armed Bandit algorithms in social media. Applied and Computational Engineering 68 (2024), 150–158. https://doi.org/10.54254/2755-2721/68/20241418 Manuscript submitted to ACM
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.