GroupAffect-4: A Multimodal Dataset of Four-Person Collaborative Interaction

Alice Modica; Andrew Burke Dittberner; Anna Obara; Daniel Barratt; Daniel Overholt; Fabricio Batista Narcizo; Jesper Bunsow Boldt; Karim Haddad; Meisam Jamshidi Seikavandi; Paolo Burelli

arxiv: 2605.19765 · v1 · pith:QQ46CAYRnew · submitted 2026-05-19 · 💻 cs.AI · cs.DB

GroupAffect-4: A Multimodal Dataset of Four-Person Collaborative Interaction

Meisam Jamshidi Seikavandi , Alice Modica , Anna Obara , Shan Ahmed Shaffi , Fabricio Batista Narcizo , Tanya Ignatenko , Ted Vucurevich , Karim Haddad

show 5 more authors

Daniel Barratt Daniel Overholt Jesper Bunsow Boldt Paolo Burelli Andrew Burke Dittberner

This is my paper

Pith reviewed 2026-05-20 04:57 UTC · model grok-4.3

classification 💻 cs.AI cs.DB

keywords multimodal datasetgroup affectcollaborative interactionphysiologyeye trackingaffective computingsocial signalsfour-person groups

0 comments

The pith

GroupAffect-4 supplies a synchronized multimodal dataset from ten four-person groups to study affect at individual, interpersonal, and collective levels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents GroupAffect-4, a new corpus that records multimodal signals from ten groups of four people each as they work on four different collaborative tasks. The signals include wrist physiology, eye tracking, audio from close-talk mics, continuous affect ratings, personality scores, and task results, all synchronized to one clock. By collecting these in one place for co-located groups, the dataset aims to support research on how emotions and affect play out not just inside one person but between people and across the whole group. High data quality is reported with over 91 percent coverage for physiology and 98 percent for eye tracking, plus checks that the tasks actually change affect as intended. The release includes benchmarks for testing models on within-person states, between-person traits, and group dynamics.

Core claim

The authors create and release GroupAffect-4 as a multimodal dataset of 40 participants in 10 four-person groups completing four tasks: information pooling, negotiation, idea generation, and a public-goods game. Each participant is equipped with a wrist physiology sensor, eye-tracking glasses, and close-talk microphone, with all data time-aligned along with self-reports, questionnaires, task outcomes, and Big-Five personality scores. The dataset achieves high coverage rates and includes fifteen benchmark targets across three analysis levels with initial feasibility baselines.

What carries the argument

The GroupAffect-4 corpus, a synchronized collection of physiology, eye movement, audio, and report data from collaborative group sessions that enables joint analysis of affect at individual, interpersonal, and group scales.

If this is right

Affective computing models can now be evaluated on aligned signals that link personal states to interpersonal and group patterns.
The defined leave-one-group-out baselines provide a starting point for standardized tests of group dynamics prediction.
High coverage of physiology and eye-tracking windows supports extraction of continuous features across entire sessions.
Public structure with quality reports and processing scripts enables direct replication and extension by other teams.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This kind of aligned multi-level recording could support tools that monitor real-time team emotional climate during meetings.
Future comparisons with remote or virtual groups could test whether co-location changes the strength of interpersonal affect links.
Combining these recordings with existing meeting datasets might allow larger-scale studies of how group size influences affective coordination.

Load-bearing premise

The four selected collaborative tasks and the chosen sensor suite of wrist physiology monitors, eye-tracking glasses, and close-talk microphones produce recordings that reflect natural affective processes at multiple levels without major intrusion or distortion.

What would settle it

Absence of expected affective differences in self-reports during the negotiation task or data coverage falling below levels needed for reliable multi-level modeling would show that the dataset does not support its intended analyses of coupled group affect.

Figures

Figures reproduced from arXiv: 2605.19765 by Alice Modica, Andrew Burke Dittberner, Anna Obara, Daniel Barratt, Daniel Overholt, Fabricio Batista Narcizo, Jesper Bunsow Boldt, Karim Haddad, Meisam Jamshidi Seikavandi, Paolo Burelli, Shan Ahmed Shaffi, Tanya Ignatenko, Ted Vucurevich.

**Figure 2.** Figure 2: Per-benchmark ranked feature importance: top-15 features by mean normalised [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Valence, arousal, and dominance probe summaries by task. T2 negotiation produces the [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Participant demographics (age, sex, education) and group composition across 40 participants [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: T1 (Hidden-Profile Decision) big-screen layout. The shared display shows the task brief, [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: T2 (Mini-Negotiation) big-screen layout. The shared display shows the topic and format [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: T3 (Idea Generation and Selection) big-screen layout. The shared display shows the [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 8.** Figure 8: T4 (Public-Goods Micro-Game) big-screen layout. The shared display reveals individual [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: Participant tablet view of a VAD affect probe during Task 1. Valence, Arousal, and Dominance are shown simultaneously on a 1–9 Likert scale. Post-Block Questionnaires After each task, participants completed a short individual questionnaire on their tablets ( [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Participant tablet view of the post-block questionnaire (T4 shown, 7 items shown). All [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Task Outcomes Dashboard (T1–T4). Panel 1: Hidden-profile decision outcome (T1). All 10 groups selected Candidate C. Panel 2: Mini-negotiation outcome (T2). Topics and formats are colour-differentiated. Most common: “AI for productivity” (5 groups). Panel 3: Idea Generation outcome (T3). Winning ideas are grouped by themes. Panel 4: Public-Goods Contribution (T4). Per-group mean contributions (0–10 scale) … view at source ↗

**Figure 12.** Figure 12: Time-to-decision durations (T1–T3). Boxplots show discussion durations from onset to the moderator “finish” prompt for T1–T3. The dashed horizontal line marks the nominal 8- minute guideline; groups typically overran this duration, especially in T2. T4 is excluded because its discussion phase did not terminate in a single group decision. Durations were derived directly from events_grp-XX.tsv files (one pe… view at source ↗

**Figure 13.** Figure 13: Task-level physiological feature effect sizes (Cohen’s [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗

**Figure 14.** Figure 14: Cross-modal Spearman correlation matrix (physiological and audio features vs. [PITH_FULL_IMAGE:figures/full_fig_p029_14.png] view at source ↗

**Figure 15.** Figure 15: Modality ablation heatmap. Colour encodes performance relative to chance (white = [PITH_FULL_IMAGE:figures/full_fig_p030_15.png] view at source ↗

**Figure 16.** Figure 16: Per-benchmark ranked feature importance: top-15 features by mean normalised [PITH_FULL_IMAGE:figures/full_fig_p030_16.png] view at source ↗

**Figure 17.** Figure 17: Feasibility baselines for the 31-feature set (biomarker composites and annotation process [PITH_FULL_IMAGE:figures/full_fig_p032_17.png] view at source ↗

**Figure 18.** Figure 18: Cross-benchmark feature importance heatmap. Rows are all 31 sensor/behavioural features [PITH_FULL_IMAGE:figures/full_fig_p037_18.png] view at source ↗

**Figure 19.** Figure 19: LOGO-CV per-fold performance strip plot across all benchmarks. Each dot is one fold; [PITH_FULL_IMAGE:figures/full_fig_p038_19.png] view at source ↗

read the original abstract

Existing affective-computing, social-signal-processing, and meeting corpora capture important parts of human interaction, but they rarely support analysis of affect in co-located groups as a coupled individual, interpersonal, and group-level process. The required signals (per-participant physiology, eye movement, audio, self-report, task outcomes, and personality) are usually fragmented across separate dataset traditions. We introduce GroupAffect-4, a multimodal corpus of 40 participants in 10 four-person groups, each completing four ecologically varied collaborative tasks spanning information pooling, negotiation, idea generation, and a public-goods game. Each participant is instrumented with a wrist-worn physiology sensor, eye-tracking glasses, and a close-talk microphone; sessions include continuous affect self-reports, post-task questionnaires, task outcomes, and Big-Five personality scores, all time-aligned to a shared clock. The dataset covers over 91% of expected physiology windows and 98% of eye-tracking windows, with strong task validity confirmed by a clear affective manipulation check across the negotiation block. We define fifteen benchmarkable targets spanning three analysis levels -- within-person state, between-person traits, and group dynamics -- and report leave-one-group-out feasibility baselines establishing the dataset's evaluative scope. GroupAffect-4 is released with a BIDS-inspired structure, Croissant metadata, a datasheet, per-session quality reports, and open processing scripts. Code and processing scripts are available at https://github.com/meisamjam/GroupAffect-4; the dataset is publicly archived at https://zenodo.org/records/20037847.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

read the letter

GroupAffect-4 supplies a new multimodal corpus for four-person affect research with decent coverage and open release, but the ecological-validity claim rests on an untested assumption about sensor intrusiveness. The paper brings together wrist physiology, eye-tracking glasses, close-talk audio, continuous self-reports, task outcomes, and Big-Five scores for ten groups of four people working on four varied collaborative tasks. That combination, time-aligned to a shared clock and released with BIDS-style structure plus quality reports and scripts, is the concrete step forward. Existing corpora tend to split these signals across separate collections, so having them in one place for group-level analysis is useful. The reported coverage (91% physiology, 98% eye-tracking) and the affective manipulation check on the negotiation task give a practical sense of what the data actually contain. The leave-one-group-out baselines on the fifteen targets also help set expectations for downstream work. The soft spot is the lack of direct evidence that the sensors left interaction dynamics unchanged. The abstract and methods description mention post-task questionnaires but do not report comfort ratings, behavioral reactivity measures, or any uninstrumented control comparison. For a dataset positioned as capturing natural individual, interpersonal, and group affect, that gap matters even if it is fixable. The rest of the documentation looks standard for a dataset paper and shows no circularity or hidden fitting. This is for researchers in affective computing and social signal processing who need group-scale multimodal data rather than single-person or dyadic sets. A reader who wants to test models across within-person states, between-person traits, and group dynamics will find the benchmarks and open materials worth examining. The work shows clear thinking about what signals are missing in prior corpora and makes an honest effort to fill that hole. I would send it to peer review; the core contribution is real and the main concern is addressable with targeted additions.

Referee Report

2 major / 2 minor

Summary. The paper introduces GroupAffect-4, a multimodal dataset of 40 participants in 10 four-person groups completing four collaborative tasks (information pooling, negotiation, idea generation, public-goods game). Participants are instrumented with wrist physiology sensors, eye-tracking glasses, and close-talk microphones; data include continuous affect self-reports, post-task questionnaires, task outcomes, and Big-Five personality scores, all time-aligned. The dataset reports >91% physiology and 98% eye-tracking coverage, a manipulation check for affective validity in negotiation, 15 benchmark targets across within-person, between-person, and group levels, and leave-one-group-out feasibility baselines. It is released with BIDS-inspired structure, Croissant metadata, datasheet, quality reports, and open processing scripts.

Significance. If the coverage, alignment, and validity claims hold, the dataset fills a notable gap by providing time-synchronized multimodal signals for studying affect as a coupled individual-interpersonal-group process in co-located settings. The open release with standardized metadata, per-session quality reports, and reproducible scripts strengthens its utility for the community. The leave-one-group-out baselines establish a concrete evaluative scope without introducing new fitted parameters.

major comments (2)

Abstract: The central claim that the recordings capture ecologically valid affect at individual, interpersonal, and group levels rests on the untested premise that the chosen sensor suite (eye-tracking glasses + wrist physiology + close-talk mic) is minimally intrusive. No quantitative evidence (comfort ratings, behavioral reactivity metrics, or uninstrumented control comparisons) is reported despite mention of post-task questionnaires, leaving the ecological-validity foundation unsupported.
Abstract and methods description: Participant selection criteria, exact baseline implementations for the 15 benchmark targets, and any post-collection data exclusions are not detailed. These omissions directly affect reproducibility of the reported 91% physiology and 98% eye-tracking coverage figures and the leave-one-group-out feasibility results.

minor comments (2)

Abstract: The phrase 'strong task validity confirmed by a clear affective manipulation check' would benefit from a brief parenthetical note on the specific measure (e.g., self-report scale or statistical test) used in the negotiation block.
Release description: The GitHub and Zenodo links are helpful; adding a short table summarizing per-task sensor coverage statistics would improve immediate usability for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive evaluation of GroupAffect-4's contribution and for the constructive comments on ecological validity and reproducibility. We address each major comment below and will incorporate the suggested clarifications in the revised manuscript.

read point-by-point responses

Referee: Abstract: The central claim that the recordings capture ecologically valid affect at individual, interpersonal, and group levels rests on the untested premise that the chosen sensor suite (eye-tracking glasses + wrist physiology + close-talk mic) is minimally intrusive. No quantitative evidence (comfort ratings, behavioral reactivity metrics, or uninstrumented control comparisons) is reported despite mention of post-task questionnaires, leaving the ecological-validity foundation unsupported.

Authors: We agree that explicit quantitative support for minimal intrusiveness would strengthen the ecological-validity claim. Although post-task questionnaires were administered and contain relevant items, comfort and reactivity metrics were not analyzed or reported in the submitted version. In the revision we will add a short subsection (or appendix table) presenting mean comfort ratings, any self-reported interference, and observed behavioral reactivity indicators drawn directly from those questionnaires. This addition will provide the requested quantitative grounding without requiring new data collection. revision: yes
Referee: Abstract and methods description: Participant selection criteria, exact baseline implementations for the 15 benchmark targets, and any post-collection data exclusions are not detailed. These omissions directly affect reproducibility of the reported 91% physiology and 98% eye-tracking coverage figures and the leave-one-group-out feasibility results.

Authors: We concur that these methodological details are necessary for full reproducibility. The revised manuscript will expand the Participants and Benchmark Targets subsections to specify: (i) inclusion/exclusion criteria and recruitment procedures, (ii) precise algorithmic descriptions and any hyper-parameters used for each of the 15 benchmark targets, and (iii) the exact post-collection exclusion rules together with the number of sessions or segments removed. These additions will allow readers to replicate the coverage statistics and leave-one-group-out baselines exactly. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive dataset paper with no derivations or fitted predictions

full rationale

This is a dataset introduction paper whose central claims consist of describing the collection protocol, reporting coverage statistics (91% physiology, 98% eye-tracking), confirming a manipulation check, and releasing benchmark targets with leave-one-group-out baselines. No equations, first-principles derivations, parameter fits, or predictions are presented that could reduce to their own inputs. The fifteen benchmarkable targets are defined explicitly rather than derived; the feasibility baselines are reported as evaluative scope rather than claimed as novel predictions. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results appear in the provided text. The paper is therefore self-contained as a descriptive corpus release.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that the selected tasks and sensors yield representative group-affect data; no free parameters or new entities are introduced.

axioms (1)

domain assumption The four chosen tasks are ecologically valid representations of collaborative interaction.
Invoked when the abstract states the tasks span information pooling, negotiation, idea generation, and a public-goods game.

pith-pipeline@v0.9.0 · 5882 in / 1416 out tokens · 65506 ms · 2026-05-20T04:57:52.297621+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages

[1]

Croissant: A metadata format for ML-ready datasets

Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Luca Foschini, Joan Giner-Miguelez, Pieter Gijsbers, Sujata Goswami, Nitisha Jain, Michalis Karamousadakis, Michael Kuchnik, et al. Croissant: A metadata format for ML-ready datasets. InAdvances in Neural Information Processing Systems, 2024. Also available as arXiv:2403.19546

work page arXiv 2024
[2]

Sigal G. Barsade. The ripple effect: Emotional contagion and its influence on group behavior. Administrative Science Quarterly, 47(4):644–675, 2002

work page 2002
[3]

Indrani Bhattacharya, Daniel Foley, Nicholas Zhang, Tong Zhang, Christopher Mine, Qi Ji, and Richard J. Radke. UGI: An unobtrusive group interaction dataset. InProceedings of the 10th ACM Multimedia Systems Conference, 2019

work page 2019
[4]

G-REx: A real-world dataset of group emotion experiences based on physiological data

Patricia Bota, Joana Brito, Ana Fred, Pablo Cesar, and Hugo Plácido Silva. G-REx: A real-world dataset of group emotion experiences based on physiological data. 2023

work page 2023
[5]

Bradley and Peter J

Margaret M. Bradley and Peter J. Lang. Measuring emotion: The self-assessment manikin and the semantic differential.Journal of Behavior Therapy and Experimental Psychiatry, 25(1):49–59, 1994

work page 1994
[6]

Chang, Sungbok Lee, and Shrikanth S

Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N. Chang, Sungbok Lee, and Shrikanth S. Narayanan. IEMOCAP: Interactive emotional dyadic motion capture database.Language Resources and Evaluation, 42(4):335– 359, 2008

work page 2008
[7]

Gebre, and Hayley Hung

Laura Cabrera-Quiros, Andrew Demetriou, Egor Balog, Astrid van der Meijden, Esma Gedik, Binyam G. Gebre, and Hayley Hung. The MatchNMingle dataset: A novel multi-sensor resource for the analysis of social interactions and nonverbal communication in unstructured mingle and speed-dating scenarios.IEEE Transactions on Affective Computing, 12(1):148–164, 2021

work page 2021
[8]

The AMI meeting corpus: A pre-announcement

Jean Carletta, Simone Ashby, Sebastien Bourban, Mike Flynn, Mael Guillemot, Thomas Hain, Jaroslav Kadlec, Vasilis Karaiskos, Wessel Kraaij, Melissa Kronenthal, Guillaume Lathoud, Mike Lincoln, Agnes Lisowska, Iain McCowan, Wilfried Post, Dennis Reidsma, and Pierre Wellner. The AMI meeting corpus: A pre-announcement. InProceedings of the Second Inter- nati...

work page 2005
[9]

Cawley and Nicola L

Gavin C. Cawley and Nicola L. C. Talbot. On over-fitting in model selection and subsequent selection bias in performance evaluation.Journal of Machine Learning Research, 11:2079–2107, 2010

work page 2079
[10]

Sustaining cooperation in laboratory public goods experiments: A selective survey of the literature.Experimental Economics, 14(1):47–83, 2011

Ananish Chaudhuri. Sustaining cooperation in laboratory public goods experiments: A selective survey of the literature.Experimental Economics, 14(1):47–83, 2011

work page 2011
[11]

The gamma corpus of danish polyadic conversations with gaze speech and motion data in quiet and noise.Scientific Data, 2026

Mark Dourado, Henrik Gert Hassager, Jesper Udesen, and Stefania Serafin. The gamma corpus of danish polyadic conversations with gaze speech and motion data in quiet and noise.Scientific Data, 2026

work page 2026
[12]

Cooperation and punishment in public goods experiments

Ernst Fehr and Simon Gächter. Cooperation and punishment in public goods experiments. American Economic Review, 90(4):980–994, 2000

work page 2000
[13]

Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daume III, and Kate Crawford. Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

work page 2021
[14]

Larentzakis, Nader N

Kyriakos Georgiou, Andreas V . Larentzakis, Nader N. Khamis, Ghadah I. Alsuhaibani, Yasser A. Alaska, and Elias J. Giallafos. Can wearable devices accurately measure heart rate variability? a systematic review.Folia Medica, 60(1):7–20, 2018

work page 2018
[15]

Academic Press, 1981

Charles Goodwin.Conversational Organization: Interaction Between Speakers and Hearers. Academic Press, 1981. 10

work page 1981
[16]

Gorgolewski, Tibor Auer, Vince D

Krzysztof J. Gorgolewski, Tibor Auer, Vince D. Calhoun, R. Cameron Craddock, Samir Das, Eugene P. Duff, Guillaume Flandin, Satrajit S. Ghosh, Tristan Glatard, Yaroslav O. Halchenko, et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments.Scientific Data, 3:160044, 2016

work page 2016
[17]

The ICSI meeting corpus

Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, and Chuck Wooters. The ICSI meeting corpus. InProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003

work page 2003
[18]

O. P. John and S. Srivastava. The big five trait taxonomy: History, measurement, and theoretical perspectives. In L. A. Pervin and O. P. John, editors,Handbook of personality: Theory and research, pages 102–138. Guilford Press, 2nd edition, 1999

work page 1999
[19]

DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

work page 2012
[20]

Grivich, Fiorenzo Artoni, Tim Mullen, Arnaud Delorme, and Scott Makeig

Christian Kothe, Seyed Yahya Shirazi, Tristan Stenner, David Medine, Chadwick Boulay, Matthew I. Grivich, Fiorenzo Artoni, Tim Mullen, Arnaud Delorme, and Scott Makeig. The lab streaming layer for synchronized multimodal recording.Imaging Neuroscience, 3:IMAG.a.136, 2025

work page 2025
[21]

Lazarus.Emotion and Adaptation

Richard S. Lazarus.Emotion and Adaptation. Oxford University Press, 1991

work page 1991
[22]

Connie Yuan, and Poppy Lauretta McLeod

Li Lu, Y . Connie Yuan, and Poppy Lauretta McLeod. Twenty-five years of hidden profiles in group decision making: A meta-analysis.Personality and Social Psychology Review, 16(1):54– 75, 2012

work page 2012
[23]

Marks, John E

Michelle A. Marks, John E. Mathieu, and Stephen J. Zaccaro. A temporally based framework and taxonomy of team processes.Academy of Management Review, 26(3):356–376, 2001

work page 2001
[24]

Pupillometry: Psychology, physiology, and function.Journal of Cognition, 1(1):16, 2018

Sebastiaan Mathot. Pupillometry: Psychology, physiology, and function.Journal of Cognition, 1(1):16, 2018

work page 2018
[25]

Russell.An Approach to Environmental Psychology

Albert Mehrabian and James A. Russell.An Approach to Environmental Psychology. MIT Press, 1974

work page 1974
[26]

A multimodal experimental dataset on agile software development team interactions.Data in Brief, 61:111828, 2025

Diego Miranda, Carlos Escobedo, Dayana Palma, Rene Noel, Adrián Fernández, Cristian Cechinel, Jaime Godoy, and Roberto Munoz. A multimodal experimental dataset on agile software development team interactions.Data in Brief, 61:111828, 2025

work page 2025
[27]

AMI- GOS: A dataset for affect, personality and mood research on individuals and groups.IEEE Transactions on Affective Computing, 12(2):479–493, 2021

Juan Abdon Miranda-Correa, Mojtaba Khomami Abadi, Nicu Sebe, and Ioannis Patras. AMI- GOS: A dataset for affect, personality and mood research on individuals and groups.IEEE Transactions on Affective Computing, 12(2):479–493, 2021

work page 2021
[28]

Model cards for model reporting

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchin- son, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. Model cards for model reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency, pages 220– 229, 2019

work page 2019
[29]

Croissant format specification, version 1.0

MLCommons. Croissant format specification, version 1.0. https://docs.mlcommons.org/ croissant/docs/croissant-spec.html, 2024. Published 2024-03-01; accessed 2026-05- 03

work page 2024
[30]

Detecting low rapport during natural interactions in small groups from non-verbal behaviour

Philipp Müller, Michael Xuelin Huang, and Andreas Bulling. Detecting low rapport during natural interactions in small groups from non-verbal behaviour. pages 153–164, 2018

work page 2018
[31]

Preserving privacy in speaker and speech characterisation.Computer Speech & Language, 58:441–480, 2019

Andreas Nautsch, Andrés Jiménez, Alexander Treiber, Jan Kolberg, Catherine Jasserand, Els Kindt, Héctor Delgado, Massimiliano Todisco, Pierre Héroux, Nicholas Evans, et al. Preserving privacy in speaker and speech characterisation.Computer Speech & Language, 58:441–480, 2019. 11

work page 2019
[32]

NeurIPS 2026 evaluations & datasets hosting guidelines

NeurIPS. NeurIPS 2026 evaluations & datasets hosting guidelines. https://neurips.cc/ Conferences/2026/EvaluationsDatasetsHosting, 2026. Accessed 2026-05-01

work page 2026
[33]

NeurIPS 2026 evaluations & datasets track — call for papers

NeurIPS. NeurIPS 2026 evaluations & datasets track — call for papers. https://neurips. cc/Conferences/2026/CallForEvaluationsDatasets, 2026. Accessed 2026-05-01

work page 2026
[34]

Cristina Palmero, Javier Selva, Sorina Smeureanu, Julio C. S. Jacques Junior, Albert Clapés, Alba Moseguí, Zejian Zhang, David Gallardo, Georgina Guilera, David Leiva, Hugo Jair Escalante, Isabelle Guyon, Xavier Baró, and Sergio Escalera. Context-aware personality inference in dyadic scenarios: Introducing the UDIV A dataset. InProceedings of the IEEE/CVF...

work page 2021
[35]

Khandoker, Leontios J

Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan H. Khandoker, Leontios J. Hadjileontiadis, Alice Oh, Youn-Byoung Jeong, and Uichin Lee. K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations.Scientific Data, 7(1):293, 2020

work page 2020
[36]

Emotions are social.British Journal of Psychology, 87(4):663–683, 1996

Brian Parkinson. Emotions are social.British Journal of Psychology, 87(4):663–683, 1996

work page 1996
[37]

Posada-Quintero and Ki H

Hugo F. Posada-Quintero and Ki H. Chon. Innovations in electrodermal activity data collection and signal processing: A systematic review.Sensors, 20(2):479, 2020

work page 2020
[38]

Rietzschel, Bernard A

Eric F. Rietzschel, Bernard A. Nijstad, and Wolfgang Stroebe. The selection of creative ideas after individual idea generation: Choosing between creativity and impact.British Journal of Psychology, 101(1):47–68, 2010

work page 2010
[39]

Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions

Fabien Ringeval, Andreas Sonderegger, Jürgen Sauer, and Denis Lalanne. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. InProceedings of the IEEE International Conference on Automatic Face and Gesture Recognition Workshops, 2013

work page 2013
[40]

James A. Russell. A circumplex model of affect.Journal of Personality and Social Psychology, 39(6):1161–1178, 1980

work page 1980
[41]

Schegloff, and Gail Jefferson

Harvey Sacks, Emanuel A. Schegloff, and Gail Jefferson. A simplest systematics for the organization of turn-taking for conversation.Language, 50(4):696–735, 1974

work page 1974
[42]

A multimodal corpus for the study of small group interactions

Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. A multimodal corpus for the study of small group interactions. InProceedings of the International Conference on Multimodal Interaction Workshops, 2011

work page 2011
[43]

Klaus R. Scherer. Appraisal considered as a process of multilevel sequential checking. In Klaus R. Scherer, Angela Schorr, and Tom Johnstone, editors,Appraisal Processes in Emotion: Theory, Methods, Research, pages 92–120. Oxford University Press, 2001

work page 2001
[44]

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge.Speech Communication, 53(9–10):1062–1087, 2011

Björn Schuller, Anton Batliner, Stefan Steidl, and Dino Seppi. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge.Speech Communication, 53(9–10):1062–1087, 2011

work page 2011
[45]

Advancing face-to-face emotion communication: A multimodal dataset (affec)

Meisam J. Seikavandi, Laurits Dixen, Jostein Fimland, Sree Keerthi Desu, Antonia-Bianca Zserai, Ye Sul Lee, Maria Barrett, and Paolo Burelli. Advancing face-to-face emotion commu- nication: A multimodal dataset (AFFEC).arXiv preprint arXiv:2504.18969, 2025

work page arXiv 2025
[46]

Seikavandi, Jostein Fimland, Fabricio Batista Narcizo, Maria Barrett, Ted Vucurevich, Jesper Bünsow Boldt, Andrew Burke Dittberner, and Paolo Burelli

Meisam J. Seikavandi, Jostein Fimland, Fabricio Batista Narcizo, Maria Barrett, Ted Vucurevich, Jesper Bünsow Boldt, Andrew Burke Dittberner, and Paolo Burelli. Modelling the interplay of eye-tracking temporal dynamics and personality for emotion detection in face-to-face settings. arXiv preprint arXiv:2510.24720, 2025

work page arXiv 2025
[47]

Gaze reveals emotion perception: Insights from modelling naturalistic face viewing

Meisam Jamshidi Seikavandi and Maria Jung Barrett. Gaze reveals emotion perception: Insights from modelling naturalistic face viewing. InProceedings of the 22nd IEEE International Conference on Machine Learning and Applications (ICMLA), pages 2022–2025. IEEE, 2023. 12

work page 2022
[48]

MuMTAffect: A multimodal multitask affective framework for personality and emotion recognition from physiological signals

Meisam Jamshidi Seikavandi, Fabricio Batista Narcizo, Ted Vucurevich, Andrew Burke Dit- tberner, and Paolo Burelli. MuMTAffect: A multimodal multitask affective framework for personality and emotion recognition from physiological signals. InProceedings of the 3rd International Workshop on Multimodal and Responsible Affective Computing, pages 100–108, 2025

work page 2025
[49]

Fred Shaffer and J. P. Ginsberg. An overview of heart rate variability metrics and norms. Frontiers in Public Health, 5:258, 2017

work page 2017
[50]

Pooling of unshared information in group decision making: Biased information sampling during discussion.Journal of Personality and Social Psychology, 48(6):1467–1478, 1985

Garold Stasser and William Titus. Pooling of unshared information in group decision making: Biased information sampling during discussion.Journal of Personality and Social Psychology, 48(6):1467–1478, 1985

work page 1985
[51]

Heart rate variability: Standards of measurement, physiological interpretation, and clinical use.Circulation, 93(5):1043–1065, 1996

Task Force of the European Society of Cardiology and the North American Society of Pac- ing and Electrophysiology. Heart rate variability: Standards of measurement, physiological interpretation, and clinical use.Circulation, 93(5):1043–1065, 1996

work page 1996
[52]

Sara Taylor, Natasha Jaques, Weixuan Chen, Szymon Fedor, Akane Sano, and Rosalind W. Picard. Automatic identification of artifacts in electrodermal activity data. InProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages 1934–1937, 2015

work page 1934
[53]

The V oicePrivacy 2020 challenge: Results and findings

Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O’Brien, et al. The V oicePrivacy 2020 challenge: Results and findings. InProceedings of Interspeech, pages 1399–1403, 2021

work page 2020
[54]

Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown, Trevor Hastie, Robert Tib- shirani, David Botstein, and Russ B. Altman. Missing value estimation methods for DNA microarrays.Bioinformatics, 17(6):520–525, 2001

work page 2001
[55]

Van Kleef

Gerben A. Van Kleef. How emotions regulate social life: The emotions as social information (EASI) model.Current Directions in Psychological Science, 18(3):184–188, 2009

work page 2009
[56]

Van Kleef, Carsten K

Gerben A. Van Kleef, Carsten K. W. De Dreu, and Antony S. R. Manstead. The interpersonal effects of emotions in negotiations: A motivated information processing approach.Journal of Personality and Social Psychology, 87(4):510–528, 2004

work page 2004
[57]

Bias in error estimation when using cross-validation for model selection.BMC Bioinformatics, 7:91, 2006

Sudhir Varma and Richard Simon. Bias in error estimation when using cross-validation for model selection.BMC Bioinformatics, 7:91, 2006

work page 2006
[58]

Eye gaze patterns in conversations: There is more to conversational agents than meets the eyes

Roel Vertegaal, Robert Slagter, Gerrit van der Veer, and Anton Nijholt. Eye gaze patterns in conversations: There is more to conversational agents than meets the eyes. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 301–308, 2001

work page 2001
[59]

Social signal processing: Survey of an emerging domain.Image and Vision Computing, 27(12):1743–1759, 2009

Alessandro Vinciarelli, Maja Pantic, and Hervé Bourlard. Social signal processing: Survey of an emerging domain.Image and Vision Computing, 27(12):1743–1759, 2009

work page 2009
[60]

Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, et al

Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, et al. The FAIR guiding principles for scientific data management and stewardship.Scientific Data, 3:160018, 2016

work page 2016
[61]

Roisman, and Thomas S

Zhihong Zeng, Maja Pantic, Glenn I. Roisman, and Thomas S. Huang. A survey of affect recognition methods: Audio, visual, and spontaneous expressions.IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1):39–58, 2009

work page 2009
[62]

GAP corpus: Group affect and performance corpus

Justine Zhang, Ravi Kumar, and Cristian Danescu-Niculescu-Mizil. GAP corpus: Group affect and performance corpus. https://convokit.cornell.edu/documentation/gap.html,

work page
[63]

big screen

Dataset documentation. 13 Appendix Table of Contents AList of Acronyms BStimuli and Task Orchestration CExtended Limitations and Caveats DBFI-44 Scoring and Item List EAudio T0 Baseline Reliability FSynchronisation Pipeline Detail GPreprocessing Steps HExtended Dataset Characterization IExtended Benchmarks: Sequential Conversation Tasks JBenchmark Interpr...

work page
[64]

No directional label bias, but within-person effect sizes are inflated

Within-person z-score (affects B0–B3d).Normalisation statistics (median, median absolute deviation ( MAD)) are computed over all four task rows per participant before the LOGO-CV split, so held-out participants contribute their own unsupervised T1–T4 distribution to test-time normalisation. No directional label bias, but within-person effect sizes are inf...

work page
[65]

Cannot favour any particular label direction; impact on AUC is negligible, but represents a strict-protocol deviation

Global feature selection.Missing-rate and correlation-based filtering is applied to the full dataset before splitting. Cannot favour any particular label direction; impact on AUC is negligible, but represents a strict-protocol deviation. 22 Table 13: Retained feature set after global feature selection (35 features total; 31 used in all reported benchmarks...

work page
[66]

never met

Annotation process-metadata features.The four annotation features ( answers_n, ann_total_events_n, ann_response_postblock_n, ann_event_span_s) are excluded from the 31-feature benchmark set. ann_event_span_s and answers_n vary systematically by task (T2 has more V ADprobes and a longer anno- tation span), creating a direct shortcut for the B0 task-classif...

work page arXiv 2026

[1] [1]

Croissant: A metadata format for ML-ready datasets

Mubashara Akhtar, Omar Benjelloun, Costanza Conforti, Luca Foschini, Joan Giner-Miguelez, Pieter Gijsbers, Sujata Goswami, Nitisha Jain, Michalis Karamousadakis, Michael Kuchnik, et al. Croissant: A metadata format for ML-ready datasets. InAdvances in Neural Information Processing Systems, 2024. Also available as arXiv:2403.19546

work page arXiv 2024

[2] [2]

Sigal G. Barsade. The ripple effect: Emotional contagion and its influence on group behavior. Administrative Science Quarterly, 47(4):644–675, 2002

work page 2002

[3] [3]

Indrani Bhattacharya, Daniel Foley, Nicholas Zhang, Tong Zhang, Christopher Mine, Qi Ji, and Richard J. Radke. UGI: An unobtrusive group interaction dataset. InProceedings of the 10th ACM Multimedia Systems Conference, 2019

work page 2019

[4] [4]

G-REx: A real-world dataset of group emotion experiences based on physiological data

Patricia Bota, Joana Brito, Ana Fred, Pablo Cesar, and Hugo Plácido Silva. G-REx: A real-world dataset of group emotion experiences based on physiological data. 2023

work page 2023

[5] [5]

Bradley and Peter J

Margaret M. Bradley and Peter J. Lang. Measuring emotion: The self-assessment manikin and the semantic differential.Journal of Behavior Therapy and Experimental Psychiatry, 25(1):49–59, 1994

work page 1994

[6] [6]

Chang, Sungbok Lee, and Shrikanth S

Carlos Busso, Murtaza Bulut, Chi-Chun Lee, Abe Kazemzadeh, Emily Mower, Samuel Kim, Jeannette N. Chang, Sungbok Lee, and Shrikanth S. Narayanan. IEMOCAP: Interactive emotional dyadic motion capture database.Language Resources and Evaluation, 42(4):335– 359, 2008

work page 2008

[7] [7]

Gebre, and Hayley Hung

Laura Cabrera-Quiros, Andrew Demetriou, Egor Balog, Astrid van der Meijden, Esma Gedik, Binyam G. Gebre, and Hayley Hung. The MatchNMingle dataset: A novel multi-sensor resource for the analysis of social interactions and nonverbal communication in unstructured mingle and speed-dating scenarios.IEEE Transactions on Affective Computing, 12(1):148–164, 2021

work page 2021

[8] [8]

The AMI meeting corpus: A pre-announcement

Jean Carletta, Simone Ashby, Sebastien Bourban, Mike Flynn, Mael Guillemot, Thomas Hain, Jaroslav Kadlec, Vasilis Karaiskos, Wessel Kraaij, Melissa Kronenthal, Guillaume Lathoud, Mike Lincoln, Agnes Lisowska, Iain McCowan, Wilfried Post, Dennis Reidsma, and Pierre Wellner. The AMI meeting corpus: A pre-announcement. InProceedings of the Second Inter- nati...

work page 2005

[9] [9]

Cawley and Nicola L

Gavin C. Cawley and Nicola L. C. Talbot. On over-fitting in model selection and subsequent selection bias in performance evaluation.Journal of Machine Learning Research, 11:2079–2107, 2010

work page 2079

[10] [10]

Sustaining cooperation in laboratory public goods experiments: A selective survey of the literature.Experimental Economics, 14(1):47–83, 2011

Ananish Chaudhuri. Sustaining cooperation in laboratory public goods experiments: A selective survey of the literature.Experimental Economics, 14(1):47–83, 2011

work page 2011

[11] [11]

The gamma corpus of danish polyadic conversations with gaze speech and motion data in quiet and noise.Scientific Data, 2026

Mark Dourado, Henrik Gert Hassager, Jesper Udesen, and Stefania Serafin. The gamma corpus of danish polyadic conversations with gaze speech and motion data in quiet and noise.Scientific Data, 2026

work page 2026

[12] [12]

Cooperation and punishment in public goods experiments

Ernst Fehr and Simon Gächter. Cooperation and punishment in public goods experiments. American Economic Review, 90(4):980–994, 2000

work page 2000

[13] [13]

Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daume III, and Kate Crawford. Datasheets for datasets.Communications of the ACM, 64(12):86–92, 2021

work page 2021

[14] [14]

Larentzakis, Nader N

Kyriakos Georgiou, Andreas V . Larentzakis, Nader N. Khamis, Ghadah I. Alsuhaibani, Yasser A. Alaska, and Elias J. Giallafos. Can wearable devices accurately measure heart rate variability? a systematic review.Folia Medica, 60(1):7–20, 2018

work page 2018

[15] [15]

Academic Press, 1981

Charles Goodwin.Conversational Organization: Interaction Between Speakers and Hearers. Academic Press, 1981. 10

work page 1981

[16] [16]

Gorgolewski, Tibor Auer, Vince D

Krzysztof J. Gorgolewski, Tibor Auer, Vince D. Calhoun, R. Cameron Craddock, Samir Das, Eugene P. Duff, Guillaume Flandin, Satrajit S. Ghosh, Tristan Glatard, Yaroslav O. Halchenko, et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments.Scientific Data, 3:160044, 2016

work page 2016

[17] [17]

The ICSI meeting corpus

Adam Janin, Don Baron, Jane Edwards, Dan Ellis, David Gelbart, Nelson Morgan, Barbara Peskin, Thilo Pfau, Elizabeth Shriberg, Andreas Stolcke, and Chuck Wooters. The ICSI meeting corpus. InProceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003

work page 2003

[18] [18]

O. P. John and S. Srivastava. The big five trait taxonomy: History, measurement, and theoretical perspectives. In L. A. Pervin and O. P. John, editors,Handbook of personality: Theory and research, pages 102–138. Guilford Press, 2nd edition, 1999

work page 1999

[19] [19]

DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

Sander Koelstra, Christian Muhl, Mohammad Soleymani, Jong-Seok Lee, Ashkan Yazdani, Touradj Ebrahimi, Thierry Pun, Anton Nijholt, and Ioannis Patras. DEAP: A database for emotion analysis using physiological signals.IEEE Transactions on Affective Computing, 3(1):18–31, 2012

work page 2012

[20] [20]

Grivich, Fiorenzo Artoni, Tim Mullen, Arnaud Delorme, and Scott Makeig

Christian Kothe, Seyed Yahya Shirazi, Tristan Stenner, David Medine, Chadwick Boulay, Matthew I. Grivich, Fiorenzo Artoni, Tim Mullen, Arnaud Delorme, and Scott Makeig. The lab streaming layer for synchronized multimodal recording.Imaging Neuroscience, 3:IMAG.a.136, 2025

work page 2025

[21] [21]

Lazarus.Emotion and Adaptation

Richard S. Lazarus.Emotion and Adaptation. Oxford University Press, 1991

work page 1991

[22] [22]

Connie Yuan, and Poppy Lauretta McLeod

Li Lu, Y . Connie Yuan, and Poppy Lauretta McLeod. Twenty-five years of hidden profiles in group decision making: A meta-analysis.Personality and Social Psychology Review, 16(1):54– 75, 2012

work page 2012

[23] [23]

Marks, John E

Michelle A. Marks, John E. Mathieu, and Stephen J. Zaccaro. A temporally based framework and taxonomy of team processes.Academy of Management Review, 26(3):356–376, 2001

work page 2001

[24] [24]

Pupillometry: Psychology, physiology, and function.Journal of Cognition, 1(1):16, 2018

Sebastiaan Mathot. Pupillometry: Psychology, physiology, and function.Journal of Cognition, 1(1):16, 2018

work page 2018

[25] [25]

Russell.An Approach to Environmental Psychology

Albert Mehrabian and James A. Russell.An Approach to Environmental Psychology. MIT Press, 1974

work page 1974

[26] [26]

A multimodal experimental dataset on agile software development team interactions.Data in Brief, 61:111828, 2025

Diego Miranda, Carlos Escobedo, Dayana Palma, Rene Noel, Adrián Fernández, Cristian Cechinel, Jaime Godoy, and Roberto Munoz. A multimodal experimental dataset on agile software development team interactions.Data in Brief, 61:111828, 2025

work page 2025

[27] [27]

AMI- GOS: A dataset for affect, personality and mood research on individuals and groups.IEEE Transactions on Affective Computing, 12(2):479–493, 2021

Juan Abdon Miranda-Correa, Mojtaba Khomami Abadi, Nicu Sebe, and Ioannis Patras. AMI- GOS: A dataset for affect, personality and mood research on individuals and groups.IEEE Transactions on Affective Computing, 12(2):479–493, 2021

work page 2021

[28] [28]

Model cards for model reporting

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchin- son, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. Model cards for model reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency, pages 220– 229, 2019

work page 2019

[29] [29]

Croissant format specification, version 1.0

MLCommons. Croissant format specification, version 1.0. https://docs.mlcommons.org/ croissant/docs/croissant-spec.html, 2024. Published 2024-03-01; accessed 2026-05- 03

work page 2024

[30] [30]

Detecting low rapport during natural interactions in small groups from non-verbal behaviour

Philipp Müller, Michael Xuelin Huang, and Andreas Bulling. Detecting low rapport during natural interactions in small groups from non-verbal behaviour. pages 153–164, 2018

work page 2018

[31] [31]

Preserving privacy in speaker and speech characterisation.Computer Speech & Language, 58:441–480, 2019

Andreas Nautsch, Andrés Jiménez, Alexander Treiber, Jan Kolberg, Catherine Jasserand, Els Kindt, Héctor Delgado, Massimiliano Todisco, Pierre Héroux, Nicholas Evans, et al. Preserving privacy in speaker and speech characterisation.Computer Speech & Language, 58:441–480, 2019. 11

work page 2019

[32] [32]

NeurIPS 2026 evaluations & datasets hosting guidelines

NeurIPS. NeurIPS 2026 evaluations & datasets hosting guidelines. https://neurips.cc/ Conferences/2026/EvaluationsDatasetsHosting, 2026. Accessed 2026-05-01

work page 2026

[33] [33]

NeurIPS 2026 evaluations & datasets track — call for papers

NeurIPS. NeurIPS 2026 evaluations & datasets track — call for papers. https://neurips. cc/Conferences/2026/CallForEvaluationsDatasets, 2026. Accessed 2026-05-01

work page 2026

[34] [34]

Cristina Palmero, Javier Selva, Sorina Smeureanu, Julio C. S. Jacques Junior, Albert Clapés, Alba Moseguí, Zejian Zhang, David Gallardo, Georgina Guilera, David Leiva, Hugo Jair Escalante, Isabelle Guyon, Xavier Baró, and Sergio Escalera. Context-aware personality inference in dyadic scenarios: Introducing the UDIV A dataset. InProceedings of the IEEE/CVF...

work page 2021

[35] [35]

Khandoker, Leontios J

Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan H. Khandoker, Leontios J. Hadjileontiadis, Alice Oh, Youn-Byoung Jeong, and Uichin Lee. K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations.Scientific Data, 7(1):293, 2020

work page 2020

[36] [36]

Emotions are social.British Journal of Psychology, 87(4):663–683, 1996

Brian Parkinson. Emotions are social.British Journal of Psychology, 87(4):663–683, 1996

work page 1996

[37] [37]

Posada-Quintero and Ki H

Hugo F. Posada-Quintero and Ki H. Chon. Innovations in electrodermal activity data collection and signal processing: A systematic review.Sensors, 20(2):479, 2020

work page 2020

[38] [38]

Rietzschel, Bernard A

Eric F. Rietzschel, Bernard A. Nijstad, and Wolfgang Stroebe. The selection of creative ideas after individual idea generation: Choosing between creativity and impact.British Journal of Psychology, 101(1):47–68, 2010

work page 2010

[39] [39]

Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions

Fabien Ringeval, Andreas Sonderegger, Jürgen Sauer, and Denis Lalanne. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. InProceedings of the IEEE International Conference on Automatic Face and Gesture Recognition Workshops, 2013

work page 2013

[40] [40]

James A. Russell. A circumplex model of affect.Journal of Personality and Social Psychology, 39(6):1161–1178, 1980

work page 1980

[41] [41]

Schegloff, and Gail Jefferson

Harvey Sacks, Emanuel A. Schegloff, and Gail Jefferson. A simplest systematics for the organization of turn-taking for conversation.Language, 50(4):696–735, 1974

work page 1974

[42] [42]

A multimodal corpus for the study of small group interactions

Dairazalia Sanchez-Cortes, Oya Aran, Marianne Schmid Mast, and Daniel Gatica-Perez. A multimodal corpus for the study of small group interactions. InProceedings of the International Conference on Multimodal Interaction Workshops, 2011

work page 2011

[43] [43]

Klaus R. Scherer. Appraisal considered as a process of multilevel sequential checking. In Klaus R. Scherer, Angela Schorr, and Tom Johnstone, editors,Appraisal Processes in Emotion: Theory, Methods, Research, pages 92–120. Oxford University Press, 2001

work page 2001

[44] [44]

Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge.Speech Communication, 53(9–10):1062–1087, 2011

Björn Schuller, Anton Batliner, Stefan Steidl, and Dino Seppi. Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge.Speech Communication, 53(9–10):1062–1087, 2011

work page 2011

[45] [45]

Advancing face-to-face emotion communication: A multimodal dataset (affec)

Meisam J. Seikavandi, Laurits Dixen, Jostein Fimland, Sree Keerthi Desu, Antonia-Bianca Zserai, Ye Sul Lee, Maria Barrett, and Paolo Burelli. Advancing face-to-face emotion commu- nication: A multimodal dataset (AFFEC).arXiv preprint arXiv:2504.18969, 2025

work page arXiv 2025

[46] [46]

Seikavandi, Jostein Fimland, Fabricio Batista Narcizo, Maria Barrett, Ted Vucurevich, Jesper Bünsow Boldt, Andrew Burke Dittberner, and Paolo Burelli

Meisam J. Seikavandi, Jostein Fimland, Fabricio Batista Narcizo, Maria Barrett, Ted Vucurevich, Jesper Bünsow Boldt, Andrew Burke Dittberner, and Paolo Burelli. Modelling the interplay of eye-tracking temporal dynamics and personality for emotion detection in face-to-face settings. arXiv preprint arXiv:2510.24720, 2025

work page arXiv 2025

[47] [47]

Gaze reveals emotion perception: Insights from modelling naturalistic face viewing

Meisam Jamshidi Seikavandi and Maria Jung Barrett. Gaze reveals emotion perception: Insights from modelling naturalistic face viewing. InProceedings of the 22nd IEEE International Conference on Machine Learning and Applications (ICMLA), pages 2022–2025. IEEE, 2023. 12

work page 2022

[48] [48]

MuMTAffect: A multimodal multitask affective framework for personality and emotion recognition from physiological signals

Meisam Jamshidi Seikavandi, Fabricio Batista Narcizo, Ted Vucurevich, Andrew Burke Dit- tberner, and Paolo Burelli. MuMTAffect: A multimodal multitask affective framework for personality and emotion recognition from physiological signals. InProceedings of the 3rd International Workshop on Multimodal and Responsible Affective Computing, pages 100–108, 2025

work page 2025

[49] [49]

Fred Shaffer and J. P. Ginsberg. An overview of heart rate variability metrics and norms. Frontiers in Public Health, 5:258, 2017

work page 2017

[50] [50]

Pooling of unshared information in group decision making: Biased information sampling during discussion.Journal of Personality and Social Psychology, 48(6):1467–1478, 1985

Garold Stasser and William Titus. Pooling of unshared information in group decision making: Biased information sampling during discussion.Journal of Personality and Social Psychology, 48(6):1467–1478, 1985

work page 1985

[51] [51]

Heart rate variability: Standards of measurement, physiological interpretation, and clinical use.Circulation, 93(5):1043–1065, 1996

Task Force of the European Society of Cardiology and the North American Society of Pac- ing and Electrophysiology. Heart rate variability: Standards of measurement, physiological interpretation, and clinical use.Circulation, 93(5):1043–1065, 1996

work page 1996

[52] [52]

Sara Taylor, Natasha Jaques, Weixuan Chen, Szymon Fedor, Akane Sano, and Rosalind W. Picard. Automatic identification of artifacts in electrodermal activity data. InProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pages 1934–1937, 2015

work page 1934

[53] [53]

The V oicePrivacy 2020 challenge: Results and findings

Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O’Brien, et al. The V oicePrivacy 2020 challenge: Results and findings. InProceedings of Interspeech, pages 1399–1403, 2021

work page 2020

[54] [54]

Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown, Trevor Hastie, Robert Tib- shirani, David Botstein, and Russ B. Altman. Missing value estimation methods for DNA microarrays.Bioinformatics, 17(6):520–525, 2001

work page 2001

[55] [55]

Van Kleef

Gerben A. Van Kleef. How emotions regulate social life: The emotions as social information (EASI) model.Current Directions in Psychological Science, 18(3):184–188, 2009

work page 2009

[56] [56]

Van Kleef, Carsten K

Gerben A. Van Kleef, Carsten K. W. De Dreu, and Antony S. R. Manstead. The interpersonal effects of emotions in negotiations: A motivated information processing approach.Journal of Personality and Social Psychology, 87(4):510–528, 2004

work page 2004

[57] [57]

Bias in error estimation when using cross-validation for model selection.BMC Bioinformatics, 7:91, 2006

Sudhir Varma and Richard Simon. Bias in error estimation when using cross-validation for model selection.BMC Bioinformatics, 7:91, 2006

work page 2006

[58] [58]

Eye gaze patterns in conversations: There is more to conversational agents than meets the eyes

Roel Vertegaal, Robert Slagter, Gerrit van der Veer, and Anton Nijholt. Eye gaze patterns in conversations: There is more to conversational agents than meets the eyes. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 301–308, 2001

work page 2001

[59] [59]

Social signal processing: Survey of an emerging domain.Image and Vision Computing, 27(12):1743–1759, 2009

Alessandro Vinciarelli, Maja Pantic, and Hervé Bourlard. Social signal processing: Survey of an emerging domain.Image and Vision Computing, 27(12):1743–1759, 2009

work page 2009

[60] [60]

Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, et al

Mark D. Wilkinson, Michel Dumontier, IJsbrand Jan Aalbersberg, et al. The FAIR guiding principles for scientific data management and stewardship.Scientific Data, 3:160018, 2016

work page 2016

[61] [61]

Roisman, and Thomas S

Zhihong Zeng, Maja Pantic, Glenn I. Roisman, and Thomas S. Huang. A survey of affect recognition methods: Audio, visual, and spontaneous expressions.IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(1):39–58, 2009

work page 2009

[62] [62]

GAP corpus: Group affect and performance corpus

Justine Zhang, Ravi Kumar, and Cristian Danescu-Niculescu-Mizil. GAP corpus: Group affect and performance corpus. https://convokit.cornell.edu/documentation/gap.html,

work page

[63] [63]

big screen

Dataset documentation. 13 Appendix Table of Contents AList of Acronyms BStimuli and Task Orchestration CExtended Limitations and Caveats DBFI-44 Scoring and Item List EAudio T0 Baseline Reliability FSynchronisation Pipeline Detail GPreprocessing Steps HExtended Dataset Characterization IExtended Benchmarks: Sequential Conversation Tasks JBenchmark Interpr...

work page

[64] [64]

No directional label bias, but within-person effect sizes are inflated

Within-person z-score (affects B0–B3d).Normalisation statistics (median, median absolute deviation ( MAD)) are computed over all four task rows per participant before the LOGO-CV split, so held-out participants contribute their own unsupervised T1–T4 distribution to test-time normalisation. No directional label bias, but within-person effect sizes are inf...

work page

[65] [65]

Cannot favour any particular label direction; impact on AUC is negligible, but represents a strict-protocol deviation

Global feature selection.Missing-rate and correlation-based filtering is applied to the full dataset before splitting. Cannot favour any particular label direction; impact on AUC is negligible, but represents a strict-protocol deviation. 22 Table 13: Retained feature set after global feature selection (35 features total; 31 used in all reported benchmarks...

work page

[66] [66]

never met

Annotation process-metadata features.The four annotation features ( answers_n, ann_total_events_n, ann_response_postblock_n, ann_event_span_s) are excluded from the 31-feature benchmark set. ann_event_span_s and answers_n vary systematically by task (T2 has more V ADprobes and a longer anno- tation span), creating a direct shortcut for the B0 task-classif...

work page arXiv 2026