AGIEval shows GPT-4 exceeding average human scores on SAT Math at 95% and Chinese college entrance English at 92.5%, while revealing weaker results on complex reasoning tasks.
Racist or Sexist Meme? Classifying Memes beyond Hateful
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
TwistedHumor dataset shows dark humor in YouTube Shorts clusters around critique, coping, awkwardness and identity with more mixed and toxic audience reactions than regular humor.
citing papers explorer
-
AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models
AGIEval shows GPT-4 exceeding average human scores on SAT Math at 95% and Chinese college entrance English at 92.5%, while revealing weaker results on complex reasoning tasks.
-
When Jokes Cross the Line: Analyzing Regular Humor and Dark Humor in YouTube Shorts
TwistedHumor dataset shows dark humor in YouTube Shorts clusters around critique, coping, awkwardness and identity with more mixed and toxic audience reactions than regular humor.