MM-Eval unifies evaluation of multimodal summaries by integrating factual text quality, cross-modal relevance via MLLM judge, and visual diversity via truncated CLIP entropy, then calibrates their combination on human preferences.
In: 2018 IEEE 31st Computer Security Foundations Symposium (CSF)
9 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
SPeCTrA-Sum uses hierarchical cross-modal fusion via DVP and DPP-distilled image selection via VRP to generate more accurate and visually grounded multimodal summaries.
IRAP quantifies ambiguous performance requirements into mathematical functions via interactive retrieval-augmented preference elicitation and outperforms ten prior methods on four real-world datasets with up to 40x gains in five interaction rounds.
Flipping 1-2 sign bits in DNN parameters, located without data or optimization, drops accuracy to near zero across image classification, detection, segmentation, and language models.
Annotary is a concolic execution system for smart contract vulnerability detection that uses source-code annotations, symbolic EVM execution, and blockchain data resolution to handle inter-contract and inter-transaction flows.
Learning progression-derived rubrics produce AI feedback on student science writing that matches expert rubric quality in key dimensions.
SPEAR applies multi-agent systems with planning, execution, and repair agents using negotiation protocols to smart contract auditing and compares it empirically to centralized and pipeline approaches.
Empirical review of 233 real-world vulnerabilities from 34 TON audits produces a specialized checklist for asynchronous message handling, supported by case studies and an 11-person practitioner survey.
A study of 143 students found that rubric-based generative AI feedback helped students revise writing for better organization and style, saved teachers time, but produced inconsistent automated ratings.
citing papers explorer
-
Measuring What Matters Beyond Text: Evaluating Multimodal Summaries by Quality, Alignment, and Diversity
MM-Eval unifies evaluation of multimodal summaries by integrating factual text quality, cross-modal relevance via MLLM judge, and visual diversity via truncated CLIP entropy, then calibrates their combination on human preferences.
-
Towards Visually Grounded Multimodal Summarization via Cross-Modal Transformer and Gated Attention
SPeCTrA-Sum uses hierarchical cross-modal fusion via DVP and DPP-distilled image selection via VRP to generate more accurate and visually grounded multimodal summaries.
-
Conjecture and Inquiry: Quantifying Software Performance Requirements via Interactive Retrieval-Augmented Preference Elicitation
IRAP quantifies ambiguous performance requirements into mathematical functions via interactive retrieval-augmented preference elicitation and outperforms ten prior methods on four real-world datasets with up to 40x gains in five interaction rounds.
-
Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips
Flipping 1-2 sign bits in DNN parameters, located without data or optimization, drops accuracy to near zero across image classification, detection, segmentation, and language models.
-
Annotary: A Concolic Execution System for Developing Secure Smart Contracts
Annotary is a concolic execution system for smart contract vulnerability detection that uses source-code annotations, symbolic EVM execution, and blockchain data resolution to handle inter-contract and inter-transaction flows.
-
Using Learning Progressions to Guide AI Feedback for Science Learning
Learning progression-derived rubrics produce AI feedback on student science writing that matches expert rubric quality in key dimensions.
-
SPEAR: An Engineering Case Study of Multi-Agent Coordination for Smart Contract Auditing
SPEAR applies multi-agent systems with planning, execution, and repair agents using negotiation protocols to smart contract auditing and compares it empirically to centralized and pipeline approaches.
-
From Paradigm Shift to Audit Rift: Empirical Analysis and Validation of Security Audit Methodologies for Asynchronous Smart Contract Systems
Empirical review of 233 real-world vulnerabilities from 34 TON audits produces a specialized checklist for asynchronous message handling, supported by case studies and an 11-person practitioner survey.
-
Generative AI Feedback, English Writing and Teacher Rubrics: A Multiple-Case Study of CyberScholar
A study of 143 students found that rubric-based generative AI feedback helped students revise writing for better organization and style, saved teachers time, but produced inconsistent automated ratings.