Reefknot is a comprehensive benchmark designed to evaluate and mitigate relation hallucinations in multimodal large language models (MLLMs). It addresses the limitations of current benchmarks by providing a detailed evaluation framework and a novel confidence-based mitigation strategy. Reefknot aims to improve the trustworthiness of multimodal intelligence by reducing hallucination rates.
Confidence-based mitigation strategy
Not specified
Visual Genome scene graph dataset
Reduces hallucination rate by 9.75%
Not specified
No
Yes
Comprehensive benchmark, confidence-based mitigation, improved trustworthiness
Yes
Not specified
Not specified
Not specified
Not specified
Not specified
Not specified
No
Not specified
Not specified
20,000 real-world samples
Not specified
Not specified
Not specified
Not specified
Not specified
Technology, Research
Evaluation of multimodal models, hallucination mitigation
Not specified
Not specified
High
Not specified
Not specified
Not specified
No
Not specified
Not specified
No
Not specified
Not specified
Not specified
Not specified
Not specified
No
Not specified
Not specified
0.00
Not specified
Not specified
01/01/1970
01/01/1970
Not specified
Not specified
Yes