🧠

Model

Hallucination Leaderboard

Name: Hallucination Leaderboard
Author: vectara

by vectara vectara/hallucination-leaderboard

Free2AITools Nexus Index

48.0

S: Semantic 50

Query-time baseline · scored live at search

A: Authority 59

P: Popularity 70

R: Recency 90

Q: Quality 70

Tech Context

Vital Performance —

Source →

Task categories from upstream metadata

💬Chat & Dialogue

Technical Constraints

Experimental / High Latency

Low FNI signal 48 FNI Score

Tiny - Params

- Context

0 Downloads

Commercial APACHE License

Model Information Summary
Entity Passport
Registry ID	vectara/hallucination-leaderboard
License	Apache-2.0
Provider	github

📜

Cite this model

Academic & Research Attribution

BibTeX

@misc{vectara_hallucination_leaderboard,
  author = {vectara},
  title = {Hallucination Leaderboard Model},
  year = {2026},
  howpublished = {\url{https://github.com/vectara/hallucination-leaderboard}},
  note = {Accessed via Free2AITools.}
}

APA Style

vectara. (2026). Hallucination Leaderboard [Model]. Free2AITools. https://github.com/vectara/hallucination-leaderboard

🔬Technical Deep Dive

Full Specifications [+]

Quick Commands

🐙 Git Clone

git clone https://github.com/vectara/hallucination-leaderboard

⚖️ Free2AITools Nexus Index V2.0

Methodology How FNI works

Semantic (S) 50

Query-time baseline · scored live at search

Authority (A) 59

Popularity (P) 70

Recency (R) 90

Quality (Q) 70

💬 Index Insight

FNI V2.0 for Hallucination Leaderboard: Authority (A:59), Popularity (P:70), Recency (R:90), Quality (Q:70). Semantic (S) is a query-time baseline scored live at search.

Free2AITools Nexus Index

Data Sources / Provenance

HuggingFace API GitHub Metadata Arxiv Citation DB Methodology

Open data Updated: Live data

---

🚀 What's Next?

📊

Technical Deep Dive

Hallucination Leaderboard

Public LLM leaderboard computed using Vectara's Hallucination Evaluation Model, also known as HHEM. This evaluates how often an LLM introduces hallucinations when summarizing a document. We plan to update this regularly as our model and the LLMs get updated over time.

Feel free to check out the interactive hallucination leaderboard on Hugging Face.

If you are interested in previous versions os this leaderboard:

First version based on HHEM-1.0, it is available here
Most recent version, based on the previous dataset is available here

In loving memory of Simon Mark Hughes...

Last updated on April 28, 2026

Plot: hallucination rates of various LLMs

Model	Hallucination Rate	Factual Consistency Rate	Answer Rate	Average Summary Length (Words)
antgroup/finix_s1_32b	1.8 %	98.2 %	99.5 %	172.4
openai/gpt-5.4-nano-2026-03-17	3.1 %	96.9 %	100.0 %	144.4
google/gemini-2.5-flash-lite	3.3 %	96.7 %	99.5 %	95.7
microsoft/Phi-4	3.7 %	96.3 %	80.7 %	120.9
meta-llama/Llama-3.3-70B-Instruct-Turbo	4.1 %	95.9 %	99.5 %	64.6
snowflake/snowflake-arctic-instruct	4.3 %	95.7 %	62.7 %	81.4
google/gemma-3-12b-it	4.4 %	95.6 %	97.4 %	89.7
mistralai/mistral-large-2411	4.5 %	95.5 %	99.9 %	85.0
qwen/qwen3-8b	4.8 %	95.2 %	99.9 %	83.6
amazon/nova-pro-v1:0	5.1 %	94.9 %	99.3 %	66.2
amazon/nova-2-lit

⚠️ Incomplete Data

Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.

View Original Source →

📝 Limitations & Considerations

• Benchmark scores may vary based on evaluation methodology and hardware configuration.
• VRAM requirements are estimates; actual usage depends on quantization and batch size.
• FNI scores are relative rankings and may change as new models are added.
⚠ License Unknown: Verify licensing terms before commercial use.

Social Proof

GitHub Repository

3.2KStars

103Forks

Repo Issues

🐙 Data Source: GitHub ↗

🔄 Updated daily

Source summary: Based on GitHub metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Model Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id: gh-model--vectara--hallucination-leaderboard
slug: vectara--hallucination-leaderboard
source: github
author: vectara
license: Apache-2.0
tags: generative-ai, hallucinations, llm, python

⚙️ Technical Specs

architecture: null
params billions: null
context length: null
pipeline tag: text-generation

📊 Engagement & Metrics

downloads: 0
stars: 3,179
forks: 103

Data indexed from public sources. Updated daily.

Hallucination Leaderboard

Task categories from upstream metadata

Technical Constraints

Cite this model

🔬Technical Deep Dive

Quick Commands

⚖️ Free2AITools Nexus Index V2.0

💬 Index Insight

Data Sources / Provenance

🚀 What's Next?

Models in this category

Deployment reference

Technical Deep Dive

Hallucination Leaderboard

⚠️ Incomplete Data

📝 Limitations & Considerations

Social Proof

🛡️ Model Transparency Report

🆔 Identity & Source

⚙️ Technical Specs

📊 Engagement & Metrics