📊
Dataset

Terminal Bench 2 Leaderboard

by sadahiy hf-dataset--sadahiy--terminal-bench-2-leaderboard
Nexus Index
41.6 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 49
R: Recency 89
Q: Quality 30
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 41.6 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--sadahiy--terminal-bench-2-leaderboard
License Apache-2.0
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__sadahiy__terminal_bench_2_leaderboard,
  author = {sadahiy},
  title = {Terminal Bench 2 Leaderboard Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/sadahiy/terminal-bench-2-leaderboard}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
sadahiy. (2026). Terminal Bench 2 Leaderboard [Dataset]. Free2AITools. https://huggingface.co/datasets/sadahiy/terminal-bench-2-leaderboard

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

41.6
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 49
Recency (R) 89
Quality (Q) 30

đŸ’Ŧ Index Insight

FNI V2.0 for Terminal Bench 2 Leaderboard: Semantic (S:50), Authority (A:0), Popularity (P:49), Recency (R:89), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
22,326

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

Terminal-Bench 2.0 Leaderboard Submissions

This repository accepts leaderboard submissions for Terminal-Bench 2.0.

How to Submit

  1. Fork this repository
  2. Create a new branch for your submission
  3. Add your submission (a job or folder of jobs) under submissions/terminal-bench/2.0/<agent>__<model(s)>/
  4. Open a Pull Request

Submission Structure

text
submissions/
  terminal-bench/
    2.0/
      __/
        metadata.yaml       # Required: agent and model info
        /       # One or more job directories
          config.json
          /result.json
          /result.json
          ...

Required: metadata.yaml

Each submission must include a metadata.yaml file with the following fields:

yaml
agent_url: https://...         # Required: link to agent repo/docs
agent_display_name: "My Agent" # Required: display name for leaderboard
agent_org_display_name: "Org"  # Required: organization name

models:                              # Required: list of models used
  - model_name: gpt-5                # Required: model identifier
    model_provider: openai           # Required: provider (openai, anthropic, etc.)
    model_display_name: "GPT-5"      # Required
    model_org_display_name: "OpenAI" # Required
  # - Other models if your agent used multiple

Job Directory Requirements

Each job directory must contain all of the contents of your run.

Validation Rules

Your submission will be automatically validated. To pass:

  • timeout_multiplier must equal 1.0
  • No agent timeout overrides (override_timeout_sec, max_timeout_sec)
  • No verifier timeout overrides
  • No resource overrides (override_cpus, override_memory_mb, override_storage_mb)
  • All trial directories must have valid result.json files
  • Trial directories must contain other artifacts from the run
  • Each task must be evaluated with a minimum of five trials. We recommend the -k 5 flag for convenience.
  • Agents cannot access the Terminal-Bench website or GitHub repository (reward hacking)

Submission Process

  1. Open PR: When you open a Pull Request, our bot will automatically validate your submission
  2. Fix Issues: If validation fails, the bot will comment with specific errors to fix
  3. Merge: Once validation passes, a maintainer will review and merge your PR
  4. Import: After merge, results are automatically imported to the leaderboard

Questions?

Open an issue in this repository or contact [email protected].

Social Proof

HuggingFace Hub
22.3KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-dataset--sadahiy--terminal-bench-2-leaderboard
slug
sadahiy--terminal-bench-2-leaderboard
source
huggingface
author
sadahiy
license
Apache-2.0
tags
license:apache-2.0, region:us

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

📊 Engagement & Metrics

downloads
22,326
stars
0
forks
0

Data indexed from public sources. Updated daily.