📊
Dataset

Swe Bench Pro

by ScaleAI hf-dataset--scaleai--swe-bench_pro
Nexus Index
41.9 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 68
R: Recency 73
Q: Quality 30
Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 41.9 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--scaleai--swe-bench_pro
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__scaleai__swe_bench_pro,
  author = {ScaleAI},
  title = {Swe Bench Pro Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/scaleai/swe-bench_pro}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
ScaleAI. (2026). Swe Bench Pro [Dataset]. Free2AITools. https://huggingface.co/datasets/scaleai/swe-bench_pro

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

41.9
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 68
Recency (R) 73
Quality (Q) 30

đŸ’Ŧ Index Insight

FNI V2.0 for Swe Bench Pro: Semantic (S:50), Authority (A:0), Popularity (P:68), Recency (R:73), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
854,918

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

Dataset Summary

SWE-Bench Pro is a challenging, enterprise-level dataset for testing agent ability on long-horizon software engineering tasks.

Paper: https://static.scale.com/uploads/654197dc94d34f66c0f5184e/SWEAP_Eval_Scale%20(9).pdf

See the related evaluation Github: https://github.com/scaleapi/SWE-bench_Pro-os

Dataset Structure

We follow SWE-Bench Verified (https://huggingface.co/datasets/SWE-bench/SWE-bench_Verified) in terms of dataset structure, with several extra fields.

Data Fields

repo (string): Repository identifier - one of 11 repository classes

instance_id (string): Unique identifier for each instance (65-120 characters)

base_commit (string): Git commit hash of the base version (40 characters)

patch (string): The golden code patch/diff (1.44k - 180k characters)

test_patch (string): Test cases related to the patch (325 - 322k characters)

problem_statement (string): Description of the issue being addressed (419 - 8.04k characters)

requirements (string): Project requirements or dependencies (124 - 6.7k characters, may be null)

interface (string): API or interface specifications (1 - 12.2k characters, may be null)

repo_language (string): Programming language of the repository - one of 4 language classes

fail_to_pass (string): Test cases that should pass after patch application (10 - 155k characters)

pass_to_pass (string): Test cases that should continue passing (2 - 532k characters)

issue_specificity (string): Specificity of the issue (12-77 characters)

issue_categories (string): Categories or tags for the issue type

before_repo_set_cmd (string): Repo set command for testing

selected_test_files_to_run (string): Files selected for testing

📊 Structured Schema (Zero-Fabrication)

Feature Key Data Type
repo string
instance_id string
base_commit string
patch string
test_patch string
problem_statement string
requirements string
interface string
repo_language string
fail_to_pass string
pass_to_pass string
issue_specificity string
issue_categories string
before_repo_set_cmd string
selected_test_files_to_run string
dockerhub_tag string

Estimated Rows: 731

Social Proof

HuggingFace Hub
854.9KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id
hf-dataset--scaleai--swe-bench_pro
slug
scaleai--swe-bench_pro
source
huggingface
author
ScaleAI
license
tags
size_categories:n<1k, format:csv, modality:text, library:datasets, library:pandas, library:mlcroissant, library:polars, region:us, format:parquet, benchmark:official, benchmark:eval-yaml

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

📊 Engagement & Metrics

downloads
854,918
stars
102
forks
0

Data indexed from public sources. Updated daily.