📊

Dataset

Swe Bench Verified

Name: Swe Bench Verified
Creator: Princeton Nlp

by Princeton Nlp hf-dataset--princeton-nlp--swe-bench_verified

Nexus Index

32.3 Top 100%

S: Semantic 50

A: Authority 0

P: Popularity 67

R: Recency 12

Q: Quality 30

Tech Context

Vital Performance

0 DL / 30D

0.0%

Source →

Data Integrity 32.3 FNI Score

- Size

- Rows

Parquet Format

- Tokens

Dataset Information Summary
Entity Passport
Registry ID	hf-dataset--princeton-nlp--swe-bench_verified
Provider	huggingface

📜

Cite this dataset

Academic & Research Attribution

BibTeX

@misc{hf_dataset__princeton_nlp__swe_bench_verified,
  author = {Princeton Nlp},
  title = {Swe Bench Verified Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/princeton-nlp/swe-bench_verified}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

Princeton Nlp. (2026). Swe Bench Verified [Dataset]. Free2AITools. https://huggingface.co/datasets/princeton-nlp/swe-bench_verified

🔬Technical Deep Dive

Full Specifications [+]

⚖️ Nexus Index V2.0

Methodology Index Protocol

32.3

TOP 100% SYSTEM IMPACT

Semantic (S) 50

Authority (A) 0

Popularity (P) 67

Recency (R) 12

Quality (Q) 30

💬 Index Insight

FNI V2.0 for Swe Bench Verified: Semantic (S:50), Authority (A:0), Popularity (P:67), Recency (R:12), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

HuggingFace API GitHub Metadata Arxiv Citation DB System Audit

Unbiased Data Node Refresh: VFS Live

⬇️

Downloads

731,917

👁️ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

Dataset Summary

SWE-bench Verified is a subset of 500 samples from the SWE-bench test set, which have been human-validated for quality. SWE-bench is a dataset that tests systems’ ability to solve GitHub issues automatically. See this post for more details on the human-validation process.

The dataset collects 500 test Issue-Pull Request pairs from popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution.

The original SWE-bench dataset was released as part of SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Want to run inference now? This dataset only contains the problem_statement (i.e. issue text) and the base_commit which represents the state of the codebase before the issue has been resolved. If you want to run inference using the "Oracle" or BM25 retrieval settings mentioned in the paper, consider the following datasets.

princeton-nlp/SWE-bench_Lite_oracle

princeton-nlp/SWE-bench_Lite_bm25_13K

princeton-nlp/SWE-bench_Lite_bm25_27K

Supported Tasks and Leaderboards SWE-bench proposes a new task: issue resolution provided a full repository and GitHub issue. The leaderboard can be found at www.swebench.com

Languages The text of the dataset is primarily English, but we make no effort to filter or otherwise clean based on language type.

Dataset Structure

An example of a SWE-bench datum is as follows:

text

instance_id: (str) - A formatted instance identifier, usually as repo_owner__repo_name-PR-number.
patch: (str) - The gold patch, the patch generated by the PR (minus test-related code), that resolved the issue.
repo: (str) - The repository owner/name identifier from GitHub.
base_commit: (str) - The commit hash of the repository representing the HEAD of the repository before the solution PR is applied.
hints_text: (str) - Comments made on the issue prior to the creation of the solution PR’s first commit creation date.
created_at: (str) - The creation date of the pull request.
test_patch: (str) - A test-file patch that was contributed by the solution PR.
problem_statement: (str) - The issue title and body.
version: (str) - Installation version to use for running evaluation.
environment_setup_commit: (str) - commit hash to use for environment setup and installation.
FAIL_TO_PASS: (str) - A json list of strings that represent the set of tests resolved by the PR and tied to the issue resolution.
PASS_TO_PASS: (str) - A json list of strings that represent tests that should pass before and after the PR application.

📊 Structured Schema (Zero-Fabrication)

Feature Key	Data Type
`repo`	`string`
`instance_id`	`string`
`base_commit`	`string`
`patch`	`string`
`test_patch`	`string`
`problem_statement`	`string`
`hints_text`	`string`
`created_at`	`string`
`version`	`string`
`FAIL_TO_PASS`	`string`
`PASS_TO_PASS`	`string`
`environment_setup_commit`	`string`
`difficulty`	`string`

Estimated Rows: 500

Social Proof

HuggingFace Hub

731.9KDownloads

Hub Discussions

🤗 Data Source: Hugging Face ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id: hf-dataset--princeton-nlp--swe-bench_verified
slug: princeton-nlp--swe-bench_verified
source: huggingface
author: Princeton Nlp
license
tags: size_categories:n<1k, format:parquet, modality:text, library:datasets, library:pandas, library:mlcroissant, library:polars, region:us

⚙️ Technical Specs

architecture: null
params billions: null
context length: null
pipeline tag

📊 Engagement & Metrics

downloads: 731,917
stars: 331
forks: 0

Data indexed from public sources. Updated daily.

Welcome to Free2AI Tools!

Smart Search

FNI Score

You're All Set!