hellaswag
Pillar scores are computed during the next indexing cycle.
| Entity Passport | |
| Registry ID | hf-dataset--rowan--hellaswag |
| Provider | huggingface |
Cite this dataset
Academic & Research Attribution
@misc{hf_dataset__rowan__hellaswag,
author = {Rowan},
title = {hellaswag Dataset},
year = {2026},
howpublished = {\url{https://huggingface.co/datasets/rowan/hellaswag}},
note = {Accessed via Free2AITools Knowledge Fortress}
} đŦTechnical Deep Dive
Full Specifications [+]âž
âī¸ Nexus Index V2.0
đŦ Index Insight
FNI V2.0 for hellaswag: Semantic (S:0), Authority (A:0), Popularity (P:0), Recency (R:0), Quality (Q:0).
Verification Authority
đī¸ Data Preview
Row-level preview not available for this dataset.
Schema structure is shown in the Field Logic panel when available.
đ Explore Full Dataset âđ§Ŧ Field Logic
Schema not yet indexed for this dataset.
Dataset Specification
Dataset Card for "hellaswag"
Table of Contents
- Dataset Description
- Dataset Structure
- Dataset Creation
- Considerations for Using the Data
- Additional Information
Dataset Description
- Homepage: https://rowanzellers.com/hellaswag/
- Repository: https://github.com/rowanz/hellaswag/
- Paper: HellaSwag: Can a Machine Really Finish Your Sentence?
- Point of Contact: More Information Needed
- Size of downloaded dataset files: 71.49 MB
- Size of the generated dataset: 65.32 MB
- Total amount of disk used: 136.81 MB
Dataset Summary
HellaSwag: Can a Machine Really Finish Your Sentence? is a new dataset for commonsense NLI. A paper was published at ACL2019.
Supported Tasks and Leaderboards
Languages
Dataset Structure
Data Instances
default
- Size of downloaded dataset files: 71.49 MB
- Size of the generated dataset: 65.32 MB
- Total amount of disk used: 136.81 MB
An example of 'train' looks as follows.
This example was too long and was cropped:
{
"activity_label": "Removing ice from car",
"ctx": "Then, the man writes over the snow covering the window of a car, and a woman wearing winter clothes smiles. then",
"ctx_a": "Then, the man writes over the snow covering the window of a car, and a woman wearing winter clothes smiles.",
"ctx_b": "then",
"endings": "[\", the man adds wax to the windshield and cuts it.\", \", a person board a ski lift, while two men supporting the head of the per...",
"ind": 4,
"label": "3",
"source_id": "activitynet~v_-1IBHYS3L-Y",
"split": "train",
"split_type": "indomain"
}
Data Fields
The data fields are the same among all splits.
default
ind: aint32feature.activity_label: astringfeature.ctx_a: astringfeature.ctx_b: astringfeature.ctx: astringfeature.endings: alistofstringfeatures.source_id: astringfeature.split: astringfeature.split_type: astringfeature.label: astringfeature.
Data Splits
| name | train | validation | test |
|---|---|---|---|
| default | 39905 | 10042 | 10003 |
Dataset Creation
Curation Rationale
Source Data
Initial Data Collection and Normalization
Who are the source language producers?
Annotations
Annotation process
Who are the annotators?
Personal and Sensitive Information
Considerations for Using the Data
Social Impact of Dataset
Discussion of Biases
Other Known Limitations
Additional Information
Dataset Curators
Licensing Information
MIT https://github.com/rowanz/hellaswag/blob/master/LICENSE
Citation Information
@inproceedings{zellers2019hellaswag,
title={HellaSwag: Can a Machine Really Finish Your Sentence?},
author={Zellers, Rowan and Holtzman, Ari and Bisk, Yonatan and Farhadi, Ali and Choi, Yejin},
booktitle ={Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},
year={2019}
}
Contributions
Thanks to @albertvillanova, @mariamabarham, @thomwolf, @patrickvonplaten, @lewtun for adding this dataset.
đ Structured Schema (Zero-Fabrication)
| Feature Key | Data Type |
|---|---|
ind |
int32 |
activity_label |
string |
ctx_a |
string |
ctx_b |
string |
ctx |
string |
endings |
unknown |
source_id |
string |
split |
string |
split_type |
string |
label |
string |
Estimated Rows: 59,950
Social Proof
AI Summary: Based on Hugging Face metadata. Not a recommendation.
đĄī¸ Dataset Transparency Report
Verified data manifest for traceability and transparency.
đ Identity & Source
- id
- hf-dataset--rowan--hellaswag
- slug
- rowan--hellaswag
- source
- huggingface
- author
- Rowan
- license
- tags
- language:en, size_categories:10k
âī¸ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
đ Engagement & Metrics
- downloads
- 300,906
- stars
- 166
- forks
- 0
Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)