This is a Dataset, not a Model
The following metrics do not apply: FNI Score, Deployment Options, Model Architecture
boolq
"--- annotations_creators: - crowdsourced language_creators: - found language: - en license: - cc-by-sa-3.0 multilinguality: - monolingual size_categories: - 10K"
Best Scenarios
Technical Constraints
πΈοΈ Neural Graph Explorer
v15.13π Learn More
π Interest Trend
* Real-time activity index across HuggingFace, GitHub and Research citations.
Capabilities
- β Data Science
Finding datasets with similar distribution...
No benchmark correlations for this dataset.
π¬Deep Dive
Expand Details [+]βΎ
π οΈ Technical Profile
β‘ Hardware & Scale
π§ Training & Env
π Cloud & Rights
ποΈ Data Preview
| feature | label | split |
|---|---|---|
| example_text_1 | 0 | train |
| example_text_2 | 1 | train |
| example_text_3 | 0 | test |
| example_text_4 | 1 | validation |
| example_text_5 | 0 | train |
𧬠Schema & Configs
Fields
Dataset Card
Dataset Card for Boolq
Table of Contents
- Dataset Summary - Supported Tasks and Leaderboards - Languages - Data Instances - Data Fields - Data Splits - Curation Rationale - Source Data - Annotations - Personal and Sensitive Information - Social Impact of Dataset - Discussion of Biases - Other Known Limitations - Dataset Curators - Licensing Information - Citation Information - ContributionsDataset Description
- Homepage: More Information Needed
- Repository: https://github.com/google-research-datasets/boolean-questions
- Paper: https://arxiv.org/abs/1905.10044
- Point of Contact: More Information Needed
- Size of downloaded dataset files: 8.77 MB
- Size of the generated dataset: 7.83 MB
- Total amount of disk used: 16.59 MB
Dataset Summary
BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring ---they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context. The text-pair classification setup is similar to existing natural language inference tasks.
Supported Tasks and Leaderboards
Languages
Dataset Structure
Data Instances
#### default
- Size of downloaded dataset files: 8.77 MB
- Size of the generated dataset: 7.83 MB
- Total amount of disk used: 16.59 MB
This example was too long and was cropped:{
"answer": false,
"passage": "\"All biomass goes through at least some of these steps: it needs to be grown, collected, dried, fermented, distilled, and burned...",
"question": "does ethanol take more energy make that produces"
}
Data Fields
The data fields are the same among
Dataset Card for Boolq
Table of Contents
- Dataset Summary - Supported Tasks and Leaderboards - Languages - Data Instances - Data Fields - Data Splits - Curation Rationale - Source Data - Annotations - Personal and Sensitive Information - Social Impact of Dataset - Discussion of Biases - Other Known Limitations - Dataset Curators - Licensing Information - Citation Information - ContributionsDataset Description
- Homepage: More Information Needed
- Repository: https://github.com/google-research-datasets/boolean-questions
- Paper: https://arxiv.org/abs/1905.10044
- Point of Contact: More Information Needed
- Size of downloaded dataset files: 8.77 MB
- Size of the generated dataset: 7.83 MB
- Total amount of disk used: 16.59 MB
Dataset Summary
BoolQ is a question answering dataset for yes/no questions containing 15942 examples. These questions are naturally occurring ---they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context. The text-pair classification setup is similar to existing natural language inference tasks.
Supported Tasks and Leaderboards
Languages
Dataset Structure
Data Instances
#### default
- Size of downloaded dataset files: 8.77 MB
- Size of the generated dataset: 7.83 MB
- Total amount of disk used: 16.59 MB
This example was too long and was cropped:{
"answer": false,
"passage": "\"All biomass goes through at least some of these steps: it needs to be grown, collected, dried, fermented, distilled, and burned...",
"question": "does ethanol take more energy make that produces"
}
Data Fields
The data fields are the same among all splits.
#### default
question: astringfeature.answer: aboolfeature.passage: astringfeature.
Data Splits
| name |train|validation| |-------|----:|---------:| |default| 9427| 3270|
Dataset Creation
Curation Rationale
Source Data
#### Initial Data Collection and Normalization
#### Who are the source language producers?
Annotations
#### Annotation process
#### Who are the annotators?
Personal and Sensitive Information
Considerations for Using the Data
Social Impact of Dataset
Discussion of Biases
Other Known Limitations
Additional Information
Dataset Curators
Licensing Information
BoolQ is released under the Creative Commons Share-Alike 3.0 license.
Citation Information
@inproceedings{clark2019boolq,
title = {BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions},
author = {Clark, Christopher and Lee, Kenton and Chang, Ming-Wei, and Kwiatkowski, Tom and Collins, Michael, and Toutanova, Kristina},
booktitle = {NAACL},
year = {2019},
}
Contributions
Thanks to @lewtun, @lhoestq, @thomwolf, @patrickvonplaten, @albertvillanova for adding this dataset.
5,791 characters total