👁️ Data Preview

feature	label	split
example_text_1	0	train
example_text_2	1	train
example_text_3	0	test
example_text_4	1	validation
example_text_5	0	train

Showing 5 sample rows. Real-time preview requires login.

🧬 Schema & Configs

Fields

feature: string

label: int64

split: string

Dataset Card

Dataset Card for Mostly Basic Python Problems (mbpp)

- Table of Contents - Dataset Description - Dataset Summary - Supported Tasks and Leaderboards - Languages - Dataset Structure - Data Instances - Data Fields - Data Splits - Dataset Creation - Curation Rationale - Source Data - Initial Data Collection and Normalization - Who are the source language producers? - Annotations - Annotation process - Who are the annotators? - Personal and Sensitive Information - Considerations for Using the Data - Social Impact of Dataset - Discussion of Biases - Other Known Limitations - Additional Information - Dataset Curators - Licensing Information - Citation Information - Contributions

Dataset Description

Repository: https://github.com/google-research/google-research/tree/master/mbpp
Paper: Program Synthesis with Large Language Models

Dataset Summary

The benchmark consists of around 1,000 crowd-sourced Python programming problems, designed to be solvable by entry level programmers, covering programming fundamentals, standard library functionality, and so on. Each problem consists of a task description, code solution and 3 automated test cases. As described in the paper, a subset of the data has been hand-verified by us.

Released here as part of Program Synthesis with Large Language Models, Austin et. al., 2021.

Supported Tasks and Leaderboards

This dataset is used to evaluate code generations.

Languages

English - Python code

Dataset Structure

python

dataset_full = load_dataset("mbpp")
DatasetDict({
    test: Dataset({
        features: ['task_id', 'text', 'code', 'test_list', 'test_setup_code', 'challenge_test_list'],
        num_rows: 974
    })
})dataset_sanitized = load_dataset("mbpp", "sanitized")
DatasetDict({
    test: Dataset({
        features: ['source_file', 'task_id', 'prompt', 'code', 'test_imports', 'test_list'],
        num_rows: 427
    })
})

Data Inst

Dataset Card for Mostly Basic Python Problems (mbpp)

- Table of Contents - Dataset Description - Dataset Summary - Supported Tasks and Leaderboards - Languages - Dataset Structure - Data Instances - Data Fields - Data Splits - Dataset Creation - Curation Rationale - Source Data - Initial Data Collection and Normalization - Who are the source language producers? - Annotations - Annotation process - Who are the annotators? - Personal and Sensitive Information - Considerations for Using the Data - Social Impact of Dataset - Discussion of Biases - Other Known Limitations - Additional Information - Dataset Curators - Licensing Information - Citation Information - Contributions

Dataset Description

Repository: https://github.com/google-research/google-research/tree/master/mbpp
Paper: Program Synthesis with Large Language Models

Dataset Summary

The benchmark consists of around 1,000 crowd-sourced Python programming problems, designed to be solvable by entry level programmers, covering programming fundamentals, standard library functionality, and so on. Each problem consists of a task description, code solution and 3 automated test cases. As described in the paper, a subset of the data has been hand-verified by us.

Released here as part of Program Synthesis with Large Language Models, Austin et. al., 2021.

Supported Tasks and Leaderboards

This dataset is used to evaluate code generations.

Languages

English - Python code

Dataset Structure

python

dataset_full = load_dataset("mbpp")
DatasetDict({
    test: Dataset({
        features: ['task_id', 'text', 'code', 'test_list', 'test_setup_code', 'challenge_test_list'],
        num_rows: 974
    })
})dataset_sanitized = load_dataset("mbpp", "sanitized")
DatasetDict({
    test: Dataset({
        features: ['source_file', 'task_id', 'prompt', 'code', 'test_imports', 'test_list'],
        num_rows: 427
    })
})

Data Instances

#### mbpp - full

code

{
    'task_id': 1,
    'text': 'Write a function to find the minimum cost path to reach (m, n) from (0, 0) for the given cost matrix cost[][] and a position (m, n) in cost[][].',
    'code': 'R = 3\r\nC = 3\r\ndef min_cost(cost, m, n): \r\n\ttc = [[0 for x in range(C)] for x in range(R)] \r\n\ttc[0][0] = cost[0][0] \r\n\tfor i in range(1, m+1): \r\n\t\ttc[i][0] = tc[i-1][0] + cost[i][0] \r\n\tfor j in range(1, n+1): \r\n\t\ttc[0][j] = tc[0][j-1] + cost[0][j] \r\n\tfor i in range(1, m+1): \r\n\t\tfor j in range(1, n+1): \r\n\t\t\ttc[i][j] = min(tc[i-1][j-1], tc[i-1][j], tc[i][j-1]) + cost[i][j] \r\n\treturn tc[m][n]',
    'test_list': [
        'assert min_cost([[1, 2, 3], [4, 8, 2], [1, 5, 3]], 2, 2) == 8',
        'assert min_cost([[2, 3, 4], [5, 9, 3], [2, 6, 4]], 2, 2) == 12',
        'assert min_cost([[3, 4, 5], [6, 10, 4], [3, 7, 5]], 2, 2) == 16'],
    'test_setup_code': '',
    'challenge_test_list': []
}

#### mbpp - sanitized

code

{
    'source_file': 'Benchmark Questions Verification V2.ipynb',
    'task_id': 2,
    'prompt': 'Write a function to find the shared elements from the given two lists.',
    'code': 'def similar_elements(test_tup1, test_tup2):\n  res = tuple(set(test_tup1) & set(test_tup2))\n  return (res) ',
    'test_imports': [],
    'test_list': [
        'assert set(similar_elements((3, 4, 5, 6),(5, 7, 4, 10))) == set((4, 5))',
        'assert set(similar_elements((1, 2, 3, 4),(5, 4, 3, 7))) == set((3, 4))',
        'assert set(similar_elements((11, 12, 14, 13),(17, 15, 14, 13))) == set((13, 14))'
        ]
}

Data Fields

source_file: unknown
text/prompt: description of programming task
code: solution for programming task
test_setup_code/test_imports: necessary code imports to execute tests
test_list: list of tests to verify solution
challenge_test_list: list of more challenging test to further probe solution

Data Splits

There are two version of the dataset (full and sanitized), each with four splits:

train
evaluation
test
prompt

The prompt split corresponds to samples used for few-shot prompting and not for training.

Dataset Creation

See section 2.1 of original paper.

Curation Rationale

In order to evaluate code generation functions a set of simple programming tasks as well as solutions is necessary which this dataset provides.

Source Data

#### Initial Data Collection and Normalization The dataset was manually created from scratch.

#### Who are the source language producers? The dataset was created with an internal crowdsourcing effort at Google.

Annotations

#### Annotation process The full dataset was created first and a subset then underwent a second round to improve the task descriptions.

#### Who are the annotators? The dataset was created with an internal crowdsourcing effort at Google.

Personal and Sensitive Information

None.

Considerations for Using the Data

Make sure you execute generated Python code in a safe environment when evauating against this dataset as generated code could be harmful.

Social Impact of Dataset

With this dataset code generating models can be better evaluated which leads to fewer issues introduced when using such models.

Discussion of Biases

Other Known Limitations

Since the task descriptions might not be expressive enough to solve the task. The sanitized split aims at addressing this issue by having a second round of annotators improve the dataset.

Additional Information

Dataset Curators

Google Research

Licensing Information

CC-BY-4.0

Citation Information

code

@article{austin2021program,
  title={Program Synthesis with Large Language Models},
  author={Austin, Jacob and Odena, Augustus and Nye, Maxwell and Bosma, Maarten and Michalewski, Henryk and Dohan, David and Jiang, Ellen and Cai, Carrie and Terry, Michael and Le, Quoc and others},
  journal={arXiv preprint arXiv:2108.07732},
  year={2021}

Contributions

Thanks to @lvwerra for adding this dataset.

7,104 characters total

Welcome to Free2AI Tools!

Smart Search

FNI Score

You're All Set!

Best Scenarios

Technical Constraints

🕸️ Neural Graph Explorer

📈 Interest Trend

Capabilities

🔬Deep Dive

🛠️ Technical Profile

⚡ Hardware & Scale

🧠 Training & Env

🌐 Cloud & Rights

👁️ Data Preview

🧬 Schema & Configs

Fields

Dataset Card

Dataset Card for Mostly Basic Python Problems (mbpp)

Table of Contents

Dataset Description

Dataset Summary

Supported Tasks and Leaderboards

Languages

Dataset Structure

Data Inst

Dataset Card for Mostly Basic Python Problems (mbpp)

Table of Contents

Dataset Description

Dataset Summary

Supported Tasks and Leaderboards

Languages

Dataset Structure

Data Instances

Data Fields

Data Splits

Dataset Creation

Curation Rationale

Source Data

Annotations

Personal and Sensitive Information

Considerations for Using the Data

Social Impact of Dataset

Discussion of Biases

Other Known Limitations

Additional Information

Dataset Curators

Licensing Information

Citation Information

Contributions