📊

Dataset

FinCoT

Name: FinCoT
Creator: TheFinAI

by TheFinAI hf-dataset--thefinai--fincot

Nexus Index

29.1 Top 100%

S: Semantic 50

A: Authority 0

P: Popularity 60

R: Recency 25

Q: Quality 30

Tech Context

Vital Performance

0 DL / 30D

0.0%

Source →

Data Integrity 29.1 FNI Score

- Size

- Rows

Parquet Format

- Tokens

Dataset Information Summary
Entity Passport
Registry ID	hf-dataset--thefinai--fincot
Provider	huggingface

📜

Cite this dataset

Academic & Research Attribution

BibTeX

@misc{hf_dataset__thefinai__fincot,
  author = {TheFinAI},
  title = {FinCoT Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/thefinai/fincot}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

TheFinAI. (2026). FinCoT [Dataset]. Free2AITools. https://huggingface.co/datasets/thefinai/fincot

🔬Technical Deep Dive

Full Specifications [+]

⚖️ Nexus Index V2.0

Methodology Index Protocol

29.1

TOP 100% SYSTEM IMPACT

Semantic (S) 50

Authority (A) 0

Popularity (P) 60

Recency (R) 25

Quality (Q) 30

💬 Index Insight

FNI V2.0 for FinCoT: Semantic (S:50), Authority (A:0), Popularity (P:60), Recency (R:25), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

HuggingFace API GitHub Metadata Arxiv Citation DB System Audit

Unbiased Data Node Refresh: VFS Live

⬇️

Downloads

151,052

👁️ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

Fino1 is a financial reasoning dataset based on FinQA, ConvFinQA, TATQA, DocMath-Eval, Econ-Logic, BizBench-QA, DocFinQA dataset., with GPT-4o-generated reasoning paths to enhance structured financial question answering.

For more details, please check our paper Fin-o1[arxiv.org/abs/2502.08127].

Source Data

Initial Data Collection and Normalization

The dataset originates from FinQA, TATQA, DocMath-Eval, Econ-Logic, BizBench-QA, DocFinQA dataset.

FinQA (Apache 2.0): A dataset for financial question answering, incorporating structured tables and textual context to test multi-step reasoning abilities.

TATQA (CC BY 4.0): A tabular question-answering dataset that includes diverse financial reports, allowing for multi-step reasoning over tables and text.

DocMath-Eval (MIT License): A dataset designed to evaluate mathematical reasoning over financial documents, focusing on quantitative financial statements.

Econ-Logic (CC BY-NC-SA 4.0): A dataset that requires logical reasoning over economic and financial texts, with restrictions on commercial use.

BizBench-QA (Apache 2.0): A business-focused question-answering dataset that tests contextual understanding and financial reasoning.

DocFinQA (MIT License): A financial QA dataset that includes multi-document reasoning, designed for comprehensive financial statement analysis.

ConvFinQA (MIT License): A dataset for conversational financial QA, allowing for multi-turn interactions and progressive information extraction.

Annotations

Annotation Process

We employ an iterative verification and refinement strategy, utilizing GPT-4o to generate a comprehensive reasoning process for each question-answer pair.

💡 Citation

If you use this dataset in your research, please cite the original paper and our paper:

bibtex


@article{qian2025fino1,
  title={Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance},
  author={Qian, Lingfei and Zhou, Weipeng and Wang, Yan and Peng, Xueqing and Huang, Jimin and Xie, Qianqian},
  journal={arXiv preprint arXiv:2502.08127},
  year={2025}
}

@article{chen2021finqa,
  title={Finqa: A dataset of numerical reasoning over financial data},
  author={Chen, Zhiyu and Chen, Wenhu and Smiley, Charese and Shah, Sameena and Borova, Iana and Langdon, Dylan and Moussa, Reema and Beane, Matt and Huang, Ting-Hao and Routledge, Bryan and others},
  journal={arXiv preprint arXiv:2109.00122},
  year={2021}

@article{chen2022convfinqa,
  title={Convfinqa: Exploring the chain of numerical reasoning in conversational finance question answering},
  author={Chen, Zhiyu and Li, Shiyang and Smiley, Charese and Ma, Zhiqiang and Shah, Sameena and Wang, William Yang},
  journal={arXiv preprint arXiv:2210.03849},
  year={2022}
}

@misc{zhu2021tatqaquestionansweringbenchmark,
      title={TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance}, 
      author={Fengbin Zhu and Wenqiang Lei and Youcheng Huang and Chao Wang and Shuo Zhang and Jiancheng Lv and Fuli Feng and Tat-Seng Chua},
      year={2021},
      eprint={2105.07624},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2105.07624}, 
}

@inproceedings{zhao2024docmath,
  title={DocMath-eval: Evaluating math reasoning capabilities of LLMs in understanding long and specialized documents},
  author={Zhao, Yilun and Long, Yitao and Liu, Hongjun and Kamoi, Ryo and Nan, Linyong and Chen, Lyuhao and Liu, Yixin and Tang, Xiangru and Zhang, Rui and Cohan, Arman},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={16103--16120},
  year={2024}
}

@article{quan2024econlogicqa,
  title={Econlogicqa: A question-answering benchmark for evaluating large language models in economic sequential reasoning},
  author={Quan, Yinzhu and Liu, Zefang},
  journal={arXiv preprint arXiv:2405.07938},
  year={2024}
}

@inproceedings{krumdick2024bizbench,
  title={BizBench: A Quantitative Reasoning Benchmark for Business and Finance},
  author={Krumdick, Michael and Koncel-Kedziorski, Rik and Lai, Viet Dac and Reddy, Varshini and Lovering, Charles and Tanner, Chris},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={8309--8332},
  year={2024}
}

@article{reddy2024docfinqa,
  title={Docfinqa: A long-context financial reasoning dataset},
  author={Reddy, Varshini and Koncel-Kedziorski, Rik and Lai, Viet Dac and Krumdick, Michael and Lovering, Charles and Tanner, Chris},
  journal={arXiv preprint arXiv:2401.06915},
  year={2024}
}

## 📊 Structured Schema (Zero-Fabrication)
| Feature Key | Data Type |
| :--- | :--- |
| `Question` | `string` |
| `Reasoning_process` | `string` |
| `Final_response` | `string` |
| `Negative_reasoning_process` | `string` |
| `Negative_response` | `string` |

**Estimated Rows:** `9,186`

Social Proof

HuggingFace Hub

151.1KDownloads

Hub Discussions

🤗 Data Source: Hugging Face ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id: hf-dataset--thefinai--fincot
slug: thefinai--fincot
source: huggingface
author: TheFinAI
license
tags: size_categories:1k<n<10k, format:parquet, modality:text, library:datasets, library:pandas, library:mlcroissant, library:polars, arxiv:2502.08127, arxiv:2109.00122, arxiv:2210.03849, arxiv:2105.07624, arxiv:2405.07938, arxiv:2401.06915, region:us

⚙️ Technical Specs

architecture: null
params billions: null
context length: null
pipeline tag

📊 Engagement & Metrics

downloads: 151,052
stars: 15
forks: 0

Data indexed from public sources. Updated daily.

Welcome to Free2AI Tools!

Smart Search

FNI Score

You're All Set!