📊
Dataset

Ndl Core Structured Data

by theodi hf-dataset--theodi--ndl-core-structured-data
Nexus Index
38.8 Top 100%
S / A / P / R / Q Breakdown Calibration Pending

Pillar scores are computed during the next indexing cycle.

Tech Context
Vital Performance
0 DL / 30D
0.0%
Data Integrity 38.8 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--theodi--ndl-core-structured-data
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__theodi__ndl_core_structured_data,
  author = {theodi},
  title = {Ndl Core Structured Data Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/theodi/ndl-core-structured-data}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
theodi. (2026). Ndl Core Structured Data [Dataset]. Free2AITools. https://huggingface.co/datasets/theodi/ndl-core-structured-data

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

38.8
ESTIMATED IMPACT TIER
Semantic (S) 0
Authority (A) 0
Popularity (P) 0
Recency (R) 0
Quality (Q) 0

đŸ’Ŧ Index Insight

FNI V2.0 for Ndl Core Structured Data: Semantic (S:0), Authority (A:0), Popularity (P:0), Recency (R:0), Quality (Q:0).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
81,965

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification

NDL Core – Structured Data

Overview

NDL Core – Structured Data is a curated collection of structured UK public sector datasets, converted into Apache Parquet format for efficient analytics and machine learning workflows.

This repository is part of the broader NDL Core Corpus, which combines both textual and structured data sourced from authoritative UK government and public sector platforms. Textual sources (e.g. GOV.UK, Hansard, legislation.gov.uk) are hosted separately, while this repository focuses exclusively on tabular / structured datasets.

The goal of this dataset is to provide a clean, analysis-ready foundation for research, policy analysis, data science, and downstream AI applications.


Data Sources

The structured data in this repository has been collected from the following UK public sector sources:

1. data.gov.uk

  • Top 10 most recent datasets per category at the time of crawling
  • Covers a wide range of domains including transport, environment, health, education, and government operations
  • Original formats varied (CSV, XLSX, JSON), all normalized and converted to Parquet

2. Office for National Statistics (ONS)

  • Official UK statistics on population, economy, labour market, health, and more
  • Includes high-value national datasets frequently used in research and policymaking

3. Defra (Department for Environment, Food & Rural Affairs)

  • Environmental, agricultural, and rural data
  • Includes datasets related to land use, climate, farming, and natural resources

Intended Use Cases

  • Policy analysis and evaluation
  • Socioeconomic and environmental research
  • Public sector analytics
  • Feature engineering for ML models
  • Retrieval-augmented generation (RAG) pipelines
  • Data integration with textual government corpora

Relationship to NDL Core Corpus

This repository contains only structured data. Related textual datasets (government guidance, parliamentary debates, legislation) are hosted separately in the NDL Core Corpus repository.

Together, these datasets enable hybrid structured + unstructured analysis across UK public sector information.


Limitations

  • Data reflects the state of the source datasets at crawl time
  • Some datasets may be incomplete, deprecated, or superseded upstream
  • Schema consistency varies across sources
  • No guarantee of real-time updates

Social Proof

HuggingFace Hub
82.0KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Verified data manifest for traceability and transparency.

100% Data Disclosure Active

🆔 Identity & Source

id
hf-dataset--theodi--ndl-core-structured-data
slug
theodi--ndl-core-structured-data
source
huggingface
author
theodi
license
tags
size_categories:100m

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

📊 Engagement & Metrics

downloads
81,965
stars
0
forks
0

Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)