Hnm Search Data
Pillar scores are computed during the next indexing cycle.
| Entity Passport | |
| Registry ID | hf-dataset--rajeev-gupta--hnm-search-data |
| Provider | huggingface |
Cite this dataset
Academic & Research Attribution
@misc{hf_dataset__rajeev_gupta__hnm_search_data,
author = {Rajeev Gupta},
title = {Hnm Search Data Dataset},
year = {2026},
howpublished = {\url{https://huggingface.co/datasets/rajeev-gupta/hnm-search-data}},
note = {Accessed via Free2AITools Knowledge Fortress}
} π¬Technical Deep Dive
Full Specifications [+]βΎ
βοΈ Nexus Index V2.0
π¬ Index Insight
FNI V2.0 for Hnm Search Data: Semantic (S:0), Authority (A:0), Popularity (P:0), Recency (R:0), Quality (Q:0).
Verification Authority
ποΈ Data Preview
Row-level preview not available for this dataset.
Schema structure is shown in the Field Logic panel when available.
π Explore Full Dataset β𧬠Field Logic
Schema not yet indexed for this dataset.
Dataset Specification
HnM Search Dataset Created from Recommendations Dataset
This synthetic data-set is created using the recommendations dataset:
- https://huggingface.co/datasets/einrafh/hnm-fashion-recommendations-data (Use of this dataset is subject to the terms and conditions set forth on the original distribution page. This dataset is intended for non-commercial and research use.)
- https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/data (DATA ACCESS AND USE: Non-Commercial Purposes & Academic Research.)
as base. The base dataset is a recommendations data set where transactions data has the articles purchased by the users. This dataset gives the search queries, which may have been issued by the user before buying the article, along with the candidate results. The license for our additions is https://cdla.dev/permissive-2-0/
Search Queries Dataset
queries.csv:253685List of queries for transactions.qrels.csv:253685List of positive and negative article-ids which were retrieved for each query.
Base Dataset
articles.csv:105542List of unique products/articles with their properties/features.customers.csv:1371980List of unique customers/users with their properties/features.transactions_train.csv:31788324List of historical transactions/purchases of different articles by customers.
π Dataset Structure & Components
All search queries data is located in the folder 'data/search/' directory.
data/search/queries.csvQueries generated from individual transactions (transactions_train.csv). (253685 rows, 3 columns: query_id, transaction_id, and query_text)data/search/qrels.csvQuery results candidates-- positives (from the transaction) and close negatives article_ids (from articles.csv) . (253685 rows, 3 columns: query_id, positive_ids, negatives_ids (space separated))
All raw (recommendations) data is located in the data/raw/ directory.
data/raw/transactions_train.csvA historical record of all purchase transactions. This file serves as a central table connecting customers with the articles they purchased. (31,788,324 rows, 5 columns)data/raw/customers.csvThis dimension table contains attributes for each unique customer. (1,371,980 rows, 7 columns)data/raw/articles.csvThis dimension table contains highly detailed attributes for each unique product (article). (105,542 rows, 25 columns)data/raw/images/This directory contains product images, organized into subdirectories based on the first 3 digits of thearticle_id.
π Relationships Between Search Data
These files can be combined (joined) to create a comprehensive dataset for analysis:
query_id can be used to join the files queries.csv and qrels.csv to get the textual queries and the corresponding resultant articles. Similarly, transaction_id (from queries.csv) can be used to get the details of corresponding transactions using transactions_train.csv. positive_ids and negative_ids (from qrels.csv) can be used to join with articles.csv to get the details of the result articles (both positive-- which the user purchased-- and negatives)
πData Schema
Data schema for transactions_train.csv, 'customers.csv', and 'articles.csv' can be obtained from https://huggingface.co/datasets/einrafh/hnm-fashion-recommendations-data.
Here is the schema for the search data.
`queries.csv`
| column | Description | Type |
|---|---|---|
query_id |
Unique ID for the query(Primary Key) | object (String) |
transaction_id |
Unique ID for the transaction(Foreign Key) | object (String) |
query_text |
Text of the query | object (String) |
`qrels.csv`
| column | Description | Type |
|---|---|---|
query_id |
ID for the query(Foreign Key) | object (String) |
positive_ids |
ID for the positive result(Foreign Key) which the user clicked/purchased | object (String) |
negative_ids |
Space separated list of IDs for the negative result(Foreign Key) which the user didn't click/purchase | object (String) |
π Source
The base dataset is provided to the public by H&M Group through the Kaggle platform for analysis and research purposes. We have added search queries over the base dataset.
- Platform: Kaggle, H&M Personalized Fashion Recommendations
β οΈ License
The use of this dataset is subject to the terms and conditions stated on its original distribution page. This dataset is intended for non-commercial and research purposes.
π Structured Schema (Zero-Fabrication)
| Feature Key | Data Type |
|---|---|
article_id |
int64 |
product_code |
int64 |
prod_name |
string |
product_type_no |
int64 |
product_type_name |
string |
product_group_name |
string |
graphical_appearance_no |
int64 |
graphical_appearance_name |
string |
colour_group_code |
int64 |
colour_group_name |
string |
perceived_colour_value_id |
int64 |
perceived_colour_value_name |
string |
perceived_colour_master_id |
int64 |
perceived_colour_master_name |
string |
department_no |
int64 |
department_name |
string |
index_code |
string |
index_name |
string |
index_group_no |
int64 |
index_group_name |
string |
section_no |
int64 |
section_name |
string |
garment_group_no |
int64 |
garment_group_name |
string |
detail_desc |
string |
Estimated Rows: 105,542
Social Proof
AI Summary: Based on Hugging Face metadata. Not a recommendation.
π‘οΈ Dataset Transparency Report
Verified data manifest for traceability and transparency.
π Identity & Source
- id
- hf-dataset--rajeev-gupta--hnm-search-data
- slug
- rajeev-gupta--hnm-search-data
- source
- huggingface
- author
- Rajeev Gupta
- license
- tags
- task_categories:text-ranking, task_categories:text-retrieval, task_categories:text-classification, language:en, size_categories:10m
βοΈ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
π Engagement & Metrics
- downloads
- 166,093
- stars
- 0
- forks
- 0
Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)