Dummy G1
| Entity Passport | |
| Registry ID | hf-dataset--sihatafnan--dummy_g1 |
| Provider | huggingface |
Cite this dataset
Academic & Research Attribution
@misc{hf_dataset__sihatafnan__dummy_g1,
author = {sihatafnan},
title = {Dummy G1 Dataset},
year = {2026},
howpublished = {\url{https://huggingface.co/datasets/sihatafnan/dummy_g1}},
note = {Accessed via Free2AITools Knowledge Fortress}
} π¬Technical Deep Dive
Full Specifications [+]βΎ
βοΈ Free2AITools Nexus Index V2.0
π¬ Index Insight
FNI V2.0 for Dummy G1: Semantic (S:50), Authority (A:62), Popularity (P:52), Recency (R:95), Quality (Q:50).
Verification Authority
ποΈ Data Preview
Row-level preview not available for this dataset.
Schema structure is shown in the Field Logic panel when available.
π Explore Full Dataset β𧬠Field Logic
Schema not yet indexed for this dataset.
Dataset Specification
Anonymized G1 Trajectories
About BONES-SEED
This release derives from BONES-SEED (Skeletal Everyday Embodiment Dataset) β a large multi-actor motion-capture corpus collected for studying humanoid robot teleoperation. Roughly 522 operators contribute ~142K motion clips sampled at 120 fps, covering locomotion, communication, dance, everyday actions, sport, gaming, and interactions. Each operator is annotated with biometric attributes (height, weight, age, gender) plus per-segment bone lengths.
In BONES-SEED, every clip ships in three parallel formats so the same motion can be studied at different levels of body-shape disclosure:
- SOMA Proportional BVH β original mocap on each operator's true bone lengths.
- SOMA Uniform BVH β the same motion retargeted onto a single shared skeleton (body shape stripped).
- G1 CSV β the same motion retargeted again to the Unitree G1 humanoid robot, as 29-DOF joint angles.
The clip identifiers follow {motion_name}__A{actor_uid}[__M].csv, where
_M denotes the mirrored variant. Clips are bucketed by capture date
(YYMMDD).
UNVEIL's central finding is that even after both retargeting steps strip away the operator's body shape, the G1 joint-angle stream still carries enough operator-specific dynamics β velocity profiles, ranges of motion, coordination rhythms β to re-identify the original operator and recover their height, weight, age, and gender. Our paper proposes an operator-aware anonymizer that closes this leak; this repository is the result of applying it to every G1 clip.
What's in this repository
For every clip in BONES-SEED, this repository ships the G1-retargeted CSV
after anonymization. The folder layout mirrors the source's
g1/csv/{date}/{motion}__A{actor}[__M].csv convention, so paths line up
one-to-one with the BONES-SEED G1 split.
csv/
βββ /
βββ __A[__M].csv
CSV columns: Frame, root_translate{X,Y,Z}, root_rotate{X,Y,Z}, <29 joint DOFs>
in centimetres and degrees, sampled at 120 fps. Joint columns cover the
G1's hip / knee / ankle, waist, and shoulder / elbow / wrist degrees of
freedom.
How the trajectories were anonymized (high level)
We train an encoderβdecoder defence with two parallel encoders and a shared decoder:
- A motion encoder captures what the operator is doing β the task content of the trajectory.
- An operator encoder captures how they are doing it β the velocity profiles, ranges of motion, and coordination rhythms that UNVEIL identifies as the source of biometric leakage.
- A decoder reconstructs a trajectory from a pair of latent codes (motion + operator).
The encoders are trained with self- and cross-reconstruction objectives, triplet losses (motion code groups by action, operator code groups by operator), a cooperative classifier on the operator code, and an adversarial classifier on the motion code so that the motion code carries no information about who the operator is.
At inference, each clip is encoded into its (motion, operator) pair, the operator code is swapped with that of a deliberately dissimilar operator (chosen to maximise distance in biometric attributes β height, weight, age, gender), and the decoder re-renders the trajectory. The result preserves the action so downstream task learning still works, but destroys the operator-specific dynamics that allowed UNVEIL to re-identify the source operator.
Intended use
For studying the trade-off between operator privacy and downstream utility on humanoid robot demonstration data. Not a drop-in replacement for the original BONES-SEED G1 trajectories.
Citation
@inproceedings{anonymous2026unveil,
title = {Inverting Retargeting: Humanoid Datasets Remember Their Operators},
author = {Anonymous},
booktitle = {Anonymous Submission to NeurIPS},
year = {2026}
}
Social Proof
AI Summary: Based on Hugging Face metadata. Not a recommendation.
π‘οΈ Dataset Transparency Report
Technical metadata sourced from upstream repositories.
π Identity & Source
- id
- hf-dataset--sihatafnan--dummy_g1
- slug
- sihatafnan--dummy_g1
- source
- huggingface
- author
- sihatafnan
- license
- tags
- region:us
βοΈ Technical Specs
- architecture
- null
- params billions
- null
- context length
- null
- pipeline tag
π Engagement & Metrics
- downloads
- 36,304
- stars
- 0
- forks
- null
Data indexed from public sources. Updated daily.