📊
Dataset

Quovadis Speakeasy

by wy777 hf-dataset--wy777--quovadis-speakeasy
Nexus Index
46.0 Top 0%
S / A / P / R / Q Breakdown Calibration Pending

Pillar scores are computed during the next indexing cycle.

Tech Context
Vital Performance
0 DL / 30D
0.0%

This dataset was created and further refined as part of the following two publications: - "Quo Vadis: Hybrid Machine Learning Meta-Model Based on Contextual and Behavioral Malware Representations", Trizna et al., 2022, https://dl.acm.org/doi/10.1145/3560830.3563726 - "Nebula: Self-Attention for Dynamic Malware Analysis", Trizna et al., 2024, https://ieeexplore.ieee.org/document/10551436 If you used it in your research, please cite us: Arxiv references of both paper...

Data Integrity 46 FNI Score
- Size
- Rows
Parquet Format
- Tokens
Dataset Information Summary
Entity Passport
Registry ID hf-dataset--wy777--quovadis-speakeasy
Provider huggingface
📜

Cite this dataset

Academic & Research Attribution

BibTeX
@misc{hf_dataset__wy777__quovadis_speakeasy,
  author = {wy777},
  title = {Quovadis Speakeasy Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/wy777/quovadis-speakeasy}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
wy777. (2026). Quovadis Speakeasy [Dataset]. Free2AITools. https://huggingface.co/datasets/wy777/quovadis-speakeasy

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Nexus Index V2.0

46.0
ESTIMATED IMPACT TIER
Semantic (S) 50
Authority (A) 0
Popularity (P) 0
Recency (R) 0
Quality (Q) 0

đŸ’Ŧ Index Insight

FNI V2.0 for Quovadis Speakeasy: Semantic (S:50), Authority (A:0), Popularity (P:0), Recency (R:0), Quality (Q:0).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live
âŦ‡ī¸
Downloads
43,519

đŸ‘ī¸ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

đŸ§Ŧ Field Logic

đŸ§Ŧ

Schema not yet indexed for this dataset.

Dataset Specification


license: apache-2.0

About Dataset

Citation

This dataset was created and further refined as part of the following two publications:

If you used it in your research, please cite us:

@inproceedings{quovadis,
author = {Trizna, Dmitrijs},
title = {Quo Vadis: Hybrid Machine Learning Meta-Model Based on Contextual and Behavioral Malware Representations},
year = {2022},
isbn = {9781450398800},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3560830.3563726},
doi = {10.1145/3560830.3563726},
booktitle = {Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security},
pages = {127–136},
numpages = {10},
keywords = {reverse engineering, neural networks, malware, emulation, convolutions},
location = {Los Angeles, CA, USA},
series = {AISec'22}
}
@ARTICLE{nebula,
  author={Trizna, Dmitrijs and Demetrio, Luca and Biggio, Battista and Roli, Fabio},
  journal={IEEE Transactions on Information Forensics and Security}, 
  title={Nebula: Self-Attention for Dynamic Malware Analysis}, 
  year={2024},
  volume={19},
  number={},
  pages={6155-6167},
  keywords={Malware;Feature extraction;Data models;Analytical models;Long short term memory;Task analysis;Encoding;Malware;transformers;dynamic analysis;convolutional neural networks},
  doi={10.1109/TIFS.2024.3409083}}

Arxiv references of both papers: arxiv.org/abs/2310.10664 and arxiv.org/abs/2208.12248.

Description

This dataset contains behavioral reports obtained with Speakeasy emulator from 93533 32-bit portable executables (PE).

This is complementary dataset to https://huggingface.co/datasets/dtrizna/quovadis-ember, which represents static EMBER features of the same malware samples.

To reflect concept drift in malware:

  • 76126 files that form a training set were collected in Jan 2022.
  • 17407 files that form a test set were collected in Apr 2022.

Labels

Files located in report_clean and report_windows_syswow64 are clean (benign). All others represent malware distributed over 7 families. A specific number of files in each folder:

  • Training set

    • report_backdoor : 11062
    • report_clean : 24434
    • report_coinminer : 6891
    • report_dropper : 8243
    • report_keylogger : 4378
    • report_ransomware : 9627
    • report_rat : 1697
    • report_trojan : 8733
    • report_windows_syswow64 : 236
  • Test set

    • report_backdoor : 1940
    • report_clean : 7944
    • report_coinminer : 1684
    • report_dropper : 252
    • report_keylogger : 1041
    • report_ransomware : 2139
    • report_rat : 1258
    • report_trojan : 1085
    • report_windows_syswow64 : 59
Top Tier

Social Proof

HuggingFace Hub
43.5KDownloads
🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Dataset Transparency Report

Verified data manifest for traceability and transparency.

100% Data Disclosure Active

🆔 Identity & Source

id
hf-dataset--wy777--quovadis-speakeasy
source
huggingface
author
wy777
tags
license:apache-2.0arxiv:2310.10664arxiv:2208.12248region:us

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null

📊 Engagement & Metrics

likes
0
downloads
43,519

Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)