📊

Dataset

Quovadis Speakeasy

Name: Quovadis Speakeasy
Creator: wy777
License: Apache-2.0

by wy777 hf-dataset--wy777--quovadis-speakeasy

Nexus Index

33.5 Top 100%

S: Semantic 50

A: Authority 0

P: Popularity 57

R: Recency 53

Q: Quality 30

Tech Context

Vital Performance

0 DL / 30D

0.0%

Source →

Data Integrity 33.5 FNI Score

- Size

- Rows

Parquet Format

- Tokens

Dataset Information Summary
Entity Passport
Registry ID	hf-dataset--wy777--quovadis-speakeasy
License	Apache-2.0
Provider	huggingface

📜

Cite this dataset

Academic & Research Attribution

BibTeX

@misc{hf_dataset__wy777__quovadis_speakeasy,
  author = {wy777},
  title = {Quovadis Speakeasy Dataset},
  year = {2026},
  howpublished = {\url{https://huggingface.co/datasets/wy777/quovadis-speakeasy}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

wy777. (2026). Quovadis Speakeasy [Dataset]. Free2AITools. https://huggingface.co/datasets/wy777/quovadis-speakeasy

🔬Technical Deep Dive

Full Specifications [+]

⚖️ Nexus Index V2.0

Methodology Index Protocol

33.5

TOP 100% SYSTEM IMPACT

Semantic (S) 50

Authority (A) 0

Popularity (P) 57

Recency (R) 53

Quality (Q) 30

💬 Index Insight

FNI V2.0 for Quovadis Speakeasy: Semantic (S:50), Authority (A:0), Popularity (P:57), Recency (R:53), Quality (Q:30).

Free2AITools Nexus Index

Verification Authority

HuggingFace API GitHub Metadata Arxiv Citation DB System Audit

Unbiased Data Node Refresh: VFS Live

⬇️

Downloads

92,520

👁️ Data Preview

📊

Row-level preview not available for this dataset.

Schema structure is shown in the Field Logic panel when available.

🔗 Explore Full Dataset ↗

🧬 Field Logic

🧬

Schema not yet indexed for this dataset.

Dataset Specification

About Dataset

Citation

This dataset was created and further refined as part of the following two publications:

"Quo Vadis: Hybrid Machine Learning Meta-Model Based on Contextual and Behavioral Malware Representations", Trizna et al., 2022, https://dl.acm.org/doi/10.1145/3560830.3563726
"Nebula: Self-Attention for Dynamic Malware Analysis", Trizna et al., 2024, https://ieeexplore.ieee.org/document/10551436

If you used it in your research, please cite us:

bibtex

@inproceedings{quovadis,
author = {Trizna, Dmitrijs},
title = {Quo Vadis: Hybrid Machine Learning Meta-Model Based on Contextual and Behavioral Malware Representations},
year = {2022},
isbn = {9781450398800},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3560830.3563726},
doi = {10.1145/3560830.3563726},
booktitle = {Proceedings of the 15th ACM Workshop on Artificial Intelligence and Security},
pages = {127–136},
numpages = {10},
keywords = {reverse engineering, neural networks, malware, emulation, convolutions},
location = {Los Angeles, CA, USA},
series = {AISec'22}
}
@ARTICLE{nebula,
  author={Trizna, Dmitrijs and Demetrio, Luca and Biggio, Battista and Roli, Fabio},
  journal={IEEE Transactions on Information Forensics and Security}, 
  title={Nebula: Self-Attention for Dynamic Malware Analysis}, 
  year={2024},
  volume={19},
  number={},
  pages={6155-6167},
  keywords={Malware;Feature extraction;Data models;Analytical models;Long short term memory;Task analysis;Encoding;Malware;transformers;dynamic analysis;convolutional neural networks},
  doi={10.1109/TIFS.2024.3409083}}

Arxiv references of both papers: arxiv.org/abs/2310.10664 and arxiv.org/abs/2208.12248.

Description

This dataset contains behavioral reports obtained with Speakeasy emulator from 93533 32-bit portable executables (PE).

This is complementary dataset to https://huggingface.co/datasets/dtrizna/quovadis-ember, which represents static EMBER features of the same malware samples.

To reflect concept drift in malware:

76126 files that form a training set were collected in Jan 2022.
17407 files that form a test set were collected in Apr 2022.

Labels

Files located in report_clean and report_windows_syswow64 are clean (benign). All others represent malware distributed over 7 families. A specific number of files in each folder:

Training set
- report_backdoor : 11062
- report_clean : 24434
- report_coinminer : 6891
- report_dropper : 8243
- report_keylogger : 4378
- report_ransomware : 9627
- report_rat : 1697
- report_trojan : 8733
- report_windows_syswow64 : 236
Test set
- report_backdoor : 1940
- report_clean : 7944
- report_coinminer : 1684
- report_dropper : 252
- report_keylogger : 1041
- report_ransomware : 2139
- report_rat : 1258
- report_trojan : 1085
- report_windows_syswow64 : 59

Social Proof

HuggingFace Hub

92.5KDownloads

Hub Discussions

🤗 Data Source: Hugging Face ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on Hugging Face metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Dataset Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id: hf-dataset--wy777--quovadis-speakeasy
slug: wy777--quovadis-speakeasy
source: huggingface
author: wy777
license: Apache-2.0
tags: license:apache-2.0, arxiv:2310.10664, arxiv:2208.12248, region:us

⚙️ Technical Specs

architecture: null
params billions: null
context length: null
pipeline tag

📊 Engagement & Metrics

downloads: 92,520
stars: 0
forks: 0

Data indexed from public sources. Updated daily.

Welcome to Free2AI Tools!

Smart Search

FNI Score

You're All Set!