📄
Paper

An Empirical Survey of Data Augmentation for Limited Data Learning in NLP

by Independent / Community 013eb12ce5468f79d58bf859653f4929c5a2bd14
Free2AITools Nexus Index
71.2
S: Semantic 50

Query-time baseline · scored live at search

A: Authority 89
P: Popularity 67
R: Recency 100
Q: Quality 65
Tech Context
Vital Performance

NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks where significant time, money, or expertise is required to label massive amounts of textual data. Recently, data augmentation methods have been explored as a means of improving data efficiency in NLP. To date, there has been no systematic empirical overview of data augmenta...

High Impact 232 Citations
Paper Information Summary
Entity Passport
Registry ID 013eb12ce5468f79d58bf859653f4929c5a2bd14
License ArXiv
Provider semantic_scholar
📜

Cite this paper

Academic & Research Attribution

BibTeX
@misc{013eb12ce5468f79d58bf859653f4929c5a2bd14,
  author = {Unknown},
  title = {An Empirical Survey of Data Augmentation for Limited Data Learning in NLP Paper},
  year = {2026},
  howpublished = {\url{https://api.semanticscholar.org/013eb12ce5468f79d58bf859653f4929c5a2bd14}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
Unknown. (2026). An Empirical Survey of Data Augmentation for Limited Data Learning in NLP [Paper]. Free2AITools. https://api.semanticscholar.org/013eb12ce5468f79d58bf859653f4929c5a2bd14

đŸ”ŦTechnical Deep Dive

Full Specifications [+]

âš–ī¸ Free2AITools Nexus Index V2.0

Semantic (S) 50

Query-time baseline · scored live at search

Authority (A) 89
Popularity (P) 67
Recency (R) 100
Quality (Q) 65

đŸ’Ŧ Index Insight

FNI V2.0 for An Empirical Survey of Data Augmentation for Limited Data Learning in NLP: Authority (A:89), Popularity (P:67), Recency (R:100), Quality (Q:65). Semantic (S) is a query-time baseline scored live at search.

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live

📝 Executive Summary

"NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks where significant time, money, or expertise is required to label massive amounts of textual data. Recently, data augmentation methods have been explored as a means of improving data efficiency in NLP. To date, there has been no systematic empirical overview of data augmenta..."

❝ Cite Node

@article{Unknown2026An,
  title={An Empirical Survey of Data Augmentation for Limited Data Learning in NLP},
  author={},
  note={Indexed by Free2AITools},
  year={2026}
}

Abstract & Analysis

NLP has achieved great progress in the past decade through the use of neural models and large labeled datasets. The dependence on abundant data prevents NLP models from being applied to low-resource settings or novel tasks where significant time, money, or expertise is required to label massive amounts of textual data. Recently, data augmentation methods have been explored as a means of improving data efficiency in NLP. To date, there has been no systematic empirical overview of data augmentation for NLP in the limited labeled data setting, making it difficult to understand which methods work in which settings. In this paper, we provide an empirical survey of recent progress on data augmentation for NLP in the limited labeled data setting, summarizing the landscape of methods (including token-level augmentations, sentence-level augmentations, adversarial augmentations, and hidden-space augmentations) and carrying out experiments on 11 datasets covering topics/news classification, inference tasks, paraphrasing tasks, and single-sentence tasks. Based on the results, we draw several conclusions to help practitioners choose appropriate augmentations in different settings and discuss the current challenges and future directions for limited data learning in NLP.

đŸ“ĻData Source: semantic_scholar
🔄 Daily sync (03:00 UTC)

AI Summary: Based on semantic_scholar metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseâ„šī¸ Verify with original source

đŸ›Ąī¸ Paper Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

source
semantic_scholar
author
Unknown
license
ArXiv
tags
paper, research, academic

âš™ī¸ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag

📊 Engagement & Metrics

downloads
0
stars
0
forks
null
citations
232

Data indexed from public sources. Updated daily.