layoutlm-document-qa
"This is a fine-tuned version of the multi-modal Layou......"
⥠Quick Commands
huggingface-cli download impira/layoutlm-document-qa pip install -U transformers Engineering Specs
⥠Hardware
đ§ Lifecycle
đ Identity
Est. VRAM Benchmark
~95845061.5GB
* Technical estimation for FP16/Q4 weights. Does not include OS overhead or long-context batching. For Technical Reference Only.
đ Interest Trend
Real-time Trend Indexing In-Progress
* Real-time activity index across HuggingFace, GitHub and Research citations.
No similar models found.
Social Proof
đŦTechnical Deep Dive
Full Specifications [+]âž
đ What's Next?
⥠Quick Commands
huggingface-cli download impira/layoutlm-document-qa pip install -U transformers Hardware Compatibility
Multi-Tier Validation Matrix
RTX 3060 / 4060 Ti
RTX 4070 Super
RTX 4080 / Mac M3
RTX 3090 / 4090
RTX 6000 Ada
A100 / H100
Pro Tip: Compatibility is estimated for 4-bit quantization (Q4). High-precision (FP16) or ultra-long context windows will significantly increase VRAM requirements.
README
LayoutLM for Visual Question Answering
This is a fine-tuned version of the multi-modal LayoutLM model for the task of question answering on documents. It has been fine-tuned using both the SQuAD2.0 and DocVQA datasets.
Getting started with the model
To run these examples, you must have PIL, pytesseract, and PyTorch installed in addition to transformers.
from transformers import pipeline
nlp = pipeline(
"document-question-answering",
model="impira/layoutlm-document-qa",
)
nlp(
"https://templates.invoicehome.com/invoice-template-us-neat-750px.png",
"What is the invoice number?"
)
# {'score': 0.9943977, 'answer': 'us-001', 'start': 15, 'end': 15}
nlp(
"https://miro.medium.com/max/787/1*iECQRIiOGTmEFLdWkVIH2g.jpeg",
"What is the purchase amount?"
)
# {'score': 0.9912159, 'answer': '$1,000,000,000', 'start': 97, 'end': 97}
nlp(
"https://www.accountingcoach.com/wp-content/uploads/2013/10/[email protected]",
"What are the 2020 net sales?"
)
# {'score': 0.59147286, 'answer': '$ 3,750', 'start': 19, 'end': 20}
NOTE: This model and pipeline was recently landed in transformers via PR #18407 and PR #18414, so you'll need to use a recent version of transformers, for example:
pip install git+https://github.com/huggingface/transformers.git@2ef774211733f0acf8d3415f9284c49ef219e991
About us
This model was created by the team at Impira.
LayoutLM for Visual Question Answering
This is a fine-tuned version of the multi-modal LayoutLM model for the task of question answering on documents. It has been fine-tuned using both the SQuAD2.0 and DocVQA datasets.
Getting started with the model
To run these examples, you must have PIL, pytesseract, and PyTorch installed in addition to transformers.
from transformers import pipeline
nlp = pipeline(
"document-question-answering",
model="impira/layoutlm-document-qa",
)
nlp(
"https://templates.invoicehome.com/invoice-template-us-neat-750px.png",
"What is the invoice number?"
)
# {'score': 0.9943977, 'answer': 'us-001', 'start': 15, 'end': 15}
nlp(
"https://miro.medium.com/max/787/1*iECQRIiOGTmEFLdWkVIH2g.jpeg",
"What is the purchase amount?"
)
# {'score': 0.9912159, 'answer': '$1,000,000,000', 'start': 97, 'end': 97}
nlp(
"https://www.accountingcoach.com/wp-content/uploads/2013/10/[email protected]",
"What are the 2020 net sales?"
)
# {'score': 0.59147286, 'answer': '$ 3,750', 'start': 19, 'end': 20}
NOTE: This model and pipeline was recently landed in transformers via PR #18407 and PR #18414, so you'll need to use a recent version of transformers, for example:
pip install git+https://github.com/huggingface/transformers.git@2ef774211733f0acf8d3415f9284c49ef219e991
About us
This model was created by the team at Impira.
đ Limitations & Considerations
- âĸ Benchmark scores may vary based on evaluation methodology and hardware configuration.
- âĸ VRAM requirements are estimates; actual usage depends on quantization and batch size.
- âĸ FNI scores are relative rankings and may change as new models are added.
- â License Unknown: Verify licensing terms before commercial use.
- âĸ Source: Unknown
Cite this model
Academic & Research Attribution
@misc{hf_model__impira__layoutlm_document_qa,
author = {impira},
title = {undefined Model},
year = {2026},
howpublished = {\url{https://huggingface.co/impira/layoutlm-document-qa}},
note = {Accessed via Free2AITools Knowledge Fortress}
} AI Summary: Based on Hugging Face metadata. Not a recommendation.
đĄī¸ Model Transparency Report
Verified data manifest for traceability and transparency.
đ Identity & Source
- id
- hf-model--impira--layoutlm-document-qa
- author
- impira
- tags
- transformerspytorchtfsafetensorslayoutlmdocument-question-answeringpdfenlicense:mitendpoints_compatibleregion:us
âī¸ Technical Specs
- architecture
- layoutlm
- params billions
- 127,793,412
- context length
- 4,096
- vram gb
- 95,845,061.5
- vram is estimated
- true
- vram formula
- VRAM â (params * 0.75) + 2GB (KV) + 0.5GB (OS)
đ Engagement & Metrics
- likes
- 1,152
- downloads
- 17,295
Free2AITools Constitutional Data Pipeline: Curated disclosure mode active. (V15.x Standard)