🧠

Model

CodeGen

Name: CodeGen
Author: salesforce

by salesforce gh-tool--salesforce--codegen

Free2AITools Nexus Index

44.4 Top 100%

S: Semantic 50

A: Authority 0

P: Popularity 65

R: Recency 68

Q: Quality 70

Tech Context

Vital Performance

0 DL / 30D

0.0%

Source →

Audited 44.4 FNI Score

Tiny - Params

- Context

0 Downloads

Commercial APACHE License

Model Information Summary
Entity Passport
Registry ID	gh-tool--salesforce--codegen
License	Apache-2.0
Provider	github

📜

Cite this model

Academic & Research Attribution

BibTeX

@misc{gh_tool__salesforce__codegen,
  author = {salesforce},
  title = {CodeGen Model},
  year = {2026},
  howpublished = {\url{https://github.com/salesforce/CodeGen}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}

APA Style

salesforce. (2026). CodeGen [Model]. Free2AITools. https://github.com/salesforce/CodeGen

🔬Technical Deep Dive

Full Specifications [+]

Quick Commands

🐙 Git Clone

git clone https://github.com/salesforce/CodeGen

⚖️ Free2AITools Nexus Index V2.0

Methodology Index Protocol

Semantic (S) 50

Authority (A) 0

Popularity (P) 65

Recency (R) 68

Quality (Q) 70

💬 Index Insight

FNI V2.0 for CodeGen: Semantic (S:50), Authority (A:0), Popularity (P:65), Recency (R:68), Quality (Q:70).

Free2AITools Nexus Index

Verification Authority

HuggingFace API GitHub Metadata Arxiv Citation DB System Audit

Unbiased Data Node Refresh: VFS Live

---

🚀 What's Next?

📊

Find Training Datasets

Discover datasets compatible with this model

📈

Compare Benchmarks

See how this model ranks on standard tests

⚡

Technical Deep Dive

CodeGen

Official release for the CodeGen1 and CodeGen2 models (350M, 1B, 3B, 7B 16B) for Program Synthesis by Salesforce AI Research.

News

July 2023

CodeGen2.5 released outperforming 16B parameter models with only 7B.

May 2023

CodeGen2.0 released with strong infill sampling capability.

March 2022

CodeGen1.0 released on par with OpenAI Codex at the time.

Publications

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Erik Nijkamp*, Bo Pang*, Hiroaki Hayashi*, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, and Caiming Xiong
ICLR, 2023

CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp*, Hiroaki Hayashi*, Caiming Xiong, Silvio Savarese, and Yingbo Zhou
ICLR, 2023

Usage

The models are available on the Hugging Face Hub.

CodeGen1.0

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen-2B-mono")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-2B-mono")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"]))

CodeGen2.0

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen2-7B")
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen2-7B", trust_remote_code=True, revision="main")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0], truncate_before_pattern=[r"\n\n^#", "^'''", "\n\n\n"]))

CodeGen2.5

python

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/codegen25-7b-mono", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen25-7b-mono")
inputs = tokenizer("# this function prints hello world", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

Training

The Jaxformer library for data pre-processing, training and fine-tuning the CodeGen models can be found here:

https://github.com/salesforce/jaxformer

Citation

If you find our code or paper useful, please cite the paper:

bibtex

@article{nijkamp2022codegen,
  title={CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis},
  author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
  journal={ICLR},
  year={2023}
}

@article{nijkamp2023codegen2,
  title={CodeGen2: Lessons for Training LLMs on Programming and Natural Languages},
  author={Nijkamp, Erik and Hayashi, Hiroaki and Xiong, Caiming and Savarese, Silvio and Zhou, Yingbo},
  journal={ICLR},
  year={2023}
}

Ethics disclaimer for Salesforce AI models, data, code

This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high-risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our standard AUP and AI AUP.

⚠️ Incomplete Data

Some information about this model is not available. Use with Caution - Verify details from the original source before relying on this data.

View Original Source →

📝 Limitations & Considerations

• Benchmark scores may vary based on evaluation methodology and hardware configuration.
• VRAM requirements are estimates; actual usage depends on quantization and batch size.
• FNI scores are relative rankings and may change as new models are added.
⚠ License Unknown: Verify licensing terms before commercial use.

Social Proof

GitHub Repository

421Forks

Repo Issues

🐙 Data Source: GitHub ↗

🔄 Daily sync (03:00 UTC)

AI Summary: Based on GitHub metadata. Not a recommendation.

📊 FNI Methodology 📚 Knowledge Baseℹ️ Verify with original source

🛡️ Model Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

🆔 Identity & Source

id: gh-tool--salesforce--codegen
slug: salesforce--codegen
source: github
author: salesforce
license: Apache-2.0
tags: programsynthesis, generativemodel, codex, languagemodel, llm, tpu-acceleration, python

⚙️ Technical Specs

architecture: null
params billions: null
context length: null
pipeline tag: other

📊 Engagement & Metrics

downloads: 0
stars: 0
forks: 421

Data indexed from public sources. Updated daily.

Cite this model

🔬Technical Deep Dive

Quick Commands

⚖️ Free2AITools Nexus Index V2.0

💬 Index Insight

Verification Authority

🚀 What's Next?

Find Training Datasets

Compare Benchmarks

Deployment Guide

Technical Deep Dive

CodeGen

News

Publications

Usage

Training

Citation

Ethics disclaimer for Salesforce AI models, data, code

⚠️ Incomplete Data

📝 Limitations & Considerations

Social Proof

🛡️ Model Transparency Report

🆔 Identity & Source

⚙️ Technical Specs

📊 Engagement & Metrics