This article needs additional or more specific categories. Please help out by adding categories to it so that it can be listed with similar articles. (February 2023)
Initial releaseJune 9, 2021; 22 months ago (2021-06-09)
TypeLanguage model
LicenseOpen-source Edit this on Wikidata

GPT-J is an open source artificial intelligence language model developed by EleutherAI.[1] It generally follows GPT-2 architecture with the only major difference of the so-called parallel decoders: instead of placing the feed-forward multilayer perceptron after the masked multi-head attention, they are computed in parallel in order to achieve higher throughput with distributed training.[2]

GPT-J performs very similarly to similarly-sized OpenAI's GPT-3 versions on various zero-shot down-streaming tasks and can even outperform it on code generation tasks.[3] The newest version, GPT-J-6B is a language model based on a data set called The Pile.[4] The Pile is an open-source 825 gigabyte language modelling data set that is split into 22 smaller datasets.[5]

GPT-J originally does not function as a chat bot unlike ChatGPT, only as a text predictor.[6] In March 2023, Databricks released Dolly, an Apache-licensed, instruction-following model based on GPT-J with fine-tuning from the Stanford Alpaca dataset.[7]


  1. ^ Demo, GPT-3. "GPT-J | Discover AI use cases". Retrieved 2023-02-28.
  2. ^
  3. ^ "GPT-J-6B: An Introduction to the Largest Open Source GPT Model | Forefront". Retrieved 2023-02-28.
  4. ^ Wang, Ben (2023-02-28), Table of contents, retrieved 2023-02-28
  5. ^ "The Pile". Retrieved 2023-02-28.
  6. ^ Mueller, Vincent (2022-01-25). "How you can use GPT-J". Medium. Retrieved 2023-02-28.
  7. ^ Conover, Mike; Hayes, Matt; Mathur, Ankit; Meng, Xiangrui; Xie, Jianwei; Wan, Jun; Ghodsi, Ali; Wendell, Patrick; Zaharia, Matei (24 March 2023). "Hello Dolly: Democratizing the magic of ChatGPT with open models". Retrieved 2023-04-05.