|Initial release||June 9, 2021|
GPT-J is an open source artificial intelligence language model developed by EleutherAI. It generally follows GPT-2 architecture with the only major difference of the so-called parallel decoders: instead of placing the feed-forward multilayer perceptron after the masked multi-head attention, they are computed in parallel in order to achieve higher throughput with distributed training.
GPT-J performs very similarly to similarly-sized OpenAI's GPT-3 versions on various zero-shot down-streaming tasks and can even outperform it on code generation tasks. The newest version, GPT-J-6B is a language model based on a data set called The Pile. The Pile is an open-source 825 gigabyte language modelling data set that is split into 22 smaller datasets.
GPT-J originally does not function as a chat bot unlike ChatGPT, only as a text predictor. In March 2023, Databricks released Dolly, an Apache-licensed, instruction-following model based on GPT-J with fine-tuning from the Stanford Alpaca dataset.