R
Influenced by

R terminal
Paradigms	Multi-paradigm: procedural, object-oriented, functional, reflective, imperative, array^[1]
Designed by	Ross Ihaka and Robert Gentleman
Developer	R Core Team
First appeared	August 1993; 30 years ago (1993-08)

Stable release	4.4.1^[2] / 14 June 2024; 30 days ago (14 June 2024)

Typing discipline	Dynamic
Platform	arm64 and x86-64
License	GNU GPL v2^[3]
Filename extensions	.r^[4] .rdata .rhistory .rds .rda^[5]
Website	www.r-project.org
Lisp S^[6] Scheme^[1]
Influenced
Julia^[7]
R Programming at Wikibooks

R is a programming language for statistical computing and data visualization. It has been adopted in the fields of data mining, bioinformatics, and data analysis.^[8]

The core R language is augmented by a large number of extension packages, containing reusable code, documentation, and sample data.

R software is open-source and free software. It is licensed by the GNU Project and available under the GNU General Public License.^[3] It is written primarily in C, Fortran, and R itself. Precompiled executables are provided for various operating systems.

As an interpreted language, R has a native command line interface. Moreover, multiple third-party graphical user interfaces are available, such as RStudio—an integrated development environment—and Jupyter—a notebook interface.

History

R was started by professors Ross Ihaka and Robert Gentleman as a programming language to teach introductory statistics at the University of Auckland.^[9] The language was inspired by the S programming language, with most S programs able to run unaltered in R.^[6] The language was also inspired by Scheme's lexical scoping, allowing for local variables.^[1]

The name of the language, R, comes from being both an S language successor as well as the shared first letter of the authors, Ross and Robert.^[10] In August 1993, Ihaka and Gentleman posted a binary of R on StatLib — a data archive website. At the same time, they announced the posting on the s-news mailing list.^[11] On December 5, 1997, R became a GNU project when version 0.60 was released.^[12] On February 29, 2000, the first official 1.0 version was released.^[13]

Packages

Main article: R package

R packages are collections of functions, documentation, and data that expand R.^[14] For example, packages add report features such as RMarkdown, Quarto, knitr and Sweave. Easy package installation and use have contributed to the language's adoption in data science.^[15]

Base packages are immediately available when starting R and provide the necessary syntax and commands for programming, computing, graphics production, basic arithmetic, and statistical functionality.^[16]

The Comprehensive R Archive Network (CRAN) was founded in 1997 by Kurt Hornik and Fritz Leisch to host R's source code, executable files, documentation, and user-created packages.^[17] Its name and scope mimic the Comprehensive TeX Archive Network and the Comprehensive Perl Archive Network.^[17] CRAN originally had three mirrors and 12 contributed packages.^[18] As of June 2024, it has 104 mirrors^[19] and 20,853 contributed packages.^[20] Packages are also available on repositories R-Forge, Omegahat, and GitHub.

The Task Views on the CRAN website lists packages in fields such as finance, genetics, high-performance computing, machine learning, medical imaging, meta-analysis, social sciences, and spatial statistics.

The Bioconductor project provides packages for genomic data analysis, complementary DNA, microarray, and high-throughput sequencing methods.

Packages add the capability to implement various statistical techniques such as linear, generalized linear and nonlinear modeling, classical statistical tests, spatial analysis, time-series analysis, and clustering.

An example package is the tidyverse package. Its focus is having a common interface around accessing and processing data contained in a data frame data structure, a two-dimensional table of rows and columns called "tidy data".^[21] Each function in the package is designed to couple together all the other functions in the package.^[22]

Installing a package occurs only once. To install tidyverse:^[22]

> install.packages("tidyverse")

To instantiate the functions, data, and documentation of a package, execute the library() function. To instantiate tidyverse:^[a]

> library(tidyverse)

Interfaces

R comes installed with a command line console. Available for installation are various integrated development environments (IDE). IDEs for R include R.app (OSX/macOS only), Rattle GUI, R Commander, RKWard, RStudio, and Tinn-R.

General purpose IDEs that support R include Eclipse via the StatET plugin and Visual Studio via R Tools for Visual Studio.

Editors that support R include Emacs, Vim via the Nvim-R plugin, Kate, LyX via Sweave, WinEdt (website), and Jupyter (website).

Scripting languages that support R include Python (website), Perl (website), Ruby (source code), F# (website), and Julia (source code).

General purpose programming languages that support R include Java via the Rserve socket server, and .NET C# (website).

Statistical frameworks which use R in the background include Jamovi and JASP.

Community

The R Core Team was founded in 1997 to maintain the R source code. The R Foundation for Statistical Computing was founded in April 2003 to provide financial support. The R Consortium is a Linux Foundation project to develop R infrastructure.

The R Journal is an open access, academic journal which features short to medium-length articles on the use and development of R. It includes articles on packages, programming tips, CRAN news, and foundation news.

The R community hosts many conferences and in-person meetups. These groups include:

UseR!: an annual international R user conference (website)
Directions in Statistical Computing (DSC) (website)
R-Ladies: an organization to promote gender diversity in the R community (website)
SatRdays: R-focused conferences held on Saturdays (website)
R Conference (website)
posit::conf (formerly known as rstudio::conf) (website)

Implementations

The main R implementation is written primarily in C, Fortran, and R itself. Other implementations include:

pretty quick R (pqR), by Radford M. Neal, attempts to improve memory management.
Renjin is an implementation of R for the Java Virtual Machine.
CXXR and Riposte^[23] are implementations of R written in C++.
Oracle's FastR is an implementation of R, built on GraalVM.
TIBCO Software, creator of S-PLUS, wrote TERR — an R implementation to integrate with Spotfire.^[24]

Microsoft R Open (MRO) was an R implementation. As of 30 June 2021, Microsoft started to phase out MRO in favor of the CRAN distribution.^[25]

Commercial support

Although R is an open-source project, some companies provide commercial support:

Revolution Analytics provides commercial support for Revolution R.
Oracle provides commercial support for the Big Data Appliance, which integrates R into its other products.
IBM provides commercial support for in-Hadoop execution of R.

Examples

Hello, World!

"Hello, World!" program:

> print("Hello, World!")
[1] "Hello, World!"

Basic syntax

The following examples illustrate the basic syntax of the language and use of the command-line interface. (An expanded list of standard language features can be found in the R manual, "An Introduction to R".^[26])

In R, the generally preferred assignment operator is an arrow made from two characters <-, although = can be used in some cases.^[27]

> x <- 1:6 # Create a numeric vector in the current environment
> y <- x^2 # Create vector based on the values in x.
> print(y) # Print the vector’s contents.
[1]  1  4  9 16 25 36

> z <- x + y # Create a new vector that is the sum of x and y
> z # Return the contents of z to the current environment.
[1]  2  6 12 20 30 42

> z_matrix <- matrix(z, nrow = 3) # Create a new matrix that turns the vector z into a 3x2 matrix object
> z_matrix 
     [,1] [,2]
[1,]    2   20
[2,]    6   30
[3,]   12   42

> 2 * t(z_matrix) - 2 # Transpose the matrix, multiply every element by 2, subtract 2 from each element in the matrix, and return the results to the terminal.
     [,1] [,2] [,3]
[1,]    2   10   22
[2,]   38   58   82

> new_df <- data.frame(t(z_matrix), row.names = c("A", "B")) # Create a new data.frame object that contains the data from a transposed z_matrix, with row names 'A' and 'B'
> names(new_df) <- c("X", "Y", "Z") # Set the column names of new_df as X, Y, and Z.
> print(new_df)  # Print the current results.
   X  Y  Z
A  2  6 12
B 20 30 42

> new_df$Z # Output the Z column
[1] 12 42

> new_df$Z == new_df['Z'] && new_df[3] == new_df$Z # The data.frame column Z can be accessed using $Z, ['Z'], or [3] syntax and the values are the same. 
[1] TRUE

> attributes(new_df) # Print attributes information about the new_df object
$names
[1] "X" "Y" "Z"

$row.names
[1] "A" "B"

$class
[1] "data.frame"

> attributes(new_df)$row.names <- c("one", "two") # Access and then change the row.names attribute; can also be done using rownames()
> new_df
     X  Y  Z
one  2  6 12
two 20 30 42

Structure of a function

One of R's strengths is the ease of creating new functions.^[28] Objects in the function body remain local to the function, and any data type may be returned. In R, almost all functions and all user-defined functions are closures.^[29]

Create a function:

# The input parameters are x and y.
# The function returns a linear combination of x and y.
f <- function(x, y) {
  z <- 3 * x + 4 * y

  # this return() statement is optional
  return(z)
}

Usage output:

> f(1, 2)
[1] 11

> f(c(1, 2, 3), c(5, 3, 4))
[1] 23 18 25

> f(1:3, 4)
[1] 19 22 25

It is possible to define functions to be used as infix operators with the special syntax `%name%` where "name" is the function variable name:

> `%sumx2y2%` <- function(e1, e2) {e1 ^ 2 + e2 ^ 2}
> 1:3 %sumx2y2% -(1:3)
[1]  2  8 18

Since version 4.1.0 functions can be written in a short notation, which is useful for passing anonymous functions to higher-order functions:^[30]

> sapply(1:5, \(i) i^2)    # here \(i) is the same as function(i) 
[1]  1  4  9 16 25

Native pipe operator

In R version 4.1.0, a native pipe operator, |>, was introduced.^[31] This operator allows users to chain functions together one after another, instead of a nested function call.

> nrow(subset(mtcars, cyl == 4)) # Nested without the pipe character
[1] 11

> mtcars |> subset(cyl == 4) |> nrow() # Using the pipe character
[1] 11

Another alternative to nested functions, in contrast to using the pipe character, is using intermediate objects. However, some argue that using the pipe operator will produce code that is easier to read.^[22]

> mtcars_subset_rows <- subset(mtcars, cyl == 4)
> num_mtcars_subset <- nrow(mtcars_subset_rows)
> print(num_mtcars_subset)
[1] 11

Object-oriented programming

The R language has native support for object-oriented programming. There are two native frameworks, the so-called S3 and S4 systems. The former, being more informal, supports single dispatch on the first argument and objects are assigned to a class by just setting a "class" attribute in each object. The latter is a Common Lisp Object System (CLOS)-like system of formal classes (also derived from S) and generic methods that supports multiple dispatch and multiple inheritance^[32]

In the example, summary is a generic function that dispatches to different methods depending on whether its argument is a numeric vector or a "factor":

> data <- c("a", "b", "c", "a", NA)
> summary(data)
   Length     Class      Mode 
        5 character character 
> summary(as.factor(data))
   a    b    c NA's 
   2    1    1    1

Modeling and plotting

The R language has built-in support for data modeling and graphics. The following example shows how R can generate and plot a linear model with residuals.

# Create x and y values
x <- 1:6
y <- x^2

# Linear regression model y = A + B * x
model <- lm(y ~ x)

# Display an in-depth summary of the model
summary(model)

# Create a 2 by 2 layout for figures
par(mfrow = c(2, 2))

# Output diagnostic plots of the model
plot(model)

Output:

Residuals:
      1       2       3       4       5       6       7       8      9      10
 3.3333 -0.6667 -2.6667 -2.6667 -0.6667  3.3333

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  -9.3333     2.8441  -3.282 0.030453 * 
x             7.0000     0.7303   9.585 0.000662 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.055 on 4 degrees of freedom
Multiple R-squared:  0.9583, Adjusted R-squared:  0.9478
F-statistic: 91.88 on 1 and 4 DF,  p-value: 0.000662

Mandelbrot set

This Mandelbrot set example highlights the use of complex numbers. It models the first 20 iterations of the equation z = z² + c, where c represents different complex constants.

Install the package that provides the write.gif() function beforehand:

install.packages("caTools")

R Source code:

library(caTools)

jet.colors <-
    colorRampPalette(
        c("green", "pink", "#007FFF", "cyan", "#7FFF7F",
          "white", "#FF7F00", "red", "#7F0000"))

dx <- 1500 # define width
dy <- 1400 # define height

C  <-
    complex(
            real = rep(seq(-2.2, 1.0, length.out = dx), each = dy),
            imag = rep(seq(-1.2, 1.2, length.out = dy), times = dx)
            )

# reshape as matrix of complex numbers
C <- matrix(C, dy, dx)

# initialize output 3D array
X <- array(0, c(dy, dx, 20))

Z <- 0

# loop with 20 iterations
for (k in 1:20) {

  # the central difference equation
  Z <- Z^2 + C

  # capture the results
  X[, , k] <- exp(-abs(Z))
}

write.gif(
    X,
    "Mandelbrot.gif",
    col = jet.colors,
    delay = 100)

Version names

All R version releases from 2.14.0 onward have codenames that make reference to Peanuts comics and films.^[33]^[34]^[35]

In 2018, core R developer Peter Dalgaard presented a history of R releases since 1997.^[36] Some notable early releases before the named releases include:

Version 1.0.0 released on February 29, 2000 (2000-02-29), a leap day
Version 2.0.0 released on October 4, 2004 (2004-10-04), "which at least had a nice ring to it"^[36]

The idea of naming R version releases was inspired by the Debian and Ubuntu version naming system. Dalgaard also noted that another reason for the use of Peanuts references for R codenames is because, "everyone in statistics is a P-nut".^[36]

R release codenames
Version	Release date	Name	Peanuts reference	Reference
4.4.1	2024-06-14	Race for Your Life	^[37]	^[38]
4.4.0	2024-04-24	Puppy Cup	^[39]	^[40]
4.3.3	2024-02-29	Angel Food Cake	^[41]	^[42]
4.3.2	2023-10-31	Eye Holes	^[43]	^[44]
4.3.1	2023-06-16	Beagle Scouts	^[45]	^[46]
4.3.0	2023-04-21	Already Tomorrow	^[47]^[48]^[49]	^[50]
4.2.3	2023-03-15	Shortstop Beagle	^[51]	^[52]
4.2.2	2022-10-31	Innocent and Trusting	^[53]	^[54]
4.2.1	2022-06-23	Funny-Looking Kid	^[55]^[56]^[57]^[58]^[59]^[60]	^[61]
4.2.0	2022-04-22	Vigorous Calisthenics	^[62]	^[63]
4.1.3	2022-03-10	One Push-Up	^[62]	^[64]
4.1.2	2021-11-01	Bird Hippie	^[65]^[66]	^[64]
4.1.1	2021-08-10	Kick Things	^[67]	^[68]
4.1.0	2021-05-18	Camp Pontanezen	^[69]	^[70]
4.0.5	2021-03-31	Shake and Throw	^[71]	^[72]
4.0.4	2021-02-15	Lost Library Book	^[73]^[74]^[75]	^[76]
4.0.3	2020-10-10	Bunny-Wunnies Freak Out	^[77]	^[78]
4.0.2	2020-06-22	Taking Off Again	^[79]	^[80]
4.0.1	2020-06-06	See Things Now	^[81]	^[82]
4.0.0	2020-04-24	Arbor Day	^[83]	^[84]
3.6.3	2020-02-29	Holding the Windsock	^[85]	^[86]
3.6.2	2019-12-12	Dark and Stormy Night	See It was a dark and stormy night#Literature^[87]	^[88]
3.6.1	2019-07-05	Action of the Toes	^[89]	^[90]
3.6.0	2019-04-26	Planting of a Tree	^[91]	^[92]
3.5.3	2019-03-11	Great Truth	^[93]	^[94]
3.5.2	2018-12-20	Eggshell Igloos	^[95]	^[96]
3.5.1	2018-07-02	Feather Spray	^[97]	^[98]
3.5.0	2018-04-23	Joy in Playing	^[99]	^[100]
3.4.4	2018-03-15	Someone to Lean On	^[101]^{[better source needed]}	^[102]
3.4.3	2017-11-30	Kite-Eating Tree	See Kite-Eating Tree^[103]	^[104]
3.4.2	2017-09-28	Short Summer	See It Was a Short Summer, Charlie Brown	^[105]
3.4.1	2017-06-30	Single Candle	^[106]	^[107]
3.4.0	2017-04-21	You Stupid Darkness	^[106]	^[108]
3.3.3	2017-03-06	Another Canoe	^[109]	^[110]
3.3.2	2016-10-31	Sincere Pumpkin Patch	^[111]	^[112]
3.3.1	2016-06-21	Bug in Your Hair	^[113]	^[114]
3.3.0	2016-05-03	Supposedly Educational	^[115]	^[116]
3.2.5	2016-04-11	Very, Very Secure Dishes	^[117]	^[118]^[119]^[120]
3.2.4	2016-03-11	Very Secure Dishes	^[117]	^[121]
3.2.3	2015-12-10	Wooden Christmas-Tree	See A Charlie Brown Christmas^[122]	^[123]
3.2.2	2015-08-14	Fire Safety	^[124]^[125]	^[126]
3.2.1	2015-06-18	World-Famous Astronaut	^[127]	^[128]
3.2.0	2015-04-16	Full of Ingredients	^[129]	^[130]
3.1.3	2015-03-09	Smooth Sidewalk	^[131]^{[page needed]}	^[132]
3.1.2	2014-10-31	Pumpkin Helmet	See You're a Good Sport, Charlie Brown	^[133]
3.1.1	2014-07-10	Sock it to Me	^[134]^[135]^[136]^[137]	^[138]
3.1.0	2014-04-10	Spring Dance	^[89]	^[139]
3.0.3	2014-03-06	Warm Puppy	^[140]	^[141]
3.0.2	2013-09-25	Frisbee Sailing	^[142]	^[143]
3.0.1	2013-05-16	Good Sport	^[144]	^[145]
3.0.0	2013-04-03	Masked Marvel	^[146]	^[147]
2.15.3	2013-03-01	Security Blanket	^[148]	^[149]
2.15.2	2012-10-26	Trick or Treat	^[150]	^[151]
2.15.1	2012-06-22	Roasted Marshmallows	^[152]	^[153]
2.15.0	2012-03-30	Easter Beagle	^[154]	^[155]
2.14.2	2012-02-29	Gift-Getting Season	See It's the Easter Beagle, Charlie Brown^[156]	^[157]
2.14.1	2011-12-22	December Snowflakes	^[158]	^[159]
2.14.0	2011-10-31	Great Pumpkin	See It's the Great Pumpkin, Charlie Brown^[160]	^[161]
r-devel	N/A	Unsuffered Consequences	^[162]	^[36]

Portal

v t e R (programming language)
Features	Sweave
Implementations	Distributed R Microsoft R Open (Revolution R Open) Renjin
Packages	Bibliometrix easystats qdap lumi RGtk2 Rhea Rmetrics rnn RQDA Shiny SimpleITK Statcheck tidyverse ggplot2 dplyr knitr
Interfaces	Bio7 Emacs Speaks Statistics Java GUI for R KH Coder Rattle GUI R Commander RExcel RKWard RStudio
People	Roger Bivand Jenny Bryan John Chambers Peter Dalgaard Dirk Eddelbuettel Robert Gentleman Ross Ihaka Thomas Lumley Brian D. Ripley Julia Silge Luke Tierney Hadley Wickham Yihui Xie
Organisations	R Consortium Revolution Analytics R-Ladies Posit PBC (formerly RStudio PBC)
Publications	The R Journal

v t e GNU Project
History	GNU Manifesto Free Software Foundation Europe India Latin America History of free software
Licenses	GNU General Public License linking exception font exception GNU Lesser General Public License GNU Affero General Public License GNU Free Documentation License
Software	GNU (variants) Hurd Linux-libre glibc Bash coreutils findutils Build system GCC binutils GDB GRUB GNUstep GIMP Jami GNU Emacs GNU TeXmacs GNU Octave GNU Taler GNU R GSL GMP GNU Electric GNU Archimedes GNUnet GNU Privacy Guard Gnuzilla (IceCat) GNU Health GNUmed GNU LilyPond GNU Go GNU Chess Gnash Guix more...
Contributors	Alexandre Oliva Benjamin Mako Hill Bradley M. Kuhn Brian Fox Federico Heinz Georg C. F. Greve John Sullivan Nagarjuna G. Richard M. Stallman
Other topics	GNU/Linux naming controversy Revolution OS Free Software Foundation anti-Windows campaigns Defective by Design

Numerical-analysis software

Free

Discontinued	Fortress

Proprietary

Comparison

Statistical software

Public domain

Open-source

ADMB
DAP
gretl
JASP
JAGS
JMulTi
Julia
Jupyter (Julia, Python, R)
GNU Octave
OpenBUGS
Orange
PSPP
Python (statsmodels, PyMC3, IPython, IDLE)
R (RStudio)
SageMath
SimFiT
SOFA Statistics
Stan
XLispStat

Freeware

Commercial

Cross-platform	Data Desk GAUSS GraphPad InStat GraphPad Prism IBM SPSS Statistics IBM SPSS Modeler JMP Maple Mathcad Mathematica MATLAB OxMetrics RATS Revolution Analytics SAS SmartPLS Stata StatView SUDAAN S-PLUS TSP World Programming System (WPS)
Windows only	BMDP EViews GenStat LIMDEP LISREL MedCalc Microfit Minitab MLwiN NCSS SHAZAM SigmaStat Statistica StatsDirect StatXact SYSTAT The Unscrambler UNISTAT
Excel add-ons	Analyse-it UNISTAT for Excel XLfit RExcel

Category
Comparison

v t e Programming languages
Comparison Timeline History
Ada ALGOL APL Assembly BASIC C C++ C# Classic Visual Basic COBOL Erlang Forth Fortran Go Haskell Java JavaScript Julia Kotlin Lisp Lua MATLAB ML Object Pascal Pascal Perl PHP Prolog Python R Ruby Rust SQL Scratch Shell Simula Smalltalk Swift Visual Basic more...
Lists: Alphabetical Categorical Generational Non-English-based Category

Authority control databases
International	VIAF
National	France BnF data Germany Israel United States Czech Republic
Other	IdRef