Introduction to R

Author

Digital Causality Lab

Published

April 5, 2023

1 Basics of R (I)

This notebook provides an introduction to programming in R. R is a programming language that is widely used for statistical analysis in research and industry. It is also an important tool to create data products, like presentations, (automated) reports, applications and software packages.

1.1 Why R?

  • It’s one of the most important tools for computational and applied statistics in academia and data science in practice,
  • Wide collection of tools for data analysis,
  • Has grown substantially over the last years,
  • R is open source,
  • Has a very powerful user interface in RStudio,
  • Large community of contributors constantly adding new and powerful packages.

Alternatives to R include, for example, Python, julia, matlab, … and their extensions.

1.2 Introduction and Background of Participants

Before we start, we would like to ask you about your background.

  • Have you already acquired any programming skills?
  • What kind of “statistical” software do you know or use in your studies or job?
  • Do you already know R?

1.3 Installing R and RStudio

  • Install R, a free software environment for statistical computing and graphics from the R project website. It is recommended to install a precompiled binary distribution for your operating system.
  • Install RStudio’s IDE (Integrated Development Environment), a powerful user interface for R. RStudio Desktop is freely available.

1.4 First steps in R and RStudio

R itself comes with very few functionalities. Also, the standard GUI is not very beautiful as you can see in Figure 1.

Figure 1: Screenshot of the R GUI.

  • Open the R GUI and try to use R as a calculator, type
2+2
[1] 4
  • Now, open RStudio and see the difference. Look at the different panes. Repeat the calculation 2+2.

1.5 A Brief Outlook on Basic Operations and Objects in R

R uses different classes of objects, e.g. lists, matrices, vectors, data frames…

Class Example
numeric 2.2,c(5,2)
character 'Hello'
logical TRUE
list list('Hello',5)
matrix matrix(5,3,2)

You can generate a new object, e.g. a vector with the assignment operator <- (or using =).

  • Generate a vector a that collects all integers from 1 to 5. Use the R command c().
a <- c(1,2,3,4,5)
a
[1] 1 2 3 4 5
# Alternatively (shorter)
a <- c(1:5)

# Even shorter
a <- 1:5
  • Generate a matrix M with 3 rows and 2 columns that lists all integer numbers from 1 to 6.
M <- matrix(c(1,2,3,4,5,6), nrow=3, ncol=2) # define a 3x2 matrix
M
     [,1] [,2]
[1,]    1    4
[2,]    2    5
[3,]    3    6

R supports many operations for vectors and matrices.

Standard calculations
Addition +
Subtraction -
Multiplication *
Division /
Exponentiation ^
  • Divide each entry in a and M by 2.
a/2
[1] 0.5 1.0 1.5 2.0 2.5
M/2
     [,1] [,2]
[1,]  0.5  2.0
[2,]  1.0  2.5
[3,]  1.5  3.0

1.6 Installing R packages

A strength of R is that many add-on packages are available. These packages extend the capability of R. A package is a file that contains a collection of related functions and variables. Most packages are hosted at CRAN, the Comprehensive R Archive Network, and can be installed in R/RStudio with command install.packages().

  • To see which packages are installed on your machine, run the command
library()
  • Install the package by typing
# Install package with name "hdm"
install.packages("hdm")
  • To use the functions that are provided in the hdm package, you have to load the library via
library("hdm")

1.7 Loading Data and Basic Operations with Data

You can load data, e.g., from .csv files or R data files .rda. You have to make sure that you set the right working directory. You can check and change the working directory with

# Check
getwd()

# Change working directory
setwd("INSERT_YOUR_PATH_HERE")

Click here to download an exemplary data file. Provided you work in the right directory (i.e., a directory with a subdirectory data, where you saved the data set), you can load the data with the load() command.

# It's good to start with a clean desk:
# Type the following command to remove all previously created objects
rm(list=ls())

# Load the data
load("data/data_counterfactual.rda") 

# Type ls() to show all objects in your session
ls()
[1] "data_counterfactual"

Now, you can work with the data set called data_counterfactual. Try out the basic commands

dim(data_counterfactual)
summary(data_counterfactual)
plot(data_counterfactual$Y_a1)
table(data_counterfactual)

1.8 Finding Help in R and Online

R has a comprehensive built-in help system. To get help for the function lm() provided by the stats package, which estimates a linear regression, you can use any of the following commands

  • help.start() - general help and extensive introduction to R
  • help(lm) - help for function lm()
  • ?lm - same result
  • apropos("lm")- list all functions containing string "lm"
  • ??lm- extensive search on all documents containing the string "lm"
  • example(lm)- show an example of function lm() (if available)
  • RSiteSearch("lm") - search for lm in help manuals and archived mailing lists

2 Basics of R (II)

After the short introduction above, you will find a more detailed version in this section.

2.1 Starting the Session

We start by emptying the workspace and specifying the current directory.

rm(list=ls())
directory <- "YOUR_DIRECTORY"
setwd(directory)

2.2 Objects in R

2.2.1 Vectors

R is based upon different classes of objects. An object is easily created using the assignment symbol <- or the equality symbol “=”.

x <- 5
x
[1] 5
y = 5
y
[1] 5

The most basic object type in R is a vector. A vector is an ordered combination of simple objects that are of the same type, for instance numbers or letters. A vector is created using the c() call. Although you may want to declare a scalar (i.e. a single number) as an object, it effectively is created as a vector of length 1.

# Vector of numbers
vector1 <- c(2,4,1,4,5)
vector1
[1] 2 4 1 4 5
# Vector of characters
vector2 <- c("a","b,","c","d")
vector2
[1] "a"  "b," "c"  "d" 

In the case of strings (or characters, as they are called in R), the value to be stored needs to be declared by quotation marks. If this is not done, R looks for objects with corresponding names:

# We declared x and y already
vector3 <- c(y,x) 
vector3 
[1] 5 5
# We haven't created objects called a,b,c,d before, this is why we get an error
vector4 <- c(a,b,c,d)
Error in eval(expr, envir, enclos): object 'a' not found

Logical statements can be collected in vectors, as well.

vector.logic = c(TRUE, FALSE, TRUE, FALSE)
vector.logic
[1]  TRUE FALSE  TRUE FALSE

You can use vectors to create a new vector, too. However, remember that the elements of a vector need to be of the same class. If they are not of the same class, R tries to make them the same class of objects. In our case, the elements of the created vectors become characters.

vector5 <- c(vector1, vector3)
vector5
[1] 2 4 1 4 5 5 5
vector6 <- c(vector1, vector2)
vector6
[1] "2"  "4"  "1"  "4"  "5"  "a"  "b," "c"  "d" 

A way to create vectors that is helpful in many applications is to use the functions rep() and seq(). rep() repeats a certain object as many times as specified and seq() creates a sequence of numbers.

# Repeat "a" five times
vector7 <- rep("a", 5)
vector7
[1] "a" "a" "a" "a" "a"
# Repeat 5 five times
vector8 <- rep(5,5)
vector8
[1] 5 5 5 5 5
# Repeat a vector 5 times
vector9 <- rep(vector3, 5)
vector9
 [1] 5 5 5 5 5 5 5 5 5 5
# Create a sequence from 0 to 10
vector10 <- seq(0,10)
vector10
 [1]  0  1  2  3  4  5  6  7  8  9 10
# Create a sequence from 0 to 20 in steps of 5
vector11 <- seq(0,20,5)
vector11
[1]  0  5 10 15 20
# Create a sequence from 5 to 15 in 7 steps
seq(5,15, length=7+1)
[1]  5.000000  6.428571  7.857143  9.285714 10.714286 12.142857 13.571429
[8] 15.000000
# A sequence of integers is also created easily as
vector12 <- 1:5
vector13 <- 5:-4

Moreover, empty vectors can be created with indication of the length and type of vector.

empty.vector <- vector(mode="numeric", length=4)
empty.vector
[1] 0 0 0 0
empty.vector2 <- vector(mode = "character", length=2)
empty.vector2
[1] "" ""

More details can be found in the help files. You can assess them by typing ?vector.

2.2.2 Operations for Vectors

Basic R provides a set of operations that can be performed on vectors. If you want to access a particular element of a vector, this can be done with square brackets.

# Access the first element of vector1
vector1[1]
[1] 2
# Access element 4 of vector1
vector1[4]
[1] 4

It is possible to access more than one element at the same time. In this case you need to create a vector with an index of the elements you want to access.

# Remember we already created a vector called "vector1"
vector1 
[1] 2 4 1 4 5
# Access element 1 and 4 of vector1
vector1[c(1,4)]
[1] 2 4
# The index vector can also be specified before
index = c(1,3,5)
vector1[index]
[1] 2 1 5
# It is also possible to declare which elements should NOT be access by -
vector1[-index]
[1] 4 4

Alternatively, it is also possible to use logical statements.

vector1[vector1<3]
[1] 2 1
vector1[vector1==4]
[1] 4 4
# "OR" operation with |
vector1[vector1==4 | vector1<2]
[1] 4 1 4
# "AND" operation with & 
vector1[vector1>1 & vector1<4]
[1] 2

Sometimes it’s also helpful declare conditions on values for elements of vectors to select them. For instance, we would like to find those entries in a vector that satisfy a certain condition. For instance, the next command will return a vector that says if the condition is TRUE or FALSE for each element in the vector.

vector1 > 2
[1] FALSE  TRUE FALSE  TRUE  TRUE

Now let’s show the entries that satisfy the condition (as we did before).

vector1[vector1 > 2]
[1] 4 4 5

Sometimes we want to know which of the entries in the vector (in terms of their place in the row of elements) are the ones that satisfy the condition. We can use the which() function.

which(vector1 > 2)
[1] 2 4 5

Hence, we can find out that the 2nd, 4th, and 5th entry in the vector satisfy the condition (i.e., that they are greater than two). To see this, remember how vector1 looks like.

vector1
[1] 2 4 1 4 5

2.2.3 Operations for Vectors

In R it is possible to calculate with vectors. Many mathematical functions that are applied to single numbers can also be applied to the elements of a vector.

# Power of 2
4^2 
[1] 16
5^2
[1] 25
10^2
[1] 100
# Can also be applied to a vector with the same elements
# the operation is executed elementwise
c(4,5,10)^2
[1]  16  25 100
# Alternatively substract a number from each of the elements
c(4,5,10) - 10
[1] -6 -5  0

R provides mathematical functions to calculate the sum sum() and the mean mean() of the elements of a vector. Moreover, it offers calls to compute the length of a vector.

sum(vector1)
[1] 16
mean(vector1)
[1] 3.2
length(vector1)
[1] 5

It is also possible to put vectors together with append() or simply c(), to reverse it and to indicate the number of distinct elements unique().

append(vector1, vector3)
[1] 2 4 1 4 5 5 5
rev(vector1)
[1] 5 4 1 4 2
unique(vector1)
[1] 2 4 1 5

2.2.4 Matrices, Arrays, Lists and Data Frames

2.2.4.1 Matrices

Vectors are probably the most important objects in R. However, in most analyses the data is multivariate, i.e. comprise more than one variable. First let us consider matrices.

# By default matrices are `filled' up by columns
mat1 = matrix(c("a","b","c","d"), ncol = 2, nrow = 2)
mat1
     [,1] [,2]
[1,] "a"  "c" 
[2,] "b"  "d" 
# They can also be `filled' up by rows
mat2 = matrix(c("a","b","c","d"), ncol = 2, nrow = 2, byrow = TRUE)
mat2
     [,1] [,2]
[1,] "a"  "b" 
[2,] "c"  "d" 
# Check that matrix1 and transposed matrix2 are the same
mat1 == t(mat2) # (elementwise comparison)
     [,1] [,2]
[1,] TRUE TRUE
[2,] TRUE TRUE

A matrix has two dimensions, rows and columns. These can be accessed by squared brackets, whereas the first entry in brackets refers to rows and the second entry (separated by a ,) to columns.

# First row of matrix 1
mat1[1,]
[1] "a" "c"
# First column of matrix 1
mat1[,1]
[1] "a" "b"

A row or a column of a matrix is a vector, again.

# To check this
class(mat1[1,])
[1] "character"
class(mat1[,1])
[1] "character"

And, the other way around… Matrices can be constructed from binding vectors column-wise or row-wise together.

mat2 = cbind(vector1, rev(vector1))
mat2
     vector1  
[1,]       2 5
[2,]       4 4
[3,]       1 1
[4,]       4 4
[5,]       5 2
mat3 = rbind(vector1, rev(vector1))
mat3
        [,1] [,2] [,3] [,4] [,5]
vector1    2    4    1    4    5
           5    4    1    4    2

2.2.4.2 Arrays

Matrices have two dimensions. However, one might think of more dimensions than rows and columns. For instance, imagine you want to draw random numbers that you want to store din a matrix and you want to repeat it 2 times. Then arrays are useful.

# Create an empty array with 3 dimensions, for instance 
# an arrangement of 4 2x2 matrices 
arr1 <- array(NA, dim = c(2,2,4))
arr1
, , 1

     [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA

, , 2

     [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA

, , 3

     [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA

, , 4

     [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA
# Say you want to draw four random numbers that you store in a matrix 
# and you want to repeat this 4 times the four numbers drawn at the same
# time are stored in one matrix

# A seed is set to make this example reproducible
set.seed(1234)

# Note: You have to run the two previous lines of code before! 

# We can implement this in a loop (although this is not the most elegant way)
for (i in 1:dim(arr1)[3]){
  arr1[,,i] <- rnorm(4)
}

arr1
, , 1

           [,1]      [,2]
[1,] -1.2070657  1.084441
[2,]  0.2774292 -2.345698

, , 2

          [,1]       [,2]
[1,] 0.4291247 -0.5747400
[2,] 0.5060559 -0.5466319

, , 3

           [,1]       [,2]
[1,] -0.5644520 -0.4771927
[2,] -0.8900378 -0.9983864

, , 4

            [,1]       [,2]
[1,] -0.77625389  0.9594941
[2,]  0.06445882 -0.1102855

2.2.4.3 Lists

Another common object type in R is a list. The great advantage of a list is that it allows to collect items of different classes.

mylist <- list("hello", 1, arr1, vector1, mat1)
mylist
[[1]]
[1] "hello"

[[2]]
[1] 1

[[3]]
, , 1

           [,1]      [,2]
[1,] -1.2070657  1.084441
[2,]  0.2774292 -2.345698

, , 2

          [,1]       [,2]
[1,] 0.4291247 -0.5747400
[2,] 0.5060559 -0.5466319

, , 3

           [,1]       [,2]
[1,] -0.5644520 -0.4771927
[2,] -0.8900378 -0.9983864

, , 4

            [,1]       [,2]
[1,] -0.77625389  0.9594941
[2,]  0.06445882 -0.1102855


[[4]]
[1] 2 4 1 4 5

[[5]]
     [,1] [,2]
[1,] "a"  "c" 
[2,] "b"  "d" 

A very useful property of lists is that the collected items can be accessed by their names via a $ operator.

# First let us assign names to the elements of the list
names(mylist) = c("Word", "Number", "Array", "Vector", "Matrix")

# Now we can call the elements by their names 
mylist$Word
[1] "hello"
mylist$Matrix
     [,1] [,2]
[1,] "a"  "c" 
[2,] "b"  "d" 

Alternatively, you can access the elements of a list via their position using [[]].

mylist[[1]]
[1] "hello"

2.2.4.4 Data Frames

Let us now turn to data frames, the class of object that is probably used most frequently in applied analyses. Data frames share the benefits of both matrices, i.e. the rows and columns of a data set can be accessed via squared brackets, and lists, i.e. that it is possible to use the $ operator and exploit the naming of variables. Moreover, the different variables collected in a data frame can be of different classes, e.g. one characteristic is a numeric variable (e.g. age) and another categorical (e.g. occupation). A data frame can be constructed from a matrix or a vector.

# Age 
age = c(38, 20, 45, 20)
occ = c("bus driver", "barkeeper", "nurse", "student")
inc = c(2000, 1400, 1400, 800)
df = data.frame(age, occ, inc)
df
  age        occ  inc
1  38 bus driver 2000
2  20  barkeeper 1400
3  45      nurse 1400
4  20    student  800
# A matrix would not work here (try!)
#matrix = as.matrix(cbind(age, occ))

# Now variables can be accessed via the dimension operators
df[1,]
  age        occ  inc
1  38 bus driver 2000
df[2,]
  age       occ  inc
2  20 barkeeper 1400
df[,1]
[1] 38 20 45 20
df[,2]
[1] bus driver barkeeper  nurse      student   
Levels: barkeeper bus driver nurse student
# Or via the $ operator
df$age
[1] 38 20 45 20
df$occ
[1] bus driver barkeeper  nurse      student   
Levels: barkeeper bus driver nurse student
# It is also possible to access the data via logical statements
df[df$age <40,]
  age        occ  inc
1  38 bus driver 2000
2  20  barkeeper 1400
4  20    student  800
df[df[,3]>1000,]
  age        occ  inc
1  38 bus driver 2000
2  20  barkeeper 1400
3  45      nurse 1400

2.3 Working with Objects - Loops and Functions

2.3.1 Loops

Loops are an integral part of basic programming. A loop is a sequence of instructions that are repeated until a certain condition is reached. For instance, one might be interested in the cumulative maximum of the square of every element in the following vector v.

v <- c(1,4,2,10,5)

For our task “compute the cumulative maximum of the square of every element in the vector v”, the loop might look like

# Begin the loop with a `for` statement and choose running index with starting 
# and ending value
for (i in 1:length(v)) {
  
  # For each elemenet in the vector, compute the maximum
  square <- v[i]^2
  
  # Compute the cumulative maximum
  if (i == 1) {
    # The very first element in the loop is the 
    # cumulative maximum by definition 
    maxsq = square
  }
  
  if (i > 1) {
    # Cumulative maximum for the 
    # second, third, ...., last element in 
    # the loop
    maxsq <- max(square, maxsq)
  }
  
  # Print (i.e. show) the cumulative maximum
  
  print(maxsq) 
}
[1] 1
[1] 16
[1] 16
[1] 100
[1] 100

2.3.2 Functions

A function is a collection of statements that you can execute wherever and whenever you want(Langtangen 2011, 1:93). Functions are central to statistical programming as they allow to perform any kind of operation taking an object as an input and giving another object as an output. Thereby, the user can define the operation by himself. Suppose you want to implement the sum of squares of two numbers, i.e., doing something like the following

2^2 + 4^2
[1] 20

You can write a function, which takes any two values as input and returns the sum of the squared input variables as output.

# x and y are the input objects 
myfunc <- function(x, y){
  # this is the operation we want to implement
  z = x^2 + y^2
  # this is the output of the function
  return(z)
}

# Apply function to test if it does what we want it to do
myfunc(2, 4)
[1] 20

Although the function implementation might be a bit artificial in this example, the great advantage of the self-written function is that it can now be applied to any input values. Thereby, we know that nothing else will happen than what we told the function to do (except for programming errors!).

# Run the function for another pair of values 
myfunc(4, 6)
[1] 52

2.4 Optional: Vectorization

2.4.1 Basic Vectorization

R is a vectorized language, i.e. most functions can be applied not only to a single object but be executed for a collection of them. As in the examples above, it is possible to compute not only the square of a single number, but to calculate it for all elements of a vector.

In some cases, a loop is not the best way of implementation. For instance, always writing a loop from scratch might be prone to errors and sometimes a loop can be slow in terms of computation time. The task we want to execute can be implemented in an alternative way which makes use of vectorization in R. Thereby, you can take away two R lessons:

  1. First, make use of the vectorized logic of R and,
  2. Use implemented R functions.

Vectorization in R means that a certain function can be applied to a vector of elements. For instance, v^2 computes the squares for each element of the vector in R making it possible to replace the first line in the loop.

# Use vectorized function instead of a loop
v2 <- v^2

Moreover, basic R provides a function to compute a cumulative maximum. Therefore, there is no need to use a loop at all. Using implemented R functions is helpful since they can help you to save time and to avoid mistakes. Hence, doing some research for common problems can be very rewarding in many cases.

cummax(v2)
[1]   1  16  16 100 100
# or, equivalently
cummax(v^2)
[1]   1  16  16 100 100

2.4.2 More Advanced Vectorization

2.4.3 apply() and lapply()

In many cases, the task that should be implemented in a R program/function is more complicated than the previous example. The increased complexity already comes from using more complex objects than vectors like matrices or lists.

R provides an apply() family which allows to execute a function to a collection of objects. Suppose, now, we want to compute the cumulative maximum of the squares of not only one vector but on 5 vectors.

u <- c(5, 8, 1, 0, 2)
w <- c(1, 2, 1, 5, 1)
y <- c(8, 9, 1, 9, 8)
z <- c(10, 10, 10, 12, 10)

The function apply() makes it possible to execute a function on one dimension of a matrix, say on all columns. We could implement our task as following.

# Bind the vectors to a matrix (the vectors correspond to the columns)
mat <- cbind(u, v, w, y, z)

# apply the function to each column in the matrix (MARGIN = 2)
apply(mat, 2, function(x) cummax(x^2))
      u   v  w  y   z
[1,] 25   1  1 64 100
[2,] 64  16  4 81 100
[3,] 64  16  4 81 100
[4,] 64 100 25 81 144
[5,] 64 100 25 81 144

Alternatively, we can redefine our self-written function myfunc() and call it via apply(), too.

# x is the input object here
myfunc <- function(x){
  # this is the operation we want to implement
  y <- cummax(x^2)
  # this is the output of the function
  return(y)
}

# Or we can use the self-written function from above myfunct()
apply(mat, 2, myfunc)
      u   v  w  y   z
[1,] 25   1  1 64 100
[2,] 64  16  4 81 100
[3,] 64  16  4 81 100
[4,] 64 100 25 81 144
[5,] 64 100 25 81 144

Alternatively, one can bind the vector together as the rows a matrix and apply the function to the rows. We obtain the same results.

# Bind the vectors to a matrix (the vectors correspond to the rows)
mat = rbind(u, v, w, y, z)

# apply the function to each row in the matrix (MARGIN = 1)
apply(mat, 1, function(x) cummax(x^2))
      u   v  w  y   z
[1,] 25   1  1 64 100
[2,] 64  16  4 81 100
[3,] 64  16  4 81 100
[4,] 64 100 25 81 144
[5,] 64 100 25 81 144

Combining the vectors to a matrix might work in this case. However, a problem might arise when vectors have different lengths. Suppose, in addition we have a sixth vector x, which is longer than the others. Then, the matrix approach would fail without appropriate extension of the other vectors. A solution would be to collect the vectors in a list and execute our function via lapply().

x = c(1,2,3,5,6,1,5,6)

list1 = list(u, v, w, y, z, x)

# lapply() executes the function for each element of a list
lapply(list1, function(x) cummax(x^2))
[[1]]
[1] 25 64 64 64 64

[[2]]
[1]   1  16  16 100 100

[[3]]
[1]  1  4  4 25 25

[[4]]
[1] 64 81 81 81 81

[[5]]
[1] 100 100 100 144 144

[[6]]
[1]  1  4  9 25 36 36 36 36

2.4.4 Relatives to lapply(): Map(), mapply(), sapply()

An alternative and identical implementation can be achieved using the Map() function or, similarly the mapply() function

Map(function(x) cummax(x^2), list1) 
[[1]]
[1] 25 64 64 64 64

[[2]]
[1]   1  16  16 100 100

[[3]]
[1]  1  4  4 25 25

[[4]]
[1] 64 81 81 81 81

[[5]]
[1] 100 100 100 144 144

[[6]]
[1]  1  4  9 25 36 36 36 36
mapply(function(x) cummax(x^2), list1)
[[1]]
[1] 25 64 64 64 64

[[2]]
[1]   1  16  16 100 100

[[3]]
[1]  1  4  4 25 25

[[4]]
[1] 64 81 81 81 81

[[5]]
[1] 100 100 100 144 144

[[6]]
[1]  1  4  9 25 36 36 36 36

The Map() function allows for a yet more general way of vectorization using multiple arguments.

For instance, suppose we want are given a second list of the same length. For each of the six elements, we would like to know the maximum of the cummulative maximum of the squared entries across the two lists.

list2 = list(c(1:5), c(5:8), c(10:-1), c(0:5), c(2:8), c(9:4))
mapply(function(x,y) max(cummax(x^2), cummax(y^2)), list1, list2)
[1]  64 100 100  81 144  81
# Check
Map(function(x) cummax(x^2), list1) 
[[1]]
[1] 25 64 64 64 64

[[2]]
[1]   1  16  16 100 100

[[3]]
[1]  1  4  4 25 25

[[4]]
[1] 64 81 81 81 81

[[5]]
[1] 100 100 100 144 144

[[6]]
[1]  1  4  9 25 36 36 36 36
Map(function(x) cummax(x^2), list2)
[[1]]
[1]  1  4  9 16 25

[[2]]
[1] 25 36 49 64

[[3]]
 [1] 100 100 100 100 100 100 100 100 100 100 100 100

[[4]]
[1]  0  1  4  9 16 25

[[5]]
[1]  4  9 16 25 36 49 64

[[6]]
[1] 81 81 81 81 81 81

The function vapply() provides the opportunity to execute a specified function on elements in a vector. However, in our example this would not simplify our code. Vectorization provides elegant and efficient implementations which pays off in large data sets. Moreover, vectorization paves the way for parallelization, i.e. executing the task on a number of nodes/cores at the same time. For instance, the R package parallel provides parallelized functions of lapply (mclapply) and Map (mcMap).

References

Langtangen, Hans Petter. 2011. A Primer on Scientific Programming with Python. Vol. 1. Springer.