11  Lists

11.1 Lists

It’s time to introduce another data structure: the list.

Lists are very flexible because you can put anything you want in it: unlike a vector, the elements of a list can have different data types. For example:

list_example <- list(1, "a", TRUE)
list_example
[[1]]
[1] 1

[[2]]
[1] "a"

[[3]]
[1] TRUE

Like a vector, the “length” of a list corresponds to how many entries it contains:

length(list_example)
[1] 3

The elements of lists also can have names, they can be given by prepending them to the values, separated by an equals sign:

another_list <- list(title = "Numbers", numbers = 1:10, data = TRUE)
another_list
$title
[1] "Numbers"

$numbers
 [1]  1  2  3  4  5  6  7  8  9 10

$data
[1] TRUE

Lists, it turns out, can become a lot more complicated than vectors. While each entry of a vector is just a single value, each entry of a list can be any type of object, including vectors and data frames. For example, the following complicated_list list of length three contains three entries: a numeric vector, a data frame, and a single character value:

# define the cats data frame:
cats <- data.frame(coat = c("calico", "black", "tabby"),
                   weight = c(2.1, 5.0, 3.2),
                   likes_string = c(1, 0, 1),
                   stringsAsFactors = FALSE)
complicated_list <- list(vec = c(1, 2, 9),
                         dataframe = cats, 
                         single_value = "a")
complicated_list
$vec
[1] 1 2 9

$dataframe
    coat weight likes_string
1 calico    2.1            1
2  black    5.0            0
3  tabby    3.2            1

$single_value
[1] "a"
Challenge 1

Create a list of length two containing a (1) character vector containing the letters “x”, “y”, and “z” and (2) a data frame with two columns that looks like this.

    name grade
1  Henry     A
2 Hannah     B
3 Harvey     C

Your list output should look like this:

[[1]]
[1] "x" "y" "z"

[[2]]
    name grade
1  Henry     A
2 Hannah     B
3 Harvey     C
list(c("x", "y", "z"),
     data.frame(name = c("Henry", "Hannah", "Harvey"), grade = c("A", "B", "C")))
[[1]]
[1] "x" "y" "z"

[[2]]
    name grade
1  Henry     A
2 Hannah     B
3 Harvey     C

11.1.1 List subsetting

Now we’ll introduce some new subsetting operators. There are three functions used to subset lists. We’ve already seen these when learning about vectors and data frames: [, [[, and $.

Using [ will always return a list. If you want to subset a list, but not extract an element, then you will likely use [.

xlist <- list(a = "Software Carpentry", b = 1:10, data = head(mtcars))
xlist[1]
$a
[1] "Software Carpentry"

This returns a list with one element.

We can subset elements of a list the same way as atomic vectors using [. Comparison operations however won’t work as they’re not recursive, they will try to condition on the data structures in each element of the list, not the individual elements within those data structures.

xlist[1:2]
$a
[1] "Software Carpentry"

$b
 [1]  1  2  3  4  5  6  7  8  9 10

To extract individual elements of a list, you need to use the double-square bracket function: [[.

xlist[[1]]
[1] "Software Carpentry"

Notice that now the result is a vector, not a list.

You can’t extract more than one element at once:

xlist[[1:2]]
Error in xlist[[1:2]]: subscript out of bounds

Nor use it to skip elements:

xlist[[-1]]
Error in xlist[[-1]]: invalid negative subscript in get1index <real>

But you can use names to both subset and extract elements:

xlist[["a"]]
[1] "Software Carpentry"

The $ function is a shorthand way for extracting elements by name:

xlist$data
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
Challenge 2

Given the following list:

xlist <- list(a = "Software Carpentry", b = 1:10, data = head(mtcars))
xlist
$a
[1] "Software Carpentry"

$b
 [1]  1  2  3  4  5  6  7  8  9 10

$data
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Using list and vector subsetting, extract the second entry in the second element of the list (i.e., extract the number 2 from the b entry from xlist). There are several ways to do this. Compare your answer with your neighbor. Did you do it the same way?

xlist$b[2]
[1] 2
xlist[[2]][2]
[1] 2
xlist[["b"]][2]
[1] 2

11.1.2 Data frames as a special case of a list

It turns out that a data frame is a special kind of list. Specifically, a data frame is a list of vectors of the same length.

This is why you can extract vector columns from a data frame using the double brackets notation:

cats
    coat weight likes_string
1 calico    2.1            1
2  black    5.0            0
3  tabby    3.2            1
cats[["coat"]]
[1] "calico" "black"  "tabby" 

Note that the df[i, j] index notation is specific to data frames (and does not work for lists).