8 Introduction to Matrices
In this chapter, our main objectives are to delve into the realm of spatial data analysis by working with raster data. We will introduce you to various packages beyond “base R” that are specifically designed for handling spatial data. We will also explore the fundamental data structures for spatial data analysis, including matrix
and array
, as well as their spatial counterparts, namely the stars
class for single-band rasters and multi-band rasters.
Throughout this chapter, we will learn how to access and manipulate cell values and other properties of rasters. Additionally, we will gain insights into reading and writing raster data, which are essential skills for working with spatial data.
To achieve these aims, we will be utilizing the following R packages:
terra
: This package provides a versatile and efficient framework for working with spatial data, including raster data. It offers a wide range of functionalities for data manipulation, analysis, and visualization.tidyterra
: An extension of thetidyverse
ecosystem,tidyterra
complements theterra
package by providing a set of tidyverse-style functions and workflows for working with spatial data in a tidy and consistent manner.raster
: Theraster
package is a well-established package for handling raster data in R. It offers a comprehensive set of functions for reading, writing, processing, and analyzing raster datasets.sf
: Thesf
package is focused on handling vector data, providing classes and functions for working with spatial geometries. However, it also includes support for working with raster data, making it a valuable package for spatial data analysis.
By mastering these packages and their functionalities, you will gain the skills and tools necessary to explore, analyze, and visualize spatial data effectively.
8.1 Matrices
Matrices are a fundamental data structure in R that allows you to store and manipulate data in a two-dimensional format. They consist of rows and columns, where each element is identified by its row and column index. Matrices are useful for organizing and working with structured data, such as numerical or categorical values.
Matrices are useful for performing various mathematical operations, such as matrix multiplication, addition, and transposition. They are also commonly used in statistical analysis and data manipulation tasks. R provides many built-in functions and operators for working with matrices, allowing you to perform calculations, subsetting, and other operations efficiently.
8.2 What is a matrix
?
A matrix is a two-dimensional data structure in R that consists of rows and columns. It is similar to a table or a spreadsheet where each element in the matrix is identified by its row and column index. Matrices in R can contain elements of the same data type, such as numeric values, character strings, or logical values.
Unlike a data.frame
, the number of values in all columns of a matrix is equal, and the same can be said about the rows. It is important to know how to work with matrices because it is a commonly used data structure, with many uses in data processing and analysis, including spatial data. For example, many R function accept a matrix
as an argument, or return a matrix
as a returned object. Moreover, a matrix
is used to store raster object.
8.3 Creating a matrix
In R, you can create a matrix using the matrix()
function. The function takes in a vector of values and parameters specifying the number of rows and columns. A matrix
function accepts the following arguments:
-
data
—A vector of the values to fill into the matrix -
nrow
—The number of rows -
ncol
—The number of columns -
byrow
—Whether the matrix is filled by column (FALSE
, the default) or by row (TRUE
)
For example, to create a 3x3 matrix with numeric values, you can use the following code:
This will create a matrix my_matrix
with 3 rows and 3 columns, where the elements are filled in column-wise order.
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Note that the class of matrix
objects is a vector of length two, with the values "matrix"
and "array"
:
This implies the fact that the matrix
class inherits from the more general array
class. The nrow
and ncol
parameters determine the number of rows and number of columns, respectively. When only one of them is specified, the other is automatically determined based on the length of the data
vector:
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Example 8.1 What do you think will happen when we try to create a matrix with less, or more, data
values than matrix size nrow*ncol
? Run the following expressions to find out.
Solution.
Example 8.2 Create a 3×33×3 matrix where all values are 1/91/9.
Finally, the byrow
parameter determines the direction of filling the matrix with data
values. In both cases the filling starts from the top-left corner (i.e., row 1, column 1), however with byrow=FALSE
the matrix is filled one column at a time (the default), while with byrow=TRUE
the matrix is filled one row at a time. For example:
8.4 matrix
properties
In R, a matrix is a two-dimensional data structure that contains elements of the same data type organized in rows and columns. Matrices are useful for various mathematical and statistical operations.
8.4.1 Dimensions
A matrix has a defined number of rows and columns, which determine its dimensions. The dimensions of a matrix can be obtained using the dim() function.
For example, if mat is a matrix, dim(mat) will return a vector containing the number of rows and columns.
The length
function returns the number of values in a matrix
:
Just like with a data.frame
, the nrow
and ncol
functions return the number of rows and columns in a matrix
, respectively:
Also like with a data.frame
, the dim
function gives both dimensions of the matrix
as a vector of length 2, i.e., number of rows and columns, respectively:
For example, R has a built-in dataset named volcano
, which is a matrix
of surface elevation. The sample script volcano.R
, used in Section 2.1.1 to demontrate working with R code files, creates a 3D image of elevation based on that matrix (Figure 2.2).
Example 8.3 Find out what are the number of elements, rows and columns in the built-in matrix
named volcano
.
8.4.2 Row and column names
Like a data.frame
, matrix
objects also have row and column names which can be accessed or modified using the rownames
and colnames
functions, respectively. Unlike data.frame
row and column names, which are mandatory, matrix
row and column names are optional. For example, matrices created with matrix
initially do not have row and column names:
The matrix
row and column names can be initialized, or modified, by assignment to the rownames
and colnames
properties:
8.4.3 matrix
conversions
In R, you can convert between different data structures, such as vectors, data frames, and matrices, using various functions and operations. Here are some common ways to convert data into matrices:
8.4.4 matrix
→ vector
In R, you can convert a matrix into a vector using the as.vector() or c() function. Here’s how you can perform the conversion:
Note that the matrix
values are always arranged by column in the resulting vector!
Example 8.4 Does the volcano
matrix contain any NA
values? How can we check?
8.4.5 matrix
→ data.frame
To convert a matrix into a data frame in R, you can use the as.data.frame() function. Here’s how you can perform the conversion:
Note that row and column names are automatically generated (if they do not exist) as part of the conversion, since they are mandatory in a data.frame
8.4.6 Transposing a matrix
In R, you can transpose a matrix by interchanging its rows and columns. The transpose operation flips the matrix along its diagonal. R provides the t()
function specifically designed for transposing matrices. The t
function transposes a matrix
. In other words, the matrix rows and columns are "switched"—rows become columns and columns become rows. Here’s how you can transpose a matrix in R:
Example 8.5 What will be the result of t(t(x))
?