8 Matrices

Matrices are rectangular collections of numbers. In this module we will introduce them and review some basic operators, to then introduce a sneak peek of why matrices are useful (and cool).

8.1 Introduction

8.1.1 Scalars

One number (for example, 12) is referred to as a scalar.

\[ a = 12 \]

8.1.2 Vectors

We can put several scalars together to make a vector. Here is an example:

\[ \overrightarrow b = \begin{bmatrix} 12 \\ 14 \\ 15 \end{bmatrix} \]

Since this is a column of numbers, we cleverly refer to it as a column vector.

Here is another example of a vector, this time represented as a row vector:

\[ \overrightarrow c = \begin{bmatrix} 12 & 14 & 15 \end{bmatrix} \]

Column vectors are possibly more common and useful, but we sometimes write things down using row vectors to

Vectors are fairly easy to construct in R. As we saw before, we can use the c() function to combine elements:

c(5, 25, -2, 1)

[1]  5 25 -2  1

Warning

Remember that the code above does not create any objects. To do so, you’d need to use the assignment operator (<-):

vector_example <- c(5, 25, -2, 1)
vector_example

[1]  5 25 -2  1

Or we can also create vectors from sequences with the : operator or the seq() function:

10:20

 [1] 10 11 12 13 14 15 16 17 18 19 20

seq(from = 3, to = 27, by = 3)

[1]  3  6  9 12 15 18 21 24 27

8.2 Operators

8.2.1 Summation

The summation operator $\sum$ (i.e., the uppercase Sigma letter) lets us perform an operation on a sequence of numbers, which is often but not always a vector.

\[\overrightarrow d = \begin{bmatrix} 12 & 7 & -2 & 3 & -1 \end{bmatrix}\]

We can then calculate the sum of the first three elements of the vector, which is expressed as follows: \[\sum_{i=1}^3 d_i\]

Then we do the following math: \[12+7+(-2)=17\]

It is also common to use $n$ in the superscript to indicate that we want to sum all elements:

\[ \sum_{i=1}^n d_i = 12 + 7 + (-2) + 3 + (-1) = 19 \] We can perform these operations using the sum() function in R:

vector_d <- c(12, 7, -2, 3, -1)

sum(vector_d[1:3])

[1] 17

sum(vector_d)

[1] 19

8.2.2 Product

The product operator $\prod$ (i.e., the uppercase Pi letter) can also perform operations over a sequence of elements in a vector. Recall our previous vector:

\[\overrightarrow d = \begin{bmatrix} 12 & 7 & -2 & 3 & 1 \end{bmatrix}\]

We might want to calculate the product of all its elements, which is expressed as follows: \[\prod_{i=1}^n d_i = 12 \cdot 7 \cdot (-2) \cdot 3 \cdot (-1) = 504\]

In R, we can compute products using the prod() function:

prod(vector_d)

[1] 504

Exercise

Get the product of the first three elements of vector $d$. Write the notation by hand and use R to obtain the number.

8.3 Matrices

8.3.1 Basics

We can append vectors together to form a matrix:

\[A = \begin{bmatrix} 12 & 14 & 15 \\ 115 & 22 & 127 \\ 193 & 29 & 219 \end{bmatrix}\]

The number of rows and columns of a matrix constitute the dimensions of the matrix. The first number is the number of rows (“r”) and the second number is the number of columns (“c”) in the matrix.

Important

Find a way to remember “r x c” permanently. The order of the dimensions never changes.

Matrix $A$ above, for example, is a $3 \times 3$ matrix. Sometimes we’d refer to it as $A_{3 \times 3}$.

Tip

It is common to use capital letters (sometimes bold-faced) to represent matrices. In contrast, vectors are usually represented with either bold lowercase letters or lowercase letters with an arrow on top (e.g., $\overrightarrow v$).

Constructing matrices in R

There are different ways to create matrices in R. One of the simplest is via rbind() or cbind(), which paste vectors together (either by rows or by columns):

# Create some vectors
vector1 <- 1:4
vector2 <- 5:8
vector3 <- 9:12
vector4 <- 13:16

# Using rbind(), each vector will be a row 
rbind_mat <- rbind(vector1, vector2, vector3, vector4)
rbind_mat

        [,1] [,2] [,3] [,4]
vector1    1    2    3    4
vector2    5    6    7    8
vector3    9   10   11   12
vector4   13   14   15   16

# Using cbind(), each vector will be a column
cbind_mat <- cbind(vector1, vector2, vector3, vector4)
cbind_mat

     vector1 vector2 vector3 vector4
[1,]       1       5       9      13
[2,]       2       6      10      14
[3,]       3       7      11      15
[4,]       4       8      12      16

An alternative is to use to properly named matrix() function. The basic syntax is matrix(data, nrow, ncol, byrow):

data is the input vector which becomes the data elements of the matrix.
nrow is the number of rows to be created.
ncol is the number of columns to be created.
byrow is a logical clue. If TRUE then the input vector elements are arranged by row. By default (FALSE), elements are arranged by column.

Let’s see some examples:

# Elements are arranged sequentially by row.
M <- matrix(c(1:12), nrow = 4, byrow = T)
M

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
[4,]   10   11   12

# Elements are arranged sequentially by column (byrow = F by default).
N <- matrix(c(1:12), nrow = 4)
N

     [,1] [,2] [,3]
[1,]    1    5    9
[2,]    2    6   10
[3,]    3    7   11
[4,]    4    8   12

8.3.2 Structure

How do we refer to specific elements of the matrix? For example, matrix $A$ is an $m\times n$ matrix where $m=n=3$. This is sometimes called a square matrix.

More generally, matrix $B$ is an $m\times n$ matrix where the elements look like this: \[B= \begin{bmatrix} b_{11} & b_{12} & b_{13} & \ldots & b_{1n} \\ b_{21} & b_{22} & b_{23} & \ldots & b_{2n} \\ \vdots & \vdots & \vdots & \ldots & \vdots \\ b_{m1} & b_{m2} & b_{m3} & \ldots & b_{mn} \end{bmatrix}\]

Thus $b_{23}$ refers to the second unit down and third across. More generally, we refer to row indices as $i$ and to column indices as $j$.

In R, we can access a matrix’s elements using square brackets:

# In matrix N, access the element at 1st row and 3rd column.
N[1,3]

[1] 9

# In matrix N, access the element at 4th row and 2nd column.
N[4,2]

[1] 8

Tip

When trying to identify a specific element, the first subscript is the element’s row and the second subscript is the element’s column (always in that order).

Warning

In R, indexing is 1-based, meaning that the first element of a vector, matrix, or any other data structure is accessed with index 1. In other programming tools such as Python, indexing is 0-based, meaning that the first element of a list, array, or any other data structure is accessed with index 0.

# Create a 2x2 matrix in R
matrix_A <- matrix(1:4, nrow = 2, ncol = 2)

# Access the element in the first row, first column
element <- matrix_A[1, 1]  # This will return 1

# Create a list in Python
vector = [10, 20, 30, 40]

# Access the first element
first_element = vector[0]  # This will return 10
import numpy as np

# Create a 2x2 matrix in Python with NumPy
matrix_A = np.array([[1, 2], [3, 4]])

# Access the element in the first row, first column
element = matrix_A[0, 0]  # This will return 1

8.4 Matrix operations

8.4.1 Addition and subtraction

Addition and subtraction are straightforward operations.
Matrices must have exactly the same dimensions for both of these operations.
We add or subtract each element with the corresponding element from the other matrix.
This is expressed as follows:

\[A \pm B=C\]

\[c_{ij}=a_{ij} \pm b_{ij} \text{ }\forall i,j\]

\[\begin{bmatrix} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{bmatrix} \pm \begin{bmatrix} b_{11} & b_{12} & b_{13}\\ b_{21} & b_{22} & b_{23}\\ b_{31} & b_{32} & b_{33} \end{bmatrix}\] \[=\] \[\begin{bmatrix} a_{11}\pm b_{11} & a_{12}\pm b_{12} & a_{13}\pm b_{13}\\ a_{21}\pm b_{21} & a_{22}\pm b_{22} & a_{23}\pm b_{23}\\ a_{31}\pm b_{31} & a_{32}\pm b_{32} & a_{33}\pm b_{33} \end{bmatrix}\]

Addition and subtraction in R

We start by creating two 2x3 matrices:

\[A= \begin{bmatrix} 3 & -1 & 2 \\ 9 & 4 & 6 \end{bmatrix}\]

# Create two 2x3 matrices.
matrix1 <- matrix(c(3, 9, -1, 4, 2, 6), nrow = 2)
matrix1

     [,1] [,2] [,3]
[1,]    3   -1    2
[2,]    9    4    6

And \[B= \begin{bmatrix} 5 & 2 & 0 \\ 9 & 3 & 4 \end{bmatrix}\]

matrix2 <- matrix(c(5, 2, 0, 9, 3, 4), nrow = 2)
matrix2

     [,1] [,2] [,3]
[1,]    5    0    3
[2,]    2    9    4

We can simply use the + and - operators for addition and substraction:

matrix1 + matrix2

     [,1] [,2] [,3]
[1,]    8   -1    5
[2,]   11   13   10

matrix1 - matrix2

     [,1] [,2] [,3]
[1,]   -2   -1   -1
[2,]    7   -5    2

Exercise

(Use code for one of these and do the other one by hand!)

1) Calculate $A + B$

\[A= \begin{bmatrix} 1 & 0 \\ -2 & -1 \end{bmatrix}\]

\[B = \begin{bmatrix} 5 & 1 \\ 2 & -1 \end{bmatrix}\]

2) Calculate $A - B$

\[A= \begin{bmatrix} 6 & -2 & 8 & 12 \\ 4 & 42 & 8 & -6 \end{bmatrix}\]

\[B = \begin{bmatrix} 18 & 42 & 3 & 7 \\ 0 & -42 & 15 & 4 \end{bmatrix}\]

8.4.2 Scalar multiplication

Scalar multiplication is very intuitive. As we know, a scalar is a single number. We multiply each value in the matrix by the scalar to perform this operation.

Formally, this is expressed as follows: \[A = \begin{bmatrix} a_{11} & a_{12} & a_{13}\\ a_{21} & a_{22} & a_{23}\\ a_{31} & a_{32} & a_{33} \end{bmatrix}\] \[cA = \begin{bmatrix} ca_{11} & ca_{12} & ca_{13}\\ ca_{21} & ca_{22} & ca_{23}\\ ca_{31} & ca_{32} & ca_{33} \end{bmatrix}\]

In R, all we need to do is take an established matrix and multiply it by some scalar:

# matrix1 from our previous example
matrix1

     [,1] [,2] [,3]
[1,]    3   -1    2
[2,]    9    4    6

matrix1 * 3

     [,1] [,2] [,3]
[1,]    9   -3    6
[2,]   27   12   18

Exercise

Calculate $2\times A$ and $-3 \times B$. Again, do one by hand and the other one using R.

\[A= \begin{bmatrix} 1 & 4 & 8 \\ 0 & -1 & 3 \end{bmatrix}\] \[ B = \begin{bmatrix} -15 & 1 & 5 \\ 2 & -42 & 0 \\ 7 & 1 & 6 \end{bmatrix}\]

8.4.3 Matrix multiplication

Multiplying matrices is slightly trickier than multiplying scalars.
Two matrices must be conformable for them to be multiplied together. This means that the number of columns in the first matrix equals the number of rows in the second.
When multiplying $A \times B$, if $A$ is $m \times n$, $B$ must have $n$ rows.

Important

The conformability requirement never changes. Before multiplying anything, check to make sure the matrices are indeed conformable.

The resulting matrix will have the same number of rows as the first matrix and the number of columns in the second. For example, if $A$ is $i \times k$ and $B$ is $k \times j$, then $A \times B$ will be $i \times j$.

Which of the following can we multiply? What will be the dimensions of the resulting matrix? \[\begin{aligned} B_{4 \times 1}= \begin{bmatrix} 2 \\ 3\\ 4\\ 1 \end{bmatrix} M_{3 \times 3} = \begin{bmatrix} 1 & 0 & 2\\ 1 & 2 & 4\\ 2 & 3 & 2 \end{bmatrix} L_{2 \times 3} = \begin{bmatrix} 6 & 5 & -1\\ 1 & 4 & 3 \end{bmatrix} \end{aligned}\]

The only valid multiplication based on the provided matrices is $L \times M$, which results in a $2 \times 3$ matrix.

Why can’t we multiply in the opposite order?

The non-commutative property of matrix multiplication is a fundamental aspect in matrix algebra. The multiplication of matrices is sensitive to the order in which the matrices are multiplied due to the requirements of dimensional compatibility, the resulting dimensions, and the computation process itself.

Warning

When multiplying matrices, order matters. Even if multiplication is possible in both directions, in general $AB \neq BA$.

Multiplication steps

Multiply each row by each column, summing up each pair of multiplied terms.

Tip

This is sometimes to referred to as the “dot product,” where we multiply matching members, then sum up.

The element in position $ij$ is the sum of the products of elements in the $i$th row of the first matrix ($A$) and the corresponding elements in the $j$th column of the second matrix ($B$). \[c_{ij}=\sum_{k=1}^n a_{ik}b_{kj}\]

Example

Suppose a company manufactures two kinds of furniture: chairs and sofas.

A chair costs $100 for wood, $270 for cloth, and $130 for feathers.
Each sofa costs $150 for wood, $420 for cloth, and $195 for feathers.

	Chair	Sofa
Wood	100	150
Cloth	270	420
Feathers	130	195

The same information about unit cost ($C$) can be presented as a matrix.

\[C = \begin{bmatrix} 100 & 150\\ 270 & 420\\ 130 & 195 \end{bmatrix}\]

Note that each of the three rows of this 3 x 2 matrix represents a material (wood, cloth, or feathers), and each of the two columns represents a product (chair or coach). The elements are the unit cost (in USD).

Now, suppose that the company will produce 45 chairs and 30 sofas this month. This production quantity can be represented in the following table, and also as a 2 x 1 matrix ($Q$):

Product	Quantity
Chair	45
Sofa	30

\[Q = \begin{bmatrix} 45 \\ 30 \end{bmatrix}\]

What will be the company’s total cost? The “total expenditure” is equal to the “unit cost” times the “production quantity” (the number of units).

The total expenditure ($E$) for each material this month is calculated by multiplying these two matrices.

\[\begin{aligned} E = CQ = \begin{bmatrix} 100 & 150\\ 270 & 420\\ 130 & 195 \end{bmatrix} \begin{bmatrix} 45 \\ 30 \end{bmatrix} = \begin{bmatrix} (100)(45) + (150)(30) \\ (270)(45) + (420)(30) \\ (130)(45) + (195)(30) \end{bmatrix} = \begin{bmatrix} 9,000 \\ 24,750 \\ 11,700 \end{bmatrix} \end{aligned}\]

Multiplying the 3x2 Cost matrix ($C$) times the 2x1 Quantity matrix ($Q$) yields the 3x1 Expenditure matrix ($E$).

As a result of this matrix multiplication, we determine that this month the company will incur expenditures of:

$9,000 for wood
$24,750 for cloth
$11,700 for feathers.

Matrix multiplication in R

Before attempting matrix multiplication, we must make sure the matrices are conformable (as we do for our manual calculations).

Then we can multiply our matrices together using the %*% operator.

C <- matrix(c(100, 270, 130, 150, 420, 195), nrow = 3)
C

     [,1] [,2]
[1,]  100  150
[2,]  270  420
[3,]  130  195

Q <- matrix(c(45, 30), nrow = 2)
Q

     [,1]
[1,]   45
[2,]   30

C %*% Q

      [,1]
[1,]  9000
[2,] 24750
[3,] 11700

Warning

If you have a missing value or NA in one of the matrices you are trying to multiply (something we will discuss in further detail in the next module), you will have NAs in your resulting matrix.

8.4.4 Properties of operations

Addition and subtraction:
- Associative: $(A \pm B) \pm C = A \pm (B \pm C)$
- Communicative: $A \pm B = B \pm A$
Multiplication:
- $AB \neq BA$
- $A(BC) = (AB)C$
- $A(B+C) = AB + AC$
- $(A+B)C = AC + BC$

8.5 Special matrices

Square matrix

\[ A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix} \]

In a square matrix, the number of rows equals the number of columns ($m=n$):
The diagonal of a matrix is a set of numbers consisting of the elements on the line from the upper-left-hand to the lower-right-hand corner of the matrix, as in $ d(A)=[1,5,9] $. Diagonals are particularly useful in square matrices.
The trace of a matrix, denoted as $tr(A)$, is the sum of the diagonal elements of the matrix. $tr(A) = 1+5+9 = 15$

Diagonal matrix:

In a diagonal matrix, all of the elements of the matrix that are not on the diagonal are equal to zero. \[ D = \begin{bmatrix} 4 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 6 \end{bmatrix} \]

Scalar matrix:

A scalar matrix is a diagonal matrix where the diagonal elements are all equal to each other. In other words, we’re really only concerned with one scalar (or element) held in the diagonal.

\[ S = \begin{bmatrix} 7 & 0 & 0 \\ 0 & 7 & 0 \\ 0 & 0 & 7 \end{bmatrix} \]

Identity matrix:

\[ I = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \]

The identity matrix is a scalar matrix with all of the diagonal elements equal to one.
Remember that, as with all diagonal matrices, the off-diagonal elements are equal to zero.
The capital letter $I$ is reserved for the identity matrix. For convenience, a 3x3 identity matrix can be denoted as $I_3$.

8.6 Transpose

The transpose is the original matrix with the rows and the columns interchanged.

The notation is either $J'$ (“J prime”) or $J^T$ (“J transpose”).

\[J = \begin{bmatrix} 4 & 5\\ 3 & 0\\ 7 & -2 \end{bmatrix}\]

\[J' = J^T = \begin{bmatrix} 4 & 3 & 7 \\ 5 & 0 & -2 \end{bmatrix}\]

In R, we use t() to get the transpose.

J <- matrix(c(4, 3, 7, 5, 0, -2), ncol = 2)
J

     [,1] [,2]
[1,]    4    5
[2,]    3    0
[3,]    7   -2

t(J)

     [,1] [,2] [,3]
[1,]    4    3    7
[2,]    5    0   -2

8.7 Inverse

Just like a number has a reciprocal, a matrix has an inverse.
When we multiply a matrix by its inverse we get the identity matrix (which is like “1” for matrices).

\[A × A^{-1} = I\]

The inverse of A is $A^{-1}$ only when:

\[AA^{-1} = A^{-1}A = I\]

Sometimes there is no inverse at all.

Note

For now, don’t worry about calculating the inverse of a matrix manually. This is the type of task we use R for.

In R, we use the solve() function to calculate the inverse of a matrix:

A <- matrix(c(3, 2, 5, 2, 3, 2, 5, 2, 4), ncol = 3)
A

     [,1] [,2] [,3]
[1,]    3    2    5
[2,]    2    3    2
[3,]    5    2    4

solve(A)

            [,1]        [,2]       [,3]
[1,] -0.29629630 -0.07407407  0.4074074
[2,] -0.07407407  0.48148148 -0.1481481
[3,]  0.40740741 -0.14814815 -0.1851852

8.8 Linear systems and matrices

A system of equations can be represented by an augmented matrix.
System of equations: \[{\color{red}{3}}x + {\color{green}{6}}y = {\color{blue}{12}}\] \[{\color{red}{5}}x + {\color{green}{10}}y = {\color{blue}{25}}\]
In an augmented matrix, each row represents one equation in the system and each column represents a variable or the constant terms. \[\begin{bmatrix} {\color{red}{3}} & {\color{green}{6}} & {\color{blue}{12}}\\ {\color{red}{5}} & {\color{green}{10}} & {\color{blue}{25}} \end{bmatrix}\]

8.9 OLS and matrices

We can use the logic above to calculate estimates for our ordinary least squares (OLS) models.
OLS is a linear regression technique used to find the best-fitting line for a set of data points (observations) by minimizing the residuals (the differences between the observed and predicted values).
We minimize the sum of the squared errors.

8.9.1 Dependent variable

Suppose, for example, we have a sample consisting of $n$ observations.
The dependent variable is denoted as an $n \times1$ column vector.

\[Y = \begin{bmatrix} y_1 \\ y_2 \\ y_3 \\ \vdots \\ y_n \end{bmatrix}\]

8.9.2 Independent variables

Suppose there are $k$ independent variables and a constant term, meaning $k+1$ columns and $n$ rows.
We can represent these variables as an $n \times (k+1)$ matrix, expressed as follows:

\[X= \begin{bmatrix} 1 & x_{11} & \dots & x_{1k} \\ 1 & x_{21} & \dots & x_{2k} \\ \vdots & \vdots & \dots & \vdots \\ 1 & x_{n1} & \dots & x_{nk} \end{bmatrix}\]

$x_{ij}$ is the $i$-th observation of the $j$-th independent variable.

8.9.3 Linear regression model

Let’s say we have 173 observations (n = 173) and 2 IVs (k = 3).
This can be expressed as the following linear equation: \[y = \beta_0 + \beta_1x_1 + \beta_2x_2 + \epsilon\]
In matrix form, we have: \[\begin{aligned} \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix} = \begin{bmatrix} 1 & x_{11} & x_{21} \\ 1 & x_{21} & x_{22} \\ \vdots & \vdots & \vdots \\ 1 & x_{1173} & x_{2173} \end{bmatrix} \begin{bmatrix} \beta_0 \\ \beta_1 \\ \beta_2 \end{bmatrix} + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_{173} \end{bmatrix}\end{aligned} \]
All 173 equations can be represented by: \[y=X\beta+\epsilon\]

8.9.4 Estimates

Without getting too much into the mechanics, we can calculate our coefficient estimates with matrix algebra using the following equation:

\[\hat{\beta} = (X'X)^{-1}X'Y\]

Read aloud, we say “X prime X inverse, X prime Y”.
The little hat on our beta ($\hat{\beta}$) signifies that these are estimates.
Remember, the OLS method is to choose $\hat{\beta}$ such that the sum of squared residuals (“SSR”) is minimized.

8.9.4.1 Example in R

We will load the mtcars data set (our favorite) for this example, which contains data about many different car models.

cars_df <- mtcars

Now, we want to estimate the association between hp (horsepower) and wt (weight), our independent variables, and mpg (miles per gallon), our dependent variable.
First, we transform our dependent variable into a matrix, using the as.matrix function and specifying the column of the mtcars data set to create a column vector of our observed values for the DV.

Y <- as.matrix(cars_df$mpg)
Y

      [,1]
 [1,] 21.0
 [2,] 21.0
 [3,] 22.8
 [4,] 21.4
 [5,] 18.7
 [6,] 18.1
 [7,] 14.3
 [8,] 24.4
 [9,] 22.8
[10,] 19.2
[11,] 17.8
[12,] 16.4
[13,] 17.3
[14,] 15.2
[15,] 10.4
[16,] 10.4
[17,] 14.7
[18,] 32.4
[19,] 30.4
[20,] 33.9
[21,] 21.5
[22,] 15.5
[23,] 15.2
[24,] 13.3
[25,] 19.2
[26,] 27.3
[27,] 26.0
[28,] 30.4
[29,] 15.8
[30,] 19.7
[31,] 15.0
[32,] 21.4

Next, we do the same thing for our independent variables of interest, and our constant.

# create two separate matrices for IVs
X1 <- as.matrix(cars_df$hp)
X2 <- as.matrix(cars_df$wt)

# create constant column

# bind them altogether into one matrix
constant <-  rep(1, nrow(cars_df))
X <- cbind(constant, X1, X2)
X

      constant          
 [1,]        1 110 2.620
 [2,]        1 110 2.875
 [3,]        1  93 2.320
 [4,]        1 110 3.215
 [5,]        1 175 3.440
 [6,]        1 105 3.460
 [7,]        1 245 3.570
 [8,]        1  62 3.190
 [9,]        1  95 3.150
[10,]        1 123 3.440
[11,]        1 123 3.440
[12,]        1 180 4.070
[13,]        1 180 3.730
[14,]        1 180 3.780
[15,]        1 205 5.250
[16,]        1 215 5.424
[17,]        1 230 5.345
[18,]        1  66 2.200
[19,]        1  52 1.615
[20,]        1  65 1.835
[21,]        1  97 2.465
[22,]        1 150 3.520
[23,]        1 150 3.435
[24,]        1 245 3.840
[25,]        1 175 3.845
[26,]        1  66 1.935
[27,]        1  91 2.140
[28,]        1 113 1.513
[29,]        1 264 3.170
[30,]        1 175 2.770
[31,]        1 335 3.570
[32,]        1 109 2.780

Next, we calculate $X'X$, $X'Y$, and $(X'X)^{-1}$.

Don’t forget to use %*% for matrix multiplication!

# X prime X
XpX <- t(X) %*% X

# X prime X inverse
XpXinv <- solve(XpX)

# X prime Y
XpY <- t(X) %*% Y

# beta coefficient estimates
bhat <- XpXinv %*% XpY
bhat

                [,1]
constant 37.22727012
         -0.03177295
         -3.87783074

Arel-Bundock, Vincent, Nils Enevoldsen, and CJ Yetman. 2018. “Countrycode: An r Package to Convert Country Names and Country Codes.” Journal of Open Source Software 3 (28): 848. https://doi.org/10.21105/joss.00848.

Aronow, Peter M, and Benjamin T Miller. 2019. Foundations of Agnostic Statistics. Cambridge University Press.

Bank, World. 2023. “World Bank Open Data.” https://data.worldbank.org/.

Baydin, Atılım Günes, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. “Automatic Differentiation in Machine Learning: A Survey.” The Journal of Machine Learning Research 18 (1): 5595–5637.

Coppedge, Michael, John Gerring, Carl Henrik Knutsen, Staffan I. Lindberg, Jan Teorell, David Altman, Michael Bernhard, et al. 2022. “V-Dem Codebook V12.” Varieties of Democracy (V-Dem) Project. https://www.v-dem.net/dsarchive.html.

Dahlberg, Stefan, Aksen Sundström, Sören Holmberg, Bo Rothstein, Natalia Alvarado Pachon, Cem Mert Dalli, and Yente Meijers. 2023. “The Quality of Government Basic Dataset, Version Jan23.” University of Gothenburg: The Quality of Government Institute. https://www.gu.se/en/quality-government doi:10.18157/qogbasjan23.

FiveThirtyEight. 2021. “Tracking Congress In The Age Of Trump [Dataset].” https://projects.fivethirtyeight.com/congress-trump-score/.

Imai, Kosuke, and Nora Webb Williams. 2022. Quantitative Social Science: An Introduction in Tidyverse. Princeton; Oxford: Princeton University Press.

Moore, Will H., and David A. Siegel. 2013. A Mathematics Course for Political and Social Research. Princeton, NJ: Princeton University Pres.

Pontin, Jason. 2007. “Oppenheimer’s Ghost.” MIT Technology Review, October 15, 2007. https://www.technologyreview.com/2007/10/15/223531/oppenheimers-ghost-3/.

Robinson, David. 2020. Fuzzyjoin: Join Tables Together on Inexact Matching. https://github.com/dgrtwo/fuzzyjoin.

Rossi, Hugo. 1996. “Mathematics Is an Edifice, Not a Toolbox.” Notices of the AMS 43 (10): 1108.

Smith, Danny. 2020. Survey Research Datasets and R. https://socialresearchcentre.github.io/r_survey_datasets/.

U. S. Department of Agriculture [USDA], Agricultural Research Service. 2019. “Department of Agriculture Agricultural Research Service.” https://fdc.nal.usda.gov/.

Whittinghill, Dexter C, and Robert V Hogg. 2001. “A Little Uniform Density with Big Instructional Potential.” Journal of Statistics Education 9 (2).

Wickham, Hadley. 2014. “Tidy Data.” Journal of Statistical Software 59 (10). https://doi.org/10.18637/jss.v059.i10.

Wickham, Hadley, Danielle Navarro, and Thomas Lin Pedersen. 2023. Ggplot2: Elegant Graphics for Data Analysis. 3rd ed. https://ggplot2-book.org/.