Introduction

This is a guide to using R for basic data analysis used in econometrics, namely OLS regression and its extensions. This guide is merely how to apply basic econometrics knowledge into R commands, that is, it assumes you already understand the concepts. This is not an econometrics textbook, it will not describe the theory and explanation behind various concepts, only a guide of how to model and apply them in R. For more on the theoretical background, see the lecture slides in my Econometrics class.

How to Read this Guide

As an econometrics student, the core of your data analysis life will be working with data.frames (think “spreadsheets”, where each row is an observation and each column is a variable). You will:

  • import data into a data.frame
  • transform (“wrangle”) data into more useful variables or data.frames
  • plot data from data.frames (in histograms, scatterplots, etc.)
  • run regressions using data from data.frames

This guide attempts to introduce you to R from the ground up, which means it starts with simpler types of objects than data.frames (namely, vectors). I would not necessarily recommend reading from beginning to end. The first two sections describe a lot about R as a language and discuss different types of R objects, data types, and commands. Starting at the very beginning, reading them will seem overwhelming. They will become more useful to return to for reference later, once you have some practice under your belt.

Other Comments

Open Source: The raw (.Rmd) code used to produce this guide, along with the guide itself, are available on GitHub, and are updated regularly. GitHub does not automatically render HTML, so download the HTML file and open it, or view it where I host it on my website.

Note to Students: This is a work in progress, check the date at the top for when this was last updated. This compiles all of my instructions, advice, and examples from econometrics class lectures regarding R. It also contains some advanced material that I did not or will not cover in class, but will be useful to know for future data analysis and understanding or diagnosing problems.

Note to Everyone Else: This guide is oriented primarily for my Econometrics class at Hood College, but should be of wider use to anyone interest ed in learning R for data analysis. Lecture slides, handouts, and guides (both PDFs and source code in R Markdown) are openly available on GitHub.

See also my companion guide to using R Markdown to more effectively manage your entire workflow (text, data analysis, tables, graphs, and citations!) in a single plain text file and make your work reproducible and shareable, hosted on my website, with source available on GitHub