Getting SPSS and Stata Data Files Into R

I use SPSS and Stata quite a bit and want to know how to get .sav and .dta files into R.

Getting generic data files into R (e.g. .csv & .txt) are fairly simple and handled with R’s base library.  Getting this pre-formatted data files into R takes a bit more work (primarily the use of an additional package).

The foreign() package is apparently the key to doing this.  Packages are compact and efficient code repositories that the R community has generated and maintained over the years.  I think we’re going to find these quite useful throughout the semester.

So back to the task – getting an SPSS file into R.  There are some working SPSS files in Blackboard under “Course Documents –> R –> R Data”.  Feel free to use your own or download the course data files.  Either way, we either 1) have to know where this data is stored once we download it (or where your data is)  or 2) have R shortcuts mapped to the files where this data is stored (see previous post).

If you haven’t created shortcuts mapped to your working directory you can always set your working directory once you’re in R:

setwd("C:/mydata")

This is a generic folder and path name that you may use but you may also have something more elaborate like:

setwd("C:/Documents and Settings/Rwilliams/My Documents/R Data")

If you don’t set your working directory, you can still get data into R, but you’ll have to enter the entire path name each time (e.g. “C:/Documents and Settings/Rwilliams/My Documents/R Data/data.sav”).

You can also check the contents of your working directory with:

dir()

If you see you data in there, then you’re ready to start calling it.

The following code should get the SPSS data file into R:

library(foreign)

This calls the foreign package from your R library so we can read in foreign datasets

data<-read.spss("NELS88_student.sav",use.value.labels=T, max.value.labels=Inf, to.data.frame=T)

Since SPSS files often have labels that represent categorical variables we tell R to treat these labels as such (may not always be appropriate).  We also don’t put a limit on factor levels by specifying “Inf”.  Finally, we tell R that this is to become a data frame.

Your data should now be stored in the R object “data”.  Depending on the size of the data you just imported, it may not be useful to take a look at all of the data at one time.   A couple of things I typically do to make sure my data made it in correctly are below.

summary(data)

This gives a summary of each vector in the data frame.  Sometimes this can be cumbersome but if you know what you’re looking for it can help.

head(data)

This provides the first 6 rows of the data frame.

tail(data)

This provides the last six rows of the data.

nrow(data)

This let’s us know how may rows are in the data

ncol(data)

This lets us know how many columns are in the data

Datasets from Stata can easily be imported using the following command:

library(foreign)
data<-read.dta("data.dta")
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

2 Responses to Getting SPSS and Stata Data Files Into R

  1. avatar Julie says:

    I completed the homework without a problem or so I thought. After reading this blog, you use more specific codes for pulling in SPPS files. When I use your code, addressing value labels and max values etc, the code no longer works. Is there another package that needs to be installed or called upon other than foreign?

    my code —- data<-read.spss(”NELS88_student.sav”)
    your code — data<-read.spss(”NELS88_student.sav”,use.value.labels=T, max.value.labels=Inf, to.data.frame=T)

  2. avatar Ryan Williams says:

    Hi Julie –

    Two things to ask yourself when importing data from SPSS (or really any other file format) are: 1) did all the cases and variables survive the transfer; and 2) are the values (particularly for categorical factor variables) of your variables unchanged? My code should preserve the labels from SPSS.

    You do have to call the foreign() package before using read.spss(). I tried the code and it does seem to work for me. I tried your code as well and I am not getting the data into a manipulable format. Were you able to work with variables once you got it in?

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>