Write APA-Style Manuscripts Directly in RStudio

As penance for my long absence, I’m going to tackle a big topic today. I’m going to show you how to write your APA-Style manuscript directly in RStudio using LaTeX and Sweave. LaTeX is a typesetting program that uses some syntax to automatically format your manuscript, unlike MS Word where you have to format everything manually. Sweave just runs LaTeX from R and allows you to use R commands directly in your manuscript. My goal isn’t to make you an expert on everything that LaTeX can do. Quite frankly, I don’t know it that well. Instead, my goal is to give you an accessible framework so that you can go write your next manuscript in RStudio.

Quick note: It’s been a while since I set all of this up for the first time on my computer, so I can’t remember if you need to download some LaTeX packages, like the apa6 and apacite packages. If something doesn’t work for you, then shoot me an e-mail and I’ll try to help you through the problem and update this tutorial.

Motivation

Why write your manuscript in RStudio? I’ll give you three reasons.

1. You can do everything in one program. Instead of switching back and forth between windows for SPSS and MS Word, or between R and MS Word, you can do everything in a couple of RStudio tabs.

2. Never forget to update your statistics again. Sometimes you collect more data, or you come up with a better strategy for data analysis, and then you need to update your manuscript. This usually requires running the new analyses in your stats program, and then writing the results in your MS Word document. Well, sometimes you run the analysis and then forget to update your Word document. Or you enter a typo and so your manuscript numbers differ from your stats program numbers. By writing everything in RStudio, you can write the analysis directly in your manuscript. That means if you tweak the analysis, your manuscript is automatically updated to reflect the changes.

3. Never fidget with APA-style formatting again. There are packages in LaTeX that will give you an APA-style document, including references, automatically. This lets you just focus on the writing.

So let’s get started.

Install LaTeX

Sweave is a part of R and RStudio, but you need to install LaTeX. You can find installation instructions at the LaTeX Project website.

Set up RStudio

Now that you’ve installed LaTeX, it’s time to make sure that RStudio is set up properly. Find the RStudio Preferences. Then go to the Sweave section. Where it says “Weave Rnw files using:”, make sure to select Sweave. Where it says “Typeset LaTeX into PDF using:”, make sure to select pdfLaTeX. Save the changes, and now you’re ready to write your manuscript in RStudio!

Create a New R Sweave File

In RStudio, go to File -> New File -> R Sweave to create a new R Sweave file. The file extension will be Rnw, so if I refer to your R Sweave file or your Rnw file, it means exactly the same thing.

Screen Shot 2016-04-24 at 3.11.59 PM

You’ll see a new file that should have the following code:

\documentclass{article}

\begin{document}
\SweaveOpts{concordance=TRUE}

\end{document}

This is LaTeX code. In LaTeX syntax, all functions begin with a backslash, then they have the function name, and then the arguments go in curly braces like this: \functionName{arguments}. The \documentclass{} function specifies what kind of document you’re writing, which will determine how LaTeX formats the document. The section between the \documentclass{} function and the beginning of the document is called the “preamble”. You can add functions to the preamble that will change the format of the document, which we’ll do next.

Setting Up Your APA-Style Manuscript

Now we’re going to change the R Sweave default code to give you the basic code for an APA-Style manuscript. We’ll do this by accessing the apa6 and apacite packages in LaTeX.

Replace the code in your R Sweave file with the following code:

\documentclass[man,a4paper,noextraspace,apacite]{apa6}
\usepackage{apacite}
\title{}
\shorttitle{}
\author{}
\affiliation{}

\abstract{}
\keywords{}

\authornote{}

\begin{document}
\maketitle

\bibliography{}
\bibliographystyle{apacite}

\end{document}

Okay, there’s a lot going on here, so let’s break it down.

\documentclass[man,noextraspace,apacite]{apa6} This line specifies that we want to use the apa6 package to create an apa6 document. That’s why apa6 appears in the curly braces. All the stuff in the square brackets specifies some formatting for the apa6 document.

  • man formats the document as a manuscript, whereas jou formats it like a published article (you can try switching back and forth after we compile our pdf later)
  • noextraspace fixes a formatting problem that sometimes comes up with the apa6 documents, though I can’t actually recall the specific problem
  • apacite specifies that we’ll use the apacite package to format our references section
  • \usepackage{apacite} This line accesses the apacite package, which will give us an APA-style References section.

    \title{}, \shorttitle{}, \author{}, \affiliation{}, \abstract{}, \keywords{} These are exactly what they sound like. You can write the title of your paper, the running head (shorttitle), your name, your affiliation, your entire abstract, and your keywords (separated with commas) in the curly braces of each function. I’ll show you a complete example in a moment.

    \authornote{} This is also what it sounds like, but I set it apart because you need to have double spaces between different parts of the author note to create line breaks. You’ll see this in the example below.

    \maketitle This function has no arguments, it just takes all of the information in your preamble and turns it into a title page.

    \bibliography{} This line specifies the name of the file with your references. So in the curly braces I would enter something like mybibliography.bib, or whatever I happened to call my references file. It should be stored in the same file folder as your Rnw file. It has to have the .bib extension because LaTeX uses BibTeX files to format the references. I’ll show you how to format these files using the apacite package later.

    \bibliographystyle{apacite} This line tells LaTeX that you’re using the apacite package to format the references section.

    Now that we’ve gone through all of the specific functions, let me show you what a filled-in document looks like.

    \documentclass[man,a4paper,noextraspace,apacite]{apa6}
    \usepackage{apacite}
    \title{Criminals are Punished Less When They Harm Statistical Victims}
    \shorttitle{Criminals and Statistical Victims}
    \author{Joshua D. Wondra and Phoebe C. Ellsworth}
    \affiliation{University of Michigan}

    \abstract{There is a wealth of literature showing that people feel less compassion for statistical victims than for identified victims. But do people punish criminals less when they harm statistical victims than when they harm identified victims? We don't know if this study has been run already, but Josh thinks it's an interesting idea.}
    \keywords{statistical victims, law and emotion, sentencing}

    \authornote{Joshua D. Wondra, Department of Psychology, University of Michigan.

    Phoebe C. Ellsworth, Department of Psychology, University of Michigan.

    We are grateful to PRlab for their comments on an earlier version of this manuscript.

    Correspondence concerning this article should be addressed to Josh Wondra, Department of Psychology, University of Michigan, 530 Church St., Ann Arbor, MI 48109-1043.

    Contact: jdwondra@umich.edu}

    \begin{document}
    \maketitle

    \bibliography{StatisticalVictimsReferences.bib}
    \bibliographystyle{apacite}

    \end{document}

    Go ahead and fill in all of the arguments with information from a manuscript that you might write. Then look right above the syntax window in RStudio and click the Compile PDF button. If all goes well, then you should see the beginning of your APA-style document, which should look something like this.

    By the way, your R Sweave/Rnw file needs to be given a name without spaces, because for some reason LaTeX hates spaces.

    Writing the Manuscript, Except for the Data

    Before we go into how you add your analyses to the document, here’s how you set up the general framework for an empirical paper.

    \begin{document}
    \maketitle

    This is the first paragraph of my manuscript.

    Create separate paragraphs with double spaces.

    You can \textit{italicize} text with the textit function, if needed. For some special symbol, such as \%, you need to use a backslash before the symbol. You can even use Greek letters like \alpha, \beta, \Sigma, and \sigma.

    \section{Method}

    The section function creates level 1 headings.

    \subsection{Procedure}
    The subsection function creates level 2 headings. Any idea how to create level 3 headings?

    \subsubsection{First Phase of the Study}
    Subjects did tasks A, B, and C.

    \subsubsection{Second Phase of the Study}
    Subjects did tasks X, Y, and Z.

    \section{Results}

    And here are the results!

    \section{Discussion}

    Now, let's discuss the results...

    \bibliography{StatisticalVictimsReferences.bib}
    \bibliographystyle{apacite}

    \end{document}

    Try filling it out on your own and then you can Compile the pdf again. It should look something like this.

    Adding the Analyses with Sweave

    Now we’re ready to add data. You can insert chunks of R syntax using the following code:


    <<ChunkTitle, echo=FALSE>>=
    # R code goes here
    @

    The @ symbol ends the chunk. The echo=FALSE argument makes it so that every piece of R output doesn’t show up in your manuscript.

    I suggest creating an initial chunk of R code at the beginning of your manuscript or at the beginning of your Method section to read in the data, set up any packages that you want to use, and do anything else that will apply throughout your manuscript. Then you can create other chunks that will run your analyses and insert the results into your manuscript. Here’s an example:

    \section{Method}
    <<InitializeData, echo=FALSE>>=
    # I'm going to generate random data, but if you had a real data file you could read it in here

    # Generate data
    set.seed(1234) # I'm setting the seed to 1234 so that you can get the same results if you decide to follow along
    punishment <- rnorm(40, mean=4, sd=1)
    group <- factor(rep(c('identifiable victim','statistical victim'), each=20))
    gender <- sample(c('female','male'), replace=TRUE, size=40)
    myData <- data.frame(punishment, group, gender)

    # Load packages
    library(ggplot2)
    @

    \subsection{Overview}
    Subjects read about a legal case in which the criminal stole money from one identifiable victim, or from many statistical victims. Then they indicated how many years they thought the criminal should spend in prison.

    \subsection{Subjects}
    <<SubjectDemographics, echo=FALSE>>=
    # In this chunk, I'll generate subject information
    totalN <- length(myData$gender)
    femaleN <- sum(myData$gender=='female')
    @
    Subjects were \Sexpr{totalN} students (\Sexpr{femaleN} female) who participated for course credit.

    See what I did? In the Sweave chunks I wrote my R code and created variables that I could insert into my manuscript using the \Sexpr{} function. The name of the function is short for “S expression”, because the programming language R is based on the programming language S. You can do the same thing for the results section.

    \section{Results}
    <<Punishment, echo=FALSE>>=
    t.punishment <- t.test(punishment ~ group, data=myData)

    ## NOTE: Everything is rounded to 2-3 digits

    # Save the mean and standard deviation for each group as variables
    id.mean <- round(mean(myData$punishment[which(myData$group=='identifiable victim')]), digits=2)
    id.sd <- round(sd(myData$punishment[which(myData$group=='identifiable victim')]), digits=2)
    st.mean <- round(mean(myData$punishment[which(myData$group=='statistical victim')]), digits=2)
    st.sd <- round(sd(myData$punishment[which(myData$group=='statistical victim')]), digits=2)

    # Save stats from the t test as variables
    t.value <- round(t.punishment$statistic, digits=2)
    df <- round(t.punishment$parameter, digits=2)
    p <- round(t.punishment$p.value, digits=3)
    @

    Subjects' punishment was no different when they read about identifiable victims (\textit{M} = \Sexpr{id.mean}, \textit{SD} = \Sexpr{id.sd}) than when they read about statistical victims (\textit{M} = \Sexpr{st.mean}, \textit{SD} = \Sexpr{st.sd}), \textit{t}(\Sexpr{df}) = \Sexpr{t.value}, \textit{p} = \Sexpr{p}.

    Notice that I used the \textit{} function a lot to italicize the letters for the statistics. I think there’s a way to make this process more automatic when you report standard statistics, but I don’t know it offhand.

    Try it out, compile the pdf, and you should get something like this.

    Adding Figures

    You can add a Sweave chunk to create a figure anywhere, surround it with a little LaTeX syntax, and you’ll be all set.

    \begin{figure}
    <<Fig1, echo=FALSE, fig=TRUE>>=
    # Make sure to add the fig=TRUE part to the beginning of the chunk!
    # We don't need to load ggplot2 because we did that in our initial chunk

    # Set up the plot data
    plot.means <- c(id.mean, st.mean)
    plot.sds <- c(id.sd, st.sd)
    plot.ses <- plot.sds/sqrt(20)
    plot.groups <- factor(c('identifiable victim','statistical victim'))
    plot.data <- data.frame(plot.means, plot.ses, plot.groups)

    # Create barplot with standard error bars
    ggplot(plot.data, aes(y=plot.means, x=plot.groups)) +
    geom_bar(stat='identity') +
    geom_errorbar(stat='identity', aes(ymin=plot.means-plot.ses, ymax=plot.means+plot.ses))
    @

    \textit{Figure 1.} Average punishment by group. Bars represent standard errors.
    \end{figure}

    This can go anywhere in the document, as long as its below all the code chunks that it refers to. So I couldn’t enter this before my initial chunk, because I load the ggplot2 package in the initial chunk. And I couldn’t enter it before the descriptive statistics, because I use the variables for the means and standard deviations that I created in the previous chunk.

    A Note on Tables

    Writing tables with LaTeX syntax is a bit more involved. I don’t use tables very often in my own manuscripts, so I won’t go into it here. I might revisit this later if someone asks, but for now I recommend that you search online for help formatting tables.

    In-Text Citations and References

    Now we’re on to the last big part of writing the APA-style manuscript, your references.

    Start by creating a text file in TextEdit, NotePad, or my favorite, TextMate. Save it with the name you want to use for your References document, but make sure to add the extension .bib at the end. For example, I might save it as StatisticalVictimsReferences.bib.

    Inside the .bib file, you’ll use the apacite syntax to create references. Here’s the basic format for a journal article.

    @article{,
    author = {},
    title = {},
    journal = {},
    year = {},
    volume = {},
    pages = {x--x},
    doi = {http://dx.doi.org/},
    }

    Aside from filling in the blanks in the curly braces, you need to create your own citekey that you’ll use for your in-text citations. Here’s an example.

    @article{Small2007,
    author = {Small, Deborah A. and Loewenstein, George and Slovic, Paul},
    title = {Sympathy and callousness: The impact of deliberative thought on donations to identifiable and statistical victims},
    journal = {Organizational Behavior and Human Decision Processes},
    year = {2007},
    volume = {102},
    pages = {143--153},
    doi = {http://dx.doi.org/10.1016/j.obhdp.2006.01.005},
    }

    My citekey is Small2007. And here is an example of the apacite entry for a book:

    @book{Box1973,
    author = {Box, G. E. P. and Tiao, G. C.},
    title = {Bayesian inference in statistical analysis},
    year = {1973},
    publisher = {Addison-Wesley Publishing Company},
    city = {Reading, MA},
    }

    You can find other reference formats for the apacite package in the package documentation.

    Most of the errors I run into when writing documents are in my .bib file. I usually forget a comma, add too many commas, or forget the double-hyphen between page numbers.

    To do in-text citations, you use the \cite{} function and feed it the citekey that you created.

    There is a wealth of literature showing that people feel less compassion for statistical victims than for identified victims \cite{Small2007}.

    And that’s it. LaTeX will create the References section and the in-text citation when you compile the pdf. You might need to compile twice for it to work, at least the first time. For other cases of in-text citations, see the apacite documentation to get the syntax.

    The bad news: you need to type in all the information for each reference you use. The good news: once you create a .bib file, you can keep re-using it for every new manuscript you write, and just use the citekeys in future papers without any other typing.

    Conclusion

    So that’s it! You should now be able to write a manuscript in RStudio. Please send me an e-mail at jdwondra@umich.edu if questions come up, and I’m happy to help, and to revise this tutorial to make it better.

    Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s