PASS Pro Now Available

Welcome to the next evolution of PASS. Unlock exclusive training, discounts, and networking opportunities designed to accelerate your data career. Learn More >

The official PASS Blog is where you’ll find the latest blog posts from PASS community members and the PASS Board. Contributors share their thoughts and discuss a wide variety of topics spanning PASS and the data community.

Automate Professional PowerPoint Presentations Using R

The PowerPoint presentation is the vehicle of choice for information sharing among many businesses. R brings unparalleled power to data analysis and visualization. However, getting those visualizations from a window in R Studio to a formatted slide via copy/paste is tedious and error prone. There is a better way. Jeff Renz demonstrates how R Markdown can automate the creation of presentation-worthy slides from R code. This feature saves hours of time, eliminates errors, and allows a user to update a two-hundred-page slide deck with a key stroke. This is key for decks that include state-by-state data, profit margins across dozens of product lines, or complex visualizations reliant on constantly updated data.

In this example, we use data on COVID-19 data provided by the New York Times which is updated daily. The R code allows users to get the latest statistics by state in a formatted PowerPoint deck instantly. Check out the code and the outputs on Github:

The following sections provide step-by-step instructions on how to:

  1. Run R Markdown and Knit to create PowerPoint output.
  2. Create PowerPoint elements:
    • Title slide
    • Side-by-side elements
    • A full-page element, in this case a nicely formatted table
  3. Fix PowerPoint formatting problems.


Creating PowerPoint Output

Open up Covid19_Analysis_by_State.Rmd.

Click Knit -> Knit to PowerPoint (Takes about 1 minute to generate PowerPoint file).

Click to view PowerPoint.


Generate Title Slide

This section references the Covid-19-Analysis-By-State.Rmd file on Github:

This tutorial assumes some familiarity with R, R Studio, and R Markdown. If you’ve ever used R Markdown to output an HTML file or a PDF, the format will look very familiar. Simply change the output in the header to powerpoint_presentation.

Here we’ve also added `r Sys.Date()` to show the current date and added a PASS Template.

We’ll get into more detail on making advanced changes to the PowerPoint template. Overall, it’s important to retain the default names of the layout slides in the PowerPoint Master View. R Markdown will look for those names when it renders the presentation.

Power Point Title Slide:


Create Full Page Table Slide

For this document, we used the following R libraries:

We are pulling the data from the New York Times Github repository. We use the lubridate function as_datetime() to format the dates as dates and R’s built-in and libraries to turn the state names into their abbreviations.

Slides with a single element follow the standard R Markdown pattern. A hash tag precedes the header text. The following encapsulates the code block: ```{ r …} [code] ```. We are setting warnings, messages, and echo to FALSE to limit the output of this block to the graph. No one wants to see R warning messages in a presentation.

In this block, we use the dplyr library (part of the tidyverse) to organize the data. The dplyr library allows for the use of pipes (%>%), which takes the output from the previous function and feeds it into the following function. It keeps the code cleaner and easier to read. We’re using the kable() function (part of the knitr package) to output the chart.

But wait, the output runs off the page. Don’t worry, we’ll cover this in advanced formatting.


Create Slide with Side-By-Side Elements

To create a slide with two components, we need to invoke the columns pattern. Note the series of colons ( ::::::::::::: ) encapsulating the code for the slide content. We use inline R code (`r [code]`) to set the title of the slide as the state name.

Each new column is denoted with three colons followed by {.column}. Colorado is the 6th state (alphabetically), so[6] outputs “Colorado”.

We used ggplot2 to create the graph with the cumulative cases and deaths per state and the usmap library to create the county-level heat map.

We used other functions to output additional analysis on the rate of infections. The functions are in the Git repository at scripts>02_load_functions.R.

This outputs a slide with the same formatting issue as the state summary slide. We’ll fix this later.


Advanced Formatting of PowerPoint Template

The chart that spilled out the slide is easily fixed by changing the font size in the chart from 18 pt to 12 pt. However, this is a little trickier than changing a setting in the Master Slide View.

To accomplish this, we must crack the template file. To do so, rename the extension on the template file from .potx (PowerPoint template) to .7z (a 7 zip file).

Now use a zip program such as 7-zip File Manager to open the file.

Go to ppt>slideMaster>slideMaster.xml and edit the file.

We pasted the code into Notepad++ to take advantage of the program’s XML formatting.

Under <p:otherStlye>, change that sz=”1800” to sz=”1200” (18 pt font to 12 pt font).

Copy/paste the modified code into the edit window and save. Close the window and 7-zip file manager and change the file extension back to the .potx. We renamed the file PASS_Template_cracked.potx.

Now the default font size for the table is 12 point. It fits nicely on the page, making for a well-formatted presentation. Now, producing a 102-page PowerPoint with the latest statewide analysis of COVID-19 infections requires pressing a button (Ctrl+Shift+K).

Here’s the newly formatted state summary slide:

Here’s the fixed slide showing the state-specific analysis of infection rates:

Jeff Renz
About the author

Jeff Renz is a Senior Architect at RevGen Partners currently working as the design architect and implementation lead on several projects for a Fortune Top 50 company. Jeff has worked with SQL Server and BI for 15+ years and has 10 years’ experience with data warehouse design and implementation. In addition to consulting, he is joining the University of Denver adjunct faculty and is expected to begin teaching in the ICT program starting in June.  He received his bachelor’s degree in Computer Science, his master’s degree in Operations Research from Colorado School of Mines and will be graduating from the Harvard Business Analytics Program in December.

Please login or register to post comments.

Theme picker

Back to Top