Repeating Yourself with Functions

Coffee and Coding

07 September 2023

Why?

  • Forecasting project, need to do the same thing with data for 6 centres.
  • Copy-paste runs risk of not doing the same thing each time (and boring/time-consuming/frustrating).
  • Repetition –> function.

What?

Demo with plots, equally applicable to ‘doing stuff’ with data.

# preview data
head(new_rtt)
  provider_code count rtt_yrmon rtt_mon
1           RJE    83  Nov 2015      11
2           RJE    75  Dec 2015      12
3           RJE    82  Jan 2016       1
4           RJE    74  Feb 2016       2
5           RJE    62  Mar 2016       3
6           RJE    76  Apr 2016       4

Remember, this is about writing functions, not creating stunning visualisations!

Repeat this for each of the 6 centres

How?

Do it ‘normally’ for one centre. What are the parameters to change?

p1 <- new_rtt |> 
  filter(provider_code == "RJE") |> 
  ggplot(aes(x = rtt_yrmon, y = count)) +
  geom_line() +
  su_theme() +
    theme(legend.position = "none") +
  labs(title = "RJE",
       subtitle = "time trend of new referrals")

p2 <- new_rtt |> 
  filter(provider_code == "RJE") |> 
  ggplot(aes(x = month(rtt_yrmon), y = count)) +
  geom_col() +
  su_theme() +
    theme(legend.position = "none") +
  labs(
       subtitle = "monthly pattern of new referrals")

plots <- ggarrange(p1, p2, nrow = 2)

plots

This becomes the argument for the function.

Choose a name for the argument (!= variable_name)

In this example we will use prov in place of "RJE"

Anatomy of a Function

fn_name <- function(arguments){
  
  # do stuff
  
}

Run the function with fn_name(parameter as argument)

Turning our code into a function

p1 <- new_rtt |> 
  filter(provider_code == "RJE") |> 
  ggplot(aes(x = rtt_yrmon, y = count)) +
  geom_line() +
  su_theme() +
    theme(legend.position = "none") +
  labs(title = "RJE",
       subtitle = "time trend of new referrals")

p2 <- new_rtt |> 
  filter(provider_code == "RJE") |> 
  ggplot(aes(x = month(rtt_yrmon), y = count)) +
  geom_col() +
  su_theme() +
    theme(legend.position = "none") +
  labs(
       subtitle = "monthly pattern of new referrals")

plots <- ggarrange(p1, p2, nrow = 2)

plots
fn_plots <- function(prov){
  
    p1 <- new_rtt |> 
      filter(provider_code == prov) |> 
      ggplot(aes(x = rtt_yrmon, y = count)) +
      geom_line() +
      su_theme() +
        theme(legend.position = "none") +
      labs(title = prov,
           subtitle = "time trend of new referrals")
    
    p2 <- new_rtt |> 
      filter(provider_code == prov) |> 
      ggplot(aes(x = month(rtt_yrmon), y = count)) +
      geom_col() +
      su_theme() +
        theme(legend.position = "none") +
      labs(
           subtitle = "monthly pattern of new referrals")
    
    plots <- ggarrange(p1, p2, nrow = 2)
    
    plots
    
}

Running our function

fn_plots <- function(prov){
  
    p1 <- new_rtt |> 
      filter(provider_code == prov) |> 
      ggplot(aes(x = rtt_yrmon, y = count)) +
      geom_line() +
      su_theme() +
        theme(legend.position = "none") +
      labs(title = prov,
           subtitle = "time trend of new referrals")
    
    p2 <- new_rtt |> 
      filter(provider_code == prov) |> 
      ggplot(aes(x = month(rtt_yrmon), y = count)) +
      geom_col() +
      su_theme() +
        theme(legend.position = "none") +
      labs(
           subtitle = "monthly pattern of new referrals")
    
    plots <- ggarrange(p1, p2, nrow = 2)
    
    plots
    
}
fn_plots("RKB")

What if we want more than one argument?

Easy! Just add them to the arguments when you define the function.

If I wanted to run this function on multiple dataframes I would change the function to:

fn_plots <- function(df, prov){
  
    p1 <- df |> 
      filter(provider_code == prov) 
    # and the rest as before
}

and run it with fn_plots(new_rtt, "RKB").

Note that the order of entering the parameters is important. If I tried to run fn_plots("RKB", new_rtt) it would look for a dataframe called "RKB" and a provider called new_rtt.

Working through a list of parameters

Avoid manually running fn_plots() for each provider.
Use purrr::map to iterate over a list

# create a vector of all the providers
prov_labels <- c("RJE", "RKB", "RL4", "RRK", "RWE", "RX1")

map(prov_labels, ~ fn_plots(.x))
[[1]]


[[2]]


[[3]]


[[4]]


[[5]]


[[6]]

Troubleshooting - does the function work?

Crawl before you can walk - make sure fn_plot() works for one parameter.

Insert browser() into the function while testing - steps into the function (don’t forget to remove it when it works!)

This is a new function that will save each time-trend plot

fn_save_plot <- function(prov){
  
    p <- new_rtt |> 
      filter(provider_code == prov) |> 
      ggplot(aes(x = month(rtt_yrmon), y = count)) +
      geom_col() +
      su_theme() +
        theme(legend.position = "none") +
      labs(
           subtitle = paste0(prov, " - monthly pattern of new referrals"))
    
    ggsave(paste0(prov, "_plot.png"), 
           plot = p)
  
}

Troubleshooting - does it walk the walk?

When learning to walk, use safely() or possibly() in your walk function - it will indicate if any parameters have failed, rather than just fall down.

# wrap fn_plots in safely
safe_pl <- safely(.f = fn_save_plot)

map(prov_labels, ~ safe_pl(.x))


# wrap fn_plots in possibly
poss_pl <- possibly(.f = fn_save_plot)

map(prov_labels, ~ poss_pl(.x))

Console output of wrapping function in possibly