Repeating Yourself with Functions

Coffee and Coding

07 September 2023

Why?

  • Forecasting project, need to do the same thing with data for 6 centres.
  • Copy-paste runs risk of not doing the same thing each time (and boring/time-consuming/frustrating).
  • Repetition –> function.

What?

Demo with plots, equally applicable to ‘doing stuff’ with data.

# preview data
head(new_rtt)
  provider_code count rtt_yrmon rtt_mon
1           RJE    83  Nov 2015      11
2           RJE    75  Dec 2015      12
3           RJE    82  Jan 2016       1
4           RJE    74  Feb 2016       2
5           RJE    62  Mar 2016       3
6           RJE    76  Apr 2016       4

Remember, this is about writing functions, not creating stunning visualisations!

Repeat this for each of the 6 centres

How?

Do it ‘normally’ for one centre. What are the parameters to change?

p1 <- new_rtt |>
  filter(provider_code == "RJE") |>
  ggplot(aes(x = rtt_yrmon, y = count)) +
  geom_line() +
  su_theme() +
  theme(legend.position = "none") +
  labs(title = "RJE", subtitle = "time trend of new referrals")

p2 <- new_rtt |>
  filter(provider_code == "RJE") |>
  ggplot(aes(x = month(rtt_yrmon), y = count)) +
  geom_col() +
  su_theme() +
  theme(legend.position = "none") +
  labs(
    subtitle = "monthly pattern of new referrals"
  )

plots <- ggarrange(p1, p2, nrow = 2)

plots

This becomes the argument for the function.

Choose a name for the argument (!= variable_name)

In this example we will use prov in place of "RJE"

Anatomy of a Function

fn_name <- function(arguments) {
  # do stuff
}

Run the function with fn_name(parameter as argument)

Turning our code into a function

p1 <- new_rtt |>
  filter(provider_code == "RJE") |>
  ggplot(aes(x = rtt_yrmon, y = count)) +
  geom_line() +
  su_theme() +
  theme(legend.position = "none") +
  labs(title = "RJE", subtitle = "time trend of new referrals")

p2 <- new_rtt |>
  filter(provider_code == "RJE") |>
  ggplot(aes(x = month(rtt_yrmon), y = count)) +
  geom_col() +
  su_theme() +
  theme(legend.position = "none") +
  labs(
    subtitle = "monthly pattern of new referrals"
  )

plots <- ggarrange(p1, p2, nrow = 2)

plots
fn_plots <- function(prov) {
  p1 <- new_rtt |>
    filter(provider_code == prov) |>
    ggplot(aes(x = rtt_yrmon, y = count)) +
    geom_line() +
    su_theme() +
    theme(legend.position = "none") +
    labs(title = prov, subtitle = "time trend of new referrals")

  p2 <- new_rtt |>
    filter(provider_code == prov) |>
    ggplot(aes(x = month(rtt_yrmon), y = count)) +
    geom_col() +
    su_theme() +
    theme(legend.position = "none") +
    labs(
      subtitle = "monthly pattern of new referrals"
    )

  plots <- ggarrange(p1, p2, nrow = 2)

  plots
}

Running our function

fn_plots <- function(prov) {
  p1 <- new_rtt |>
    filter(provider_code == prov) |>
    ggplot(aes(x = rtt_yrmon, y = count)) +
    geom_line() +
    su_theme() +
    theme(legend.position = "none") +
    labs(title = prov, subtitle = "time trend of new referrals")

  p2 <- new_rtt |>
    filter(provider_code == prov) |>
    ggplot(aes(x = month(rtt_yrmon), y = count)) +
    geom_col() +
    su_theme() +
    theme(legend.position = "none") +
    labs(
      subtitle = "monthly pattern of new referrals"
    )

  plots <- ggarrange(p1, p2, nrow = 2)

  plots
}
fn_plots("RKB")

What if we want more than one argument?

Easy! Just add them to the arguments when you define the function.

If I wanted to run this function on multiple dataframes I would change the function to:

fn_plots <- function(df, prov) {
  p1 <- df |>
    filter(provider_code == prov)
  # and the rest as before
}

and run it with fn_plots(new_rtt, "RKB").

Note that the order of entering the parameters is important. If I tried to run fn_plots("RKB", new_rtt) it would look for a dataframe called "RKB" and a provider called new_rtt.

Working through a list of parameters

Avoid manually running fn_plots() for each provider.
Use purrr::map to iterate over a list

# create a vector of all the providers
prov_labels <- c("RJE", "RKB", "RL4", "RRK", "RWE", "RX1")

map(prov_labels, ~ fn_plots(.x))
[[1]]


[[2]]


[[3]]


[[4]]


[[5]]


[[6]]

Troubleshooting - does the function work?

Crawl before you can walk - make sure fn_plot() works for one parameter.

Insert browser() into the function while testing - steps into the function (don’t forget to remove it when it works!)

This is a new function that will save each time-trend plot

fn_save_plot <- function(prov) {
  p <- new_rtt |>
    filter(provider_code == prov) |>
    ggplot(aes(x = month(rtt_yrmon), y = count)) +
    geom_col() +
    su_theme() +
    theme(legend.position = "none") +
    labs(
      subtitle = paste0(prov, " - monthly pattern of new referrals")
    )

  ggsave(paste0(prov, "_plot.png"), plot = p)
}

Troubleshooting - does it walk the walk?

When learning to walk, use safely() or possibly() in your walk function - it will indicate if any parameters have failed, rather than just fall down.

# wrap fn_plots in safely
safe_pl <- safely(.f = fn_save_plot)

map(prov_labels, ~ safe_pl(.x))


# wrap fn_plots in possibly
poss_pl <- possibly(.f = fn_save_plot)

map(prov_labels, ~ poss_pl(.x))

Console output of wrapping function in possibly