NHS-R Community Webinar
Aug 23, 2023
Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation
Unit Testing checks each component (or unit) for accuracy independently of one another.
Integration Testing integrates units to ensure that the code works together.
End-to-End Testing (e2e) makes sure that the entire system functions correctly.
User Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.
Unit Testing checks each component (or unit) for accuracy independently of one another.
Integration Testing integrates units to ensure that the code works together.
End-to-End Testing (e2e) makes sure that the entire system functions correctly.
User Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.
Unit Testing checks each component (or unit) for accuracy independently of one another.
Integration Testing integrates units to ensure that the code works together.
End-to-End Testing (e2e) makes sure that the entire system functions correctly.
User Acceptance Testing (UAT) ensures that the product meets the real user’s requirements.
We have a {shiny}
app which grabs some data from a database, manipulates the data, and generates a plot.
Image source: The Testing Pyramid: Simplified for One and All headspin.io
expect_*()
functions…we arrange the environment, before running the function
we act by calling the function
we assert that the actual results match our expected results
test_that("my_function works", {
# arrange
x <- 5
y <- 7
expected <- 0.714285
# act
actual <- my_function(x, y)
# assert
expect_equal(actual, expected)
})
── Failure: my_function works ──────────────────────────────────────────────────
`actual` not equal to `expected`.
1/1 mismatches
[1] 0.714 - 0.714 == 7.14e-07
Error:
! Test failed
test_that("my_function works", {
# arrange
x <- 5
y <- 7
expected <- 0.714285
# act
actual <- my_function(x, y)
# assert
expect_equal(actual, expected, tolerance = 1e-6)
})
Test passed 🎊
(this is a slightly artificial example, usually the default tolerance is good enough)
Remember the validation steps we built into our function to handle edge cases?
Let’s write tests for these edge cases:
we expect errors
a non-exhaustive list
another non-exhaustive list
my_big_function <- function(type) {
con <- dbConnect(RSQLite::SQLite(), "data.db")
df <- tbl(con, "data_table") |>
collect() |>
mutate(across(date, lubridate::ymd))
conditions <- read_csv(
"conditions.csv", col_types = "cc"
) |>
filter(condition_type == type)
df |>
semi_join(conditions, by = "condition") |>
count(date) |>
ggplot(aes(date, n)) +
geom_line() +
geom_point()
}
Function to get the data from the database
Function to get the relevant conditions
Function to combine the data and create a count by date
Function to generate a plot from the summarised data
The original function refactored to use the new functions
This is going to be significantly easier to test, because we now can verify that the individual components work correctly, rather than having to consider all of the possibilities at once.
summarise_data
summarise_data
summarise_data
Generate some random data to build a reasonably sized data frame.
You could also create a table manually, but part of the trick of writing good tests for this function is to make it so the dates don’t all have the same count.
The reason for this is it’s harder to know for sure that the count worked if every row returns the same value.
We don’t need the values to be exactly like they are in the real data, just close enough. Instead of dates, we can use numbers, and instead of actual conditions, we can use letters.
summarise_data
Tests need to be reproducible, and generating our table at random will give us unpredictable results.
So, we need to set the random seed; now every time this test runs we will generate the same data.
summarise_data
Create the conditions table. We don’t need all of the columns that are present in the real csv, just the ones that will make our code work.
We also need to test that the filtering join (semi_join
) is working, so we want to use a subset of the conditions that were used in df
.
summarise_data
Because we are generating df
randomly, to figure out what our “expected” results are, I simply ran the code inside of the test to generate the “actual” results.
Generally, this isn’t a good idea. You are creating the results of your test from the code; ideally, you want to be thinking about what the results of your function should be.
Imagine your function doesn’t work as intended, there is some subtle bug that you are not yet aware of. By writing tests “backwards” you may write test cases that confirm the results, but not expose the bug. This is why it’s good to think about edge cases.
summarise_data
test_that("it summarises the data", {
# arrange
set.seed(123)
df <- tibble(
date = sample(1:10, 300, TRUE),
condition = sample(c("a", "b", "c"), 300, TRUE)
)
conditions <- tibble(condition = c("a", "b"))
expected <- tibble(
date = 1:10,
n = c(19, 18, 12, 14, 17, 18, 24, 18, 31, 21)
)
# act
actual <- summarise_data(df, conditions)
# assert
})
That said, in cases where we can be confident (say by static analysis of our code) that it is correct, building tests in this way will give us the confidence going forwards that future changes do not break existing functionality.
In this case, I have created the expected data frame using the results from running the function.
summarise_data
test_that("it summarises the data", {
# arrange
set.seed(123)
df <- tibble(
date = sample(1:10, 300, TRUE),
condition = sample(c("a", "b", "c"), 300, TRUE)
)
conditions <- tibble(condition = c("a", "b"))
expected <- tibble(
date = 1:10,
n = c(19, 18, 12, 14, 17, 18, 24, 18, 31, 21)
)
# act
actual <- summarise_data(df, conditions)
# assert
expect_equal(actual, expected)
})
Test passed 😸
The test works!
{testthat}
works best with Packages{usethis}
use_testthat()
will set up the folders for test scriptsuse_test()
will create a test file for the currently open script{withr}
package has a lot of useful functions that will automatically clean things up when the test finishesmy_big_function
(from before) without calling the intermediate functions, then you should look at the {mockery}
packageQuestions?
thomas.jemmett@nhs.net / DM me on slack
view slides at the-strategy-unit.github.io/data_science/presentations