Getting my head round ‘mocking’

testing
software development
mocking
learning
reflection
Author
Affiliation

Fran Barton

Published

December 31, 2025

My neurons feel tangled.

There’s something about the concept of ‘mocking’ in software development that just isn’t clicking with me. I’ve had very capable people generously explain it to me. And I’ve read the docs and watched the videos.

No matter. It just doesn’t seem to make sense to my brain in the right way.

But I know that it can click - and I know that I have been here before.

A frame from Taylor Swift’s Anti-Hero video with the lyric “It’s me, hi, I’m the problem, it’s me”

Grasping functions

When I was starting out learning R, I just wrote everything in long scripts, copying and pasting code as I went.

I read more experienced analysts talking about the virtues of writing functions, but I didn’t really know or understand what a function was, it just sounded like something unnecessarily advanced and complicated. I had very humble needs, and my code worked for me, so why would I need to leap into the hyperspace of writing functions and even developing a package? All far too fussy and excessive.

And yet. Eventually I got tired of copying and pasting and noticed - with a kind of igniting excitement in my mind - that the way to do the same thing multiple times with slightly different parameters was to turn it into a function.

OK, fine. But now I have to understand how a function works.

“Which bits of information do I need to provide as arguments, and which things can just be in the script?”

“What if I need to do something inside the function to the variable passed in as x; should I still call it x or do I then need to call it something else?”

“Does the argument x need to have the same name as the name of the variable I am planning to pass in from my environment, for it to work? Or is the opposite true - in order to avoid confusion, you should never call your function argument the same as the variable you’re going to pass in?”

Well, these are all very entertaining newbie questions.

When you’re conceptually out of your depth, you ask questions that don’t make any sense! But at least you’re asking questions.

Once I understood what I was doing with functions, all of these conceptual confusions, about what I can now call the evaluation environment of the function and what it means to write a pure function and so on, seemed so misguided and the right way of thinking about things seemed so obvious.

*** Once you know what you’re doing, it can be hard to remember what was like not to know.

But I have to remember - and I do remember, quite well! - that is doesn’t feel like that when you’re in the swamp of learning, and oscillating between conscious and unconscious incompetence (TODO add link to definition here?).

When you’re new, and overawed, and uncertain, and learning, you don’t know if you’re doing it right or wrong.

(TODO maybe use callouts or quotes to highlight particular lines)

You’re often in the dark, and when things don’t work as you expect, you don’t know how to tell the difference between “I’m stupid and I am out of my depth and this is never going to work” and “I just need to change a little thing, I’m nearly there.”

It’s so time-consuming!

(And don’t get me started on when I then started trying to learn how to use purrr::map() and friends!)

From my position as a relatively experienced R coder and developer in 2025, I can look back on the most frustrating bits of my R learning journey with a mixture of fondness, empathy and horror.

It all feels like so much water that I am glad is under the bridge.

But, now, here I am again, trying to learn something new, and experiencing almost exactly the same kinds of brainaches as I did back in 2020.

It’s easy to feel anger generated by frustration, alongside my determination.

Why does it do that?”
What does that word mean?”
WHY are people who write tutorials ever think it’s OK to use the words ‘just’ or ‘simply’?”
Slow down! I’m lost already

Mocks

Given all the above, I know I can get my head around mocking. It is just going to take some time. And, at some point, a breakthrough moment when it finally clicks.

There’s something in my mind - my own conceptual model - of what we are doing with mocking that is constantly wrong. When I read a tutorial, the next line of code or the next sentence is never what I expect it to be, in the way that iot would be be if I were reading about a technique that better fits my mental model(s).

It’s nice when you read a line of code and can sense what its output is going to be.

When I read about mocking, the output or the effect of the code I read is always different to what I thought they are doing or expected to see. That jolt of surprise comes with its own little emotional punch of confusion and inadequacy.

There’s an opacity to it, for me currently.

I wonder what learning techniques I can use to help me.

  • Writing things down?
  • Making notes on tutorials?
  • Looking for analogies?
  • Trying to reframe the language and core concepts of mocking into a different metaphor that better conforms to the shape of my brain and the assumptions I am bringing?
  • Just doing it myself repeatedly until it clicks?

Here’s some resources

Here’s what I currently know about mocking

Off the top of my head:

OK, so sometimes when you are testing a function that receives input from an external source, like data from an API, or a great big (or sensitive) dataset, it’s not practical or ethical to run your tests for that function against the actual data. It’s slow, and unreliable, or it involves revealing aspects of the dataset that are confidential, or it involves pinging an API endpoint over the internet that might not aalways be available. Or even if you can run those tests locally on your own machine, you can’t share that data, or the access key/token for it, so anyone else that needs to test your package can’t run your tests. And you can’t then run automated tests via GitHub Actions, either.

OK so that’s one set of scenarios at last that explains the need to mock up the data or the response.

I get it: you just need to test what your functions do, not whether the external data source is present and functioning. That makes perfect sense.

OK, so how do I replace the external data in my tests?

Here’s where I’m confused.

If I have a function:

double <- \(x) x * 2

and I do something like

x <- 2

local_mocked_bindings(
  double = function(...) 4
)

then I haven’t tested if double() actually works, I’ve just stipulated that it does by providing what I expect its answer to be.

Or at the other end of the spectrum, if I need to mock up a large data frame as a function input, do I need to just create a synthetic replacement out of my own imagination, or take the real data and kind of run some functions over it to mangle and obscure the values?

Let’s say I have a function:

process_data <- function(dat) {
  dat |>
    dplyr::filter(.data$year == 2025) |>
    dplyr::mutate(mean = mapply(mean, .data$value_x, .data%value_y))
}

and then I want to do:

test_that("process_data works as expected", {
  actual <- process_data(my_secret_df)
  expect_identical(nrow(actual), 1500)
  expect_identical(ncol(actual), 8)
  expect_true(!anyNA(actual$mean))
})

I don’t understand how to replace that with a mocked value.

I think I have to do something like: ?

test_that("process_data works as expected", {
  # having created a snapshot (?) called my_fake_data that has the correct
  # dimensions and characteristics that the "real" actual would have???
  with_mocked_bindings(process_data(my_secret_dataframe), actual = my_fake_data)
  expect_identical(nrow(actual), 1500)
  expect_identical(ncol(actual), 8)
  expect_true(!anyNA(actual$mean))
})

Creating my_fake_data sounds like a right pain, and also a massive data protection risk if I don’t adequately obscure the values in my_secret_df.

I must be missing something.

The other thing that is on my mind is: these aren’t real tests. I’m just stipulating what the mocked value should be. I’m creating my own output and then saying I’ve tested the function. But I haven’t really. So what’s the point of having these tests?

Sometimes an API might be inaccessible, or we might not be able to properly test our data pipeline, and that’s just the way the world is. So it feels like people are using mocks just to pretend they are testing things in order to try to get to 100% test coverage, but actually it is all a facade.

I must be missing something.

Here’s what I’ve learnt since I started recording things in this document

Conclusions

I expect that at some future time I will be able to look back at this blog post with that funny feeling of wincing and puzzlement - “How could I ever have not grasped this very simple concept?”

the path from unknowing to knowing, from being all at sea to grasping how to use a set of tools.