Introducing pytest-r-snapshot: Verifying Python code against R outputs at scale

An earthy, minimalist display blending textured art with dried and fresh botanicals.
Photo by Toa Heftiba.

First package in the new year: I’m glad to announce pytest-r-snapshot, a pytest plugin for snapshot testing Python code against reference outputs generated by R.

You can install it from PyPI:

pip install pytest-r-snapshot

If your project uses uv, add it as a development dependency:

uv add --dev pytest-r-snapshot

Why snapshot testing against R?

When you’re writing Python code that needs to produce identical results to existing R code, testing becomes tricky. You could manually generate expected outputs from R and save them as fixtures with your tests, but that’s tedious and error-prone. You could require R and R packages in CI, but that adds infrastructure complexity and slows things down. What you really want is a way to record the ground truth once or regularly, commit it, and replay it everywhere.

That’s exactly what pytest-r-snapshot does.

Workflow ergonomics

The pytest plugin is designed for a portable workflow:

Record locally (requires R): run labeled R code chunks embedded in your Python tests and write snapshot files.
Replay everywhere (default; no R required): read committed snapshot files and compare them to Python outputs.

The R code chunks live right next to your test assertions. This makes it easy to see and update what you’re testing against. The snapshots are stored as plain text files that get committed to version control, so CI runs fast without needing R.

The best part? The R code blocks follow exactly the same syntax as labeled R Markdown code chunks, so you can easily copy-paste them from and to R Markdown or Quarto documents for prototyping and exploration.

The story behind

This plugin grew directly out of our work on rtflite, a Python package for generating RTF documents for clinical study reports. rtflite aims to match the output of the R package r2rtf, which meant we needed a reliable way to verify that our Python implementation produced bitwise identical RTF files after normalization.

We started with an ad hoc approach a year ago: manually running R scripts to generate reference outputs, saving them to fixture files, and writing custom code to compare against them. This setup worked, but the mechanism was fragile. The R code wasn’t version-controlled alongside the tests, updates were manual and easy to forget, and the whole setup was hard to understand for new contributors.

pytest-r-snapshot formalizes this pattern. You embed the R code directly in your Python test file (as comments or in multiline docstrings), the plugin extracts and runs it when recording, and the snapshot files become the source of truth. When someone reads the test, they can see both the R code that produced the expected output and the Python code being tested.

Since v2.5.1, the rtflite project has formally migrated to use pytest-r-snapshot, which simplified the test infrastructure considerably.

How it works

The basic workflow is straightforward. In your test file, embed a labeled R code chunk as a comment or inside multiline docstrings:

def test_summary_matches_r(r_snapshot):
    # ```{r, summary}
    # x <- c(1, 2, 3)
    # summary(x)
    # ```

    actual = my_python_summary(...)
    r_snapshot.assert_match_text(actual, name="summary")

When you run pytest --r-snapshot=record, the plugin extracts the R code, runs it, and saves the output to a snapshot file. After that, regular pytest runs compare your Python output against the recorded snapshot, without needing R at all.

The plugin supports three modes:

replay (default): never runs R; fails if the snapshot is missing.
record: always runs R and overwrites snapshots.
auto: runs R only when a snapshot is missing.

You can configure the R execution environment (which Rscript to use, working directory, environment variables, timeout) either through command-line options, or pytest configuration in pyproject.toml or conftest.py.

When to use pytest-r-snapshot

This pytest plugin is particularly useful when you’re: porting R packages to Python and need to verify equivalence; maintaining parallel implementations in both languages; or building Python tools that need to interoperate with R-based systems.

It’s probably overkill if you just need to test a few simple functions. But when you’re systematically porting a substantial R codebase, having automated, version-controlled snapshot tests against the R reference implementation makes the whole process much more tractable in the long run.

Learn more

The documentation site (now built with Zensical!) covers installation, configuration, and usage patterns in detail. If you run into issues or have suggestions, please open an issue on GitHub.