Testing Bash scripts with Bats

Note: this post was migrated from my old Tumblr-backed blog

In recent years, I’ve become completely dependent on automating the tests for the code I write. Once it hits a certain size and complexity, the tedium of constantly launching the app and going through the features and checking the outputs can waste time and demotivate, so when I found Bats (the Bash Automated Testing System) and was finally able to write some real tests for mcwrapper, I was ecstatic.

I’d considered rewriting mcwrapper in another language on multiple occasions, with my main purpose being the ability to write tests and push releases without worrying that I broke some key feature, which had happened several times. Sometimes, as in the case of mcwrapper, Bash is the best tool for the job, and the lack of testing framework is no longer a valid excuse to choose another language.

Bats, itself written in Bash, allows you to write a series of tests against not only your shell scripts, but any commandline program that you write. If it doesn’t share the language with your testing target, then you’re limited to testing the commandline interface, which isn’t a total loss, but the real power comes when testing Bash shell scripts.

A Quick Rundown

A basic Bats test looks like the following:

@test "test for equality" {
  A=1
  B=1
  [ "$A" -eq "$B" ]
}

The above file should be saved as equality.bats and run with the following command:

bats equality.bats

Looking at the syntax of the example test, the test block is prefixed with the @test macro and the name of the test. The contents of the block is executed and if any line fails, the entire test will fail with output describing the line in which it failed and the output. Because the tests are written in Bash, you can test for things inside [] or using the test command, which leads to pretty readable tests, for the most part.

For testing expected failure, Bats ships with the run function which will not only always return true, but will set the global variables $status, $output and an array $lines which are the exit code, full text output and an array of the lines of output for you to run your assertions against. An example follows:

# this function returns some test and a non-zero status
failing_function() {
  echo "not good"
  return 1
}

@test "test for failure" {
  run failing_function

  [ $status -ne 0 ]
  [ $output == "not good" ]
}

It also has the ability to include helpers, which are just Bash scripts filled with functions that you can use in your tests and has setup() and teardown() functions that you can define in individual test files that will be executed before and after each test, respectively. setup() is handy for ensuring a consistent operating environment for each test, such as creating a directory, cd‘ing into it and maybe copying some test files, where teardown() can be used for cleaning up after the fact.

I’d give more examples, but the documentation for the project is more than ample and I’d basically just be duplicating that effort.

More Advanced Usage

In writing the test suite for mcwrapper, I ran into a few cases where I began having issues figuring out how to test certain things. For instance testing internal functions and verifying that proper files were created during the backup process.

Testing Internal Bash Functions

In order to test an internal function, one must source your script containing the functions. If your Bash script breaks functions into separate libraries, testing them become easier as you can source them individually. But, in the case of mcwrapper, all the functions and the program itself, including all the commandline parsing code, is in a single file.

While poking around at the Bats source code, I discovered how to detect if the script is being sourced and skip some functionality in those cases:

if [ "$BASH_SOURCE" == "$0" ]; then
  # code in here only gets executed if
  # script is run directly on the commandline
fi

From there, my tests get simple:

setup() {
  # source the file, now all mcwrapper functions
  # are available in all my tests
  . "$BATS_TEST_DIRNAME/../libexec/mcwrapper"
}

@test "start_restore sets DOING_RESTORE" {
  [ -z "$DOING_RESTORE" ]
  start_restore
  [ ! -z "$DOING_RESTORE" ]
}

Testing The Result of a Piped Function

When testing the backup functionality of mcwrapper, there was a need to run a series of backups and verify that mcwrapper only kept the N most recent backups. This got tricky since I was calling ls on the directory, greping the output for backup files, then calling wc -l on it to count them.

I needed to be able to assert that the output of the wc command matched what I expected, so the trick is to use run, but test in the context of that level of the pipeline.

ls "$backup_dir" | grep ^backup | {
  run wc -l
  [ $output -eq 5 ]
}

By moving the wc test into a command group using curly braces ({}), I was able to isolate that test and keep everything readable without creating a subshell (() would create a subshell but have a similar effect). Also, if either the ls or grep commands fail, the whole test will still fail.

Some Shortcomings

Bats is awesome, but there are places where it can be improved. For one, I’d love customizability of the output. RSpec has some nice options to either print out a . for a passing test or a F for a failed test, and then spit out a dump of the failures at the end. When you’ve got more than a handful of tests, Bats’s output can be difficult to visually parse for errors.

I recently ran tests on a project at work and saw that they broke up the test output by file and also spit out checks for passing tests and Xs for failed ones, which also made things pretty easy to visually parse.

The addition of colour would also aid greatly in visual parsing.

One thing that was missing was the ability to print debugging output when building the tests. I have a pull request that I sent in the other day that adds a decho function which prints things to the terminal prefixed with # DEBUG:, but it has yet to be merged.

Lastly, coming from the RSpec world, I’ve gotten used to being able to group my tests into logical units of similar tests. Breaking Bats tests into separate files helps, and each file can share the same setup() and teardown() functions, but having a way to just group them in the file would be cool.

I’ve gone through the source quite a bit and I could probably add some of these features, but because Bats is trying to be TAP compliant, I’m not sure how some of this would affect that. My plan is to organize my thoughts a little better, get a little more experience with using the project and submit some issues.