Lessons from my marathon Rust debugging session
A methodical approach to squashing bugs for good
With the release of Bevy 0.8, I decided that it was finally time to release the next version of leafwing-input-manager. It was going to be great! Support for mouse wheel, gamepad axes, virtual dpads: just crammed with features that users have been asking for since the very beginning.
The examples were behaving as expected, the migration to Bevy 0.8 was painless, I'd squashed a gnarly press duration bug and it felt like it was ready to ship. But, being a good maintainer, I thought: you know what, let's ensure that this all actually works, and write some tests.
And that's where my trouble began...
CleanCut (another community member / user!) and I reviewed and refined it, we tested the examples, and everything seemed to be in good working order. There weren't any automated tests, but it was working for end users, and maybe that was good enough?
The PR was massive already, and my opinion was "you know what, I can fix up any other problems later".
leafwing-input-manager is still a small project (well, relative to bevy),
and it's nice to just work through issues and clean up the code without the back-and-forth.
But, of course, while the feature "worked", the supporting input-mocking infrastructure didn't. I want to bring a culture of sophisticated automated testing to the games industry; I should have made sure that the mocking was working as expected before merging.
The examples worked, the input mocking code seemed straightforward: I was confident that adding tests would be a breeze. Of course, it wasn't. In this process, I introduced several additional small input-mechanism-specific bugs. Again: manual verification is no substitute for automated testing.
I knew that it had to be on the input-mocking side: manual verification of the features was all working just fine. I couldn't ship like this, but was overwhelmed and frustrated by the bug. I knew I shouldn't have been so yee-haw about the lack of tests earlier!
I spent a few days ignoring the project, embarrassed and annoyed. But, swallowing my pride, I decided I should ask for help, and asked Brian Merchant to pair program with me on the bug as a junior.
Having someone to talk to helped soothe my nerves and keep me motivated. And like always, being forced to explain the code base made it clear which parts needed love.
I still didn't have a great sense of what was wrong, so I decided to start cleaning up the related code. We might stumble across the bug as we worked, but if nothing else, we'd at least have something to show for our time spent debugging.
There's nothing more demoralizing (or wasteful) than an afternoon spent staring at the code with no progress at all. So, I tackled some tech debt: handling gamepad detection more gracefully, abstracting over behavior with the MockInputs trait, removing an overengineered robustness strategy, and swapping away from a lazy tuple type.
This helped! The code was easier to work with, I had a refreshed mental model of what was going on.
Rather than ending the day in empty-handed frustration, I had several nice fixes to show for my time. We're definitely getting closer to solving the bug(s).
Tests were still failing though. So let's think through this. The basic data model here is:
- The test tells the app to send a
- This is decomposed into its
- These raw inputs are sent as events.
- The events are processed by Bevy's
InputManagerPluginreads the processed
- This is converted to actions via an
- These actions are checked in the test again in
The failure was occurring at step 7, because that's where the assertions are,
but I a) had a robust test suite for core
and b) had proof via manual verification that things were kinda working.
1 was trivial to verify by following the code path, so our problem is somewhere in steps 2 or 3. Well, let's start upstream: garbage in, garbage out at least.
Oh. Oh. We're not actually inserting a critical resource. Well, that will make it very hard to pass that test.
When something strange is happening, be sure to check that the problem isn't in the test itself. Writing related tests, adding debugging tools, and good old-fashioned manual inspection can go a long way.
Of course, that wasn't the last problem...
Long tests suck. While they can reduce the amount of setup boilerplate you need to write and maintain, they're much less useful for actually debugging. Tests have two purposes: they should alert you to problems, but they should allow you to quickly isolate causes. Lumping together long strings of logic makes the latter harder (especially because Rust stops the test on the first failing assert).
Splitting my tests made it much easier to identify the critical failure.
Something was going seriously wrong when sending input events. Let's write some tests to verify that these are actually getting sent.
Let's fix that, and... oh hey, more of our tests are passing! Mouse wheel and motion tests are all good, but the equivalent gamepad tests are broken??
Tests are green, and it's time to ship! I am so, so happy that that was the last bug.
So, that was "fun". I can't say I'm pleased with letting that slip in, or the amount of time and frustration that involved, but it could have been much, much worse.
Things I did wrong:
- accepting a feature without automated tests
- thinking that manual testing was a substitute
- building on a feature I was suspicious of
Things I did right:
- having robust docs and test suite to begin with
- asking for help
- working on targeted incremental improvements
- assuming that there was only one bug ;)
As vindicating as it is to see my "take the time to do things right" mentality pay off, it sure is ironic when I'm the one who needs to learn that lesson.
Thanks for reading: hopefully it was educational, thought-provoking, and/or fun. If you'd like to read more like this in the future, consider signing up for our email list or subscribing to our RSS feed.