Code Coverage in Godot? How?

colelli · July 14, 2025, 10:50pm

Godot Version

v4.4.1

Question

Hi, as part of my ongoing development I am trying to keep my code as production-ready as possible. A way to do so - I would argue - is to keep testing and ensure that everything works as intended. Not necessarely in a strictly TDD way, but rather: add a feature → test it → ensure it works fine → keep working on other stuff → rinse and repeat.

So as part of this mission I am using the GUT Unit Testing addon for Godot, which I find quite great to be honest. Easy to pick up and use and has quite an extensive documentation with all you need to know - or almost.

One thing I am deeply missing is code coverage. I want to make sure I am covering all important bits of my code, key conditions, and so on. Also - just as personal satisfaction - I want to achieve a minimum coverage %.

By searching online I was able to find the following Code Coverage addon on GitHub by jamie-pate.

Did any of you use this solution, or any other for what it matters? Is this reliable or should I look into implementing my own extension to achieve my goal?

tldr:
do you use code coverage in your project? If so, which addon/extension - perhaps the one I linked - or should I consider implementing my own extension?

dragonforge-dev · July 15, 2025, 1:11am

I love code coverage in larger projects. I will note that it’s been 4 months since it has had any updates, so it seems relatively up to date. One of the commits mentions updating to 4.4 and UIDs, but the README never got updated because it says it supports 3.5.

If it were me, and I really wanted code coverage on a project, I’d use this one and then if it’s not being maintained, fork it and update it. Because honestly unless it’s a hot mess, you’re better off extending something that exists than spending a lot of time writing a tool from scratch.

Having said that, I personally think a code coverage tool is overkill on a single person game project. I love Unit testing professionally, but I honestly don’t use it on all my Godot projects unless they are going to be maintained a long time or are plugins. (But, so far I’ve mainly done game jams, and small professional projects that don’t pay you for good development practices.)

Also, for the record, I prefer GDUnit4. I think it’s more powerful.

Not to hijack, but I’m also curious if anyone knows of a GDScript linter? @gertkeno @hexgrid @mrdicerack @wchc @pennyloafers

pennyloafers · July 15, 2025, 11:21am

I agree with @dragonforge-dev it is overkill. I will say i started my first major project doing a bunch of testing and i felt i was giving a lot of time to make every trivial game components. It became exhausting. So i decided to pull back, and only used testing for hotter components like the player and solving bugs. Which solving bugs using the test frame work was a dream, especially in multi-component systems ( an integration test?).

I will end this by saying you will learn a lot about how things work in Godot using a test framework. Knowing that you tested every bit of code, could be helpful, knowing you tested every condition combination will maybe become less helpful and more tedious.

colelli · July 15, 2025, 11:30am

It really is overkill in 99% of the cases, but as part of my commitment towards this long-term project I want to get done, I also want to stick to as much of a production-ready pipeline. I was also thinking about implementing stages and jobs on gitlab, just for the sake of learning that tool (I use it at work, but mostly as a user rather than a pipeline engineer, so I want to learn)

I also think it is quite a good way for me to learn and get better at game-dev. I already managed to find a few minor bugs in some of my scripts thanks to unit testing. So let’s say I integrated testing just for the sake of practice and motivating myself to continue this project - for me testing is also a way to detach from implementing new mechanics, and rather focus on refining existing ones. It’s much like a breathe of fresh air

Ultimately that is not my goal, I don’t want to cover EVERY condition combination. But not having a code coverage tool makes it difficult to figure out which parts of your code are not covered - or not covered enough. For instance, I’d like something like jacoco’s report which tells you the coverage % per class, and then you can choose to cover more or not depending on your needs. As you said, player-based logic is more important to cover than ui-button callbacks.

pennyloafers · July 15, 2025, 11:50am

Thats a good point, use code coverage as an analysis tool to see which parts may need testing. I’m usually at the mercy of a coverage tool to write more test cases to meet coverage goals. Which are usually 100% coverage .

hexgrid · July 15, 2025, 3:31pm

The thing you have to remember about unit tests is, they’re also code, which means they can also have bugs and need to be maintained. There’s this pernicious idea (particularly among less technical management) that unit tests themselves are somehow exempt from bugs and that if we could just hit true 100% coverage we could fire the QA department.

This is… not how it works. But that green 100% is there, a siren call to the product manager, a mythical utopia where the code is completely bug free and perfect. Instead, it means they’re spending engineering time proving that 1+1 continues to equal 2, rather than (say) actually fixing bugs or building the product.

Unit tests also tend to ossify a codebase; because the tests are there to ensure certain behavior, if you need to change that behavior you also need to change the tests, potentially recursively. I have (multiple times) seen a simple one line bug fix on a large project delay shipping by days or weeks as the implications of that change spooled out through bad assumptions in the testing suite. The actual bug fix is usually lost in the noise of test changes in the PR.

Personally, I only put tests in place for things that have a high likelihood of subtle breakage. If I’m (say) making a data structure like a critbit trie that’s going to see a lot of use, I’ll wire that up with integrity checking, load/abuse tests, maybe set up some fuzzing… If I’m making a simple component that will have obvious failures at runtime I don’t bother larding it down with tests.

As for @dragonforge-dev 's question, I don’t know of a gdscript linter.

dragonforge-dev · July 15, 2025, 3:58pm

I agree with everything @pennyloafers and @hexgrid said.

100% code coverage is a myth. I tell companies when they hire me that expecting higher than 80% code coverage or automation of testing is an unrealistic pipe dream. And I tell them not to hire me if that’s a metric they have set in stone because I wont’ do it, and I won’t make the teams I manage do it.

Another problem with too many unit cases is not keeping up with them. I’ve come into projects where the entire test suite was left alone so long that it’s turned off in the pipeline, and 80% of the tests written are no good. And so the “solution” is to get junior engineers (dev or test) to fix it. Or to have outside contractors from another country add a bunch of tests. The problem with both of those approaches is it’s the developers who write the code that need to write the tests, because they actually understood what the code was supposed to be doing at the time.

Hey, if it’s for fun and motivates you, go for it. When I started really developing games in Godot, I was all in on unit testing. I even wrote a whole tutorial on how to use GDUnit. As far as using GitLab, I will say I think there are more pipeline tools for GitHub, i.e. GitHub Actions that people have built. I’ve used Jenkins, Bamboo, Azure, and GitHub Actions. I’ve gotten demos from the sales engineers at GitLab trying to get me to switch my company over. I can honestly say that learning any of them will help you with the others.

I will also say stay away from Azure DevOps. It is an f-ing nightmare if you use any language that isn’t in the .NET suite. They do everything they can to lock you into their ecosystem and it’s horrible. I once spent a month with our DevOps engineers at a huge company just trying to get something to work in Azure Devops that we already had running in Jenkins in the the company because “reasons”. (Those reasons were dumb, especially since Jenkins is free.)

Tirade aside, go for it and get everything running in a pipeline and it’ll be a dream. In professional projects I control, I insist on 80% code coverage on all new code, and I make it impossible for anyone to check code in without that coverage. Once in a while we do get cascading unit test dependencies, but it’s usually over a refactor and I think it’s worth the time.

On a personal Godot project I would keep code coverage to things that are important but not tested regularly. For example, I wouldn’t put unit tests on the player because I use the player all the time. But, to @hexgrid 's point, if you do cover it and something breaks your tests will tell you exactly where it breaks and can save beaucoup debugging time.

Demetrius_Dixon · July 15, 2025, 4:38pm

Are you developing a game or software? There’s a big difference.

Games are played for entertainment.
Software is used as a tool.

Personally, I’m more of a designer than a programmer. So all this “unit testing” and whatnot is complete gibberish.

I’d prioritise making a functioning game first, then making that game fun, before anything else.

For games, the real “unit tests” are your playerbase. They’ll spot bugs very well and complain/make a report. It’s just more efficient to find bugs on the fly than to spend all the time and effort on unit test implementation/maintenance.

If there’s a bug in your game, but it’s so rare that no one spots it, then it won’t really matter.

colelli · July 15, 2025, 4:43pm

Thank you all @dragonforge-dev @pennyloafers and @hexgrid for your inputs. As I said I will take it more as a motivational thing and learning experience than anything else. 100% code coverage is surreal and I don’t even want to have that tbh - who has the discipline to cover every single line .

Companies are on a different scale and made up of different devs. Even though I have set myself up for a big project that will probably take years, I don’t want to be the PO who writes tickets for the test team - aka myself - who then asks the dev team - aka also me - to refine the code ahah. It’s rather an adventure, and I want to practice what I absorb passively developing at work for a big corp.

To adress your point @Demetrius_Dixon, yes, you’re absolutely right, but again it’s part of my plan to also learn this habit of testing what I write. It’s my first big solo project and I want to try to apply all the knowledge I’ve gathered so far in one place. See what works. See what doesn’t and grow from there also consider that I am planning to bring along this project for some years, so documentation, testing, comments, clearn code principles, etc. are all going to play a big role in not making me feel lost/overwhelmed and end up throwing my code - together with my pc - out the window.

I’ll still leave this thread open, just in case future developments bring a fresh revamp on how Godot handles testing out-of-the-box. If I had way many more years of experience and deeper knowledge in the topic I’d definitely sit down developing a PR to integrate some sort of testing tool integrated in Godot, but that will not be the case - at least for the near future.

Demetrius_Dixon · July 15, 2025, 5:02pm

I’m also developing a long-term project and I also very much appreciate your enthusiasm.

…But for the love of your sanity, PLEASE only fix problems that actually currently exist!!!

You WILL rewrite code or reimplement something at some point, because that is the nature of iteration. But you cannot iterate a blank page or a bunch of theories that don’t actually exist.

Just start working, fail, and learn from those failures. Trust me, it isn’t as bad as it sounds.

dragonforge-dev · July 15, 2025, 5:32pm

I get where you’re coming from, but I have a caution. I’m going to tell a story.

A number of years ago I was running the beta testing of a very large game for Electronic Arts. There was a woeful lack of unit tests. There were automated tests. There were over 200 manual testers employed by EA testing the game literally around the clock because they worked all over the world. This was all in place before and during the beta phases.

Every other day during beta I watched bugs flip-flop because developers were fixing bugs and breaking other things that other developers were fixing. The beta was a mess. Instead of getting player feedback about balance and dealing with minor issues, we were playing whack-a-mole against a looming deadline.

That game failed. It launched. People bought it. But ultimately, it never overcame the negative first impression. I’m sure you can think of a number of games that had that experience.

You only get one chance to make a first impression, and that impression starts with your demo, or your game going into alpha on Steam, or beta, or them reading someone else’s review/opinion of your game. Most people’s first experience with your game is not going to playing your fully completed, bug-free game. Making your player base your testers is very risky.

@Demetrius_Dixon you have an active online presence based on what you’re saying you’re doing with YouTube, etc. so you may be able to manage expectations better than some, but having done that as my actual job I can tell you that replying to people about bugs and managing beta testers expectations can quickly become a full-time job. (Case-in-point, I ended up hiring someone just to manage the boards during the beta and launch of the game.)

Demetrius_Dixon · July 15, 2025, 5:56pm

Warning taken. I’ll see how my codebase holds up as I make the first prototype.

On another note, what was the game that flopped? I’d really like to know, not to make a rebuttal, I’m just really curious. @dragonforge-dev

hexgrid · July 15, 2025, 6:00pm

This is one reason smaller shops or individual devs can be more nimble. There’s fewer things going on in parallel that can destructively interfere with each other.

sixrobin · July 16, 2025, 8:55am

I’d like to endorse what @dragonforge-dev have said about players not being the playtesters, as I absolutely agree and believe it’s really important to keep in mind.

I’ve been working (as a programmer) on a game that has been in early access for almost 2 years, and although the game was eventually a success, the early access development was a bit tricky and the codebase was not that good (easy to say with hindsight, I know), and many of the game updates featured lots of bugs, issues with save files backward compatibility, etc.

A thing that we were told a bit too much for my taste was “hey, that’s an early access, it’s normal to have bugs, player know it”. However, while minor bugs may be acceptable for an early access game (that’s debatable already), there are two things to note:

1/ Players do not buy games to be its playtesters. For the case of an early access game, they may buy it to be active in a community, give some feedback about the game, but that would mostly be to improve the game’s balancing, difficulty curve, etc. basically, the game design. Not the bugs. Of course, if a few bugs are spotted on the fly, players will tend to be more comprehensive for an early access game and share them so that they can be fixed, but you cannot just say it’s “normal”.
2/ Working a game that’s not stable, receiving complaints for players when they save file gets corrupted, or when the game does crash, etc. is hard for the development team itself. Having a technical debt that’s growing is just the worst thing, and, well, unit testing and that kind of stuff exist to reduce the technical debt of a software. What I’m trying to say is, bugs are just boring for both the players and the devs, so I don’t see why one would not implement solutions to improve the game stability.

Last thing, @Demetrius_Dixon said that “Games are played for entertainment. Software is used as a tool.” I’ve sadly heard that statement too many times as an excuse for game projects of low technical quality (not pinpointing you specifically, Demetrius, as I don’t even know your codebase, but I hope you get my point). I know that games are more applications that softwares, but they’re developed using specialized tools and programming languages, which makes them closer to softwares that we tend to think, imho.
I believe that a prototype should focus on fun exclusively, and then, when making it an actual game, care about the technical quality of the project.

As always, all of these things depend on the game, the team, the time available, etc. but that post is already very long so I’ll stop there hope that read was worth your time!

colelli · July 16, 2025, 9:32pm

I had some time to give it a quick try. The only thing I prefer about GDUnit4 so far is the interface and ease-of-use, but overall feels of higher quality. However I am having a hard time mapping most of the features I use in GUT. I believe GDUnit4 does not provide such dynamicity tbh - or maybe I lack the know how.

For instance, in GUT I use something called GutInputSender to simulate key presses and user input in general. This is not as straight forward in GDUnit4. You can simulate input but from what I saw you must have a dedicated test-scene to which you can then attach a Runner and execute some simulated input. Although it’s probably more flexible for integration testing, I believe this makes unit testing a bit overkill if I just want to check if the Input is processed correctly in the script I call it in.

Mocks are available for GDUnit4, but I don’t see the same level of depth I found in GUT. In GUT you can define exactly what a stub does, in GDUnit4 you cannot, or better, I can’t find a way to simply skip the method (GUT offers a stub(<Callable>).to_do_nothing() function) - or perhaps there’s a better way to stub/mock autoloads? -

Spying also seems to be harder on GDUnit4. From my testing I can’t find a way to Spy on a Mock, you must create an actual instance to spy on. In GUT it’s exactly the opposite, you must have a Double to spy on it.
Some other features seem less error-prone in GUT as well, for example when asserting for signal emission you can use both the name of the signal (as a String), or the Callable. Might be a personal design choice, but Callables sound like a way better way to protect myself from typos.

So my question at this point I guess is: is it just my experience, or does GDUnit4 actually lack these features? And if so, you mentioned it being more “powerful” - in what way? Because as of right now I just feels less “polished” - interface aside.

I also thought GDUnit4 was better/easier to integrate in the GitHub Action as they offer an official runner, however I’ve also found this Run Gut Tests for Godot4 Action - I haven’t tested it yet though. You also mentioned a Godot Linter, perhaps this is what you are looking for? → Godot GDScript Toolkit

dragonforge-dev · July 17, 2025, 3:15pm

I haven’t looked at this functionality, but two things to keep in mind:

GDScript allows you to create user input already. You don’t need to use something else to do it.
Testing user input is not unit testing. It is potentially integration testing, but really at this point, even if you are using a unit testing framework to do it, you are striaght-up doing automated regression testing. There’s nothing wrong with that, but it is not technically unit testing.

colelli:

Mocks are available for GDUnit4, but I don’t see the same level of depth I found in GUT. In GUT you can define exactly what a stub does, in GDUnit4 you cannot, or better, I can’t find a way to simply skip the method (GUT offers a stub(<Callable>).to_do_nothing() function) - or perhaps there’s a better way to stub/mock autoloads? -

Spying also seems to be harder on GDUnit4. From my testing I can’t find a way to Spy on a Mock, you must create an actual instance to spy on. In GUT it’s exactly the opposite, you must have a Double to spy on it.
Some other features seem less error-prone in GUT as well, for example when asserting for signal emission you can use both the name of the signal (as a String), or the Callable. Might be a personal design choice, but Callables sound like a way better way to protect myself from typos.

So my question at this point I guess is: is it just my experience, or does GDUnit4 actually lack these features? And if so, you mentioned it being more “powerful” - in what way? Because as of right now I just feels less “polished” - interface aside.

I do not believe any of this functionality was in GUT when I looked into it a year ago. But it also had shit documentation at the time, so perhaps both the features and the documentation have gotten better. So to answer your question, my opinion seems to be out of date. I will have to revisit GUT the next time I’m looking at unit testing in GDScript.

I would be interested to see what you find.

It is! Thank you.