Their testing system depends on tests printing "OK" after every test. This means that in many cases, tests failing are indicated by the _absence_ of "OK" being printed.
(We've attempted to isolate those parts and write our own stuff testing against upstream in pytest. We once presented a proposal to move them to pytest, offering to do any work and even wrote pytest plugins to seamlessly integrate with their current system. We got a - literal - "Thanks, but no thanks.")
Oof. If they'd instead put "ok" before every test, they might have been accidentally compatible with TAP! https://testanything.org
It is nice to not have to depend on the language runtime to do the test.
You could even be a victim of "Embrace, Extend, Extinguish".
My advice is to consider forking it, and poaching contributors, in the interests of common good.
I’d love to investigate this further.
Experience has taught me that the “right” testing framework for a project is whatever the developers are happy and productive with.
If so that would appear common as came up right away.
Maybe unit tests need unit tests? (There’s probably a lint rule to catch what I describe above)
Yep - meta-testing (ensuring that every unit test that exists in a project adds unique coverage, remains valid, runs as expected, and I'm sure many other properties) could (and should!) definitely be automated.
Some more advanced meta-testing could involve tracking changes to a project's source history over time (in other words: tests that run with commit history). By that I'm thinking of situations like: "does this test genuinely still test what it used to, after the test and/or application code was modified?"
But yeah that would be a good thing for a linter to catch. I'm not aware if any do.
A reviewer should catch this error easily. I kind of think many don't give much attention to unittests when reviewing. Which is bad. Good unittests are far harder to write than good code.
There's much more subtle errors of this class (False Negatives / always pass).
The fix isn't to blame people for making mistakes. It's to figure out a design that doesn't allow this mistake to happen in the first place.
For example, the method could (today) require the second argument to be a keyword argument. This is also something a good linter should be able to warn on.
edit: rikatee and I wrote essentially the same reply at the same time. :-)
We do code review because we expect human error when the code was written by a human, but then we also expect not human error when the code is being read (reviewed) by a human? Any process that expects zero human error will always fail.
That's where linters add value: they allow devs to do what humans are good at (the creative complex and interesting stuff) while the bots do what bots are good at (the boring repetitive stuff)
Instead popular culture has decided that at best, this is what Schrödinger believed (Ha those crazy scientists) and at worst that somehow the cats being dead and not-dead at the same time is the core idea of quantum physics :/
> assertTrue also accepts a second argument, which is the custom error message to show if the first argument is not truthy. This call signature allows the mistake to be made and the test to pass and therefore possibly fail silently.
With modern features in python you could change the signature to
assertTrue(expr, *, msg=None)
which would prevent that issue. assert expr, "custom message"
though given the verbose api, it is ok to require the explicit msg kwarg (duplication in the tests is ok if it makes them more robust)"assertTrue also accepts a second argument, which is the custom error message to show if the first argument is not truthy. This call signature allows the mistake to be made and the test to pass and therefore possibly fail silently."
Searching for announced breaking changes about arguments to Python included functions...
https://bugs.python.org/issue25628
https://bugs.python.org/issue29193
Bear in mind only 28% of codebases actually use built-in unittest package that this gotcha is affected by, so really it's 20 of 28% of 666 aka 10% ... but that claim would be hard to justify by folks that dig stats.