#unit-testing | Python | Page 14

proud nebula Oct 9, 2025, 6:53 PM

#

We slice the stack in a different direction.

sturdy plaza Oct 9, 2025, 9:48 PM

#

Hello, all. I have an interesting problem to which I got a suboptimal solution. I'd like to know if you can give me some ideas on it.

I have a legacy class whose structure is like this:

class SomeClass
    def process(self):
        try:
            <some code>
            self.create_indicators(some_parameters)
            <some code>
        except Exception as e:
            return False
        return True

    def create_indicators(params):
        <some_code>
        my_interesting_var, refs = self.create_my_interesting_var(params)
        SomeOtherClass.send_away_my_interesting_var(my_interesting_var)
        <some_code>

So, as you can imagine, I want to inspect my_interesting_var when testing with unittest.TestCase. The only way I could do it was changing SomeClass code to this:

class TestException(Exception):
    def __init__(self, message, obj):
        super().__init__(message)
        self.obj = obj

class SomeClass
    def process(self):
        try:
            <some code>
            self.create_indicators(some_parameters)
            <some code>
        except TestException as e:
            raise TestException("Error processing my_interesting_var", e.obj)
        except Exception as e:
            return False
        return True

#

And then I created the test like this:

    @patch("mymodule.SomeOtherClass.send_away_my_interesting_var")
    def test_create_report_create_web3_indicators(self, mock_mymodule):
        # Make mocks:
        mock_mymodule.return_value = MagicMock()
        mock_mymodule.return_value.send_away_my_interesting_var.side_effect = lambda my_interesting_var, source_name=None: (_ for _ in ()).throw(TestException("my_interesting_var", my_interesting_var))
        my_interesting_var = None

        template = mymodule.SomeClass(some_inits)
        try:
            template.process()
        except TestException as e:
            my_interesting_var = e.obj

        self.assertEqual(my_interesting_var, something)

The problem is, I needed to change SomeClass code so I could test it.
Is there a way to test this variable without changing the legacy code?

proud nebula Oct 10, 2025, 6:11 AM

#

sturdy plaza Hello, all. I have an interesting problem to which I got a suboptimal solution. ...

Isn't create_my_interesting_var() pure? If it was there would be no need to fiddle with create_indicators at all

molten hollow Oct 10, 2025, 10:18 AM

#

sturdy plaza Hello, all. I have an interesting problem to which I got a suboptimal solution. ...

Hello, all. I have an interesting problem to which I got a suboptimal solution. I'd like to know if you can give me some ideas on it.
My suggestion would be, don't try to retrofit tests to an already existing code. You won't improve the design of your code that way. It's poor as a regression test, because the test won't cover what's supposed to happen, only what does happen.

So if you want to cover that piece of your feature with test, start with a test, test-drive a new class, and then replace the old class with the new class.

pearl cliff Oct 10, 2025, 11:23 AM

#

molten hollow > Hello, all. I have an interesting problem to which I got a suboptimal solution...

I strongly disagree

#

First of all what is "supposed" to happen is very very often "whatever already happens"

#

Second, if you don't test for existing behavior parity first, debugging can become a nightmare

molten hollow Oct 10, 2025, 11:28 AM

#

First of all what is "supposed" to happen is very very often "whatever already happens"
I see the confusion. When I said "supposed to happen", I meant the expected behaviour or outcome, and when I said "whatever already happens" I meant the implementation details. Good test should specify the expected behaviour, leaving the implementation to vary. If you start with the code, and try to retrofit tests to it, you don't get tests that check the expected behaviour, you get a test that checks implementation details.

#

Second, if you don't test for existing behavior parity first, debugging can become a nightmare
Sure, but if you start with tests, debugging done is very minimal, because projects like that tend to have very few bugs.

proud nebula Oct 10, 2025, 11:47 AM

#

molten hollow > First of all what is "supposed" to happen is very very often "whatever already...

Well that's clearly wrong. It's called "black box testing" or even "snapshot testing" and it's a very reasonable behavior and might or might not test implementation details.

molten hollow Oct 10, 2025, 12:02 PM

#

You can do black box testing either test first or test after, and doing black box testing test-first always yields better results than test after. As to the "very reasonable behaviour", I wouldn't be so sure, I would say it's only slightly better than manual tests. If I were to order testing strategies from the best to worst, then at the very end would be manual tests, and one tick before that would be tests written after the fact. Any kind of test written before the code will give you a better design, better coupling and cohesion, and most people report feeling way better when working with systems like that.

I guess, the reason people don't do that, is because they find themselves working on legacy systems without tests, so they very rarely get a chance to test-drive something for real. They are forced to either retrofit tests to existing code (which is very low quality), or create a new code and test-drive that (which is also often difficult). If someone finds himself in this situation, like imagine you work in a project like that for 2-3 years, and you can't get away from it; it's very painful to admit to oneself "I'm doing bad testing, because the project forces me to". It's much better to rationalize it by saying "I'm doing test after the fact and that's good".

#

I propose to you - take any person, show them either a very well written project with tests or a new project, show them how nice and easy it is to work in an envorinment like that, 99% of them will never write tests after the fact again - but they need to experience it first.

#

even "snapshot testing" and it's a very reasonable behavior and might or might not test implementation details.
I agree that it's popular, but it's not good. What you've achieved with a snapshot test, is a test that couples to the implementation most often. It doesn't provide the same level of freedom to refactor your internals as if it was written before the fact, like a regular test.

proud nebula Oct 10, 2025, 12:29 PM

#

molten hollow > even "snapshot testing" and it's a very reasonable behavior and might or might...

That's a strawman though. No one argued that.

#

Black box testing is great when it's the right tool. TDD is great when it's the right tool. No tests at all is great when it's the right situation. Everything has context.

molten hollow Oct 10, 2025, 1:47 PM

#

proud nebula Black box testing is great when it's the right tool. TDD is great when it's the ...

Black box testing is great when it's the right tool. TDD is great when it's the right tool. No tests at all is great when it's the right situation. Everything has context.
🧐
This argument can be used to defend any bad idea. Like LCD or OLED screens are better than old CRT monitors, but you could then say "lcd are good when they're right tools and ctr are good when they're the right tools". That doesn't mean anything. The context for using CRT is so narrow it doesn't make sense to recommend it to anyone, same as test-after code.

proud nebula Oct 10, 2025, 1:49 PM

#

molten hollow > Black box testing is great when it's the right tool. TDD is great when it's th...

There are very few outright bad ideas. It's all about context. Ignoring context and dealing with absolutes is fundamentalism and that's no good.

molten hollow Oct 10, 2025, 1:50 PM

#

proud nebula There are very few outright bad ideas. It's all about context. Ignoring context ...

You're leaving the subject and going off to broader areas.

#

The topic came @sturdy plaza who showed some code, and he asked how to retrofit tests into an existing code.

#

I suggested that it's a bad idea all together, and it would be much better to drive the design of the code from tests. You achieve much better reslts that way. By retrofiting (or snapshot tests), you couple your tests to the implementation. They assert the code is the code that is there, but it doesn't improve the design of the system (like tests-first would) and aren't flexible enough to allow a substantial refactor. Tests-after will break when you refactor your code, because they couple to the implementation.

#

Test before will allow refactor, because they're specificying the intent, not the implementation.

river pilot Oct 10, 2025, 1:52 PM

#

molten hollow I suggested that it's a bad idea all together, and it would be much better to dr...

if you already have code, and you don't yet have tests, you should write tests. It sounds like you're suggesting you shouldn't write those tests.

proud nebula Oct 10, 2025, 1:52 PM

#

molten hollow You're leaving the subject and going off to broader areas.

I'm talking about the basic problem underlying the disagreement.

molten hollow Oct 10, 2025, 1:54 PM

#

river pilot if you already have code, and you don't yet have tests, you should write tests....

Well, yes and no. You're right in some areas.

If you write your code first, we don't generally thinkg of it being testable, because it's already hard to get it to work. So we tend to create software thats not-testable, poorly designed and tightly-coupled. It's hard to test that code.

On the other hand, if you write your test first, that drives you to creating software that is more testable, because you can't do it any other way 😄 The resulting code is much more testable, and by definition, decoupled. It's usually better designed, because when writing the test you're thinking of it what you'd like to achieve, not implementation details.

The first is undesired, and the later is highly desired.

#

Now, how do you achieve the second, if you already have the first? 🤔

proud nebula Oct 10, 2025, 1:55 PM

#

molten hollow Well, yes and no. You're right in some areas. If you write your code first, we ...

That's all irrelevant to the situation. He's asking about his broken leg and you're telling him to not break the leg in the first place.

molten hollow Oct 10, 2025, 1:55 PM

#

The sad part is, you can't. The design decisions are already made, so you can't retrofit proper tests to that code. What you can do, is if you have like 100 "old classes" (classes without tests), you take one of them, and you rewrite it, but with tests. So you have 99-old classes and 1-new class. That one class is an improvment. You use the new class in your code, and when you're done you remove the old one. You do that for all code in your system, and you have a testable system now.

molten hollow Oct 10, 2025, 1:56 PM

#

proud nebula That's all irrelevant to the situation. He's asking about his broken leg and you...

That's all irrelevant to the situation. He's asking about his broken leg and you're telling him to not break the leg in the first place.
I see why you may think that, and if I suggested that - that would be a poor advise. But I'm not! 😄 Let me explain:

river pilot Oct 10, 2025, 1:56 PM

#

molten hollow The sad part is, you can't. The design decisions are already made, so you can't ...

it's not realistic to say, "just rewrite the whole thing"

proud nebula Oct 10, 2025, 1:56 PM

#

molten hollow The sad part is, you can't. The design decisions are already made, so you can't ...

If you rewrite it with tests but have no way to verify that your new code does the same thing as the old code, that's a problem.

molten hollow Oct 10, 2025, 1:56 PM

#

He has a class without tests, and he wants to add a test. Of course, you should want the good tests and good design - if not, what's the point? And the best way to achieve that, is to write that class test-first.

#

If you do that, you will have what you wanted - a class with good tests.

#

it's not realistic to say, "just rewrite the whole thing"
I didn't suggest rewrite the whole thing, just this class that he wants tested.

river pilot Oct 10, 2025, 1:57 PM

#

molten hollow > it's not realistic to say, "just rewrite the whole thing" I didn't suggest rew...

if i have no tests, and take your advice, i will rewrite the whole thing.

proud nebula Oct 10, 2025, 1:57 PM

#

molten hollow > That's all irrelevant to the situation. He's asking about his broken leg and y...

I mean, that's what you said, but I am willing to believe you failed to communicate what you really meant :P I do that all the time myself.

pearl cliff Oct 10, 2025, 1:57 PM

#

molten hollow The topic came <@1425948626974150687> who showed some code, and he asked how to ...

My strong disagreement with your perspective is specifically in the context of writing new tests for existing code

proud nebula Oct 10, 2025, 1:58 PM

#

molten hollow He has a class without tests, and he wants to add a test. Of course, you should ...

wat.. that's just saying the same thing again, which is the part we disagree with

pearl cliff Oct 10, 2025, 1:58 PM

#

The chance of introducing a new bug, or accidentally missing some tiny feature, it is way too high

#

In an ideal world you have a specification for the program and you can implement that as the test suite, and then factor your code to match the test suite

#

In practice, the specification is whatever the program already happens to do

sturdy plaza Oct 10, 2025, 1:59 PM

#

Oh, these hints from you all are interesting. I think there are more things here that I can chew for the moment.
I decided to create a side effect and get the arguments passed to send_away_my_interesting_var.
It's not the best solution, I know, but now I see I can use tests to help decoupling and speed up development from now on.
I'm glad it fired such an interesting discussion! 🙂

molten hollow Oct 10, 2025, 1:59 PM

#

if i have no tests, and take your advice, i will rewrite the whole thing.
You only rewrite the thing, that you want tested. If you want just one class tested,you only need to rewrite that one class.

If you rewrite it with tests but have no way to verify that your new code does the same thing as the old code, that's a problem.
That is also a very serious and real issue, thank you for bringing it up. It's serious, because there are two forces at play: In one corner, we have the "what the code should do" in the other "what the code does". Good tests should specify what the code should do. The good code, should specify what it does.

If you tell me that "have no way to verify that your new code does the same thing as the old code" to me that means, you know what the code currently does (or not), but probably not what it should to. That's a very common issue, if you write the test-after. Because you implement the stuff, you read the code, and have no idea what it's supposed to be doing.

The chance of introducing a new bug, or accidentally missing some tiny feature, it is way too high
That is also a very real issue, and exactly the thing you get if you do test-after.

#

In an ideal world you have a specification for the program and you can implement that as the test suite, and then factor your code to match the test suite
In practice, the specification is whatever the program already happens to do
That's not entirely true. Noone will give you a specification for a program. You're the programmer, you're in charge of developing the application. What you will receive, is wishes of your customer/client/user. What he wants to do, what work he needs to do, what's the benefit he wants. How it's implemented/designed/developed, is up to the programmers.

#

And thus, as a programmer - you must know what the program should do. If not, you're in big, big trouble.

proud nebula Oct 10, 2025, 2:01 PM

#

Dnaron. You sound very junior. I don't know if you are, but it sounds like you're green and excited and have read a lot. I've been that person. 25 years ago.

molten hollow Oct 10, 2025, 2:01 PM

#

Argumentum, ad hominem. Thank you.

proud nebula Oct 10, 2025, 2:01 PM

#

molten hollow And thus, as a programmer - you must know what the program should do. If not, yo...

again, that's asking him to not break the leg after the leg is already broken and there's blood squirting wildly. Context.

molten hollow Oct 10, 2025, 2:02 PM

#

That's not what I'm saying, you're misreading my words.

#

He said he's got a class without tests. He wants to add a test. Thus - he wants to have a class with tests. The best way to have that is to write a new test first, and drive that class back from the test, then remove the old class.

#

The class is not written in stone. You can refactor it, update it, remove it and rewrite it.

river pilot Oct 10, 2025, 2:03 PM

#

molten hollow He said he's got a class without tests. He wants to add a test. Thus - he wants ...

there are risks to doing that, not to mention the amount of work it would take.

molten hollow Oct 10, 2025, 2:04 PM

#

What risks? The thing you mentioned already, are that you don't really know what the class is doing, and you're scared of changing it, because you don't know what might happen.

#

And I agree, that's a bad place to be in.

proud nebula Oct 10, 2025, 2:04 PM

#

molten hollow And I agree, that's a bad place to be in.

thus black box testing

molten hollow Oct 10, 2025, 2:04 PM

#

Working in a legacy software, that who knows what will do is stressfull.

river pilot Oct 10, 2025, 2:04 PM

#

molten hollow Working in a legacy software, that who knows what will do is stressfull.

yes.

molten hollow Oct 10, 2025, 2:04 PM

#

But!

#

If you're in that kind of place, that you don't really know what the software is supposed to do, because it's so bad and old,

#

i'm sorry to say that, but you just aren't able to test it properly. You can't, it's not possible.

#

You can fool yourself into thinking you can blackbox test that,

#

and you can do that, but these tests will not give you any value.

#

they will be slow to execute, break when you refactor, won't catch bugs, won't improve your design, nothign.

river pilot Oct 10, 2025, 2:05 PM

#

molten hollow and you can do that, but these tests will not give you any value.

i'm sorry, that's simply not true. they have value.

molten hollow Oct 10, 2025, 2:06 PM

#

They have like 0.0001% of the value of the tests that would give you 100% if you wrote them test-first.

river pilot Oct 10, 2025, 2:06 PM

#

they aren't ideal, but we started from a non-ideal place.

molten hollow Oct 10, 2025, 2:06 PM

#

If you have a legacy code with classes that you have no idea what they're doing, the only thing you can do to improve it, is to learn what the code is supposed to be doing.

#

Not what it does, but what it's supposed to be doing.

#

If you don't have that, you can do all the black box testing you want, nothing good will come from that.

river pilot Oct 10, 2025, 2:07 PM

#

molten hollow If you don't have that, you can do all the black box testing you want, nothing g...

ok, we get it. this is an extreme way to express your ideals.

molten hollow Oct 10, 2025, 2:07 PM

#

Let me ask you this then. What good are tests, written by someone who doesn't know what the class under test is supposed to be doing?

river pilot Oct 10, 2025, 2:08 PM

#

molten hollow Let me ask you this then. What good are tests, written by someone who doesn't kn...

that's wasn't the question. We know what the class is supposed to do. We don't have tests.

molten hollow Oct 10, 2025, 2:08 PM

#

It's like a recipe for a pie, written by someone who doesn't know how to make one.

river pilot Oct 10, 2025, 2:08 PM

#

molten hollow It's like a recipe for a pie, written by someone who doesn't know how to make on...

no one said, "I have no idea what the class is supposed to do"

proud nebula Oct 10, 2025, 2:08 PM

#

He blocked me. So I guess you're on your own ned. Godspeed.

river pilot Oct 10, 2025, 2:08 PM

#

proud nebula He blocked me. So I guess you're on your own ned. Godspeed.

how can you tell that?

proud nebula Oct 10, 2025, 2:09 PM

#

river pilot how can you tell that?

You can't react with emojis on a message of someone who has blocked you. It's a nice funny animation too. The entire window vibrates. It's pretty neat. Confusing as hell though.

molten hollow Oct 10, 2025, 2:10 PM

#

river pilot no one said, "I have no idea what the class is supposed to do"

no one said, "I have no idea what the class is supposed to do"
I think @proud nebula said that:
If you rewrite it with tests but have no way to verify that your new code does the same thing as the old code, that's a problem.
He suggested there might be no way to verify that the new codes does the same thing as the old code. To me - the only circumstance in which that is true, is if you don't know what the code is supposed to be doing.

#

He suggested there are some parts in the code, that do something - but we're not really sure why or how.

river pilot Oct 10, 2025, 2:11 PM

#

molten hollow > no one said, "I have no idea what the class is supposed to do" I think <@69030...

ok, then it wasn't expressed well. We know what the code is supposed to be doing.

molten hollow Oct 10, 2025, 2:11 PM

#

If you do, then what's the problem with writing a new test first, then drive a class from it? 🤔

river pilot Oct 10, 2025, 2:11 PM

#

molten hollow If you do, then what's the problem with writing a new test first, then drive a c...

have you done this with legacy code?

molten hollow Oct 10, 2025, 2:12 PM

#

Yup.

#

You test-drive a small bit of the system, and you replace the usage of the old version with the new version.

proud nebula Oct 10, 2025, 2:12 PM

#

(aka YOLO)

molten hollow Oct 10, 2025, 2:12 PM

#

And you do that with every bit that you want tested properly.

river pilot Oct 10, 2025, 2:12 PM

#

molten hollow If you do, then what's the problem with writing a new test first, then drive a c...

the problem is that there can be unknown edge cases or side effects. It's not that we have no idea what the code should do. It's that we might not understand 100% of what the code does.

molten hollow Oct 10, 2025, 2:13 PM

#

the problem is that there can be unknown edge cases or side effects.
Back again - if there might, that means you don't really know what the system is supposed to be doing.

river pilot Oct 10, 2025, 2:13 PM

#

molten hollow > the problem is that there can be unknown edge cases or side effects. Back agai...

you keep switching to "100% don't know."

#

in any case, @sturdy plaza has what they needed.

molten hollow Oct 10, 2025, 2:15 PM

#

These "unknown edge cases" or side effects, that you speak of - if the application was written test-first, there wouldn't be any, because they would be covered by tests.

#

the problem is that there can be unknown edge cases or side effects
If that is true, that there are these edge-cases, then doing "blackbox" testing won't help you much eaither, because that kind of test won't illustrate those edge-cases.

river pilot Oct 10, 2025, 2:15 PM

#

molten hollow These "unknown edge cases" or side effects, that you speak of - if the applicati...

again, you are saying "you shouldn't have gotten yourself into that situation in the first place"

molten hollow Oct 10, 2025, 2:16 PM

#

As a sidenote, yes. But I'm also saying how to leave it.

river pilot Oct 10, 2025, 2:16 PM

#

molten hollow As a sidenote, yes. But I'm also saying how to leave it.

and we are saying it might not be feasible to leave it the way you are describing.

molten hollow Oct 10, 2025, 2:16 PM

#

If you want good tests, you do this:

if there are edge cases, find them
write a fresh test
drive the class from that test

river pilot Oct 10, 2025, 2:16 PM

#

and test-first doesn't ensure that you've fully tested all of the behavior either.

molten hollow Oct 10, 2025, 2:17 PM

#

river pilot and test-first doesn't ensure that you've fully tested all of the behavior eithe...

That's correct, but at least you specified in test what the code is supposed to do. If you find a missing behaviour, it's trivial to add it , because the behaviour is fully specified in the test.

river pilot Oct 10, 2025, 2:17 PM

#

molten hollow That's correct, but at least you specified in test what the code is supposed to ...

the code could have behavior the test doesn't specify

#

even if you wrote the tests first.

molten hollow Oct 10, 2025, 2:18 PM

#

river pilot the code could have behavior the test doesn't specify

that's right, but if you're doing test-first, you can freely remove that code, because it's not needed for anything.

#

If it was neede, there would be a test for it, that would catch it.

#

If you want some behaviour from a software, you codify it in a test.

river pilot Oct 10, 2025, 2:18 PM

#

molten hollow If it was neede, there would be a test for it, that would catch it.

that's not true. You write a test, you write a class, you write another class that uses the first. the second class depends on the behavior the test didn't test.

river pilot Oct 10, 2025, 2:18 PM

#

molten hollow If you want some behaviour from a software, you codify it in a test.

yes, ideally. the real world gets messy. people make mistakes.

molten hollow Oct 10, 2025, 2:19 PM

#

river pilot yes, ideally. the real world gets messy. people make mistakes.

That's true, but in test-first applications, the mistakes happen once every year maybe. In test-after, you get mistakes daily probably.

#

and even if the mistake happens, it's caught very quickly.

molten hollow Oct 10, 2025, 2:19 PM

#

river pilot that's not true. You write a test, you write a class, you write another class th...

The second class wasn't test-driven?

river pilot Oct 10, 2025, 2:20 PM

#

molten hollow The second class wasn't test-driven?

even if it was, the tests could have missed the secret behavior.

molten hollow Oct 10, 2025, 2:20 PM

#

You're the author of the test. If you missed the secret behaviour, that means it wasn't needed.

river pilot Oct 10, 2025, 2:20 PM

#

i get it: tests first is a good way to write better software. but it's not a magic bullet.

molten hollow Oct 10, 2025, 2:21 PM

#

I never said it was a magic bullet. I just said it was orders of magniute better than test after. Of course there are mistakes, but way fewer.

even if it was, the tests could have missed the secret behavior.
When you're writing tests, you're designing your system. If you want your system to do something, because it must, you write a test for it. You don't rely on secret behaviour to simply "emerge" and give you a feature. If you want a feature, you write a test for it.

#

So yes - there might be secret behaviours, but you can remove/change/update them, and if all of the tests pass, you're good to go.

#

In a legacy system, the secret behaviour that's missing might actually be critical - but you don't know it, you have no idea of knowing. If there was a test for that, you would know.

#

I get what you guys are saying. The system was in production for 10 years, some secret behaviour appeared that 50% of your users rely on it; and you're afraid of changing the code because of that secret behaviour, that noone knows about, but if you were to remove it, half of your users would scream. I get that, I've been there.

#

So due to that fear of breaking the secret behaviour, you don't test-drive your app, and do blackbox/snapshot testing; because that's the only think that you trust not to break your system.

#

That's a terrible code to work with. It's aweful. It's stressful, you feel the pressure, you can't change it much, because it's so fragile. You rename a variable and suddently the pagination doesn't work. That sucks.

#

There is no way out of this, other than to properly design your system. You need to start improving it, if the system is still to be developed for another years. To improve it, you need to know what it's supposed to be doing. You can introduce a small change, and roll it to QA's or a small number of people, to verify that you didn't break anything. You can push it to another enviornment, to ask someone who knows the system whether that part you touched still works.

#

I suggest you watch a video by Kent Beck "Forrest and a desert": https://www.youtube.com/watch?v=dtu9Ks2CN-U

YouTube

Beauty In Code

Beauty in Code 2025, 6 of 6 — Kent Beck: "The Forest & The Desert...

Beauty in Code 2025 was a single-track full day IT-conference organized by Living IT, featuring six amazing speakers. It was hosted at the Malmö Live conference center on March 1, 2025.

https://beautyincode.se
https://livingit.se

Session 6 of 6 by Kent Beck (@KentBeck)
"The Forest & The Desert Are Parallel Universes"

So close and yet so far....

▶ Play video

pearl cliff Oct 10, 2025, 2:33 PM

#

molten hollow There is no way out of this, other than to properly design your system. You need...

I don't see how this squares with your advice to rewrite the class against a speculative test suite

#

The truth is that you need both

#

And you need to be pragmatic about what you do, and what order

#

Most of the time, it's safer to take the approach of gradually building up tests surround existing functionality and building up tests around desired/specified functionality

proud nebula Oct 10, 2025, 2:34 PM

#

Ok, someone tell him no one meant that the blackbox tests should be kept for all eternity. I think he thinks that's what we're all saying.

#

(I hate how stupid blocking in discord is)

pearl cliff Oct 10, 2025, 2:34 PM

#

I guess I'm taking the approach that blackbox tests can be an absolute fucking nightmare and sometimes you actually want to write unit tests for existing code

#

Like you really need all three

pulsar oracle Oct 10, 2025, 2:35 PM

#

Peak TDD is when you drive the design of what you're building with clean interfaces, you fundamentally make it easy and comprehensive to test. If you care about actually testing that your code does what it is supposed to and consider it mandatory then you design it in the easiest way to get there. But I don't think tests after the fact in all situations are bad. You don't need TDD to write testable code, sure it's probably leagues better if you lean into it but some stuff is obvious with what behaviors it should have and it's fine to put ones on after.

pearl cliff Oct 10, 2025, 2:35 PM

#

But here a user wanders in and asks us how to write a unit test for an existing class. The answer can't be to spend several developer weeks or even months building out a sophisticated test infrastructure

pearl cliff Oct 10, 2025, 2:37 PM

#

sturdy plaza Oh, these hints from you all are interesting. I think there are more things here...

IMO this is precisely what "bad" testing tools like mocking are good for. They let you add tests easily and quickly, which allows you to make localized refactoring easy, which has a positive snowball effect

#

@molten hollow I think where your approach makes more sense is in a big team

#

I do not get the sense that this person is in a big team but I suppose I should've checked

molten hollow Oct 10, 2025, 2:40 PM

#

it's safer to take the approach of gradually building up tests surround existing functionality and building up tests around desired/specified functionality
"Safer" as in less chance of breaking secret behaviour? Yes.
"Safer" as in it lets you safely change the code, introduce new feature, fix bugs, refactor? No.

But I don't think tests after the fact in all situations are bad.
To me, writing tests after the fact has all the disadvantages, and no advantages. I'm sure, that if I jumped into your project, all tests I would've written would've been test first. There are ways to do that, that you can learn, and there are obstacles to that, but they can be dealt with.

You don't need TDD to write testable code
That's right, but if you rely on your judgment to create a testable code, that's just an untried guess. Sometimes it'll work, sometimes won't. And you end up with untested code, in some proportion.

I guess I'm taking the approach that blackbox tests can be an absolute fucking nightmare and sometimes you actually want to write unit tests for existing code
Sure, you might want. The question is - what do you hope to achieve by that?

But here a user wanders in and asks us how to write a unit test for an existing class. The answer can't be to spend several developer weeks or even months building out a sophisticated test infrastructure
I never said that. You can do that in a couple of minutes.

And you need to be pragmatic about what you do, and what order
I see how you call yourself "pragmatics" and me "idealistic", but maybe we can leave these unhelpful words? What you call "idealistic" to me is day-to-day job, that I do for many years now. Now, what you call "pragmatic" to me feels like being in the worst possible situation, that if I found myself in, I would like to quickly improve that. So, how about we keep it civil. If my you feel my advice doesn't cover some case, please bring it up in a peaceful manner, and we can talk about it.

pulsar oracle Oct 10, 2025, 2:40 PM

#

pearl cliff <@323535764455555083> I think where your approach makes more sense is in a big t...

I work by myself and I use TDD if what I'm doing isn't exploratory. Either you write and use and expect other people to use your code or application on the trust me bro model, or you add tests to your code to verify it does what you want, stuff can be as simple as a lambda. Design to make it testable, you don't even gotta write a test to do this either, even just "when in doubt write testable code" does wonders. The code isn't done unless there's tests, at least isolate the important pieces and do those, and in that case why not write them first.

river pilot Oct 10, 2025, 2:48 PM

#

@molten hollow "You can do that in a couple of minutes." The discussion will be more helpful if you acknowledge that it might take more than a couple of minutes. You are stating things in very stark terms.

molten hollow Oct 10, 2025, 2:49 PM

#

river pilot <@323535764455555083> "You can do that in a couple of minutes." The discussion ...

Why would writing a test for a new class take more than that? 😮

#

If your code is coupled to some framework, if it's undeterministic, if it's got a lot of dependencies, if it's badly designed - yes. But these are all code smells. If you stop thinking about "how can I test this already existing class", and think of it in terms "I need a class that does X", then it's very simple, and very doable in a couple of minutes.

river pilot Oct 10, 2025, 2:50 PM

#

molten hollow Why would writing a test for a new class take more than that? 😮

writing the new class and ensuring it still does what it needs to do will take more than a couple of minutes.

molten hollow Oct 10, 2025, 2:50 PM

#

river pilot writing the new class and ensuring it still does what it needs to do will take m...

Never happens to me. Please, give me an example.

river pilot Oct 10, 2025, 2:50 PM

#

molten hollow Never happens to me. Please, give me an example.

i need to get a job where you work 😄

molten hollow Oct 10, 2025, 2:51 PM

#

I gotta be honest - if it took me hours to create a test, I would be helishly tired and would probably stop doing that. But it's quick and easy, provided you don't slow yourself down by code smells.

pulsar oracle Oct 10, 2025, 2:51 PM

#

If the class is supposed to produce some sort of json file for example and there's a lot of stuff to make sure is right maybe where it's not worth bringing in the repository or other data access abstraction pattern.

molten hollow Oct 10, 2025, 2:52 PM

#

pulsar oracle If the class is supposed to produce some sort of json file for example and there...

That's speaking in terms of implementation details. Tell me what the class needs to do, what's the expected behaviour?

pulsar oracle Oct 10, 2025, 2:53 PM

#

I'm just trying to imagine because it's been a while. I don't know tbh. Maybe someone else should provide an example.

molten hollow Oct 10, 2025, 2:54 PM

#

So if your class is coupled to a framework (like uses spring anotations, laravel classes, ruby on rails stuff), has a lot of dependencies, lot of static/global state, is coupled to the inputs and outputs, is reliant on implementation details; then obviously this class in not testable and would take hours to test that. That's exactly the reason why working with it hard, even if you introduce blackbox testing to it.

#

And that's exactly why I'm suggesting you should create a new test, drive the responsibility of the class from the test, and then use it in place where the original class was used.

proud nebula Oct 10, 2025, 3:06 PM

#

ah, to be young and naive again

pearl cliff Oct 10, 2025, 3:51 PM

#

molten hollow And that's exactly why I'm suggesting you should create a new test, drive the re...

Do you have any success stories of doing this? I don't ask to doubt your experience. I more wonder if there are certain situations where this approach does work, which is useful for those of us who uniformly recommend against it.

Many programmers spanning decades have tried to do this many times and failed, which is where the advice you hear comes from. I personally have tried it and it has only ever ended up in me working through the night super stressed out when I could've been sleeping or having fun, and/or having sheepish 1:1s explaining that I badly underestimated the work.

So there's a mismatch between your recommendation and the recommendations of people who feel that they have learned the hard way not to do what you recommend. Maybe that means you have a different and unique perspective.

molten hollow Oct 10, 2025, 4:00 PM

#

pearl cliff Do you have any success stories of doing this? I don't ask to doubt your experie...

Sure!

Do you have any success stories of doing this? I don't ask to doubt your experience. I more wonder if there are certain situations where this approach does work, which is useful for those of us who uniformly recommend against it.
I mean, I managed to do it in every project I joined. I do stumble upon everything you guys describe, big classes, no tests, secret behaviours, all that. What you experience, is real. But I try to address the issues and deal with them. I tried multiple things, and what I suggest here was just the stuff that works for me. I tried blackbox/sandbox testing, and it didn't do it for me.

Many programmers spanning decades have tried to do this many times and failed, which is where the advice you hear comes from. I personally have tried it and it has only ever ended up in me working through the night super stressed out when I could've been sleeping or having fun, and/or having sheepish 1:1s explaining that I badly underestimated the work.
That's definitely true, and that's a real problem. However, I found that it's not intrisic, it's not like we're bound to suffer. Most problems like that comes from very simple things, that we can change. Stuff like:

we believe people must sign off deploys
we believe we must deploy to all of the people at once
we don't trust our developers and testers
we can't work in pairs because it slows us down too much
we should optimise for time spent coding, not talking to people
tests are not part of a releasable, so they're not important
my manager didn't ask me to refactor, so I can't do that.
we must create the whole feature at once in a sprint, we can't split it in chunks

These aren't the only ones, but there are more. There are things/assumptions, that people hold that sometimes stop them from working in a productive way. The only way for me to address them, would be to find them somehow; either by working with your code or by talking to you.

#

Maybe that means you have a different and unique perspective.
It's definitely not unique, I met many people who do the same thing. Did you try reading "Working with legacy code" by Michael Feathers?

molten hollow Oct 10, 2025, 4:01 PM

#

proud nebula ah, to be young and naive again

Argumentum ad hominem, again.

#

When I said before that "you don't know what your software is supposed to be doing", I'm prepare to accept that may have been a bit rough; people might feel personally attacked. But I didn't mean to attack anyone, that was supposed to be a diagnostic observation. Programmers not being fully aware of what the code is supposed to be doing is a real problem, that's need addressing. I just stated it, to put myself in a place where I can deal with the issue somehow. If I were to find myself in a project, where I don't know what it's doing, then that's the first, second and third thing I would need to fix. Testing would come later.

#

Test-first is useful precisely because you can't really do it, if you don't know what your software is ought to do. And I saw that when I suggested that, I got pushback - because some programmers actually didn't know that. So the thing now should be - not to skip the test-first, but to learn what the system is supposed to be doing.

river pilot Oct 10, 2025, 10:43 PM

#

molten hollow Test-first is useful precisely because you can't really do it, if you don't know...

fwiw, i wasn't pushing back on test-first. maybe you mean someone else was.

molten hollow Oct 11, 2025, 10:35 AM

#

I had a feeling i'm talking to 3 different people, and randomlny one of them answers my posts 😄

safe bronze Oct 11, 2025, 12:31 PM

#

Bro why i was temporality muted?

molten hollow Oct 12, 2025, 9:31 AM

#

safe bronze Bro why i was temporality muted?

I didn't see any of your messages in past 2 days, if that's what you're asking.

cedar wraith Oct 12, 2025, 1:47 PM

#

How are you supposed to unit test, when you actually didnt implement the function first?

#

Property based testing aswell

proud nebula Oct 12, 2025, 1:51 PM

#

cedar wraith How are you supposed to unit test, when you actually didnt implement the functio...

You can define the function to return None, then write the tests until they all pass.

proud nebula Oct 12, 2025, 1:52 PM

#

cedar wraith Property based testing aswell

PBT is a method to find edge cases in the logic that you then add to the tests. Mutation testing finds what behavior the code has that isn't tested.

tired jungle Oct 12, 2025, 9:30 PM

#

greetings, I have a bin where you can see 1 fixture and 1 test function which fails to assert due to MagicMock being compared to a string

following a pdb.set_trace, I wasn't able to return a string from the Mock object, in order to pass the test

https://pastebin.com/3nJmR0LW

Pastebin

@pytest.fixturedef mock_tempfile(monkeypatch): mock_tmp_file = m...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

i have exhausted all of the internets, leaving this channel for the last resort. If you google anything, I have tried it

#

even used GPT

proud nebula Oct 12, 2025, 9:43 PM

#

tired jungle greetings, I have a bin where you can see 1 fixture and 1 test function which fa...

Have you tried to work around the issue by avoiding mocking in the first place?

tired jungle Oct 12, 2025, 9:56 PM

#

its a requirement...

#

i dont know how to mock a namedtempfile

#

otherwise, i am doing it by the book

proud nebula Oct 12, 2025, 10:05 PM

#

tired jungle its a requirement...

What? Is this school work or something?

tired jungle Oct 12, 2025, 10:05 PM

#

work

#

just landed this job and i have little experience with unit testing, otherwise am solid with python overall

odd walrus Oct 12, 2025, 10:08 PM

#

Sometimes for a thing like a temp file you want a "fake", not a "mock"; they can be easier, worth looking up at least.

proud nebula Oct 12, 2025, 10:19 PM

#

tired jungle work

Why would work fight you on trying to.. work...?

#

Is the requirement that you can't modify the code to test until after you've tested it? Under no circumstances ever?

#

(Redefining the problem is what makes a good programmer imo)

pulsar oracle Oct 13, 2025, 8:32 AM

#

tired jungle its a requirement...

Are you sure you're not confusing mocking with mandatory unit testing? Because mocking doesn't prove that your code works or necessarily meaningfully achieve what you want. Which in this case looks like you want to load a file from somewhere correctly, which it would be better to take a sample file or produce one (whichever is more convenient) and see if the function has the end result you want.

dense bough Oct 13, 2025, 11:16 AM

#

Does pytest monkey patch make any guarantees about what __enter__ returns?

swift pewter Oct 13, 2025, 11:33 AM

#

dense bough Does pytest monkey patch make any guarantees about what `__enter__` returns?

__enter__ of what, monkeypatch.context()?

dense bough Oct 13, 2025, 12:10 PM

#

tired jungle greetings, I have a bin where you can see 1 fixture and 1 test function which fa...

Whether test-entering the mock object in the tested function there would return the mock object itself.

#

The two different context manager makes things a little confusing 😅

molten hollow Oct 15, 2025, 9:37 AM

#

cedar wraith How are you supposed to unit test, when you actually didnt implement the functio...

It's actually not that hard. Because what do you actually need to test a function? You need to know its name, its signature and arguments, and you need to know what its purpose is. So for example, I can imagine a function that parses roman numerals. I don't have the function written yet, but I can write the first test like so:

def test_parse_roman_numerals():
  assert parse_roman('I') == 1
  assert parse_roman('II') == 2
  assert parse_roman('III') == 3
  assert parse_roman('IV') == 4

Having that, I'm free to implement it however I want. You don't need to know the function implementation to write a test for it.

swift pewter Oct 15, 2025, 9:39 AM

#

molten hollow It's actually not that hard. Because what do you actually **need** to test a fun...

These pure "mathematical" functions don't occur that often in real-life code though.

molten hollow Oct 15, 2025, 9:40 AM

#

swift pewter These pure "mathematical" functions don't occur _that_ often in real-life code t...

That's right, but there are easy ways to test-drive those too. If you give me an example of what's hard to write test-first for, I can show you can I would test-drive that.

#

There are things that are hard to test of course: UIs, concurrency, distributed systems, 3rd party systems. But there are tricks to side-step it, so that most of your code can be test-driven.

molten hollow Oct 15, 2025, 9:43 AM

#

tired jungle greetings, I have a bin where you can see 1 fixture and 1 test function which fa...

I wanted to help you, but I fail to understand what the code actually needs to do? 🤔

cloud shadow Oct 17, 2025, 2:08 AM

#

pulsar oracle Are you sure you're not confusing mocking with mandatory unit testing? Because ...

Test function for sure and then try try try

cedar wraith Oct 17, 2025, 1:29 PM

#

molten hollow It's actually not that hard. Because what do you actually **need** to test a fun...

Ah, so unit testing mainly relies on planning and defining the expected results beforehand, as I understood

molten hollow Oct 17, 2025, 1:30 PM

#

cedar wraith Ah, so unit testing mainly relies on planning and defining the expected results ...

Not really. You don't need to plan ahead. It's just about specifying the expected outcome.

#

defining the expected results beforehand that's correct. But planning part, not really.

cedar wraith Oct 17, 2025, 1:34 PM

#

molten hollow `defining the expected results beforehand` that's correct. But planning part, no...

Well, I’m kinda confused. I’m currently working as an apprentice in a software development company, and they really emphasize detailed product planning

proud nebula Oct 17, 2025, 2:00 PM

#

cedar wraith Ah, so unit testing mainly relies on planning and defining the expected results ...

Beforehand? Before what?

#

Black box testing is very much after the fact.

molten hollow Oct 17, 2025, 3:38 PM

#

cedar wraith Well, I’m kinda confused. I’m currently working as an apprentice in a software d...

If the product planning is to do with production, then that's okay. If that's to do with software development, then that's a huge mistake.

proud nebula Oct 17, 2025, 6:11 PM

#

molten hollow If the product planning is to do with production, then that's okay. If that's to...

You don't know the context. Context is everything.

For example, at JPL, your statement would be the huge mistake.

cedar wraith Oct 17, 2025, 6:36 PM

#

proud nebula Beforehand? Before what?

I meant beforehand as in defining the expected behavior before running the code, not before writing it.

proud nebula Oct 17, 2025, 7:47 PM

#

cedar wraith I meant beforehand as in defining the expected behavior before running the code,...

If you write the tests before you write the function it's TDD. If you write them after the function exists by looking at the code it's white box. If you write it without understanding the function it's black box. If you use hypothesis it's Property Based Testing. If you use mutmut it's Mutation Testing.

#

All of it is "testing" or (less academically correct, but commonly used) "unit testing".

molten hollow Oct 17, 2025, 9:48 PM

#

@proud nebula I'm sorry, but that message is quite misleading for new developers.

#

If you write the tests before you write the function it's TDD
That's necessary for TDD, but not sufficient. Not without other prerequisites.

If you write it without understanding the function it's black box. I
Maybe that's incorrect wording, but I think you mean "without knowing the implementation details"? Because if you truly meant "without understanding the function", than you have no business testing the function, if you don't understand it.

If you use hypothesis it's Property Based Testing.
~~You can use hypothesis in all kinds of testing, not just property based testing, what's the deal? ~~
PS: the author didn't mean "hypothesis", he meant "hypothesis library".

If you use mutmut it's Mutation Testing.
All of it is "testing" or (less academically correct, but commonly used) "unit testing".
Mutation testing isn't really testing per se. It's a tool to find holes in your test suite. You can't really find bugs with mutation testing, you can only find mutants (live mutations), that weren't caught by the test suite, but that's not a bug. So it's more of a test-suite-quality-control, rather than a testing strategy. You can't really catch a regression with mutation testing, and you can't drive the implementation (like TDD) with mutation testing. What you can do with mutation testing, is improve the reliablness of your test suite.

If you write them after the function exists by looking at the code it's white box. If you write it without understanding the function it's black box.
That separation is very artificial. I didn't work in a team that would use that distinction. In a proper system, you would never need to couple your tests to the implementation of the method, so what's the point this "white-box-test"? It may be a cool sounding name "white-box-test"/"black-box-test", but what does it really bring to the table?

river pilot Oct 17, 2025, 10:43 PM

#

I find most distinctions (especially within the testing world) are overdone.

molten hollow Oct 17, 2025, 10:44 PM

#

river pilot I find most distinctions (especially within the testing world) are overdone.

@river pilot Some definitely are. Do you have some examples?

river pilot Oct 17, 2025, 10:45 PM

#

molten hollow <@424559318617161740> Some definitely are. Do you have some examples?

i think your point about mutation testing is overly picky. testing your tests is still a kind of testing.

molten hollow Oct 17, 2025, 10:48 PM

#

river pilot i think your point about mutation testing is overly picky. testing your tests is...

I think it's a misused word, that leads you to believe it's testing because it has "testing" in the name: "Mutation Testing".

river pilot Oct 17, 2025, 10:48 PM

#

molten hollow I think it's a misused word, that leads you to believe it's testing because it h...

isn't it testing your tests?

molten hollow Oct 17, 2025, 10:48 PM

#

river pilot isn't it testing your tests?

I wouldn't say so.

river pilot Oct 17, 2025, 10:48 PM

#

molten hollow I wouldn't say so.

ok, we can agree to disagree

molten hollow Oct 17, 2025, 10:48 PM

#

I would call it "auditing your tests", "reviewing your tests", "inspecting your tests" at best.

river pilot Oct 17, 2025, 10:48 PM

#

is load testing a kind of testing?

molten hollow Oct 17, 2025, 10:48 PM

#

"Testing" to me means finding bugs.

molten hollow Oct 17, 2025, 10:49 PM

#

river pilot is load testing a kind of testing?

I would say so, because if it finds a problem with the application, that means the application doesn't do something it's supposed to.

#

And so security testing, performance testing, etc.

#

But I don't think mutation testing qualifies.

river pilot Oct 17, 2025, 10:49 PM

#

molten hollow I would say so, because if it finds a problem with the application, that means t...

mutation testing find a problem with the tests, the tests aren't doing something they are supposed to do.

molten hollow Oct 17, 2025, 10:49 PM

#

Same as check style, linters, code quality checks, sanity checkes, I wouldn't call any of them testing.

river pilot Oct 17, 2025, 10:49 PM

#

BUT: what value is added by saying "mutation testing isn't testing"?

#

how does that help anyone?

molten hollow Oct 17, 2025, 10:50 PM

#

river pilot mutation testing find a problem with the tests, the tests aren't doing something...

Yes, but the tests aren't part of the app.

river pilot Oct 17, 2025, 10:50 PM

#

this is what i mean about overdone distinctions.

molten hollow Oct 17, 2025, 10:50 PM

#

Overdone distinctions are bad, but blurring distinct concepts into one also isn't helpful.

river pilot Oct 17, 2025, 10:50 PM

#

molten hollow Overdone distinctions are bad, but blurring distinct concepts into one also isn'...

what would you call mutation testing instead?

molten hollow Oct 17, 2025, 10:50 PM

#

You might as well call "code reviews" testing, because they can help you find problems, but that's not testing.

molten hollow Oct 17, 2025, 10:51 PM

#

river pilot what would you call mutation testing instead?

I think most precise word would be, "inspecting the quality of tests".

#

the more mutants are alive, the weaker the test suite.

#

But it doesn't necessarily mean that the app has problems.

river pilot Oct 17, 2025, 10:52 PM

#

"inspecting the quality of tests" doesn't quite roll off the tongue

molten hollow Oct 17, 2025, 10:52 PM

#

True. Doesn't make it wrong, tho.

river pilot Oct 17, 2025, 10:52 PM

#

ok, we have different approaches to all of this I think

molten hollow Oct 17, 2025, 10:52 PM

#

Notice, that if load test/security test/performance test fails, then that necessarily means there's a problem that requires fixing.

#

With mutation testing, that's not the case.

#

I mean, "testing" is just a category humans impose on practices.

#

you might add and remove elements from that category, if you'd like . the question is whether or not that's useful.

#

@river pilot If you want to say that mutation testing is testing, then things like:

coverage
linter/checkstyle
code review
cyclomatic complexity

would also need to be added to the "testing" category.

river pilot Oct 17, 2025, 10:56 PM

#

This is something I've been thinking about more and more: https://hachyderm.io/@nedbat/115245272539560254

Level 0: Testing is debugging
Level 1: Testing is to show the program works
Level 2: Testing is to show the program doesn't work
Level 3: Testing is to reduce the risk of using the program
Level 4: Testing is a mental discipline that helps us make better software

youtube.com/watch?v=BKgdrEPYqmM

Ned Batchelder (@nedbat@hachyderm.io)

I liked this ladder of understanding the purpose of testing:

Level 0: Testing is debugging
Level 1: Testing is to show the program works
Level 2: Testing is to show the program doesn't work
Level 3: Testing is to reduce the risk of using the program
Level 4: Testing is a mental discipline that helps us make better software

youtube.com/watch?v=BKgdrEPYqmM

molten hollow Oct 17, 2025, 10:57 PM

#

river pilot This is something I've been thinking about more and more: https://hachyderm.io/...

Okay, that makes sense to me. But these 5 items, I would call by different name. I would just say it's software development.

#

Level 0: Software development is debugging
Level 1: Software development is to show the program works
Level 2: Software development is to show the program doesn't work
Level 3: Software development is to reduce the risk of using the program
Level 4: Software development is a mental discipline that helps us make better software

#

By that definition "coding == testing".

#

I mean, that's not exactly wrong. If you're using TDD, then basically testing is coding. in a sense 😄

#

so I guess that's all right.

pulsar oracle Oct 17, 2025, 10:58 PM

#

river pilot is load testing a kind of testing?

I think it does because it's the only way to find out if your application meets your capacity requirements (or at least continues to meet them)

molten hollow Oct 17, 2025, 10:58 PM

#

In my definition, testing is a falsification mechanism. If you can use something to falsify that the app/program doesn't work as it's supposed to, then that's a test.

#

If something can give you result: "fix immediately", then that's a test.

#

If it gives you "fix maybe", then that's an audit/inspection,something like that.

river pilot Oct 17, 2025, 10:59 PM

#

i don't exclude tests from "my program", maybe that's the difference here

#

i encourage people to include their tests in the total coverage percentage, for example.

molten hollow Oct 17, 2025, 11:00 PM

#

For the same reason I wouldn't say that SEO audits for example are testing.

#

@river pilot I got it!

#

I would say that Mutation Testing would count as measurement.

#

That I would agree with.

#

But not every measurement is testing.

#

Regarding your "doesn't roll off the tongue" 😄 "measurement" sounds good.

#

I'm find with any kind of measurement giving intermediate results, and what not.

#

But for it to count as test, it would need to give a definitive response.

river pilot Oct 17, 2025, 11:03 PM

#

you also want the criteria to include what it gives a response about, I think

molten hollow Oct 17, 2025, 11:05 PM

#

Basically, I would hate for a junior person to come to #unit-testing , and read "mutation testing is testing", and think that he can use that to do regression test for example - that would mislead him.

river pilot Oct 17, 2025, 11:18 PM

#

definitely these topics are intricate and subtle enough to need discussion

molten hollow Oct 17, 2025, 11:20 PM

#

river pilot definitely these topics are intricate and subtle enough to need discussion

Definitely so, if separate concepts are being blurred into one, because someone feels like they're overdone distinctions.

river pilot Oct 17, 2025, 11:21 PM

#

no need to point fingers 😄

molten hollow Oct 17, 2025, 11:22 PM

#

I like about testing, that's it's not open for interpretation. If an acceptance test, unit test, security test, load test, performance test, integration test, fails that must mean there's something wrong. You can't argue with it.

#

But with mutation testing, seo audits, checkstyles and stuff like that, it's up to the reader to interpret it.

raven igloo Oct 17, 2025, 11:23 PM

#

and with the advent of AI... I'm finding that TDD is much more enjoyable than before. I write tests and let AI write the code to pass my tests.

river pilot Oct 17, 2025, 11:25 PM

#

molten hollow But with mutation testing, seo audits, checkstyles and stuff like that, it's up ...

i often find that failing tests require my interpretation to understand the failure and decide what it means and what to do about it.

molten hollow Oct 17, 2025, 11:26 PM

#

river pilot i often find that failing tests require my interpretation to understand the fail...

I never found that. I write tests tdd-style, so I decide what the test means at the moment of writing, before any implementation.

#

When the test fails in the future, it's already determined what it means.

molten hollow Oct 17, 2025, 11:26 PM

#

river pilot i often find that failing tests require my interpretation to understand the fail...

That kind of interpretation would be required with tests-written-after-the-fact I think, and in that case I think it's weaker, exactly because of that interpretation.

river pilot Oct 17, 2025, 11:27 PM

#

this is a repeat of a few days ago: your project seems very different than the ones I have worked on.

molten hollow Oct 17, 2025, 11:27 PM

#

river pilot this is a repeat of a few days ago: your project seems very different than the o...

I think projects may have been similar. What differs is the approach I think.

river pilot Oct 17, 2025, 11:28 PM

#

i'm not interested in you telling me i've been doing it wrong.

molten hollow Oct 17, 2025, 11:29 PM

#

I mean, if you were to chose, between:

test, that if passes gives you confidence that everything works, and when fails, points you exactly where the issue is

vs.

test, that you must read and interpret what it means, and different people might disagree about what the failure means

Which test would be better? Which more useful? Which would make developers work faster and better?

molten hollow Oct 17, 2025, 11:29 PM

#

river pilot i'm not interested in you telling me i've been doing it wrong.

I'm not big on criticizing people 😄

#

I might criticize ideas, concepts, etc. but with people it's much more complicated.

river pilot Oct 17, 2025, 11:31 PM

#

i work on coverage.py. Its test suite checks that Python code is being measured properly by coverage.py. Python changes from version to version. tests fail. Is it coverage at fault, or Python?

molten hollow Oct 17, 2025, 11:31 PM

#

Python version change isn't forced on you or on the project, right?

#

When you work on that coverage.py, you need to manually add the new version?

river pilot Oct 17, 2025, 11:32 PM

#

molten hollow Python version change isn't forced on you or on the project, right?

coverage.py's goal is to properly measure the next version of Python.

river pilot Oct 17, 2025, 11:32 PM

#

molten hollow When you work on that coverage.py, you need to manually add the new version?

there are often changes in Python that need adaptations in coverage.py

molten hollow Oct 17, 2025, 11:33 PM

#

So your goal is to be very up to date with python, but it's not like your project is immediately compatible with python change.

#

Unless python was dependent on your project, that's not happening.

#

You probably version the python version you run your coverage.py, right?

#

So you set it to be compatible with 3.14 let's say for now.

river pilot Oct 17, 2025, 11:35 PM

#

it supports 3.15 now, and runs nightly against the tip of main of CPython

molten hollow Oct 17, 2025, 11:35 PM

#

So it relies on something outside of your control, then?

#

Well, then I would handle it the same was as any 3rd-party.

#

Like payment providers, etc.

river pilot Oct 17, 2025, 11:35 PM

#

molten hollow Well, then I would handle it the same was as any 3rd-party.

how?

molten hollow Oct 17, 2025, 11:36 PM

#

The same way I treat stuff like stripe, oAuth login, any kind of integration with 3rd party.

river pilot Oct 17, 2025, 11:36 PM

#

molten hollow The same way I treat stuff like stripe, oAuth login, any kind of integration wit...

i don't know what way that is.

molten hollow Oct 17, 2025, 11:36 PM

#

Let's say, when a new python 3.16 comes and there is a number of ways it's incompatible with your coverage.py;

#

there is minimum time you need to update your coverage.py, so it's compatible again. Let's say that's 24 hours.

#

For that 24 hours, your goal is not met.

#

You might want to get that number down, to maybe 12-hours or something, but still. You don't control when python is released, and they don't depend on you, so you can only retroactively react to the changes.

#

So when python introduces an incompatible change, it's neither python failure nor your failure.

#

They're just incomptible.

river pilot Oct 17, 2025, 11:39 PM

#

i don't understand why you are talking about 24 hours, and you haven't talked about how the test failures need interpretation.

molten hollow Oct 17, 2025, 11:39 PM

#

river pilot i don't understand why you are talking about 24 hours, and you haven't talked ab...

Because there's no right way to say "who's at fault".

#

Python is definitely not at fault, they just released an update.

river pilot Oct 17, 2025, 11:39 PM

#

molten hollow Because there's no right way to say "who's at fault".

yup. it needs interpretation. that's what I said.

molten hollow Oct 17, 2025, 11:40 PM

#

Your project is not at fault, because it doesn't control the things it relies on.

river pilot Oct 17, 2025, 11:40 PM

#

Python is often at fault, that's why i test pre-alphas.

molten hollow Oct 17, 2025, 11:40 PM

#

river pilot Python is often at fault, that's why i test pre-alphas.

Even if it is, there's nothing you can do about it, can you?

#

Unless you're also a maintainer/contributor of python that can freely update it.

river pilot Oct 17, 2025, 11:41 PM

#

molten hollow Even if it is, there's nothing you can do about it, can you?

I can ask them to fix it: https://github.com/python/cpython/issues?q=is%3Aissue state%3Aopen author%3Anedbat

molten hollow Oct 17, 2025, 11:41 PM

#

Sure, but you don't have control at whether they'll merge it, right? That's what I'm talking about.

river pilot Oct 17, 2025, 11:41 PM

#

those issues are mostly, "this is what i see, whose fault is it?"

molten hollow Oct 17, 2025, 11:41 PM

#

It's not like you can merge it yourself.

pulsar oracle Oct 17, 2025, 11:41 PM

#

I don't get what's being discussed? Are we talking about automated testing for compatibility with the latest python versions nightly or something?

river pilot Oct 17, 2025, 11:41 PM

#

pulsar oracle I don't get what's being discussed? Are we talking about automated testing for c...

yes.

#

@molten hollow wherever this is going: do you see how a test failure requires interpretation?

molten hollow Oct 17, 2025, 11:42 PM

#

river pilot those issues are mostly, "this is what i see, whose fault is it?"

Yes. That's "interpretation" is exactly why I wouldn't call anything you're doing testing.

molten hollow Oct 17, 2025, 11:42 PM

#

river pilot <@323535764455555083> wherever this is going: do you see how a test failure requ...

I see how what you're doing requires interpretation, yes.

#

But I don't think what your doing is testing.

#

That's just development of your coverage.py project.

river pilot Oct 17, 2025, 11:42 PM

#

molten hollow Yes. That's "interpretation" is exactly why I wouldn't call anything you're doin...

so now failures in my test suite aren't "testing"? This is getting absurd.

molten hollow Oct 17, 2025, 11:42 PM

#

river pilot so now failures in my test suite aren't "testing"? This is getting absurd.

What you're doing is developing your project.

pulsar oracle Oct 17, 2025, 11:43 PM

#

What kind of test failure is it and why does it require interpretation? If it fails and isn't compatible with the latest version, isn't that a concrete test that says "we're invalid" or something, if that's our goals. Or is it like, it could break externally for some arbitrary reason, and it's flaky so it's not really a test?

molten hollow Oct 17, 2025, 11:43 PM

#

And the part, where you solve compatibility issues with python, I wouldn't call that testing. That's integrating with a new version.

#

It's the same thing as if one of libraries in my application gets and update, and I want to update it.

#

And let's say it's got a breaking change, that I need to integrate to my app. That's not testing, that's just upgrade.

river pilot Oct 17, 2025, 11:44 PM

#

molten hollow And the part, where you solve compatibility issues with python, I wouldn't call ...

you started by saying that test failures shouldn't require interpretation. I showed you test failures that do. Now you say that isn't testing. I think we are done.

molten hollow Oct 17, 2025, 11:45 PM

#

Your example is integrating your project with newer version of python. And that definitely require interpretation, yes!

pulsar oracle Oct 17, 2025, 11:45 PM

#

river pilot you started by saying that test failures shouldn't require interpretation. I sho...

I think what he's trying to say is that because it can't kill something, or falsify or like impact the release of anything it's not really a test???

molten hollow Oct 17, 2025, 11:45 PM

#

pulsar oracle I think what he's trying to say is that because it can't kill something, or fals...

I'm just saying a test isn't open for interpretation.

#

Like, if it fails.. then everyone involved will agree that it fails.

river pilot Oct 17, 2025, 11:46 PM

#

i agree my test has failed. now i need to determine why and what to do about it.

molten hollow Oct 17, 2025, 11:46 PM

#

river pilot i agree my test has failed. now i need to determine why and what to do about it...

Point taken. if it fails, then everyone involved will agree that it fails and why.

pulsar oracle Oct 17, 2025, 11:46 PM

#

feels like some sort of exploratory test, a test still. Are we compatible with the latest python? Fail = no. we've got our result, what do we do now?

molten hollow Oct 17, 2025, 11:50 PM

#

river pilot i agree my test has failed. now i need to determine why and what to do about it...

I think what you're doing is conceptually the same as upgrading a library in my application. Isn't it?

river pilot Oct 17, 2025, 11:50 PM

#

this is my only point: you said test failures shouldn't require interpretation. Sometimes they do.

molten hollow Oct 17, 2025, 11:51 PM

#

river pilot this is my only point: you said test failures shouldn't require interpretation. ...

But how is what you're doing testing?

river pilot Oct 17, 2025, 11:51 PM

#

molten hollow But how is what you're doing testing?

i write test_foo(), I ran it with pytest. it failed. What can it possibly mean to say it isn't testing?

#

this is a meaningless distinction.

molten hollow Oct 17, 2025, 11:51 PM

#

Just because you can run it in a testing library, doesn't necessarily means it's a test.

river pilot Oct 17, 2025, 11:52 PM

#

this is absurd. i'm done.

molten hollow Oct 17, 2025, 11:52 PM

#

@river pilot

def test_foo():
  print('Hello')

Is this a test in your opinion?

#

You can take any code and put it in a testing library. Does this mean any code is a test? 🤔

#

I can take any hello world app, any function, and wrap it in a pytest test. Does this mean it's now a test?

river pilot Oct 17, 2025, 11:55 PM

#

i hope you can assume that my tests are not like that.

molten hollow Oct 17, 2025, 11:55 PM

#

I don't know what they're like, but when you tell me they're open for interpretation, then I'm prepared to say that they're not really test.

#

Test should be definite, deterministic and not open for interpretation.

#

I can agree that what your pytest "tests" are checking your integration with python, I'm fine with that.

pulsar oracle Oct 17, 2025, 11:56 PM

#

I feel like we're being loose with "open for interpretation" in the example.

molten hollow Oct 17, 2025, 11:57 PM

#

But given that you have control over your coverage.py, and not over python; then it's essentially an app + 3rd party integration.

#

Let's say I'm creating a webapp, that needs to allow the user to pay for services, and we use stripe to do that. Of course, stripe may be down, and in that case the website displays information "sorry, stripe is down".

#

Is this function "sorry, stripe is down" a test? Not it's not, it's just an information for the user that he service in unavailable. Yes, it tells you something, that you can use to do something, but it's not a test.

#

Sam as your pytest things. PYthon becomes incompatible with your app, you have something that measures it and lets you know about that, but it's not a test.

molten hollow Oct 18, 2025, 12:03 AM

#

river pilot i hope you can assume that my tests are not like that.

I think what you're doing, are measurements. And they can be open for interpretation.

PS: For them to become tests, you would need to narrow them down to true/false result with exact reason for failure. If they continue to be open for interpretations, then I'm afraid they're still measurements and not tests.

river pilot Oct 18, 2025, 12:04 AM

#

thank you for demonstrating my point.

proud nebula Oct 18, 2025, 12:38 AM

#

molten hollow I wouldn't say so.

You're now arguing against the common use of established terms.

You are also arguing that you know better what mutation testing is than the author of the most commonly used mutation testing tool for python.

You are extremely arrogant, and refuse to listen, and when you are corrected you argue minor semantic details that are themselves irrelevant until the other part gives up in frustration.

You haven't won any argument here. You have just demonstrated that you are impossible to have a meaningful discussion with, and that you will make every effort to not lose face instead of trying to learn. You have also demonstrated that you are willing to say absolutely idiotic things like "You can use hypothesis in all kinds of testing, not just property based testing, what's the deal? ". https://hypothesis.readthedocs.io/en/latest/ "Hypothesis is the property-based testing library for Python".

What will you argue next? That "python" isn't really a programming language?

At this point you are damaging this channel by your presence.

molten hollow Oct 18, 2025, 8:19 AM

#

You are also arguing that you know better what mutation testing is than the author of the most commonly used mutation testing tool for python.
If you're talking about the author of mutmut, he created the tool, but not the practice. Mutation testing was coined by Richard Lipton in early 1970-ties. There were many tools created for that later, only one of which is mutmut.
You're now arguing against the common use of established terms.
From my perspective, that's what your doing.
You have just demonstrated that you are impossible to have a meaningful discussion with, and that you will make every effort to not lose face instead of trying to learn. You have also demonstrated that you are willing to say absolutely idiotic things like "You can use hypothesis in all kinds of testing, not just property based testing, what's the deal? ". https://hypothesis.readthedocs.io/en/latest/ "Hypothesis is the property-based testing library for Python".
Sorry, I didn't realise "hypothesis" is the name of the library. I thought you used it as a regular, english word. I understand you meant "If you use hypothesis library, then it's property based testing"?
You haven't won any argument here.
I'm not here to win arguments.

#

What will you argue next? That "python" isn't really a programming language?
Straw man fallacy
You're now arguing against the common use of established terms.
Fundamental attribution error.
You are extremely arrogant, and refuse to listen,
Argumentum ad hominem.

proud nebula Oct 18, 2025, 8:35 AM

#

molten hollow > You are also arguing that you know better what mutation testing is than the au...

"He". The word you should have used is "you". And obviously I know that. I'm him 🤣 I have in fact found bugs using MT. So that falsifies your thesis above. It also does in fact help with better structure and it can show you places you need to refactor, again falsifying a previous statement you made in great confidence. It is pretty obvious you have never practiced MT.

hypothesis is a lib

Ok, but maybe the fact that the grammar doesn't make sense if you used the word in the normal sense should have made you confused enough to ask a question instead of being arrogant?

#

Also it's like the only PBT lib for python so if you had tried PBT at all you should know about it. Again: you have obviously read a lot of theory, and have much less practical understanding and experience.

molten hollow Oct 18, 2025, 9:04 AM

#

"He". The word you should have used is "you". And obviously I know that. I'm him 🤣 I have in fact found bugs using MT. So that falsifies your thesis above. It also does in fact help with better structure and it can show you places you need to refactor, again falsifying a previous statement you made in great confidence. It is pretty obvious you have never practiced MT.
I actually practice mutation testing every week for my past couple of years; and everything I say in this channel is backed by practice.

I can agree that you found a bug while using mutation testing, but I doubt it was actually with mutation testing. Please, notice - mutation testing works by having a test suite, then you introduce a change in the software, and then you run the test again. The thing that mutation testing gives you, is it validates your test suite. I can agree that while doing that, you stumbled upon a bug and you fixed it? That works, but that's not due to mutation testing being used. That's due to having a test suite.

molten hollow Oct 18, 2025, 9:06 AM

#

proud nebula "He". The word you should have used is "you". And obviously I know that. I'm him...

"He". The word you should have used is "you". And obviously I know that. I'm him 🤣 I have in fact found bugs using MT. So that falsifies your thesis above. It also does in fact help with better structure and it can show you places you need to refactor, again falsifying a previous statement you made in great confidence. It is pretty obvious you have never practiced MT.
Good for you, but just because you created a library that can be used to exercise this idea, doesn't really give you authority about its merit.

#

Ok, but maybe the fact that the grammar doesn't make sense if you used the word in the normal sense should have made you confused enough to ask a question instead of being arrogant?
I'm sorry, but most of the things you mention in this channel are... calling for my concern.

#

You clearly have a bone with me. I think so, because you're using personal arguments all the time, instead of sticking to the subject matter. I don't have a problem with you, as a person, but I don't agree with part of the things you say. I'm capable of having a reasonable debate, but not if someone uses argumentum ad hominem.

proud nebula Oct 18, 2025, 9:41 AM

#

You've made Ned visibly frustrated. That is extremely rare. You don't know his personality so you don't know what a red flag that is.

molten hollow Oct 18, 2025, 9:48 AM

#

proud nebula You've made Ned visibly frustrated. That is extremely rare. You don't know his p...

Why would that mean I'm wrong? I'm just gonna ignore any kind of non-meritoric arguments from now on.

proud nebula Oct 18, 2025, 11:10 AM

#

molten hollow Why would that mean I'm wrong? I'm just gonna ignore any kind of non-meritoric a...

You ignore everything heh. It feels very Jordan Peterson to talk to you.

odd walrus Oct 18, 2025, 2:30 PM

#

I didn’t follow all that, is the assertion that bad tests can be written, therefore testing is not inherently valuable?

pulsar oracle Oct 18, 2025, 2:35 PM

#

odd walrus I didn’t follow all that, is the assertion that bad tests can be written, theref...

I believe the assertion that was made is that certain types of tests don't count as "tests" if you have to look at the result to determine what to do, if they're not definite. In this case a test that runs nightly to check if pytest coverage is compatible with the latest python version or anything else that is not so definite in meaning.

odd walrus Oct 18, 2025, 2:35 PM

#

pulsar oracle I believe the assertion that was made is that certain types of tests don't count...

Oh I see, yeah. To me that is a “linter” check not an in-codebase unit test

#

In the sense that it’s pure validation that doesn’t really inform the shape of your codebase

pulsar oracle Oct 18, 2025, 2:38 PM

#

I'd of argued it's more like an exploratory test that is automated, send someone to go check if we're compatible with the latest external thing, if we're not go update our thing, do nothing, or go contact them to fix it, just now the exploration and getting that result is automated. But I wasn't really in this argument so idk.

river pilot Oct 18, 2025, 2:43 PM

#

odd walrus Oh I see, yeah. To me that is a “linter” check not an in-codebase unit test

i had wanted to stay out of this to collect thoughts and let the heat die down, but: the test we're talking about checks if coverage.py produces correct results on Python 3.15 (let's say). I don't see how that's a linter.

odd walrus Oct 18, 2025, 2:44 PM

#

river pilot i had wanted to stay out of this to collect thoughts and let the heat die down, ...

Oh yes, in that context it’s actually a domain concern, my above take does not apply here.

#

I thought it was just a belt and suspenders thing

#

But no in that case, it’s exactly what your tests are for

#

For RubySpec we added a bunch of “guard” support so you could make tests not run on implementations that didn’t support that etc

pulsar oracle Oct 18, 2025, 2:45 PM

#

river pilot i had wanted to stay out of this to collect thoughts and let the heat die down, ...

But is it a check before you release the new version, if it fails you block it from releasing, or is it just a daily test on the latest version you don't officially have supported yet?

odd walrus Oct 18, 2025, 2:45 PM

#

If you don’t support 3.16 yet I don’t see why it would be tested in master branch CI

#

That should be on the 3.16 support branch

river pilot Oct 18, 2025, 2:46 PM

#

odd walrus If you don’t support 3.16 yet I don’t see why it would be tested in master branc...

I have a GitHub action that runs my test suite on the tip of main in the CPython repo, to get quick feedback about changes to Python.

odd walrus Oct 18, 2025, 2:46 PM

#

river pilot I have a GitHub action that runs my test suite on the tip of main in the CPython...

Hmm, ok, and is it set up to block a PR merge for example?

river pilot Oct 18, 2025, 2:48 PM

#

odd walrus Hmm, ok, and is it set up to block a PR merge for example?

no, it doesn't block PR merges. It's there to give me early warnings about changes to Python and a chance to discuss the change with CPython devs before it's cast in stone.

odd walrus Oct 18, 2025, 2:52 PM

#

river pilot no, it doesn't block PR merges. It's there to give me early warnings about chang...

Seems cool to me, yeah. And they thought that was not a “test”? To me it just sounds like an integration test or functional test, not a unit one, but certainly a test?

river pilot Oct 18, 2025, 2:52 PM

#

odd walrus Seems cool to me, yeah. And they thought that was not a “test”? To me it just so...

i agree.

odd walrus Oct 18, 2025, 2:53 PM

#

I wouldn’t actually be surprised to learn that Google has suites that take a probabilistic approach to deciding when to fail the whole “run”, given their scale

pulsar oracle Oct 18, 2025, 2:53 PM

#

In my view an integration test tests your compatibility against either an network level mock of a system or a deployable/runnable thing that you want to fail the build if you're not compatible with, probably more specifically the actual thing.

odd walrus Oct 18, 2025, 2:54 PM

#

I don’t feel integration tests have any necessary thing to do with networks, I’ve written plenty that test CLI tool interactions etc

pulsar oracle Oct 18, 2025, 2:54 PM

#

Fair point. It could involve integration with other applications or mocks of them at the command line or anywhere else they communicate for real.

river pilot Oct 18, 2025, 2:55 PM

#

in my view (the view that started the whole discussion), categories of tests are talked about as hard-edged things, but they are often quite squishy. I'm fine calling these integration tests, or compatibility tests. You can also look at them as functional tests. But they are definitely some kind of test.

pulsar oracle Oct 18, 2025, 2:56 PM

#

river pilot in my view (the view that started the whole discussion), categories of tests are...

I personally agree with that. It very much strikes me as some type of test just outside of the normal stages of a pipeline.

river pilot Oct 18, 2025, 2:57 PM

#

people love to categorize things. It's useful sometimes to step back and ask, why are we categorizing them? How will the categories help us understand? Maybe we don't need to categorize as much as we do, or maybe we need different kinds of categories.

river pilot Oct 18, 2025, 2:57 PM

#

pulsar oracle I personally agree with that. It very much strikes me as some type of test just ...

in this case, it's the exact same tests as the ones that run on every PR, but using different builds of Python nightly.

pulsar oracle Oct 18, 2025, 3:07 PM

#

river pilot people love to categorize things. It's useful sometimes to step back and ask, wh...

I personally live by three and with more loose/pragmatic definitions. If I write a function that is simple enough in scope to just write something to a file or put something in a directory, is it a unit test if I bring in the filesystem? It's not just in memory so some would argue no but pragmaticicly I consider it a unit test because it's always there. An integration test for me has several meanings because of what other people refer to it as, like an end to end test, a functional test, etc. And then there's acceptance tests, aka functional. But I do think categories help when pragmatism is applied and I pretty much just translate/infer what people mean when they say one or the other. And then there's testing ideas out manually or manual testing and feedback though it doesn't fail anything

river pilot Oct 18, 2025, 4:01 PM

#

pulsar oracle I personally live by three and with more loose/pragmatic definitions. If I write...

this does sound like a practical approach, but even here: i'm interested to know, what does it matter whether it's a unit test or an integration test? When does that question come up day-to-day? What do you do with that information?

#

and I'm not trying to say, don't categorize. I'm trying to explore why we do it.

odd walrus Oct 18, 2025, 4:05 PM

#

To me it’s always about coverage (real coverage not C0)

river pilot Oct 18, 2025, 4:06 PM

#

odd walrus To me it’s always about coverage (real coverage not C0)

what is C0?

odd walrus Oct 18, 2025, 4:06 PM

#

river pilot what is C0?

The coverage you get from 'code coverage' tools, where it only knows which lines executed, not which actual semantics took place.

#

C1 is per-statement, C2 is like per-side-effect or something? I can't remember the exact hierarchy

river pilot Oct 18, 2025, 4:07 PM

#

I don't know of a tool that does lines but not statements?

pulsar oracle Oct 18, 2025, 4:07 PM

#

I think I do it to set the context of what I'm doing. If I say integration test it sort of indicates in my mind we're testing compatibility of some sort of real software, maybe bringing in testcontainers for a database or other application. The lines can get blurred but to me I always think about it because it sets the scene, acceptance test comes up because they prove that my application works, and I can stretch the definition and use synonyms like end to end test, or even integration if that's the goal someone is going for if it's integration with the client to the server to the database, and so on. If I write code that is supposed to get today's weather I'd unit test with a mock that code using that, unit making me think mocks of that sort, it proves that thing works but it doesn't tell me getting the weather from weather.com as we know it will work so I'd have to take my actual implementation and see if it works against the API or website data as we know it, testing integration. If we say acceptance I'm thinking how do we test the application as a whole? In my mind it comes up every day for these three.

river pilot Oct 18, 2025, 4:08 PM

#

pulsar oracle I think I do it to set the context of what I'm doing. If I say integration test ...

this seems useful to me: the category helps set the goal and the approach.

pulsar oracle Oct 18, 2025, 4:11 PM

#

river pilot this seems useful to me: the category helps set the goal and the approach.

It doesn't seem to me that everything can be categorized (at this point in time as we know it) and is black and white. In the scenario that you're not compatible with python 3.13 and it should fail the build, it's basically an integration test in purpose but running unit tests, and when it comes to writing it I can see how categorization might not really be helpful.

proud nebula Oct 18, 2025, 4:13 PM

#

odd walrus To me it’s always about coverage (real coverage not C0)

What is mutation coverage in that hierarchy? Is lower numbers more or less coverage? :)

river pilot Oct 18, 2025, 4:14 PM

#

pulsar oracle It doesn't seem to me that everything can be categorized (at this point in time ...

i really appreciate your pragmatic approach to this.

odd walrus Oct 18, 2025, 4:20 PM

#

proud nebula What is mutation coverage in that hierarchy? Is lower numbers more or less cover...

Haha great question, I guess that’s like fractal dimensions between the regular tiers? 🙂

proud nebula Oct 18, 2025, 4:30 PM

#

odd walrus Haha great question, I guess that’s like fractal dimensions between the regular ...

100% mutation coverage means all behavior of the code is tested.

odd walrus Oct 18, 2025, 4:31 PM

#

proud nebula 100% mutation coverage means all behavior of the code is tested.

Yeah, you mean mutation testing where every non-keyword gets eventually torqued? I agree

#

But I’ve also never seen a full passing suite like that without major exclusions

odd walrus Oct 18, 2025, 4:35 PM

#

proud nebula 100% mutation coverage means all behavior of the code is tested.

Like, this is conceptually true but you can’t run every possible mutated program in a CI suite

proud nebula Oct 18, 2025, 5:28 PM

#

odd walrus Yeah, you mean mutation testing where every non-keyword gets eventually torqued?...

We mutate some keywords in mutmut.

odd walrus Oct 18, 2025, 5:28 PM

#

proud nebula We mutate some keywords in mutmut.

Neat

proud nebula Oct 18, 2025, 5:28 PM

#

odd walrus Like, this is conceptually true but you can’t run every possible mutated program...

People keep asking to run mutmut in CI, which I always tell people to stop doing because it's stupid :P

#

It's a tool to fix your test suite, but running it in CI all the time is a huge waste of resources unless you are very careful how you do it and think about it deeply.

odd walrus Oct 18, 2025, 5:29 PM

#

proud nebula People keep asking to run mutmut in CI, which I always tell people to stop doing...

What do you feel the right way to integrate it into your team’s workflow is?

#

Just as needed to audit the test suite?

proud nebula Oct 18, 2025, 5:30 PM

#

odd walrus What do you feel the right way to integrate it into your team’s workflow is?

Interactive use. Mutmut3 has a super nice interactive mode now too, which makes this very enjoyable compared to before.

#

And highly selectively where it's critical only, or you care for some other reason.

thorny cave Oct 18, 2025, 5:31 PM

#

whats this channel for

proud nebula Oct 18, 2025, 5:31 PM

#

I run MT on iommi sometimes out of hobby level professional pride. But that's not extremely rational use of time :P

river pilot Oct 18, 2025, 5:31 PM

#

thorny cave whats this channel for

"Everything related to testing your Python applications and libraries, and discussion of testing as a whole."

proud nebula Oct 18, 2025, 5:32 PM

#

odd walrus What do you feel the right way to integrate it into your team’s workflow is?

For example, I wrote this about one scenario when I used MT in code that I wanted to run in production: https://kodare.net/2021/04/04/safe_number_parsing.html

thorny cave Oct 18, 2025, 5:35 PM

#

river pilot "Everything related to testing your Python applications and libraries, and discu...

ok

odd walrus Oct 18, 2025, 5:36 PM

#

proud nebula For example, I wrote this about one scenario when I used MT in code that I wante...

Oh this looks really good, I’m gonna post this a few places

#

Thanks

proud nebula Oct 18, 2025, 5:39 PM

#

odd walrus Oh this looks really good, I’m gonna post this a few places

I have a lot more good content on my blog imo. And it's all pretty short :P

odd walrus Oct 18, 2025, 5:40 PM

#

proud nebula I have a lot more good content on my blog imo. And it's all pretty short :P

Prepare for… promotion.

proud nebula Oct 18, 2025, 5:43 PM

#

odd walrus Prepare for… _promotion_.

I'd love some. I am horrible at marketing heh.

#

I mean.. just look at iommi, which imo would absolutely revolutionize web development if people embraced it. And I'm not seeing very many users at all :/

odd walrus Oct 18, 2025, 5:46 PM

#

proud nebula I'd love some. I am horrible at marketing heh.

https://www.linkedin.com/posts/wilson-bilkovich_safe-number-parsing-activity-7385370521061834752-BWd3

ember maple Oct 19, 2025, 5:17 AM

#

proud nebula I mean.. just look at iommi, which imo would absolutely revolutionize web develo...

That needs different marketing and docs for sure

proud nebula Oct 19, 2025, 6:55 AM

#

ember maple That needs different marketing and docs for sure

I think the docs are pretty good. Or at least there's a lot of it :) I have considered some kind of marketing landing page but I'm no good with design.

ember maple Oct 20, 2025, 1:45 PM

#

proud nebula I think the docs are pretty good. Or at least there's a lot of it :) I have cons...

Last time i took a look i found it a pain im happy to do a brainstorm with you to make the docs better

proud nebula Oct 20, 2025, 2:38 PM

#

ember maple Last time i took a look i found it a pain im happy to do a brainstorm with you t...

I would very much like that. My focus so far has been mostly on correctness and volume. But I'm hitting hard diminishing returns on that. You can't go beyond all examples working :)

proud nebula Oct 20, 2025, 3:07 PM

#

I really need to figure out when and how to point people read the Equivalence page. That is really key to make things click.

river pilot Oct 20, 2025, 3:11 PM

#

proud nebula I really need to figure out when and how to point people read the Equivalence pa...

what page is that?

proud nebula Oct 20, 2025, 3:48 PM

#

river pilot what page is that?

https://docs.iommi.rocks/equivalency.html If this thing, that __ is just a general purpose short form for nesting hasn't clicked, you're going to have a tough time with iommi.

river pilot Oct 20, 2025, 3:49 PM

#

proud nebula <https://docs.iommi.rocks/equivalency.html> If this thing, that `__` is just a g...

oh, i thought we were talking about docs for mutmut.

proud nebula Oct 20, 2025, 3:50 PM

#

river pilot oh, i thought we were talking about docs for mutmut.

ah. Well, those might need improvement too heh. MT in general needs more hype (than PBT :P)

proud nebula Oct 24, 2025, 9:16 PM

#

ember maple That needs different marketing and docs for sure

https://iommi.rocks new hero page online now at least. So a bit of a marketing push I hope.

proud nebula Oct 28, 2025, 5:50 PM

#

ember maple That needs different marketing and docs for sure

If you have any more feedback, I'd love to hear it. It's super hard to write docs when you are so deep in it...

ember maple Oct 28, 2025, 5:52 PM

#

proud nebula If you have any more feedback, I'd love to hear it. It's super hard to write doc...

I think ill need to shedule that im a bit stretched between too many things atm

proud nebula Oct 28, 2025, 5:53 PM

#

ember maple I think ill need to shedule that im a bit stretched between too many things atm

No worries. Just curios if you had something off the top of your head.

river pilot Oct 29, 2025, 12:02 PM

#

https://hachyderm.io/@nedbat/115457328805396573

Ned Batchelder (@nedbat@hachyderm.io)

Nice! My coverage dropped from 94.370% to 94.366%!
(I deleted code that had been covered, but was no longer needed)

github.com/nedbat/coveragepy/c…

github.com/nedbat/coverage-rep…

twin shale Oct 29, 2025, 10:29 PM

#

How I would like test decorators to work:

@test
@test.params(a=(1, 2, 4), b=(100, 150))
@test.params(a=(8,), b=(50, 100))
@def test_add_two_numbers(a, b):
   assert myadd(a, b) == a + b

And this would expand into 32 + 12 = 8 tests

river pilot Oct 29, 2025, 11:20 PM

#

twin shale How I would like test decorators to work: ```py @test @test.params(a=(1, 2, 4),...

@pytest.mark.parametrize() does this

twin shale Oct 30, 2025, 5:58 AM

#

river pilot `@pytest.mark.parametrize()` does this

No, it does it In different way, using strings and no automatic cross-product as I've seen.

testdata = [
    (datetime(2001, 12, 12), datetime(2001, 12, 11), timedelta(1)),
    (datetime(2001, 12, 11), datetime(2001, 12, 12), timedelta(-1)),
]


@pytest.mark.parametrize("a,b,expected", testdata)
def test_timedistance_v0(a, b, expected):
    diff = a - b
    assert diff == expected

proud nebula Oct 30, 2025, 6:26 AM

#

twin shale No, it does it In different way, using strings and no automatic cross-product as...

Look at how parametrize is implemented. You can implement this yourself.

twin shale Oct 30, 2025, 6:27 AM

#

That's what I have, per above 😊

swift pewter Oct 30, 2025, 6:28 AM

#

def params(**kwargs):
    def deco(fn):
        return pytest.mark.parametrize(
            kwargs.keys(),
            itertools.product(kwargs.values()),
        )(fn)
    return deco

?

twin shale Oct 30, 2025, 6:28 AM

#

I'm mostly against the separation of the test parameter name and its values. Having a comma separated string is also not very convenient.

swift pewter Oct 30, 2025, 6:29 AM

#

You don't have to comma-separate, you can also provide a tuple of strings

river pilot Oct 30, 2025, 8:59 AM

#

twin shale I'm mostly against the separation of the test parameter name and its values. Hav...

you can stack two parametrize decorators, and they cross-product.

twin shale Oct 30, 2025, 9:11 AM

#

river pilot you can stack two parametrize decorators, and they cross-product.

Is that what you want though?

river pilot Oct 30, 2025, 9:14 AM

#

twin shale Is that what you want though?

maybe I don't understand what you want. I thought it was cross-product.

river pilot Oct 30, 2025, 9:16 AM

#

twin shale Is that what you want though?

i guess your example was not cross-product

twin shale Oct 30, 2025, 9:19 AM

#

Right, cross within one decorator, "addition" between

#

But I didn't know that the parametrize decorator cross produced at all, that's good to know.

#

On another note, can someone explain hamcrests's logo? 😅

river pilot Oct 30, 2025, 9:23 AM

#

it looks like a person surfing down a pile of ham? Which makes sense for the name, but why the name?

spark thicket Oct 30, 2025, 1:53 PM

#

Any good documentation for writing tests(Integration, E2E, Unit) in a none TDD architecture?

molten hollow Oct 30, 2025, 2:08 PM

#

spark thicket Any good documentation for writing tests(Integration, E2E, Unit) in a none TDD a...

What is "none TDD architecture"? 🤔

molten hollow Oct 30, 2025, 2:09 PM

#

twin shale How I would like test decorators to work: ```py @test @test.params(a=(1, 2, 4),...

Wouldn't it be better to just explicitly define the tests?

twin shale Oct 30, 2025, 2:11 PM

#

molten hollow Wouldn't it be better to just explicitly define the tests?

What do you mean? Sweeping test parameters is a great way to get good test coverage (and avoid missing some case). If possible you can also randomly select test input (like the hypothesis package does)

tall brook Oct 30, 2025, 2:13 PM

#

spark thicket Any good documentation for writing tests(Integration, E2E, Unit) in a none TDD a...

https://www.obeythetestinggoat.com/pages/book.html#toc

river pilot Oct 30, 2025, 2:29 PM

#

@twin shale i don't know of a test decorator that works the way you showed, but it seems like it should be possible to write.

molten hollow Oct 30, 2025, 5:19 PM

#

twin shale What do you mean? Sweeping test parameters is a great way to get good test cover...

Wouldn't you get just the same coverage by having explicit tests?

river pilot Oct 30, 2025, 5:20 PM

#

molten hollow Wouldn't you get just the same coverage by having explicit tests?

you don't use parameterized tests? They are very handy for reducing repetition.

molten hollow Oct 30, 2025, 5:23 PM

#

I did in the past, but I noticed that they can make your design weaker.

Imagine you have two cases, that appear similar at first - so you write them as one test, and parametrize it to "reduce duplication". But then, after working with the code a bit you discover they aren't really the same idea, so you should split it. Maybe you split them and have two tests anyway, or maybe you're lazy and leave the tests like that, but without the difference covered.

#

Real design is about organizing expected results, information flow, compartmentalization of the system, information hiding, separation of concern, and reduction of information. That's what gives your programs a real edge.

#

Just joining two test cases into one with parametrization is nothing but syntax sugar, not a very helpful one at that imo.

river pilot Oct 30, 2025, 5:24 PM

#

Sure, it might be misapplied, but it's very common to have a dozen data scenarios for the same test. I wouldn't want a dozen tests.

molten hollow Oct 30, 2025, 5:25 PM

#

river pilot Sure, it might be misapplied, but it's very common to have a dozen data scenario...

My take is that if you have "dozen data inputs" that really means it's just one test, and one data input might suffice.

#

You don't get any real benefit from including more; and if you do, that really means it's a different test case worthy of a dedicated method and a proper test name, because that's a different behaviour.

river pilot Oct 30, 2025, 5:29 PM

#

Here's an example where i've parametrized: https://github.com/nedbat/coveragepy/blob/master/tests/test_misc.py#L85-L107 . They check different behaviors of the one function, so i could have made separate tests with individual names, but the body of the test would be the same. This let me be more concise while covering all the behavior.

molten hollow Oct 30, 2025, 5:29 PM

#

Rule of thumb:

if it's one behaviour, one data input will suffiice, no need for parametrization
if it's multiple behaviours, it's better to split them into multiple tests, no need for parametrization

river pilot Oct 30, 2025, 5:29 PM

#

i guess we'll have to disagree on this.

molten hollow Oct 30, 2025, 5:31 PM

#

VARS = {
    "FOO": "fooey",
    "BAR": "xyzzy",
}


@pytest.mark.parametrize(
    "before, after",
    [
        ("Nothing to do", "Nothing to do"),
        ("Dollar: $$", "Dollar: $"),
        ("Simple: $FOO is fooey", "Simple: fooey is fooey"),
        ("Braced: X${FOO}X.", "Braced: XfooeyX."),
        ("Missing: x${NOTHING}y is xy", "Missing: xy is xy"),
        ("Multiple: $$ $FOO $BAR ${FOO}", "Multiple: $ fooey xyzzy fooey"),
        ("Ill-formed: ${%5} ${{HI}} ${", "Ill-formed: ${%5} ${{HI}} ${"),
        ("Strict: ${FOO?} is there", "Strict: fooey is there"),
        ("Defaulted: ${WUT-missing}!", "Defaulted: missing!"),
        ("Defaulted empty: ${WUT-}!", "Defaulted empty: !"),
    ],
)
def test_substitute_variables(before: str, after: str) -> None:
    assert substitute_variables(before, VARS) == after

If I understand correctly, all of these cases are different behaviours.

river pilot Oct 30, 2025, 5:31 PM

#

molten hollow ``` VARS = { "FOO": "fooey", "BAR": "xyzzy", } @pytest.mark.parametriz...

yes, but it doesn't make things clearer to have ten functions with separate names but the same single-line body.

molten hollow Oct 30, 2025, 5:32 PM

#

river pilot yes, but it doesn't make things clearer to have ten functions with separate name...

Are you sure about that? 😄

river pilot Oct 30, 2025, 5:32 PM

#

molten hollow Are you sure about that? 😄

yes

molten hollow Oct 30, 2025, 5:32 PM

#

Let me show you how I would've created that method.

#

besides, you're missing some test cases, which I would've included.

river pilot Oct 30, 2025, 5:33 PM

#

I'd be happy to add the missing cases.

molten hollow Oct 30, 2025, 5:34 PM

#

river pilot I'd be happy to add the missing cases.

That's the point, in the current form of this test,you can't. Give me a minute, I'll show you

river pilot Oct 30, 2025, 5:35 PM

#

molten hollow That's the point, in the current form of this test,you can't. Give me a minute, ...

the error cases are handled in the next test

#

(well, one error case)

molten hollow Oct 30, 2025, 5:35 PM

#

Parametrized case for one input. Interesting.

#

Well, still. Let me show you how I would've written that, and what cases are missing for me, given I would test-drive that.

#

Actually, the more I read those examples, the less I understand what it's actually supposed to be doing 😐

river pilot Oct 30, 2025, 5:40 PM

#

molten hollow Actually, the more I read those examples, the less I understand what it's actual...

there's a docstring on the function

molten hollow Oct 30, 2025, 5:40 PM

#

For example, I read those parametrized test, and have no idea what's the ? is doing.

molten hollow Oct 30, 2025, 5:40 PM

#

river pilot there's a docstring on the function

I should be able to understand what the function does from the tests. Doc strings can lie, if you forget to update them.

river pilot Oct 30, 2025, 5:41 PM

#

molten hollow I should be able to understand what the function does from the tests. Doc string...

what would i put in the test to make it clear? A comment?

molten hollow Oct 30, 2025, 5:41 PM

#

I'm glad to show you an example, but first I need to understand what your function does.

#

So, i I understand correctly, if the string doesn't have a placeholder, it's returned as is, correct?

river pilot Oct 30, 2025, 5:42 PM

#

yes

molten hollow Oct 30, 2025, 5:42 PM

#

Also, if there's a superfluous variable in the dictionary, that's supposed to be ignored or throw error for missing placeholder?

river pilot Oct 30, 2025, 5:42 PM

#

ignored

molten hollow Oct 30, 2025, 5:42 PM

#

okay,

#

format of the placeholder are either $Foo or ${Foo}, and it doesn't change anything, that's just notation, correct?

river pilot Oct 30, 2025, 5:43 PM

#

yes

molten hollow Oct 30, 2025, 5:43 PM

#

okay, ? question mark inside the braces means... what exactly? I can't tell.

#

Also, format ${Name-default} means you either read Name from the vars, or if it's missing, then insert the default, correct?

river pilot Oct 30, 2025, 5:44 PM

#

yes

molten hollow Oct 30, 2025, 5:44 PM

#

molten hollow okay, `?` question mark inside the braces means... what exactly? I can't tell.

What the question mark means?

river pilot Oct 30, 2025, 5:45 PM

#

https://github.com/nedbat/coveragepy/blob/master/coverage/misc.py#L230-L244

molten hollow Oct 30, 2025, 5:59 PM

#

Something like that would be my tests:

river pilot Oct 30, 2025, 5:59 PM

#

Can you pastebin that as text?

#

(also, have to be afk for at least an hour)

molten hollow Oct 30, 2025, 6:00 PM

#

I sent a screenshot to say that the content of text isn't that big compared to your test, but you gain a lot of information and clarification.

#

def test_substitute_variables_in_text_with_their_values():
    text = substitute_variables('Hello $Name, ($Age)', {'Name': 'John', 'Age': '14'})
    assert text == 'Hello John, (14)'

def test_variable_has_shell_format__simple_placeholder():
    assert substitute_variables('$Foo', {'Foo': 'Bar'}) == 'Bar'

def test_variable_has_shell_format__braced_placeholder():
    assert substitute_variables('${Foo}', {'Foo': 'Bar'}) == 'Bar'

def test_simple_format__given_value_not_exists__returns_empty_string():
    assert substitute_variables('${Missing}', {}) == ''

def test_strict_format__given_value_exists__passes():
    assert substitute_variables('${Foo?}', {'Foo': 'Bar'}) == 'Bar'

def test_strict_format__given_value_not_exists__fails():
    with raises(Exception):
        substitute_variables('${Missing?}', {})

def test_default_format__given_value_exists__returns_value():
    assert substitute_variables('${Foo-default}', {'Foo': 'Bar'}) == 'Bar'

def test_default_format__given_value_not_exists__returns_default():
    assert substitute_variables('${Missing-default}', {}) == 'default'

def test_encode_dollar_sign__with_two_dollar_signs():
    assert substitute_variables('$$', {}) == '$'

def test_malformed_placeholder__double_braces__is_not_substituted():
    assert substitute_variables('${{Foo}}', {'Foo': 'Bar'}) == '${{Foo}}'

def test_malformed_placeholder__non_letter__percent_sign__is_not_substituted():
    assert substitute_variables('${%Foo}', {'Foo': 'Bar'}) == '${%Foo}'

def test_malformed_placeholder__non_letter__digit__is_not_substituted():
    assert substitute_variables('${5}', {'5': 'Bar'}) == '${5}'

def test_malformed_placeholder__not_closed_brace__is_not_substituted():
    assert substitute_variables('${Foo', {'Foo': 'Bar'}) == '${Foo'

river pilot Oct 30, 2025, 6:00 PM

#

We can agree to disagree on this point

molten hollow Oct 30, 2025, 6:01 PM

#

First test shows an example of what the function is for. Reader knows exactly how to use that function and what can it be good for.
Each case is described with given input, no need to include "Default", "Strict" into the test data. test names should be test names, test data should be test data.
Passed in values are specifically design for each scenario. Also, minimal values are inserted into the function to illustrate the behaviour.

#

You can easily copy each of those methods, and adapt to your needs. With parametrized test, if you'd like to do something slightly different, then that's not so simple.

#

Case of superfluous arguments is now explicitly tested.

#

Test names specify what needs to happen: "is_not_substituted", "returns_default", "fails". Even if test data isn't descriptive enough, test names will tell you what the intent behind the test is.

#

Now, I propose to you @river pilot show any programmer your test, and my test, and try to guess which version will be easier to understand for him.

#

With tests like that, you don't really need a test doc, because the tests contain all the information about the function you need. Plus, tests are actually executed and asserted, while function doc, it's possible the become out of date.

#

BTW, this question mark, wouldn't it make more sense to be other way around? Like ${Foo?} should returns empty string, and ${Foo} would throw for the missing value? 🤔 Just a suggestion.

twin shale Oct 30, 2025, 6:52 PM

#

molten hollow Wouldn't you get just the same coverage by having explicit tests?

I'm not sure I understand the difference except you would just write a lot more boilerplate or duplicated code if you write them as separate tests. And it would be much harder to get an overview of what you test. It's really unfeasible method to not use parametrization for some scenarios.

Having 1 test test expand to >100 is not uncommon.

twin shale Oct 30, 2025, 6:54 PM

#

molten hollow You don't get any real benefit from including more; and if you do, that really m...

This is just not true. But it all depends on what you are testing.

river pilot Oct 30, 2025, 7:09 PM

#

@molten hollow thanks for the examples. I find long test names like that hard to read. I'd rather have a comment than underscored sentences. The question mark behavior is borrowed from the shell: this function implements a subset of shell variable expansion behavior. That is mentioned in the docstring.

twin shale Oct 30, 2025, 7:14 PM

#

Sometimes a test just test that two implementations give the same result.

Sometimes a test has no documentation value but is just there to avoid someone accidentally making a mistake.

Sometimes you need test coverage with can mean testing each bit in a 64-bit integer.

Sometimes test space is just so big that you can't feasibly write manual tests to cover it.
Sometimes sweeping or exhaustive testing is not an option either - it would take too much time to cover. Here random testing is the way to go.

twin shale Oct 30, 2025, 7:17 PM

#

river pilot <@323535764455555083> thanks for the examples. I find long test names like that ...

I agree, good docstrings with human language and ascii art if needed 🤓

molten hollow Oct 30, 2025, 7:35 PM

#

My point in all of that, is this:

parametrized testing are just a syntax sugar for multiple testcases. You don't lose much by spliting the parameters into multiple test cases.
good tests are descriptive - they should tell you what you're after. Parametrized test tend to hide that, while explicit tests tend to express that
if it's hard for you to write multiple test cases, try setting up snippets/live templates in your IDE, that you can type just "test" + Tab like in PyCharm, that will insert a test snippet.
it's better to understand specific behaviours, if they're explicitly stated,
with parametrized tests, you need to conform all your test-cases into one form with parameters, with explicit tests, you're free to express them in such a way that makes it clear for the reader.

#

thanks for the examples. I find long test names like that hard to read. I'd rather have a comment than underscored sentences.
@river pilot No problem, you can shorten the names if you'd like:

def test_substitute_variables_in_text():
    text = substitute_variables('Hello $Name, ($Age)', {'Name': 'John', 'Age': '14'})
    assert text == 'Hello John, (14)'

def test_simple_placeholder():
    assert substitute_variables('$Foo', {'Foo': 'Bar'}) == 'Bar'

def test_braced_placeholder():
    assert substitute_variables('${Foo}', {'Foo': 'Bar'}) == 'Bar'

def test_value_not_exists_returns_empty():
    assert substitute_variables('${Missing}', {}) == ''

def test_strict_passes():
    assert substitute_variables('${Foo?}', {'Foo': 'Bar'}) == 'Bar'

def test_strict_fails():
    with raises(Exception):
        substitute_variables('${Missing?}', {})

def test_default_returns_value():
    assert substitute_variables('${Foo-default}', {'Foo': 'Bar'}) == 'Bar'

def test_default_returns_default():
    assert substitute_variables('${Missing-default}', {}) == 'default'

def test_encode_dollar():
    assert substitute_variables('$$', {}) == '$'

def test_malformed_double_braces():
    assert substitute_variables('${{Foo}}', {'Foo': 'Bar'}) == '${{Foo}}'

def test_malformed_percent_sign():
    assert substitute_variables('${%Foo}', {'Foo': 'Bar'}) == '${%Foo}'

def test_malformed_digit_():
    assert substitute_variables('${5}', {'5': 'Bar'}) == '${5}'

def test_malformed_not_closed_brace():
    assert substitute_variables('${Foo', {'Foo': 'Bar'}) == '${Foo'

#

You can add comments to it if you'd like, but please note that these test cases, they' aren't all the same structure.

#

The way I see it, turning that back into parametrized values would lose information that is otherwise helpful.

#

Some people might say that you can save 4 or 6 lines, by condensing them back into parametrized test cases, as if it's the lines of code that makes it hard to maintain. It's not. It's the thinking. The harder it is to think about the function, the harder it is to maintain.

#

And it's way easier to think of these behaviours if they're separate test cases like that.

#

But! There are things you can do to make them better:

you can brainstorm the design with other programmer (pair programming)
you can try to test-drive the implementation - lose coupling, and behaviour driven (tdd)
you can try to reimplement the same thing the next morning; chances are you're gonna come up with better functions
try to reimplement the same functionallity in other programming language, just to shift your mindset. It's not uncommon to come up with different solutions when changing perspective like that
try to explain the function to non-programmers; sometimes non-programmers tend to ask questions which will change your point of view drastically, giving you a chance to redesign your approach
Try to only implement the features your program needs. For example, if your other functions only use a fraction of those features, try to keep them and remove the rest; chances are a simpler solution is waiting for you

As you can see, there are a lot of approaches which would improve the overall quality of the function and tests, and parametrizing inputs IMHO just isn't one of them.

molten hollow Oct 30, 2025, 7:46 PM

#

twin shale Sometimes a test just test that two implementations give the same result. Somet...

These sound like real cases, however I would solve each of them without parametrized tests. If you give me an example, I can show you how I would approach them.

pulsar oracle Oct 30, 2025, 7:49 PM

#

It's probably a preference thing in terms of this project. @river pilot said he personally finds them easier to read this way and I imagine the other people working on it do too.

I'd personally write them like yours and use the function names to communicate exactly what is being tested to make it all the clearer, and I'd probably use TDD to do it and organize by one clear thought or behavior at a time.

twin shale Oct 30, 2025, 7:53 PM

#

molten hollow My point in all of that, is this: - parametrized testing are just a syntax suga...

And I argue that this is not feasible if you need hundreds of test.

molten hollow Oct 30, 2025, 7:54 PM

#

twin shale And I argue that this is not feasible if you need hundreds of test.

Why hundreds? Maybe we can take a look at a specific example?

river pilot Oct 30, 2025, 7:54 PM

#

I can see why some people prefer the separate tests. I think it's a tradeoff, and we are choosing differently. To me, it's easier to see the behavior being tested in the compact parameterized form.

molten hollow Oct 30, 2025, 7:55 PM

#

river pilot I can see why some people prefer the separate tests. I think it's a tradeoff, an...

I supposed you might think that, because you wrote the thing. Someone who doesn't know the function, might have a different opinion. I'm wondering if you left the project for a year and came back to it, after forgetting what was there, would you still prefer the parametrized one or the split one. I would wager that you could get back to it quicker if it was split. Maybe we'll get to settle the wager one day in the future 😄 Who knows.

twin shale Oct 30, 2025, 7:56 PM

#

@params(x=range(64))
def test count_set_bits(x):
    val = 1 << x
    assert count_set_bits(val) = 1

river pilot Oct 30, 2025, 7:56 PM

#

molten hollow I supposed you might think that, because you wrote the thing. Someone who doesn'...

did you read the docstring? You understood almost all of what it did.

molten hollow Oct 30, 2025, 7:57 PM

#

river pilot did you read the docstring? You understood almost all of what it did.

Yes, but as I mentioned, when code changes, the docstrings tend to become obsolete. There are cases where the comment says one thing, and the code says other. I actually have legacy code like that 😄 That's why I don't treat them as a valid "source of truth". The code will tell you the truth and the tests.

river pilot Oct 30, 2025, 7:58 PM

#

molten hollow Yes, but as I mentioned, when code changes, the docstrings tend to become obsole...

the names of tests are also comments that can be out of date.

molten hollow Oct 30, 2025, 7:58 PM

#

twin shale ``` @params(x=range(64)) def test count_set_bits(x): val = 1 << x assert...

@twin shale Sorry, just to be sure, it's count_set_bits(x), right? Not count_set_bits(val)?

molten hollow Oct 30, 2025, 7:59 PM

#

river pilot the names of tests are also comments that can be out of date.

That's true, but yet:

I see very often code and comment where they disagree
I don't see very often testnames that disagree with the test code

🤔 🙌

twin shale Oct 30, 2025, 7:59 PM

#

Oops, yes

molten hollow Oct 30, 2025, 8:01 PM

#

Okay, so this test.

@params(x=range(64))
def test_count_set_bits(x):
    val = 1 << x
    assert count_set_bits(val) == 1

@twin shale And please tell me, what is the intention behind this test? Are you trying to test-drive the count_set_bits() function?

twin shale Oct 30, 2025, 8:02 PM

#

I want to the test the implementation. What do you mean by test drive?

molten hollow Oct 30, 2025, 8:02 PM

#

Because to me, it would seam that the only valid implementation of count_set_bits() is actually the code you have in your test, which is val = 1 << x. At which point, the test you have just checks that the code that you wrote is the code that you wrote.

pulsar oracle Oct 30, 2025, 8:03 PM

#

molten hollow That's true, but yet: - I see very often code and comment where they disagree -...

Both can suffer the same problem I think. But it's probably down to how detailed the names or descriptions are, and how likely behavior is to change. If you have an addition function and you start with a docstring it could be a pretty fine source of truth at least for how it should be. I read somewhere from Oracle for Java and documentation that the docstrings (or whatever they're called there) should lay out a testing plan sort of.

twin shale Oct 30, 2025, 8:03 PM

#

molten hollow Because to me, it would seam that the only valid implementation of `count_set_bi...

This is not true. The implementation is a physical chip.

#

Or fpga. Or at least not python code.

molten hollow Oct 30, 2025, 8:03 PM

#

Oh, so you're trying to test hardware?

twin shale Oct 30, 2025, 8:03 PM

#

Why not? 🙂

molten hollow Oct 30, 2025, 8:04 PM

#

So let me understand, you're creator of the hardware and trying to test it using pytest?

#

Or are you a user of a hardware and just need to check whether it works?

molten hollow Oct 30, 2025, 8:04 PM

#

twin shale Why not? 🙂

I thought we were talking about software development.

twin shale Oct 30, 2025, 8:05 PM

#

molten hollow Because to me, it would seam that the only valid implementation of `count_set_bi...

This also seems to misunderstand the function?

count_set_bits(0b1100_0000_0000_0000) would return 2

twin shale Oct 30, 2025, 8:06 PM

#

molten hollow Or are you a user of a hardware and just need to check whether it works?

Somewhat both, I'm a hw engineer

molten hollow Oct 30, 2025, 8:06 PM

#

I'm approaching this whole debate from the perspective of a software developer, who uses unit tests to improve the quality of the software 😄 I wasn't aware we're migrating from it into the hardware world.

#

To me, parametrizing tests, when it comes to application development, has the same flaws as using for-loops in tests.

twin shale Oct 30, 2025, 8:07 PM

#

I don't see any reason software shouldn't be equally black box tested as something actually hardware

molten hollow Oct 30, 2025, 8:07 PM

#

twin shale I don't see any reason software shouldn't be equally black box tested as somethi...

Well, I guess that's right. I would treat that hardware same as any 3rd party software.

#

I have a feeling we're migrating away from the original topic, which was parametrized tests 😄

river pilot Oct 30, 2025, 8:09 PM

#

molten hollow Well, I guess that's right. I would treat that hardware same as any 3rd party so...

as I understand it, @twin shale's job is to test that hardware. It's not 3rd-party.

twin shale Oct 30, 2025, 8:09 PM

#

But I'm not sure where the domain of "unit" testing ends.

twin shale Oct 30, 2025, 8:09 PM

#

river pilot as I understand it, <@185793553539137537>'s job is to test that hardware. It's n...

2nd party?

#

Imagine being an architect and then testing both the blueprint and the delivered physical house

river pilot Oct 30, 2025, 8:10 PM

#

twin shale 2nd party?

whatever party it is, do I have it right that you are testing the hardware?

molten hollow Oct 30, 2025, 8:10 PM

#

Well, I'm sure I entered the debate with parametrized tests when it regards software development (like applications, tools, functions). How you're supposed to develop hardware, is something I don't know much about.

molten hollow Oct 30, 2025, 8:10 PM

#

twin shale Imagine being an architect and then testing both the blueprint and the delivered...

I can't talk with you about that, because I don't know enough :/ I only have opinions about creating software, and testing software that uses hardware, but not testing hardware itself. Sorry.

river pilot Oct 30, 2025, 8:10 PM

#

molten hollow Well, I'm sure I entered the debate with parametrized tests when it regards soft...

does it matter if it's hardware or software? The job is to test this Python function that counts the number of bits in an int.

molten hollow Oct 30, 2025, 8:11 PM

#

river pilot does it matter if it's hardware or software? The job is to test this Python func...

but he said it's not a python function.

#

he said it's a hardware chip.

twin shale Oct 30, 2025, 8:12 PM

#

molten hollow I can't talk with you about that, because I don't know enough :/ I only have opi...

Software might be simulating/emulating hardware. I don't think it's much different testing-wise.

river pilot Oct 30, 2025, 8:12 PM

#

molten hollow but he said it's not a python function.

let's say it's a Python function.

twin shale Oct 30, 2025, 8:13 PM

#

molten hollow he said it's a hardware chip.

I mean, it could be. You don't know the implementation

#

Or the implementation might be super complicated, and you just want to assert some high level stuff

river pilot Oct 30, 2025, 8:14 PM

#

or the implementation is actually in hardware, but you access it through a Python wrapper so you can use pytest. It's black-box, we don't know.

molten hollow Oct 30, 2025, 8:16 PM

#

let's say it's a Python function.
If it's a python function, and he wrote a test like that:

@params(x=range(64))
def test_count_set_bits(x):
    val = 1 << x
    assert count_set_bits(val) == 1

Then, to my eyes, the only feasable implementation of that function is the code val = 1 << x itself. So you have a test that just duplicates the implementation.

river pilot Oct 30, 2025, 8:16 PM

#

molten hollow > let's say it's a Python function. If it's a python function, and he wrote a te...

you are misunderstanding the function. It counts the number of 1 bits in the int.

molten hollow Oct 30, 2025, 8:17 PM

#

I mean, it could be. You don't know the implementation
Or the implementation might be super complicated, and you just want to assert some high level stuff

or the implementation is actually in hardware, but you access it through a Python wrapper so you can use pytest. It's black-box, we don't know.

Well, from the perspective of a software developer, I will say it makes a difference whether it's you who's responsible for creating and maintaining that function, or whether it's something off-the-shelf that you don't maintain. Like, these require two different approaches to testing.

proud nebula Oct 30, 2025, 8:18 PM

#

Dnaron being dnaron

river pilot Oct 30, 2025, 8:19 PM

#

proud nebula Dnaron being dnaron

that's not productive

molten hollow Oct 30, 2025, 8:19 PM

#

river pilot you are misunderstanding the function. It counts the number of 1 bits in the int...

Ah, that's right. Sorry, I mistook the output from the input, you're right. I retract my previous statement. But still, I wouldn't use parametrize tests to test that.

pulsar oracle Oct 30, 2025, 8:20 PM

#

@twin shale I haven't fully been following this as i've been fixated on initial the paramterization argument. I think that in the case of checking if specific bits are set (1-64) having one test with parameterization that covers that range is valid. I wouldn't write them all indvidually personally, but I would do a few specific behaviors with it. I'm NGL I probably don't understand what's being tested, but assuming we're testing a bit counting function, i'd do something like this (ignore missing implementation, I'm too dumb for this right now).

def count_bits_set(number: int) -> int:
    """
    Takes a number and counts the number of bits that are 1 in it,
    returns 0 if None are set, otherwise the number of them on.
    """
    return number.bit_length()

class TestCountBitsSetFunction:
    def test_should_report_zero_bits_set_for_number_zero(self):
        assert count_bits_set(0) == 0

    def test_should_report_one_bit_set_for_number_one(self):
        assert count_bits_set(1) == 1

    def test_should_report_right_number_of_bits_set_for_reasonable_range_of_single_bit_set(self):
        # implementation with for loop or paramterization here

    def test_should_report_i_do_not_know_how_many_for_negative_one(self):
        # This is python, I genuinely have no clue.
        pass```

molten hollow Oct 30, 2025, 8:21 PM

#

I think we're very off topic, because the whole point started from a question, posted by someone who has no interested in hardware and bits, at all.

molten hollow Oct 30, 2025, 8:21 PM

#

twin shale How I would like test decorators to work: ```py @test @test.params(a=(1, 2, 4),...

Never mind, it was you 😄

#

Well, maybe I acted too roughly. I assumed you were creating a software system, and saw you tried to use parametrized testing, and I tried to advise you against that.

twin shale Oct 30, 2025, 8:22 PM

#

molten hollow I think we're very off topic, because the whole point started from a question, p...

I don't think it matters. You can find analogies in higher level system.

molten hollow Oct 30, 2025, 8:22 PM

#

But seeing you're creating hardware, that may be valid, I don't know.

river pilot Oct 30, 2025, 8:23 PM

#

molten hollow But seeing you're creating hardware, that may be valid, I don't know.

let's talk about the python function. how would you test it?

molten hollow Oct 30, 2025, 8:23 PM

#

twin shale I don't think it matters. You can find analogies in higher level system.

Yea, I don't think so. I did use parametrized tests in the past, as a software developer, and with enough development, I alaways regreted that decision, because it failed me somehow.

#

I guess it might be appealing only in narrow situations, when you think counting lines of code makes a difference that much.

#

But seeing how you're creating hardware, I don't know, you're trying to cover the whole range of inputs? I guess that makes sense.

#

I'm sure I would never test software like that, by "covering the whole space of inputs". To me such test would be redundant, and thus harder to maintain.

river pilot Oct 30, 2025, 8:24 PM

#

molten hollow But seeing how you're creating hardware, I don't know, you're trying to cover th...

it's a python function: how would you test it?

molten hollow Oct 30, 2025, 8:24 PM

#

river pilot it's a python function: how would you test it?

I would start from test, and drive my implementation from it. So I would need to know what my "soon-to-be-written-function" needs to do.

pulsar oracle Oct 30, 2025, 8:24 PM

#

molten hollow But seeing how you're creating hardware, I don't know, you're trying to cover th...

I think it makes sense to cover like a subset of them. Do bits 1-64 work correctly? In python they can go up infinitely so it would be impossible to cover it all. But specific behaviors, if it's 1 you get 1, if it's 0 you get zero, if you do 1-64 all at the same time, you get the full value of those ones, basic scenarios to prove that it works/isn't broken.

twin shale Oct 30, 2025, 8:25 PM

#

I'm definitely not counting lines of code. In my mind I'm very pragmatic in this. It might that the cases a haven't appeared. But I do think it's needed in some cases. I agree not all tests should be lumped together as parametrizations. There is a sweet spot.

river pilot Oct 30, 2025, 8:26 PM

#

molten hollow I would start from test, and drive my implementation from it. So I would need to...

it seems like you're ignoring parts of the discussion: it counts the number of 1 bits in an int.

molten hollow Oct 30, 2025, 8:26 PM

#

twin shale I'm definitely not counting lines of code. In my mind I'm very pragmatic in this...

But you're talking about your hardware,where you want to cover a range/space of inputs, correct?

twin shale Oct 30, 2025, 8:26 PM

#

For example:

Write 10 tests with 1 test data input each.
Or wrote 10 tests with 100 test inputs each for 5% more time spent?
It might be extremely good value. Time is a scarce resource.

pulsar oracle Oct 30, 2025, 8:26 PM

#

molten hollow Yea, I don't think so. I did use parametrized tests in the past, as a software d...

Here's an example of where I used paramterization the other week where I don't think I'll regret it:


class TestDiscordSnowflakeFunctionResultBits:

    @pytest.mark.parametrize("test_value", [0, 1, 4095])
    def test_should_have_given_increment_value_in_bits_1_to_12(self, test_value: int):
        snowflake = make_discord_snowflake(increment=test_value, worker=9, process=2, timestamp=2)
        assert snowflake & 4095 == test_value

    @pytest.mark.parametrize("test_value", [0, 1, 31])
    def test_should_have_given_internal_worker_value_in_bits_13_to_17(self, test_value: int):
        snowflake = make_discord_snowflake(increment=5, worker=test_value, process=2, timestamp=2)
        internal_worker = (snowflake >> 12) & 31
        assert internal_worker == test_value

    @pytest.mark.parametrize("test_value", [0, 1, 31])
    def test_should_have_given_internal_process_value_in_bits_18_to_22(self, test_value: int):
        snowflake = make_discord_snowflake(increment=5, worker=5, process=test_value, timestamp=2)
        internal_process = (snowflake >> 17) & 31
        assert internal_process == test_value

    @pytest.mark.parametrize("test_value", [0, 1, 4398046511103])
    def test_should_have_given_timestamp_value_in_bits_23_to_64(self, test_value):
        timestamp = make_discord_snowflake(increment=5, worker=5, process=0, timestamp=test_value)
        assert timestamp >> 22 == test_value

twin shale Oct 30, 2025, 8:27 PM

#

molten hollow But you're talking about your hardware,where you want to cover a range/space of ...

I want to do that when I test software as well. How can I otherwise know that it works? Imagine count_set_bits actually being a python function.

molten hollow Oct 30, 2025, 8:28 PM

#

river pilot it seems like you're ignoring parts of the discussion: it counts the number of 1...

Okay, so the thing I'm cautios about is that there is this effect in programmers, where we often do stupid things 😄 me included:

we code stuff that's not needed
we forget stuff
we create the first thing that pops into your head
we delay feedback in learning
we do complicated stuff, instead of simple
we sometimes do something complicated because we want to feel proud of it

we do all that , because we're human. Good engineering pracitces allow us to overcome those issues. I try to do that.

So when I need to create some software, I try to test-drive it (practice TDD), to not allow myself to create more code than necessary, and make sure it's simple enough, and not overly complicated.

#

And in order to do that, I try to start from tests, to make sure I don't fall into those traps.

twin shale Oct 30, 2025, 8:29 PM

#

molten hollow I would start from test, and drive my implementation from it. So I would need to...

It should count how many bits are set. That's it.

river pilot Oct 30, 2025, 8:29 PM

#

molten hollow And in order to do that, I try to start from tests, to make sure I don't fall in...

ok, so how would you test it?

#

we don't have to talk about this if you don't want. I've asked four times, and you aren't answering.

molten hollow Oct 30, 2025, 8:29 PM

#

river pilot ok, so how would you test it?

What I would absolutely need to do, is to verify that's actually what I need to do.

#

Because it could be an x/y problem.

#

I would go from higher level tests, to lower level,

#

if, by test-driving it i would find myself needing that function - great.

#

if not, i would not write it at all, and thus not have a need to test it.

#

but!!

#

Let's say we vierifeid it 😄

#

And I verified that I need it.

#

And I'm going to write it and test it.

#

And it's me who's creating that, not 3rd party.

#

Here's how I would do it:

twin shale Oct 30, 2025, 8:30 PM

#

Ok but you are missing a HUGE shortcut here. The task IS to count bits.

molten hollow Oct 30, 2025, 8:31 PM

#

twin shale Ok but you are missing a HUGE shortcut here. The task IS to count bits.

Okay, we're verified it. I just wanted to make sure i'm not falling into x/y, but if I'm not, let's go.

#

Here's how I would test-drive it (because I'm creating that function, correct)?

#

I would write my tests in such a way, that allow me to learn.

#

And also, I know I will make mistakes, so I need to write tests in a way that allow me to make progress in iterative steps, and when I found out where I am wrong, I can correct it.

#

I will start simple

#

Count bits. Let's say there are no 1 bits:

#

def test_there_are_no_1_bits():
  assert count_bits(0) == 0

#

That's very simple, I can implement that very simple,

#

then, let's say there is 1 bit on the first position

#

def test_there_is_1_bit_at_first_position():
  assert count_bits(1) == 1

#

That's also very simple to implement,

#

then, what's the next simplest thing after that? 1 bit on the second position

#

def test_there_is_1_bit_at_second_position():
  assert count_bits(2) == 1

and also two bits

def test_there_are_2_bits():
  assert count_bits(3) == 2

#

Also, handle the other cases

#

def test_handles_signed_integers():
  assert count_bits()  # here, put in information about whether signed integers are handled

#

def test_fails_for_unsigned_integers():

twin shale Oct 30, 2025, 8:34 PM

#

I've already all 64 bits, what are you wasting your time with? 🤔

molten hollow Oct 30, 2025, 8:34 PM

#

Also, put in floats,strings and None, assert that the function behaves properly

#

def test_function_fails_for_none():
  with raises(Exception):
     count_bits(None)

#

def test_function_does_not_count_bits_for_floats() # or does? i don't know, you tell me
  assert count_bits(0.00) == 0 # what's the expected outcome here?

#

You see, i'm not treating these tests just as a regression test suite,

#

I'm treating it as:

a tool to learn
to assert what I already know
what I want my function to do

twin shale Oct 30, 2025, 8:35 PM

#

Just assume input data is 64 bits 😊

molten hollow Oct 30, 2025, 8:35 PM

#

and also to:

gather feedback on ym design

molten hollow Oct 30, 2025, 8:36 PM

#

twin shale Just assume input data is 64 bits 😊

Great, write a test for that!

def test_fails_for_128_bit_input():
  with raises(Exception):
    count_bits(here_insert_data_with_128_bits) # maybe 2^128 or smth like that

#

That's the whole point, don't do "just assume". Assert that as an executable specification.

#

If that behaviour is not met, the test suite should fail.

twin shale Oct 30, 2025, 8:38 PM

#

There is no other input except 64 bits. Don't worry about other inputs, they are not represntable.

molten hollow Oct 30, 2025, 8:38 PM

#

twin shale There is no other input except 64 bits. Don't worry about other inputs, they are...

What do you mean "there are no other"? 😄 So what the function should do when someone passes it?

twin shale Oct 30, 2025, 8:39 PM

#

The compiler or testing framework would crash on type error

pulsar oracle Oct 30, 2025, 8:39 PM

#

molten hollow ```py def test_function_does_not_count_bits_for_floats() # or does? i don't know...

I personally rely on the type hint for these and assume any caller code is wrong/that's on them, so float and None would be eliminated by not having the signature say it takes anything but int

molten hollow Oct 30, 2025, 8:39 PM

#

twin shale The compiler or testing framework would crash on type error

My point being, you can very well test that without using parametrized tests.

#

You just specify the important cases that are enough to understand the function as a whole.

#

If we're talking about the software system, of course.

twin shale Oct 30, 2025, 8:40 PM

#

molten hollow You just specify the important cases that are enough to understand the function ...

And then?

molten hollow Oct 30, 2025, 8:40 PM

#

twin shale And then?

And that's enough to have good tests - as long as we're talking about the software system.

#

For hardware, as I said, I'm no expert, so I will not answer that.

twin shale Oct 30, 2025, 8:40 PM

#

They way you are doing it you get some examples. But some things need to be exhaustively tested.

twin shale Oct 30, 2025, 8:40 PM

#

molten hollow And that's enough to have good tests - as long as we're talking about the softwa...

That's definitely not true

molten hollow Oct 30, 2025, 8:41 PM

#

Let me tell you that:

parametrized tests are a way to explore an input space. In software development, I will argue you never need to do that. In hardware, maybe you do? 🤔 I don't know.

pulsar oracle Oct 30, 2025, 8:41 PM

#

twin shale They way you are doing it you get some examples. But some things need to be exha...

That's why I'd personally start with the specific examples, then cover a range of them exactly like you did with parameterization.

twin shale Oct 30, 2025, 8:41 PM

#

It's not about exploring or making examples of the behavior. It's about verifying.

molten hollow Oct 30, 2025, 8:41 PM

#

@twin shale We're talking about software development still? Or hardware?

twin shale Oct 30, 2025, 8:41 PM

#

Sure

molten hollow Oct 30, 2025, 8:42 PM

#

Then I will argue you never need to explore an input space like that.

twin shale Oct 30, 2025, 8:42 PM

#

pulsar oracle That's why I'd personally start with the specific examples, then cover a range o...

Indeed. If you make too high level or too parametrize tests, you might miss some concrete examples, which is not good either.

molten hollow Oct 30, 2025, 8:43 PM

#

twin shale That's definitely not true

Okay, then maybe I'm wrong. Can you give me an example where you might use that for software development?

river pilot Oct 30, 2025, 8:43 PM

#

I don't understand how you can advocate for blackbox testing and then say that hardware and software would need different testing of the same algorithm?

molten hollow Oct 30, 2025, 8:44 PM

#

river pilot I don't understand how you can advocate for blackbox testing and then say that h...

Because for software, I will work with tests TDD-style. And for that, you don't need parametrized tests.

#

And for hardware, I don't know how they test and maintain that. I just don't know, so maybe they need to explore the input space? I just don't know how they do it.

pulsar oracle Oct 30, 2025, 8:44 PM

#

twin shale Indeed. If you make too high level or too parametrize tests, you might miss some...

If you think about it from a TDD perspective it becomes natural, and I say this and have to admit I don't always use TDD. But if there's a problem that is new to me, or annoying to write, sometimes with bit math, I always do. I'd take it incrementally for learning like @molten hollow said. Does it count 0 bits correctly? Does it count 1? Thinking of it now, what's negative one supposed to be, what should happen then? (in your case if you don't expect anything besides 0-64 maybe that's fine). So naturally I'd get those specific examples, then I'd wonder, does this actually work for all the ones it needs to, or a reasonable set to prove that it will, which is where paramterization or a for loop would come in.

twin shale Oct 30, 2025, 8:45 PM

#

molten hollow Then I will argue you never need to explore an input space like that.

Example: Test a sorting algoritm. A quick way is to use an existing sorting algorithm, exhaustively feed it with with permutations of (1, 2, 3, 4 5) and check expected output.

molten hollow Oct 30, 2025, 8:45 PM

#

twin shale Example: Test a sorting algoritm. A quick way is to use an existing sorting algo...

Okay, so your goal is to write a new sorting algorithm, correct?

river pilot Oct 30, 2025, 8:46 PM

#

twin shale Example: Test a sorting algoritm. A quick way is to use an existing sorting algo...

or property-based testing, which is like parameterization on steroids

twin shale Oct 30, 2025, 8:47 PM

#

pulsar oracle If you think about it from a TDD perspective it becomes natural, and I say this ...

Doesn't matter what the bits represent. Signed or unsigned integer or float or unums

pulsar oracle Oct 30, 2025, 8:48 PM

#

twin shale Doesn't matter what the bits represent. Signed or unsigned integer or float or u...

Ahhh. I see back to your specific example. You're taking a function to count how many bits are on in memory, and then testing if the bit shifting logic works on the hardware as expected, and doing parameterize to test a bunch of values at once. In an environment where at runtime or compile time it'll most likely be impossible to use anyway. We assume that function already works by the time that test code is executed.

molten hollow Oct 30, 2025, 8:49 PM

#

Well, here's the thing:

if you only care about achieving your goal, which is getting your data sorted, you could just write one test

def test_my_pokemons_are_sorted():
  assert pokemons(['Pikachu', 'Alakazam']) == ['Alakazam', 'Pikachu']

and you implement that using built-in or off-the shelf sorting in your programming language. You don't need to reimplement it, you can just use what's there. That's one case. But I know that's not what you're after.

if you really want to create your own sorting algorithm, how would I say you need to approach it: I would say you need to test drive it. Drive the implementation of that sorting algorithm in test. You're not going to invent this as a big idea in your head, you will need to iteratively design it and come up with it, solve the edge cases, work on it. So I think you should use tests as a stepping tools to it. How exactly those test cases would appear is up to the creator of the algorithm, because the tests necessary depend on the nature of that algorithm. So in order to create good tests, you need to know how the algorithm works. If you want "black box", then just go with the first approach with one test.

There is even an example in "Clean Craftsman" by Robert Martin, where he uses tests like that to write quick-sort, if you're interested. And he doesn't use parametrized tests 😄

twin shale Oct 30, 2025, 8:50 PM

#

pulsar oracle Ahhh. I see back to your specific example. You're taking a function to count how...

Not sure I follow. And I don't have more time to lend to this discussion right now. Even though I think this is very interesting! I have some things to complete before bed.

river pilot Oct 30, 2025, 8:51 PM

#

molten hollow Well, here's the thing: - if you only care about achieving your goal, which is ...

I know you aren't saying that one test of a two-element list is sufficient to test a new sorting algorithm. And I know you like black-box testing, so "the nature of the algorithm" can't come into play. This seems like a non-answer to me.

molten hollow Oct 30, 2025, 8:51 PM

#

I will say, that's the reason I dislike topics in #unit-testing channel 😄

#

Because one person asks something (like @river pilot in this case), I answer, and now @twin shale doesn't agree, so I answer, and now @river pilot doesn't agree, and I'm constantly between two people 😄

twin shale Oct 30, 2025, 8:52 PM

#

molten hollow Well, here's the thing: - if you only care about achieving your goal, which is ...

One single test with one single test input to test a sorting algorithm?

molten hollow Oct 30, 2025, 8:53 PM

#

twin shale One single test with one single test input to test a sorting algorithm?

As I mentioned, it depends if you're using a function that already exists (like sorted() in python), or you're writing your own.

#

If you're writing your own, then you need more, obviously, as I described above.

twin shale Oct 30, 2025, 8:53 PM

#

I prefer "gray box testing" (I'm not sure that's a used terminology): Test it as a black, even though I know the internal working. And ALSO spend extra effort testing the parts I know are more complicated (and likely to contain bugs) more thoroughly.

molten hollow Oct 30, 2025, 8:54 PM

#

twin shale I prefer "gray box testing" (I'm not sure that's a used terminology): Test it as...

That sounds like a good strategy, if you're writing code first, and tests after.

#

But you get way better results, if you write tests first, and code after.

river pilot Oct 30, 2025, 8:54 PM

#

molten hollow As I mentioned, it depends if you're using a function that already exists (like ...

you aren't making sense. You need zero tests for the existing Python sorted() function

molten hollow Oct 30, 2025, 8:54 PM

#

river pilot you aren't making sense. You need zero tests for the existing Python sorted() fu...

If you mean, you don't need to test it, you're right. Why would you, someone else created it.

pulsar oracle Oct 30, 2025, 8:55 PM

#

twin shale One single test with one single test input to test a sorting algorithm?

I would start it with the simplest example to see if anything about it works, then add more behaviors/scenarios for it. So just a two element list would suffice, and the sorting algorithm would probably be wrong at that point, not do much, then I'd add more with different inputs.

molten hollow Oct 30, 2025, 8:55 PM

#

But if you want to test that your function (like the one with pokemons), returns values sorted, then you need a test to test-drive that usage of sorted() function.

twin shale Oct 30, 2025, 8:55 PM

#

river pilot you aren't making sense. You need zero tests for the existing Python sorted() fu...

Although you can find issues in python builtin stuff as well 😋

molten hollow Oct 30, 2025, 8:55 PM

#

twin shale Although you can find issues in python builtin stuff as well 😋

Sure.

#

But you see, it's not your job to test the tools you're using.

pulsar oracle Oct 30, 2025, 8:55 PM

#

Although, i'm not familiar with sorting algoirthms tbh, don't most of them have the same behavior but just do it faster/slower?

molten hollow Oct 30, 2025, 8:55 PM

#

there can be bugs in ifs, fors, variables, compilers, all that.

twin shale Oct 30, 2025, 8:56 PM

#

molten hollow But you see, it's not your job to test the tools you're using.

No this is just a case of contributing to open source 🙂

molten hollow Oct 30, 2025, 8:56 PM

#

If you're working on that software, sure.

#

But I guess we're working on our own projects, mostly. And for that, we don't need to test that kinds of things.

twin shale Oct 30, 2025, 8:57 PM

#

Sure

molten hollow Oct 30, 2025, 8:57 PM

#

@twin shale are you writing your tests first?

#

Or code first, and then test that?

twin shale Oct 30, 2025, 9:01 PM

#

It varies. But mostly it's a team effort and we do both in parallel

molten hollow Oct 30, 2025, 9:01 PM

#

twin shale It varies. But mostly it's a team effort and we do both in parallel

If I can suggest something, try to test-first more, and test-after less.

#

Would be awesome if 100% was test-first.

#

Most of the problems with testing just disappear, if you test-first.

twin shale Oct 30, 2025, 9:02 PM

#

molten hollow Would be awesome if 100% was test-first.

I don't agree 🙂

molten hollow Oct 30, 2025, 9:03 PM

#

Okay, my point being, the resulting software often is better designed if you do test-first.

#

Hence, I advise it to people.

#

Unless I'm wrong, in which case I'd be happy to hear a counter-example.

pulsar oracle Oct 30, 2025, 9:07 PM

#

molten hollow Unless I'm wrong, in which case I'd be happy to hear a counter-example.

I have one counter example. Iirc In an interview with Dave Farley, Dave Thomas (one of the authors of the pragmatic programmer book) said that as sort of an experiment he went like six months without writing tests and didn't have any issues, on account of writing the code in the same testable way as when he did.

molten hollow Oct 30, 2025, 9:09 PM

#

Oh, yes! I saw that.

pulsar oracle Oct 30, 2025, 9:09 PM

#

I think it's possible to write testable well architected code without tests at all

#

But I personally wouldn't do no tests.

molten hollow Oct 30, 2025, 9:09 PM

#

Yes! Interesting observation.

twin shale Oct 30, 2025, 9:09 PM

#

On the topic of tdd we also have this discussion:
https://github.com/johnousterhout/aposd-vs-clean-code

GitHub

GitHub - johnousterhout/aposd-vs-clean-code: A discussion between J...

A discussion between John Ousterhout and Robert Martin about differences between John's book "A Philosophy of Software Design" and Bob's book "Clean Code&...

molten hollow Oct 30, 2025, 9:09 PM

#

@pulsar oracle So because he internalized writing testable code, he got the benefits from testable code and lose coupling, without needing test first.

#

Very interesting video, I agree.

twin shale Oct 30, 2025, 9:09 PM

#

Pragmatic programmer will be read hopefully starting this year

proud nebula Oct 30, 2025, 9:10 PM

#

There's also the situation where you don't know where you are going. TDD can be a massive waste of time then.

molten hollow Oct 30, 2025, 9:10 PM

#

However, my internal sceptic about this, because I bet he created that application in a familiar technology, and probably familiar domain and ecosystem.

#

Would he have achieved the same results, in a new programming language, new framework, new eko system, new domain? 🤔 Now that would be interesting to see!

#

Maybe he would, who knows.

pulsar oracle Oct 30, 2025, 9:13 PM

#

molten hollow However, my internal sceptic about this, because I bet he created that applicati...

Very early on I learned when in doubt, write testable code. I think that some people are less error prone than others, maybe off by one errors or paying attention is more likely, familiar with the language, etc. If we're taking just same language. For me, I try to avoid frameworks and if I'm using one I'd separate important logic and build it separately (can't imagine not having tests to get feedback but the design is always pretty solid).

molten hollow Oct 30, 2025, 9:13 PM

#

twin shale On the topic of tdd we also have this discussion: https://github.com/johnousterh...

Yes, I can take a look at that. What am I looking for there, specifically? 😄

Because I know a lot of people who tried TDD, and now actually prefer it. I don't know anyone who would try it, and then not like it.

I only hear that TDD sucks from people who didn't actually do that.

#

@pulsar oracle Are you in Dave's Farley discord server?

pulsar oracle Oct 30, 2025, 9:14 PM

#

I'm not. I didn't pay for the patreon or anything, I didn't even know they had a discord server. I'm just a massive fan of the channel, and now lmax, and Martin Thompson, and the extended universe 😭 DaveFarley

molten hollow Oct 30, 2025, 9:14 PM

#

pulsar oracle I'm not. I didn't pay for the patreon or anything, I didn't even know they had a...

The fee isn't that much, and it's a valuable content. I can send you some screenshots, if you'd like.

twin shale Oct 30, 2025, 9:15 PM

#

molten hollow Yes, I can take a look at that. What am I looking for there, specifically? 😄 B...

It's just tangential to this discussion, and interesting.

pulsar oracle Oct 30, 2025, 9:15 PM

#

molten hollow The fee isn't that much, and it's a valuable content. I can send you some screen...

I'll consider it

pulsar oracle Oct 30, 2025, 9:16 PM

#

proud nebula There's also the situation where you don't know where you are going. TDD can be ...

There's this. I can't even figure out what I want sometimes until I've written a lot of code to get something going. But in my case I'll probably just rewrite with more tests.

molten hollow Oct 30, 2025, 9:17 PM

#

pulsar oracle There's this. I can't even figure out what I want sometimes until I've written a...

But I bet it's not like you don't know anything.

#

You do have some starting points.

#

There are always ways to assert what you already know, and there are ways to take slices.

#

Heck, I did TDD in languages I didn't even know yet.

#

Some time ago, I started to learn Rust, never seen that thing in my life, and my first line was a test.

pulsar oracle Oct 30, 2025, 9:20 PM

#

I think it is exceptional wherever you know what you want and not how to get it. You get to specify the perfect thing then go make it happen and get feedback. So any new language it's the first thing I'll aim for If I need something done.

molten hollow Oct 30, 2025, 9:20 PM

#

you know what you want and not how to get it.
Isn't that always the case? I'm not trying to be argumentative, but what are cases where you don't know what you want?

pulsar oracle Oct 30, 2025, 9:20 PM

#

But If I start with zero clue how to approach or test something I just want to get my hands dirty and see where things will go.

molten hollow Oct 30, 2025, 9:21 PM

#

pulsar oracle But If I start with zero clue how to approach or test something I just want to g...

Yea! That's correct. And why not with a test?

#

I do that do, get my hands dirty with something new.

#

Last week, I started to create a platformer game with new game library.

#

I didn't know, and didn't know what it can do.

#

I just wanted to learn that library.

#

I started with a test.

pulsar oracle Oct 30, 2025, 9:23 PM

#

Maybe it's a mindset problem. I was writing a harness for a discord bot the other week, I didn't know it was possible or how to approach it outside of an existing library doing it. But I had almost no clue how to test it or what the interface should look like, I just jumped in sort of making a domain model of guilds, users, in memory state, and just building up wayyy too high, figuring stuff out. I absolutely could have done this with TDD, it's not figured out and I am redoing it from scratch with it now and making better progress. But It wasn't clicking in my mind, I wasn't thinking how I'd test it, it was too much at the start, and required so much thinking.

#

Once I had an interface/design/sense of where it's going I could start driving development like this. There was also a ton of data i had to put in, stuff i had to look up, this time I started it minimal and got chatgpt to generate it and filled it in, testing event handling with a helper class.

twin shale Oct 31, 2025, 7:48 PM

#

Is the top half of this picture the result of TDD? 🤡

molten hollow Nov 1, 2025, 3:29 PM

#

twin shale Is the top half of this picture the result of TDD? 🤡

Wouldn't say so.

#

Not sure why would someone ridicule tdd like that

river pilot Nov 1, 2025, 3:57 PM

#

molten hollow Not sure why would someone ridicule tdd like that

i think it's because TDD is often explained in stark terms that don't match reality. TBH, it's something you've been doing here. If you like, explain how you would use TDD to test an IsEven function. Don't get distracted by whether we need that function, etc. Just: how would you test it?

limpid raft Nov 2, 2025, 7:53 AM

#

I am using pytest-ruff. Is there a way to disable E401 just for when Pytest runs Ruff, i.e. I don't want tests failing, just because the import list is not sorted. That will be enforced as part of pre-commit, but I don't want it checked each time pytest is triggered from inotify while I am making changes to the code

swift pewter Nov 2, 2025, 9:19 AM

#

limpid raft I am using `pytest-ruff`. Is there a way to disable `E401` just for when Pytest ...

You can pass --ruff-config to pytest to use a custom ruff config file (in which you can disable E401)

limpid raft Nov 2, 2025, 11:00 AM

#

swift pewter You can pass `--ruff-config` to pytest to use a custom ruff config file (in whic...

okay, cool. That'll do for my case I guess. Would love a pyproject.toml approach, but ok for now

karmic viper Nov 5, 2025, 7:54 AM

#

I would like to know if BDDs (behave) scenario based testing is used widely in the industry standards? Or pytest based unit testing is sufficient?

proud nebula Nov 5, 2025, 8:05 AM

#

karmic viper I would like to know if BDDs (behave) scenario based testing is used widely in t...

From what I've seen, BDD is just a bunch of regex inserted in a middle layer of your tests that makes the output slightly prettier, and everything else worse.

#

Yea, from a casual look behave looks the same. A bunch of English that is never checked but is asserted as fact. How could you possibly trust that?

karmic viper Nov 5, 2025, 8:34 AM

#

proud nebula Yea, from a casual look behave looks the same. A bunch of English that is never ...

Yeah i also feels it's a waste of time and effort just to keep things simple for the business perspective

proud nebula Nov 5, 2025, 8:41 AM

#

karmic viper Yeah i also feels it's a waste of time and effort just to keep things simple for...

And that business perspective is an illusion anyway. No manager will actually read that and if they do, it is a lie anyway.

river pilot Nov 5, 2025, 10:44 AM

#

karmic viper Yeah i also feels it's a waste of time and effort just to keep things simple for...

i have not seen BDD being used much in practice. One place I worked had some of it, and it ended up that the devs had to write the tests anyway, and had to spend time either trying to understand the middle layer of translation, or adding to it.

proud nebula Nov 5, 2025, 10:49 AM

#

I also worked at a place that had some. We removed it. There was another team at the same company that had more than we did. They also removed it for the same reason: it was a cost for no gain.

pulsar oracle Nov 6, 2025, 12:58 AM

#

karmic viper I would like to know if BDDs (behave) scenario based testing is used widely in t...

As a developer and for the practical aspect you'd probably be more interested in acceptance test driven development. That's where you write tests that ideally use the language of the problem domain to test the entire system. The level of abstraction can vary, they don't have to be for the business, if you're building an HTTP server for example, they're obviously largely not, and you get more feedback about weather your application is fit for release in continuous delivery terms. If you're developing a server. You can use pytest, unit test, any testing framework. BDD for what most developers who use it care about is just end to end testing happening to have each step written in gherkin and wired up to code and it's a pretty bad way to do it.

pulsar oracle Nov 6, 2025, 1:02 AM

#

river pilot i have not seen BDD being used much in practice. One place I worked had some of...

You're supposed to write it in a way that says what should happen while leaving how separate. The developers are supposed to write the tests or even both but anyone reading can see an example usage of the system and be like "yea, that's right". The same way a developer can see a good unit test with asserts and be like yea, that's what that function is supposed to do. The idea originally formed by Dan north as a way to describe TDD to developers without mentioning the word test, where test case classes are specifications and individual tests are scenarios iirc. But you don't need gherkin at all to do it and Dan North just comments the given when and then parts among normal pytest code.

river pilot Nov 6, 2025, 1:12 AM

#

pulsar oracle You're supposed to write it in a way that says what should happen while leaving ...

It sounds great. It didn't work out in practice.

pulsar oracle Nov 6, 2025, 1:13 AM

#

river pilot It sounds great. It didn't work out in practice.

That's fair

muted lichen Nov 7, 2025, 9:12 PM

#

Ive seen BDD be attempted before, with plenty of frameworks. Robot Framework was the worst by in large.

#

You get to a point where you have to write so much custom code, you think to yourself, "what am I doing here?"

#

I'll also say I work in a place that's heavily requirements based. Lots of IBM Jazz and DOORS. Its god awful. We're supposed to be capturing tests into those requirement systems and it just doesn't happen. its just there to lookup the customer signed off requirements but nothing ever goes back in

#

Part of the issue, the requirements are setup for programmatic access and even if they were the language used in the various processes has diverged so much from the requirements (e.g. dual use of a word) that its almost impossible to keep aligned

pulsar oracle Nov 7, 2025, 9:29 PM

#

muted lichen Ive seen BDD be attempted before, with plenty of frameworks. Robot Framework was...

It's also been attempted at the lmax exchange without any frameworks. Everyone wrote the tests and specifications using normal junit and an internal DSL. The business analysts would be sat down with an IDE to write the tests and there would be massive reusability with methods like register, login, create an instrument, wait for something to happen, verify an email was sent, etc. And every developer for every feature or bug fix would create an acceptance test even independently of stuff like user stories. Maybe this BDD stuff is error prone in practice like agile, but there's a huge practical part of it being missed where it's a synonym for acceptance test driven development, testing that the entire application is fit for release, usually in terms of the business with the same terms and language (in any programming language).

https://github.com/LMAX-Exchange/Simple-DSL/wiki

GitHub

Home

Utilities to write a simple DSL in Java. Contribute to LMAX-Exchange/Simple-DSL development by creating an account on GitHub.

bronze quiver Nov 11, 2025, 7:20 PM

#

In main.py:

from voiceconversion.RVC.RVCr2 import RVCr2
...
def initialize():
...
    some_var = RVCr2(settings)
...

In test_mytest.py with pytest:

from myapp.main import initialize

I'd like it to use MockRVCr2 (that's implemented in mock_rvcr2.py) instead of the real RVCr2. How to do that?

It's possible to override before the import, but all the linters are unhappy. Is there a cleaner way?

mock_module = types.ModuleType("voiceconversion.RVC.RVCr2")
mock_module.RVCr2 = MockRVCr2
sys.modules["voiceconversion.RVC.RVCr2"] = mock_module

river pilot Nov 11, 2025, 7:38 PM

#

Where do you call initialize? You should mock things where they are used, so you want to patch main.RVCr2

bronze quiver Nov 11, 2025, 7:44 PM

#

river pilot Where do you call `initialize`? You should mock things where they are used, so ...

I import initialize in test_mytest.py and call it in the tests.

river pilot Nov 11, 2025, 7:46 PM

#

bronze quiver I import initialize in test_mytest.py and call it in the tests.

you should try mock.patch("main.RVCr2", MockRVCr2)

bitter wadiBOT Nov 12, 2025, 9:16 PM

#

Your paste is too long, and couldn't be uploaded.

river pilot Nov 12, 2025, 10:00 PM

#

please delete this.

deft vigil Nov 12, 2025, 10:00 PM

#

for ?

river pilot Nov 12, 2025, 10:04 PM

#

deft vigil for ?

it's at the very least off-topic for this channel, and obfuscated code is usually suspicious. Please delete it.

gritty oracle Nov 12, 2025, 10:04 PM

#

aha

#

ok sure

deft vigil Nov 12, 2025, 10:05 PM

#

river pilot it's at the very least off-topic for this channel, and obfuscated code is usuall...

its all about fun training

river pilot Nov 12, 2025, 10:05 PM

#

deft vigil its all about fun training

this channel is about automated testing.

deft vigil Nov 12, 2025, 10:05 PM

#

and yea its test encoded script

river pilot Nov 12, 2025, 10:06 PM

#

deft vigil and yea its test encoded script

it's not about testing. Please delete it.

deft vigil Nov 12, 2025, 10:10 PM

#

i swear its testing for training

river pilot Nov 12, 2025, 10:25 PM

#

deft vigil i swear its testing for training

it's 99Mb of encrypted code. It's not about automated testing. This channel isn't about testing people, it's about testing code. Please delete it.

potent quest Nov 16, 2025, 1:44 AM

#

hi i just got into fuzzing and may have overfuzzed some functions
how do you not do that?

proud nebula Nov 16, 2025, 8:27 AM

#

potent quest hi i just got into fuzzing and may have overfuzzed some functions how do you not...

What does that mean? That you just wasted time on it?

potent quest Nov 16, 2025, 8:45 AM

#

That now my test suite of very basic methods takes about 3 minutes to ocmplete

proud nebula Nov 16, 2025, 8:56 AM

#

potent quest That now my test suite of very basic methods takes about 3 minutes to ocmplete

I don't think you should run fuzzing always. Mutation Testing, Fuzzing, Property Based Testing, these are all methods to find tests to add to your test suite, not something you run as part of the test suite itself.

potent quest Nov 16, 2025, 8:56 AM

#

Yeah, am working on lowering the fuzzing inside my test suite

#

every time i run it i find more bugs so its actually so far been useful

proud nebula Nov 16, 2025, 8:57 AM

#

No, you missed my point. There should be literally zero fuzzing done in the test suite itself. You run that separately once in a while to find tests to add.

potent quest Nov 16, 2025, 8:58 AM

#

Hmm. Fair....

#

probably need to refigure out how to use hypothesis

proud nebula Nov 16, 2025, 8:58 AM

#

Think of it as programming itself. You don't "do programming" while the function runs in prod. You do it before :P

proud nebula Nov 16, 2025, 8:59 AM

#

potent quest probably need to refigure out how to use hypothesis

and you should call it Property Based Testing, not fuzzing, so people don't get confused imo :P

potent quest Nov 16, 2025, 8:59 AM

#

what is the difference 🤔

#

I've calmed my testing somewhat but its not great yet

#

94% coverage though

#

which is excellent

proud nebula Nov 16, 2025, 9:01 AM

#

potent quest what is the difference 🤔

Fuzzing is a super broad concept. It could mean almost anything. PBT is much more specific.

#

But yea, PBT is commonly thought of as a form of fuzzing. But so is Mutation Testing.

#

And those are VERY different

potent quest Nov 16, 2025, 9:05 AM

#

interesting

#

what is mutuation testing?

proud nebula Nov 16, 2025, 9:10 AM

#

It's a method to find what behavior your tests don't test.

potent quest Nov 16, 2025, 9:10 AM

#

Ahhh

proud nebula Nov 16, 2025, 9:10 AM

#

It can't find behaviors your code doesn't have but should have though. PBT can sometimes help with that.

potent quest Nov 16, 2025, 9:11 AM

#

Yeah I should probably use a tad of mutation testing

proud nebula Nov 16, 2025, 9:12 AM

#

I'm partial towards MT personally. PBT is hard and seldomly applicable imo. While MT is a ton of work and always applicable.

potent quest Nov 16, 2025, 9:12 AM

#

.gh repo onerandomusername ghretos

#

https://gh.arielle.codes/ghretos/tree/main/tests this is what I have so far

GitHub

ghretos/tests at main · onerandomusername/ghretos

Parse GitHub html_urls into a machine readable form - onerandomusername/ghretos

#

I just shut my computer down otherwise I'd make some other changes

#

I plan to write tests for my discord bot soon https://gh.arielle.codes/Monty

GitHub

GitHub - onerandomusername/monty-python: A Discord bot for helping ...

A Discord bot for helping with development of Python projects. - onerandomusername/monty-python

river pilot Nov 16, 2025, 1:25 PM

#

it's hard to write a blog post about mocking without pulling in pages and pages of advice about how to write better tests. This is still a draft, so thoughts are welcome: https://nedbatchelder.com/blog/202511/why_your_mock_breaks_later.html

Why your mock breaks later (draft)

An overly aggressive mock can work fine, but then break much later. Why?

potent quest Nov 16, 2025, 7:48 PM

#

huh

river pilot Nov 16, 2025, 8:00 PM

#

potent quest huh

?

pulsar oracle Nov 17, 2025, 12:18 AM

#

river pilot it's hard to write a blog post about mocking without pulling in pages and pages ...

I could have tested this example without mocking and without it being finicky. If I have that function I'd want to know that my settings are loaded correctly from a settings settings json file in a directory, just not making it explicitly the home directory.

I don't get why we want to avoid opening a real file, it's basically exactly what you want to test and you don't need to mock. Most people can afford it and there's the tempfile.TemporaryDirectory module and the superb standard library for working with paths (os path join and so on). I personally prefer to put as much as I can in an area it can be tested to assure theres less of a chance for it to go wrong on user error.

river pilot Nov 17, 2025, 12:28 AM

#

pulsar oracle I could have tested this example without mocking and without it being finicky. I...

i dont understand. how would you test it without creating a file in the user's home directory?

pulsar oracle Nov 17, 2025, 12:30 AM

#

river pilot i dont understand. how would you test it without creating a file in the user's ...

I meant I'd change it to still search in a directory and load json from a file (that function specifically, the others I'd probably change to use a loaded version of the settings and not care where from), and just change the directory, then for testing, I'd put like /tmp/wherever and it would load from /tmp/wherever/settings.json, and I'd know when given the home directory it would load it pretty much as expected. No mocks.

river pilot Nov 17, 2025, 12:31 AM

#

pulsar oracle I meant I'd change it to still search in a directory and load json from a file (...

right, a kind of dependency injection

pulsar oracle Nov 17, 2025, 12:32 AM

#

river pilot right, a kind of dependency injection

Exactly, though I'm personally hesitant to call it that with primitives, it doesn't bring up the right idea in my head (very arbitrary tbh). But yea inverting control of where the path for the directory containing the config comes from.

#

But usually people use dependency injection to get out of testing anything real and in my experience (in regards to anything I want to find out that stuff actually works) it just moves around where I have to test stuff at.

river pilot Nov 17, 2025, 12:35 AM

#

if your point is "why use mocks at all", then this is what I meant above when I said, "it's hard to write a blog post about mocking without pulling in pages and pages of advice about how to write better tests."

pulsar oracle Nov 17, 2025, 12:37 AM

#

river pilot if your point is "why use mocks at all", then this is what I meant above when I ...

Yea that's more or less what I was saying. And fair point. I'm not against mocks in general, just that specific example, which is fair given the context it's written in.

potent quest Nov 17, 2025, 3:16 AM

#

river pilot if your point is "why use mocks at all", then this is what I meant above when I ...

i'd read those, tbf

ember maple Nov 19, 2025, 7:53 PM

#

thesedays i avoid mocks+monkeypatches if i can - allowing for dependency injection and validated fakes is so much more joy

random sorrel Nov 20, 2025, 6:51 PM

#

I'm doing procedural generation, I'm using files as input and both random.seed and np.random.seed are fixed. Output keeps changing. Any obvious ideas I'm missing?

river pilot Nov 20, 2025, 6:54 PM

#

random sorrel I'm doing procedural generation, I'm using files as input and both random.seed a...

those are the typical causes. is the time a factor? Can you link us to the code?

random sorrel Nov 20, 2025, 6:54 PM

#

no, it's not public, thanks though, I'll try to go step by step and see where things start changing.

river pilot Nov 20, 2025, 7:06 PM

#

random sorrel no, it's not public, thanks though, I'll try to go step by step and see where th...

when you find out, let us know.

random sorrel Nov 20, 2025, 7:19 PM

#

I actually can share what the basis was, it's a pretty cool project but I rewrote a bunch. https://github.com/oargudo/orometry-terrains Doesn't help for debugging though...

random sorrel Nov 20, 2025, 7:29 PM

#

river pilot when you find out, let us know.

I was recording timings for functions for optimization purposes and put that into a dict and returned that. Obviously the timings are always slightly different and that changed my control hash.

The other thing I found before that that started the whole thing was that I didn't have the numpy seed set, so that's probably the first thing I fixed and then I got stuck on this other "problem".

river pilot Nov 20, 2025, 8:02 PM

#

random sorrel I was recording timings for functions for optimization purposes and put that int...

got it. glad you found it.

thin imp Nov 20, 2025, 10:40 PM

#

Yoo who is active??

river pilot Nov 20, 2025, 10:50 PM

#

thin imp Yoo who is active??

there are lots of people here. most will wait for a question or topic to chime in.

thin imp Nov 21, 2025, 11:59 AM

#

river pilot there are lots of people here. most will wait for a question or topic to chime i...

Yoo thanks for that answer

timber anchor Nov 22, 2025, 4:35 AM

#

river pilot i think it's because TDD is often explained in stark terms that don't match real...

If you are testing if something is even, you likely are testing a programming language implementation detail.

For obvious reasons, this is bad. You are coupling to a detail. When you use a prog language, you trust that the language creators tested their own code.

#

And you do not need to use gherkin or even a BDD test framework to do BDD. You do not even need a unit test framework to do TDD.

If there is some confusion with the middle layer of making your test cases, this is not really a testing problem. More of an organization one now.

Meaning, the design was probably always bad. And it also exists in the prod code, not just test code.

river pilot Nov 22, 2025, 10:24 AM

#

timber anchor If you are testing if something is even, you likely are testing a programming la...

i don't see why isEven() means you are testing the language? People often have production code that wants to know if something is even.

timber anchor Nov 22, 2025, 11:45 AM

#

If you are doing input validation, that is fine to test, but you have to name and test the case for that input validation. But purely just testing for evenness, probably coupling to the lang now.

river pilot Nov 22, 2025, 11:46 AM

#

timber anchor If you are doing input validation, that is fine to test, but you have to name an...

it seems like you read too much into that joke image. it's not directly about testing.

timber anchor Nov 22, 2025, 11:48 AM

#

i think it's because TDD is often explained in stark terms that don't match reality. TBH, it's something you've been doing here. If you like, explain how you would use TDD to test an IsEven function. Don't get distracted by whether we need that function, etc. Just: how would you test it?

Ok. Joke is joke. You asked though.

river pilot Nov 22, 2025, 11:50 AM

#

I asked about testing isEven(). If you had that function in your code, why wouldn't you test it? Sure, it's easy to imagine it's a one-line function, but that line needs a test.

#

@timber anchor ^^

timber anchor Nov 22, 2025, 12:59 PM

#

Perhaps it would be better for you to answer why it "needs a test"

river pilot Nov 22, 2025, 1:05 PM

#

timber anchor Perhaps it would be better for you to answer why it "needs a test"

because i could have written the line incorrectly:

def isEven(x):
    return bool(x % 2)

river pilot Nov 22, 2025, 1:06 PM

#

timber anchor Perhaps it would be better for you to answer why it "needs a test"

how would you decide what parts of your code need tests and what parts don't?

pulsar oracle Nov 22, 2025, 2:28 PM

#

timber anchor And you do not need to use gherkin or even a BDD test framework to do BDD. You d...

Gherkin is just BDD at the functional testing level meant for business analysts. I've been doing BDD exactly as Dan North explained in his original article at the unit level for two years now. Some people are rubbed the wrong way by the gherkin part and miss the original approach entirely. If you use a unit test framework to do any type of test you can format them the same and ideally have them say exactly what you want it to do while saying very little about how (the level of abstraction varying). I personally heard it from Dave Farley and got it immediately then got confused by other material and wasn't sure I was doing BDD because of 99% of explanations being for functional testing but the article plus other explanations once again clarified it, I'm not perfect, some tests are definitely crummy and fail to exactly read as specifications or scenarios but I'm definitely most of the way there especially recently.

pulsar oracle Nov 22, 2025, 2:34 PM

#

river pilot because i could have written the line incorrectly: ```python def isEven(x): ...

If you're using BDD you don't test it. You say you want a function that will tell you if the function is even. Then you do a few scenarios using it. You name the test case after what is being tested or specified then name methods like sentences that specify what it should do.

TestIsEvenOdd:

test_should_detect_uneven_numbers_as_odd

What should it do? So give it an odd number have it return false because that's what you want, see it fail because it doesn't meet the specification, go make it be true, or should it really be true? If it shouldn't maybe another developer or person can be like, nope, need a different behavior (as understandings change even in the code we think we want). And so on. You test any function you want, and you test any code that uses it for broader behaviors, maybe I'd test this twice as part of something broader that also has even odd functionality. It's in the first part of this article and I've been doing it for years and now do it for acceptance tests, my naming is just better.

https://dannorth.net/blog/introducing-bdd/

Dan North & Associates Limited

Introducing BDD

I had a problem. While using and teaching agile practices like test-driven development (TDD) on projects in different environments, I kept coming across the same confusion and misunderstandings. Programmers wanted to know where to start, what to test and what not to test, how much to test in one go, what to call their tests, and how to understan...

proud nebula Nov 22, 2025, 2:35 PM

#

You sound almost religious when you defer to authority that much.

river pilot Nov 22, 2025, 2:35 PM

#

pulsar oracle If you're using BDD you don't test it. You say you want a function that will tel...

this is a lot of words, but i don't see how it's bdd. You said, "check that the function returns what you want." That's how all tests work.

pulsar oracle Nov 22, 2025, 2:37 PM

#

river pilot this is a lot of words, but i don't see how it's bdd. You said, "check that the ...

It's exactly the origin of BDD though and part of it, it is testing but explained differently, and to understand specifically test driven development. It's the flavor of it and how you think about it weather at the unit level or functional. They're basically identical in what's being done but if you use BDD there's a heavy emphasis on specifying what it should do while leaving out how it does it and making it very sentence like.

river pilot Nov 22, 2025, 2:40 PM

#

i like the "specify what it should do". sentence-like doesn't really appeal to me.

#

maybe it's a bit unfair, but "BDD" now is associated with intermediate tooling that many people find unproductive.

#

type fewer words

pulsar oracle Nov 22, 2025, 2:43 PM

#

river pilot maybe it's a bit unfair, but "BDD" now is associated with intermediate tooling t...

I'm not sure how to comment on how fair it is or not tbh because I don't think I've seen it explained very well outside of the original article and no one has tried to clarify it until wayyy after the fact and when I started testing largely like this I was like "oh BDD" then , wait no??? Still TDD? Then got it again yesterday when I saw the article.

pulsar oracle Nov 22, 2025, 2:45 PM

#

river pilot maybe it's a bit unfair, but "BDD" now is associated with intermediate tooling t...

Yea, it's unfair though for sure because it's a nice way to think about it and I've been writing code in little examples basically in a very BDD/functional testing style and I like the approach. It's fair if anyone doesn't want to do it, what's really important is weather something has tests or not be it before or after, English like or at all.

river pilot Nov 22, 2025, 2:45 PM

#

we definitely agree that the important thing is to have tests.

timber anchor Nov 22, 2025, 8:00 PM

#

pulsar oracle If you're using BDD you don't test it. You say you want a function that will tel...

I would go one further and say you dont absolutely have to test that either.

It would make sense to approach it from the end user first, and then you can explain why you needed to test if something is even based on the business/user needs.

Example: Equipment must be inspected on matching parity day.

If for some reason some business rule forces you to check for evenness in a unique way then this is going to be a test, yes. But its more of a contract test to assert that your types can do modulo arithmetic, which isnt about testing if something is even anymore.

If it isnt already in your prog lang, then ok you can test drive it, but if you are just converting types with builtins and then doing the modulo, its already tested. There would need to be a far better reason than we just need coverage

river pilot Nov 22, 2025, 8:10 PM

#

timber anchor I would go one further and say you dont absolutely have to test that either. It...

I think we agree then: if there's a function called is_even(), then we should have tests for it.

#

or maybe not: "converting types and doing the modulo" is code you can get wrong. You should test it.

#

@timber anchor #unit-testing message

timber anchor Nov 22, 2025, 8:20 PM

#

Its an internal detail bool(x%2) we may as well check if it is even valid code (which in something like Java it is not).

If for some reason python changed how it evaluated the truthiness of this, your test will fail despite not changing any of your code.

Closer to language paranoia.

Your tests can inadvertently cover scenarios for evenness/oddness and avoid explicitly testing the output of modulo and how python interprets ints as bools (also known as trust and know the language).

Its not a bad sanity check to assert something is even or odd, but i would not formalize such a thing as an actual unit test.

#

Its more of just assert as sanity check, instead of unit test.

river pilot Nov 22, 2025, 8:21 PM

#

timber anchor Its more of just assert as sanity check, instead of unit test.

i'm trying to understand what you are saying. Let's say there's a python function:

def is_even(x):
    return bool(x % 1)

How would you "assert as sanity check"?

timber anchor Nov 22, 2025, 8:22 PM

#

Sanity check for your own programming language understanding. If you do not know what this does, and you need to use it, then go ahead and assert it. Or just read the docs.

river pilot Nov 22, 2025, 8:23 PM

#

timber anchor Sanity check for your own programming language understanding. If you do not know...

"go ahead and assert it": can you be very specific? What code would you write where to do that?

river pilot Nov 22, 2025, 8:23 PM

#

timber anchor Sanity check for your own programming language understanding. If you do not know...

does "assert it" mean write a unit test?

timber anchor Nov 22, 2025, 8:23 PM

#

No. it means use the assert keyword

#

You can try in a repl

river pilot Nov 22, 2025, 8:24 PM

#

timber anchor No. it means use the assert keyword

do you mean do a manual test of the function?

timber anchor Nov 22, 2025, 8:24 PM

#

No. Just validate your own learning. If you call learning manual testing, then sure

river pilot Nov 22, 2025, 8:25 PM

#

timber anchor No. Just validate your own learning. If you call learning manual testing, then s...

so you have no protection against future changes to that function? I would definitely write a test.

#

@timber anchor how do you decide what functions to write tests for?

pulsar oracle Nov 22, 2025, 8:26 PM

#

river pilot so you have no protection against future changes to that function? I would defi...

I think what they're saying is you're probably developing something where that functions usage is an implementation detail and not something worth testing directly, and that one probably isn't something you'd write in practice with an expression in the language and all anyway.

river pilot Nov 22, 2025, 8:27 PM

#

pulsar oracle I think what they're saying is you're probably developing something where that f...

can we just accept that this function exists? The question is how you would approach testing it.

pulsar oracle Nov 22, 2025, 8:27 PM

#

river pilot can we just accept that this function exists? The question is how you would appr...

Yea I accept that and would assume the end user is a consumer of like a helper library or the standard library.

river pilot Nov 22, 2025, 8:28 PM

#

pulsar oracle Yea I accept that and would assume the end user is a consumer of like a helper l...

you mean the caller of this function? How does that affect your approach to testing it?

pulsar oracle Nov 22, 2025, 8:28 PM

#

river pilot you mean the caller of this function? How does that affect your approach to test...

It doesn't. It's just that the context changes everything because i might write it as a private function as part of something broader and not test it directly.

timber anchor Nov 22, 2025, 8:28 PM

#

Like I said, indirectly. If you have a higher up business rule or scenario, and change this function those tests should fail. Example: testing that a particular customer support on-call rota strategy involving an every other day rotation behaves as expected.

#

But the test doesnt have to cascade that high up either. You can unit test the behaviors closer to isEven

river pilot Nov 22, 2025, 8:29 PM

#

timber anchor Like I said, indirectly. If you have a higher up business rule or scenario, and ...

true, but why not write a test specifically for this function? That's the "unit" in unit test.

timber anchor Nov 22, 2025, 8:29 PM

#

This is a common misconception. I recommend looking into this yourself for now

river pilot Nov 22, 2025, 8:30 PM

#

timber anchor This is a common misconception. I recommend looking into this yourself for now

can you explain your perspective to me?

#

you don't have to if you don't want to.

timber anchor Nov 22, 2025, 8:31 PM

#

I wont go into much more detail because there are people who speak on it far better than I do, and it has been done... but the unit is closer to behaviors (perhaps even so far as to say specifically end-user behavior) instead of functions.

#

Hence BDD

river pilot Nov 22, 2025, 8:32 PM

#

timber anchor Hence BDD

ok, this is where we started, I understand. It's a different approach to testing.

pulsar oracle Nov 22, 2025, 8:32 PM

#

river pilot can you explain your perspective to me?

I can give an actual example of this if you want that would explain why someone practicing TDD might skip testing is even or why it's confusing from a certain perspective.

river pilot Nov 22, 2025, 8:33 PM

#

pulsar oracle I can give an actual example of this if you want that would explain why someone ...

sure

timber anchor Nov 22, 2025, 8:34 PM

#

number of days covered in on-call with an every other day strategy.

test: leap year vs non leap year.

it will find problems with is even or odd quickly. 366 vs 365 days

river pilot Nov 22, 2025, 8:35 PM

#

timber anchor number of days covered in on-call with an every other day strategy. test: leap...

calculating leap years doesn't involve even/odd, but: why wouldn't you also want a unit test for is_even?

pulsar oracle Nov 22, 2025, 8:37 PM

#

I want to recover album photos from a game. There's a cache directory and these photos are jpegs and of a certain resolution. To do this I need to check if a file is a jpeg, does it have the signature, think of it analgous to, is this number even? My real logic is take a directory and find photos that are a jpeg and of a resolution.

So I'd do find_album_photos_in_directory(directory_path: str) and I would place actual photos in that directory and be interested in, will it find a jpeg photo with the right resolution? Will it skip one of a different format like a PNG with the same resolution, each individual function like is even or even some functions to read this information are important, but I wouldn't test them for this problem, I'd hide them. So what I'm saying is, if you need to check if something is even and you have that function, you just skip it, even if you make it easier and less error prone. But I agree completely if we're talking about making a library of functions, a standard library, something where that exact code is what you want people to consume.

timber anchor Nov 22, 2025, 8:37 PM

#

It might not need to, but it can. One team can work 182 vs 183 days. Or on leap year, 183 vs 183. The importance is in asserting which team gets 183.

#

You are testing leap year vs non leap year though, not isEven

river pilot Nov 22, 2025, 8:40 PM

#

I totally understand testing the user-visible functionality of the product. I agree that's a good thing. It's also good to have tests at smaller granularities. They can include cases that are harder to do at the higher level.

#

I'm not sure why you think it would be bad to test at the lower level also.

timber anchor Nov 22, 2025, 8:41 PM

#

That is the lower level. You can use fakes for all of that if you like

#

If you are referring to my example anyway

pulsar oracle Nov 22, 2025, 8:46 PM

#

river pilot I'm not sure why you think it would be bad to test at the lower level also.

I personally think that it does depend. If I have something I need to do all over the place it would be nice to have a tested function available that I can rely on to do it (even if it's going to be tested in other places indirectly, and I find that when I do this sometimes it makes it easier to put everything together later). But in the example I gave I avoid it because it's like planning ahead, not being so iterative. One time when I did this I started writing these different functions like for is_photo_a_jpeg(file_path) and it felt like I was making publicly available functions to be relied on that I don't need when I should have been starting with a function that solves the problem that I want and testing it. I personally think about the issue as public vs private code for it.

river pilot Nov 22, 2025, 8:48 PM

#

timber anchor That is the lower level. You can use fakes for all of that if you like

by lower level i meant a test specifically for is_even()

river pilot Nov 22, 2025, 8:48 PM

#

timber anchor If you are referring to my example anyway

maybe in your example there would be a test for is_leap_year(y)?

timber anchor Nov 22, 2025, 8:50 PM

#

the test would be something like two_people_divide_days_in_leap_year

river pilot Nov 22, 2025, 8:50 PM

#

timber anchor the test would be something like two_people_divide_days_in_leap_year

so you wouldn't test is_leap_year directly?

timber anchor Nov 22, 2025, 8:50 PM

#

it is an on call rotation, should be at least two people

river pilot Nov 22, 2025, 8:51 PM

#

timber anchor it is an on call rotation, should be at least two people

so you wouldn't test is_leap_year directly?

timber anchor Nov 22, 2025, 8:52 PM

#

I think calendar does this already for you. You are testing a dependency?

pulsar oracle Nov 22, 2025, 8:53 PM

#

timber anchor I think calendar does this already for you. You are testing a dependency?

You wouldn't test leap year checking logic?

timber anchor Nov 22, 2025, 8:53 PM

#

Maybe. Not entirely sure

pulsar oracle Nov 22, 2025, 8:54 PM

#

I would. If I have business logic that depends on a year being of a certain type and doing something if it is, I would extract it into it's own interface like YearTypeChecker with a method to check what type of year it is, then write an implementation and test it there, for the core busines logic I'd use a mock and make it say a certain year is a leap year or test it with my implementation of the checker injected as a dependency.

timber anchor Nov 22, 2025, 8:55 PM

#

If you are writing a date librbary, sure

#

What are you mocking?

pulsar oracle Nov 22, 2025, 8:57 PM

#

If i have business logic dependent on if the year is a leap year, then I put that in it's own class and make a mock to test it. The actual thing would need to use an implementation that checks that leap year logic is at least somewhat right, which is where integration tests come in, or at the very least push it off to some acceptance tests.

river pilot Nov 22, 2025, 8:57 PM

#

timber anchor If you are writing a date librbary, sure

i'm not writing a date library. I have a helper function that tells me if a year is leap or not.

pulsar oracle Nov 22, 2025, 8:57 PM

#

(I'm not talking an individual function at this point by the way, just business logic and testing that logic somewhere)

pulsar oracle Nov 22, 2025, 9:01 PM

#

river pilot so you wouldn't test `is_leap_year` directly?

I think if I'm writing code I want to know that it does something when it's a leap year and when it isn't and I don't care about the logic for it actually being one, so it could take a callable to check, or a class that serves that purpose and you could just lie with some given input to check what you want. And I think a leap year example is different because it's something probably directly related to your business logic (If I'm not imagining this wrong) and you would want to know what you pass in actually works somewhat.

river pilot Nov 22, 2025, 9:02 PM

#

pulsar oracle I think if I'm writing code I want to know that it does something when it's a le...

"you would want to know what you pass in actually works": it sounds like you would write a test for is_leap_year()

pulsar oracle Nov 23, 2025, 12:25 AM

#

river pilot "you would want to know what you pass in actually works": it sounds like you wou...

In this case yes but where varies and directly at all varies too by what I'm making.

amber fulcrum Nov 28, 2025, 9:00 PM

#

I have tests that needs some setup, e.g. by loading a JSON schema. I've set them up as fixtures in the module and that's fine. However I need the same fixture functions for several test modules, but with different file input. To DRY I'd like to move these fixtures to conftest.py.

Is there a way to parameterize fixtures in conftest.py where the parameter is actually provided by a test file/module?

#

Can a test fixture with @pytest.fixture(scope="module") get access to a variable which a specific module sets? -- Because scope="module" in conftest.py means per test-file, not per-conftest, right?

#

Google suggests that I can create a common fixture that returns a function that does the heavy lifting. This way I can give an per-file input to the reused fixture.

#

It ended up something like this:

# ---- conftest.py ----
@pytest.fixture(scope="session")
def fn_schema():
    """Return a function that reads a schema file."""
    def schema(filename: str|Path) -> dict:
        with open(ROOT / "schemas" / filename, "r") as f:
            return json.loads(f.read())
    return schema

# ---- test_something.py ----
DUT_NAME = "user_data"

@pytest.fixture(scope="module")
def schema(fn_schema):
    return fn_schema(DUT_NAME)

def test_example(schema):
    ...

Then I realize that the fn_schema() doesn't create any value as fixture since the schema() fixture is required. It could as well just be a regular imported utility function.

river pilot Nov 28, 2025, 9:41 PM

#

amber fulcrum It ended up something like this: ```py # ---- conftest.py ---- @pytest.fixture(s...

right, schema can just be a regular function. Even fn_schema might not be providing much value, since it doesn't cost much to read the file each time.

amber fulcrum Nov 28, 2025, 9:42 PM

#

river pilot right, `schema` can just be a regular function. Even `fn_schema` might not be p...

The reason schema() is a fixture is because it's used a lot in the test functions. So to avoid repeating schema = fn_schema(DUT_NAME) in every function.

#

Scope isn't terribly important for that

#

I found indirect= as an option to parametrize and are looking into if that is a more elegant method

#

This works, although the repeated @pytest.mark.parametrize(...) quickly gets very tedious:

# ---- conftest.py ----
@pytest.fixture
def schema(request):
    with open(ROOT / "schemas" / request.param, "r") as f:
        return json.loads(f.read())

# ---- test_something.py ----
DUT_NAME = "user_data"

@pytest.mark.parametrize("schema", [DUT_NAME], indirect=True)
def test_user_data(schema):
    ...

@pytest.mark.parametrize("schema", [DUT_NAME], indirect=True)
def test_user_data_2(schema):
    ...

amber fulcrum Nov 28, 2025, 10:20 PM

#

# This works:
schema = pytest.mark.parametrize("schema", [DUT_NAME], indirect=True)

@schema
def test_fn(schema):
    ...  # This works

# This doesn't work
@pytest.fixture
@pytest.mark.parametrize("schema", [DUT_NAME], indirect=True)
def local_schema(schema):
    return schema

def test_fn2(local_schema):
    ...  # This doesn't work

molten hollow Dec 4, 2025, 8:48 AM

#

amber fulcrum I have tests that needs some setup, e.g. by loading a JSON schema. I've set them...

By exposing that JSON schema you're making your tests unnecessarily complex. Maybe you could write your tests from the perspective of the user of the code, calling natural entry-point methods?

velvet dirge Dec 4, 2025, 10:43 PM

#

amber fulcrum ```py # This works: schema = pytest.mark.parametrize("schema", [DUT_NAME], indir...

can decorate the fixture if needed

@pytest.fixture(params=["user_data"])
def user_schema(request):
    return json.loads((ROOT / "schemas" / request.param).read_text())

def test_user_data(user_schema):
    ...

Otherwise, I'd write a helper to determine the schema based on some info. i.e. for a web request, you can use the path + openapi spec to find the json schema

boreal tundra Dec 7, 2025, 6:06 PM

#

I wish there was an ability to pattern match the module name on the command python -m unittest discover -s "longmodulename.modulenameblah*" -p test.py

#

I can match on the filename, but not the module ducky_skull

proud nebula Dec 7, 2025, 6:25 PM

#

boreal tundra I wish there was an ability to pattern match the module name on the command `pyt...

afaik you can with pytest

#

with -k

wind zenith Dec 8, 2025, 3:41 AM

#

Hi, so I made a click app that when you run it, runs a function with some prints and inputs like:

# in cli.py
@click.command(name="start")
def start():
    main()

cli.add_command(start)

# in cli_app.py
def main():
    raw_to_parse = input(textwrap.dedent(
        """
        Welcome!
        Do you want to start?
            (Y)es   [default]
            (N)o

        """
    ))
    to_parse: bool = True

    if raw_to_parse.lower() not in ["", "y", "yes"]:
        to_parse = False

    if to_parse is True:
        get_pattern()

How would I test this? CliRunner.invoke() doesn't seem to handle inputs in the function itself.

river pilot Dec 8, 2025, 10:17 AM

#

wind zenith Hi, so I made a click app that when you run it, runs a function with some print...

i'm not sure, but perhaps if you use click's utilities for input, their test tools would work with it? They have click.prompt() and click.confirm()

fiery arrow Dec 8, 2025, 12:08 PM

#

Am I tripping, or does IPython specifically not run this test in CI? https://github.com/ipython/ipython/blob/9.8.0/tests/cve.py since its file name doesn't start with test_.
On a recent CI run: https://github.com/ipython/ipython/actions/runs/19890255242/job/57007060466 I don't see cve anywhere in the logs

#

This seems like a very good argument for including tests in coverage, but they do include it in coverage, it just doesn't error when it's not 100% 🤔

river pilot Dec 8, 2025, 12:25 PM

#

sounds like a good issue to write

fiery arrow Dec 8, 2025, 12:32 PM

#

i'm just making sure I'm seeing it right

#

would be really silly to make a PR fixing it and it's like "we're obviously running the cve test in super-cool-separate-cve-runner"

river pilot Dec 8, 2025, 12:36 PM

#

fiery arrow would be really silly to make a PR fixing it and it's like "we're obviously runn...

at a quick look, i don't see a thing that runs it, and the commit that added that file didn't change any test-running.

fiery arrow Dec 8, 2025, 12:42 PM

#

river pilot at a quick look, i don't see a thing that runs it, and the commit that added tha...

alright, thanks for the reassurance
the test thankfully still passes, otherwise I would've thought it was intentionally named this way to unfix the CVE later

#

(kinda sus that the only test this happened with is a security related one)

#

sorry, I'm a bit over-paranoid

river pilot Dec 8, 2025, 12:45 PM

#

fiery arrow sorry, I'm a bit over-paranoid

I'd say "diligent and detail-oriented"

fiery arrow Dec 8, 2025, 12:47 PM

#

ugh, codecov has geoblocking... and also doesn't like requests over Tor...

#

I have to spam "new circuit for this site" just to see the coverage. smh

weary quarry Dec 9, 2025, 2:49 AM

#

👀 A cool example, in the wild, of why I like coverage on my test files right there. 📸

radiant stirrup Dec 14, 2025, 2:48 PM

#

So, like I have a problem when trying to run hatch test and I am not sure what I can do to fix it.

#

📎 message.txt

bitter wadiBOT Dec 14, 2025, 2:48 PM

#

radiant stirrup

Click here to see this code in our pastebin.

summer surge Dec 16, 2025, 3:10 PM

#

Hi everyone! I am here about a concept of making tests. In F.I.R.S.T principles it's required to wrtie test simultaneously with creating some x func or even before, but what about reality? Sometimes, you don't want to write tests simply to check whether value or None returned or even write test before a certain func.
So how to correctly implement T - Timely part of principles in real-world development?

river pilot Dec 16, 2025, 4:10 PM

#

To me, Timely doesn't mean the test should exist before the code. It means the tests should be added to the project when the code is added to the project. "Added" could mean a pull request, or a work item, or whatever. The new code or fix isn't done until there are tests to go along with it.

summer surge Dec 16, 2025, 6:01 PM

#

river pilot To me, Timely doesn't mean the test should exist before the code. It means the t...

ty for your answer, now it makes sense! btw i read your article today about mocks. Find it really useful for my ocassion and thanks to it I understand a bigger picture, appreciate your work!

river pilot Dec 18, 2025, 4:55 PM

#

https://nedbatchelder.com/blog/202512/a_testing_conundrum.html

A testing conundrum

A useful class that is hard to test thoroughly, and my failed attempt to use Hypothesis to do it.

marsh raft Dec 18, 2025, 5:09 PM

#

river pilot https://nedbatchelder.com/blog/202512/a_testing_conundrum.html

Beautifully written, as usual. Just out of curiosity, how long did it take you to write that? (Excluding the programming stuff; I'm just curious about looking at a sentence and saying "hm, could that be clearer")

river pilot Dec 18, 2025, 5:16 PM

#

marsh raft Beautifully written, as usual. Just out of curiosity, how long did it take you ...

that was probably an hour, then a long walk, then 20 min of editing? These days I ask claude for critiques of drafts, and take ~half its suggestions

marsh raft Dec 18, 2025, 6:01 PM

#

ooh sneaky 🙂

river pilot Dec 18, 2025, 6:14 PM

#

I think you mean, using all of the tools available to me 🙂

marsh raft Dec 18, 2025, 6:16 PM

#

YES THAT'S CHEATING

languid lance Dec 18, 2025, 6:55 PM

#

I'm really confused...
Why does a test with the name test_lorum_ipsum_update pass, but when I change it to test_update_lorum_ipsum it fails?

river pilot Dec 18, 2025, 7:02 PM

#

languid lance I'm really confused... Why does a test with the name `test_lorum_ipsum_update` p...

changing the name won't do it. something else is going on. Can you show us the passing and the failing code?

marsh raft Dec 18, 2025, 7:09 PM

#

perhaps there are two tests with the same name? test framesworks might ignore the second such

languid lance Dec 18, 2025, 7:11 PM

#

The assertion that's failing when I have the test named as test_update_lorum_ipsum is:

mock_mongo_document.find_one_and_update.assert_called_once_with(...)

The fail is:
AssertionError: Expected 'find_one_and_update' to be called once, Called 0 times

Then, when I change the test name to test_lorum_ipsum_update it passes.
There are no other tests with the failing test name. That was my first thought 😅

river pilot Dec 18, 2025, 7:14 PM

#

Renaming could change the order the tests are run. Perhaps your tests are not isolated from each other

languid lance Dec 18, 2025, 7:15 PM

#

river pilot Renaming could change the order the tests are run. Perhaps your tests are not i...

The test class that it lives in inherits from unittest.IsolatedAsyncioTestCase

marsh raft Dec 18, 2025, 7:15 PM

#

you're certain that when it passes, it actually runs (as opposed to being skipped)?

#

I guess your future looks like: simplify your tests bit by bit until you discover the bit that is breaking things

languid lance Dec 18, 2025, 7:16 PM

#

marsh raft you're *certain* that when it passes, it actually runs (as opposed to being skip...

I don't believe it's skipping since it says it passed. I could run the test through the debugger and see what all is happening

river pilot Dec 18, 2025, 7:17 PM

#

languid lance The test class that it lives in inherits from `unittest.IsolatedAsyncioTestCase`

I don't know what IsolatedAsyncioTestCase does, but there are lots of ways for tests to be accidentally coupled to each other.

marsh raft Dec 18, 2025, 7:22 PM

#

languid lance I don't believe it's skipping since it says it passed. I could run the test thro...

I'd put 5/0 or some other easy-to-type thing that is guaranteed to raise an exception

languid lance Dec 18, 2025, 7:23 PM

#

@river pilot @marsh raft - okay, so if I comment out the test above it which mocks out the function that I'm testing later, it passes with the name that I want (test_update_lorum_ipsum).

river pilot Dec 18, 2025, 7:23 PM

#

languid lance <@424559318617161740> <@417484529922736128> - okay, so if I comment out the test...

can you share the code of that test you just commented out?

pulsar oracle Dec 18, 2025, 7:26 PM

#

river pilot I don't know what IsolatedAsyncioTestCase does, but there are lots of ways for t...

It runs your test method in a new event loop, same design as the original class but that

languid lance Dec 18, 2025, 7:35 PM

#

Here's the test class' setUp:

def setUp(self):
    self.service = AttendanceService()

The first test:

async def test_handle_absence(self):
    { ... }
    
    self.service.update_attendance = AsyncMock()
    self.service.update_attendance.return_value = AttendanceModel(...)

    { ... }

    self.service.handle_absence()
    self.service.update_attendance.assert_called_once_with(...)

This passes...

Second test:

async def test_update_attendance(self, mock_document):
    mock_document.find_one_and_update = AsyncMock()
    mock_document.find_one_and_update.return_value = AttendanceModel(...)

    test_result = await self.service.update_attendance(...)

    mock_document.find_one_and_update.assert_called_once_with(...)

river pilot Dec 18, 2025, 7:40 PM

#

What is cleaning up the mocks? Something needs to undo them at the end of the test.

#

or, what makes mock_document, and what uses it?

#

I hope AttendenceService isn't a singleton....

river pilot Dec 18, 2025, 7:43 PM

#

languid lance Here's the test class' setUp: ```python def setUp(self): self.service = Atte...

more ideas ^^

languid lance Dec 18, 2025, 7:56 PM

#

river pilot I hope AttendenceService isn't a singleton....

It is a singleton 😂

river pilot Dec 18, 2025, 7:59 PM

#

languid lance It is a singleton 😂

does that mean that all of your tests are using the same one? That will be a problem.

languid lance Dec 18, 2025, 8:04 PM

#

Yeah, which now makes sense with me not having a cleanup

river pilot Dec 18, 2025, 8:04 PM

#

I recommend not using singletons: https://nedbatchelder.com/blog/202204/singleton_is_a_bad_idea.html

Singleton is a bad idea

Design patterns are a great way to think about interactions among classes. But the classic Singleton pattern is bad: you shouldn’t use it and there are better options.

languid lance Dec 18, 2025, 8:07 PM

#

I agree with you. The client that I'm working for uses them, so my hands are tied.

river pilot Dec 18, 2025, 8:15 PM

#

languid lance I agree with you. The client that I'm working for uses them, so my hands are tie...

ok, at least the root cause has been found. Cleaning up the mocks should fix it.

languid lance Dec 18, 2025, 8:16 PM

#

river pilot ok, at least the root cause has been found. Cleaning up the mocks should fix it...

Thank you and @marsh raft for being my rubber ducks rubberduck

atomic thistle Dec 20, 2025, 8:02 PM

#

Ran into a bug I can't get to the bottom of, maybe someone smarter than me can figure it out. The following test produces this error in python 3.10 - 3.11, but not 3.12 or later:

NameError: name 'isclose' is not defined

def test_shear_wall():
    file = "/Users/villager/Projects/pynite/Examples/Shear Wall - Basic.py"
    exec(open(file).read())

The script makes use of the math.isclose function. from math import isclose
I cannot reproduce the error in a smaller example.
There is no error running the file directly, or via exec. So it I expect pytest is somehow related.
More details: https://github.com/JWock82/Pynite/pull/301

swift pewter Dec 20, 2025, 8:23 PM

#

atomic thistle Ran into a bug I can't get to the bottom of, maybe someone smarter than me can f...

I cannot reproduce the error in a smaller example.
What does that mean? It doesn't happen if you remove everything past the last isclose call in that file?

#

Also: Do you have more traceback than just the NameError? Which line does it error on?

atomic thistle Dec 20, 2025, 8:26 PM

#

I don't think I can see more traceback because I'm running it through exec. GitHub action log here: https://github.com/JWock82/Pynite/actions/runs/20359819373/job/58618190836#step:6:115

#

I will try remove everything past the last isclose call.

swift pewter Dec 20, 2025, 8:28 PM

#

I would remove things until it no longer occurs. Does it happen when refering to isclose at all after the import? Does it even happen in the script file itself (or in something it calls)?

atomic thistle Dec 20, 2025, 8:46 PM

#

Thanks! That was helpful, it's happening at the list comprehension. This also throws an error:

n = len([node for node in model.nodes.values() if height])

NameError: name 'height' is not defined

I'll keep poking at it.

river pilot Dec 20, 2025, 9:06 PM

#

it sounds like some odd scoping thing

sacred lintel Dec 21, 2025, 4:28 PM

#

anyone have an idea on how i can write unit tests for this?
https://github.com/CheetahDoesStuff/sleet
i fear that they will change / affect the projects enviorment (installing/deleting packages, writing commits etc) as that is what its built for and those are the features i would need to test

GitHub

GitHub - CheetahDoesStuff/sleet

Contribute to CheetahDoesStuff/sleet development by creating an account on GitHub.

river pilot Dec 21, 2025, 4:41 PM

#

sacred lintel anyone have an idea on how i can write unit tests for this? https://github.com/C...

i haven't looked at the code, but it sounds like you could create a temporary directory and do everything there, checking the results.

sacred lintel Dec 21, 2025, 4:41 PM

#

river pilot i haven't looked at the code, but it sounds like you could create a temporary di...

hmm, i guess that would work

limpid raft Dec 23, 2025, 10:02 PM

#

I am currently taking over a pretty big code-base that isn't in Git, nor is it test-covered. What I would really like to do is import individual pieces of code into Git along with the tests I write. This is kinda hard to do with a big, convoluted code-base, and I am wondering if you know of a way to run pytest on the Git index (cached changes) whenever those change? I.e. I do git-add and pytest runs on whatever is HEAD + index at that time and gives me the output? Sort of like what pre-commit can do pre-commit…

river pilot Dec 23, 2025, 10:05 PM

#

why not put all of the code into git now? I don't understand how git-ness and tested-ness are connected.

limpid raft Dec 23, 2025, 10:06 PM

#

It's me trying to make sense of the big thing by carving out batches at a time.

#

maybe I just want pre-commit

river pilot Dec 23, 2025, 10:08 PM

#

i wouldn't run pytest in pre-commit, it could be much too slow.

limpid raft Dec 23, 2025, 10:08 PM

#

Well, I agree, but maybe this is precisely what I need right now?

swift pewter Dec 23, 2025, 10:10 PM

#

limpid raft I am currently taking over a pretty big code-base that isn't in Git, nor is it t...

I think doing this on the index is hard. Maybe you could settle for doing it based on commits? Then you could use some combination of git worktree to have a second, linked checkout of the same repo which is on that same branch (but doesn't have any uncommitted files), entr (e.g. to watch HEAD), and a little script that pulls and runs pytest in the second checkout whenever you make a commit.