Musing on the process of making messy code habitable. Yes, refactoring has risks. So has messy code. Also: Resist fascism, and take care of one another.

In yesterday’s article, we improved the drawing code in NormalFrameView and TenthFrameView. I think it’s worth reviewing how we did it. I think it’s important.

Create Duplication, then Remove It

We started with code that did similar things slightly differently, for example:

class TenthFrameView:
    def draw(self, game):
        ...
        self.white_lines([(x - mark_line, mark_line), (x - mark_line, 0)])
        self.white_lines([(x - 2 * mark_line, mark_line), (x - 2 * mark_line, 0)])
        self.white_lines([(x - game.FRAME_SIZE, game.GAME_Y), (x, game.GAME_Y)])

We renamed variables and moved lines around, and soon there was more duplication:

class NormalF
class TenthFrameView:(self, game):
        ...
        self.white_lines([(x_right , mark_line), (x_right - mark_line, mark_line), (x_right - mark_line, 0)])
        self.white_lines([(x_right, mark_line), (x_right - 3 * mark_line, mark_line)])
        self.white_lines([(x_right - 2 * mark_line, mark_line), (x_right - 2 * mark_line, 0)])

With a bit of adjusting, a few enters and multiplying by one, we got this:

        self.white_lines([
            (x_right , mark_line),
            (x_right - 1 * mark_line, mark_line),
            (x_right - 1 * mark_line, 0)])
        self.white_lines([
            (x_right , mark_line),
            (x_right - 2 * mark_line, mark_line),
            (x_right - 2 * mark_line, 0)])
        box_number = 3
        self.white_lines([
            (x_right , mark_line),
            (x_right - box_number * mark_line, mark_line),
            (x_right - box_number * mark_line, 0)])

Now those are completely so similar that even PyCharm notices it and we can extract method:

    def draw(self, game):
        ...
        self.small_box(x_right, mark_line, 1)
        self.small_box(x_right, mark_line, 2)
        self.small_box(x_right, mark_line, 3)

    def small_box(self, x_right, mark_line, box_number):
        self.white_lines([
            (x_right, mark_line),
            (x_right - box_number * mark_line, mark_line),
            (x_right - box_number * mark_line, 0)])

Then, with only a little work, we move small_box over to our ViewHelper object and we can also use it in the NormalFrameView to draw the little box that it needs. The code expresses its intention better, and it does all the small boxes the same way.

Almost by Rote

It would certainly be possible to come up with this result by thinking “Hey, there are all these little boxes, let’s devise a small box method on the helper and everyone can use it”. And we could probably come up with a small box method. It might even be as easy to use as the one we have. We’re probably smart enough to do that and wind up at a similar place to where we are in the code above.

My experience is that it takes hard thinking to look at code like the code at the top and realize that there is a lot of small box drawing going on. The code is opaque. It does not express its intention. It just draws lines, and it looks almost random. It’s hard to figure out which line is which. I find it difficult to move from confusing code like that to a better concept like “draw small boxes”.

I find it easy to see two lines that are similar and make them more similar. PyCharm makes it easy to move lines up and down, to arrange similar methods to look more similar. I can even do some nice multi-cursor editing to make similar changes in all the places where I want them to be more similar.

And after a little bit of easily changing things to look more similar, I get two good effects.

First, even if I don’t extract anything, I find the second of these examples far less troubling to read than the first:

        self.white_lines([(x_right , mark_line), (x_right - mark_line, mark_line), (x_right - mark_line, 0)])
        self.white_lines([(x_right, mark_line), (x_right - 3 * mark_line, mark_line)])
        self.white_lines([(x_right - 2 * mark_line, mark_line), (x_right - 2 * mark_line, 0)])

# -------------

        self.white_lines([
            (x_right , mark_line),
            (x_right - 1 * mark_line, mark_line),
            (x_right - 1 * mark_line, 0)])
        self.white_lines([
            (x_right , mark_line),
            (x_right - 2 * mark_line, mark_line),
            (x_right - 2 * mark_line, 0)])
        self.white_lines([
            (x_right , mark_line),
            (x_right - 3 * mark_line, mark_line),
            (x_right - 3 * mark_line, 0)])

It’s really clear in the second case that those three things are similar and, to my eyes, the similarity is not so clear in the first example. And it’s really clear that there’s something going on with that multiplier. So even if we leave the code like that, we can understand it better, and have a better chance of improving it later.

But the second effect is that removing the duplication always improves the code. Compare these two:

        self.white_lines([(x_right , mark_line), (x_right - mark_line, mark_line), (x_right - mark_line, 0)])
        self.white_lines([(x_right, mark_line), (x_right - 3 * mark_line, mark_line)])
        self.white_lines([(x_right - 2 * mark_line, mark_line), (x_right - 2 * mark_line, 0)])

# -------------

        self.small_box(x_right, mark_line, 1)
        self.small_box(x_right, mark_line, 2)
        self.small_box(x_right, mark_line, 3)

I work almost by rote, thinking mostly tactically, and the strategy just seems to emerge, because when we have code that truly looks alike, we are very likely to see the chance to make a function, a method, or an object.

Hey! That’s not right!

We might also better notice a bit of a defect, which I think I just did. Let’s try drawing just one of the three boxes in the tenth frame:

        # vh.small_box(x_right, mark_line, 1)
        vh.small_box(x_right, mark_line, 2)
        # vh.small_box(x_right, mark_line, 3)

box bottom is too long

The small_box method is drawing always from the right margin to the upright of the box. This is invisible in actual use but it’s not right. We can fix all the boxes in just one place, since we have the small_box method.

We change this:

# before

    def small_box(self, x_right, mark_line, box_number):
        self.white_lines([
            (x_right, mark_line),
            (x_right - box_number * mark_line, mark_line),
            (x_right - box_number * mark_line, 0)])

# after

    def small_box(self, x_right, mark_line, box_number):
        self.white_lines([
            (x_right - (box_number-1)*mark_line, mark_line),
            (x_right - box_number * mark_line, mark_line),
            (x_right - box_number * mark_line, 0)])

And it’s all better:

box correctly drawn

Full Disclosure

I’m not sure when that defect appeared, but it was somewhere along the way during this refactoring. The original TenthFrameView drew one big long line across the whole frame and then drew two verticals. When I observed that we could draw three L-shaped things, I introduced the “defect” of always drawing from the right margin. With a different picture, we’d have seen that sooner.

There is no doubt that refactoring can introduce errors. When we’re refactoring “model” code, our tests are there to tell us when we’ve broken something: that’s a big reason why we write them. But when we’re refactoring “view” code, we have to be careful because the changes will typically only show up on the screen.

Fear is the mind-killer.

Many teams and individuals are unwilling to refactor messy code when they find it, and there is good reason to be concerned: it is easy to break code that we do not understand. That’s why people like me and my friends are so careful not to create messy code, and to clean up any mess as soon as we notice it. That’s why I was so disappointed in myself when I created such a mess in the GUI code of this little exercise: I know that things go best when I don’t let the code deviate very far from “nearly good”.

Tiny steps are the fear-killer.

There are, however, safe ways to improve messy code. It can be tedious, of course, if there’s a lot of it, and there probably will be. We surround it with tests. The first tests will be large, but as we proceed things will become easier to test. We make changes only with machine refactoring, using our PyCharm or IDEA or Eclipse to rename things, pull out methods, inline variables, whatever changes we need. And, when we need to make a change manually, we double down on making a test first, and if we just can’t, we’re very careful and work with others, to give ourselves the best chance of seeing any problems.

And of course, we go in tiny tiny steps, checking after each step that everything is still OK.

Of course, we don’t absolutely need to fix up messy code. We can certainly leave it alone if we don’t need to change it, and if we do need to change it, we can just squidge in our changes, leaving most of the code alone, and then back away slowly in case it is going to explode.

The thing is, if we do not improve messy code, at least a little bit, every time we work on it, it gets worse. And that makes it harder to make the next change. We introduce more and more defects. We slow down. No one likes that. And we don’t enjoy our work, and one day we all scream This pile of effluviant has to be rewritten!

And that way lies big trouble. Trust me: refactoring when we can is better than rewriting. Of all the rewriting projects I personally know about, exactly zero worked out as well as hoped, and most of them failed entirely. Perhaps your experience has been better. Perhaps it was a dream.

Summary

I am not satisfied, yet, with this silly Bowling program. We’ll visit it at least one more time. But what I am observing today is that code gets messy little by little, and we can make it less messy little by little. It’s not without risk, but the risks are low. The LLL defect above did no harm. If it ever had, the defect was in one place and easy to fix in that place. Far better than if it had been in four places in two different classes.

I try to avoid making a mess. When I do make a mess, as you have seen me do in this exercise, I try to clean it up the next time I’m in the area. I’ve come to enjoy the small simple steps, especially when creating and removing duplication. It’s the most restful part of programming for me. Maybe give it a try: you might like it.

See you next time! And, resist fascism, and take care of one another.