Sudoku: Little Help?
As I think about additional Techniques, I believe I need more help from my objects. I have an idea to try out. Well, part of an idea.
Of the advanced strategies I’ve read about, Naked Pairs, still not as interesting as it sounds, is about as simple as they get, and it’s not that simple. I don’t want to go down the path that we found on chem.fsu.edu. The chem.fsu.edu code works, and is not terribly difficult to understand. But it is highly repetitive, and it’s not easy to tell the differences from the similarities among the strategies. And it’s not the kind of code I like to write and work with. In particular, though I may have used one or the other, I am not fond of loops that include continue and/or break statements.
I am too old to get into an argument with you about which is “better”, because certainly it depends on the context, but in my context, I prefer to have objects that help me more, and I prefer not to see code with complicated loops. So, this morning, as I was trying to avoid getting up at 0530, I was musing about how to develop these Techniques with as much help from the objects as possible. I have some ideas:
-
Given a Component (row, column, box1), process it in unique combinations of two, three, or four items. Provide the complement of those items in the Component in some easy fashion.
-
I think all the upcoming Techniques only manipulate the candidates lists. The typical behavior of a Technique seems to be to determine some values that can be removed from some subset of the candidates lists in the component … or perhaps another component. So the items in the component should probably be something more clever than a simple list. In particular, lists do not understand
-. Perhaps the candidates collections should be sets. Or, my usual belief is that they should be a simple object offering the protocol that I want, not some generic collection behavior. -
The Component, as it stands, offers methods like
candidates_atandset_candidates_atandremove_candidates_at, which invite us to think in terms of the positions in the Component, subscripts zero through eight. There ought to be, it seems to me, better objects for all this.
All this is vague speculation done while trying to go back to sleep. So I don’t believe anything specific from the above, but I do trust the basic notions enough to be willing to try.
Let’s begin with Naked Pairs, which is at least partly implemented.
The Naked Pairs rule is
If the same two values appear alone in exactly two cells of a Component, no other cells of the Component can possibly contain those values. Remove them from all the other cells.
Let’s see the code:
class NakedPairsTechnique:
def __init__(self, puzzle, component):
self.puzzle = puzzle
self.component = component
def apply(self):
naked_pairs = self.find_all_naked_pairs()
for naked_pair in naked_pairs:
self.puzzle = self.component.remove_naked_pair(naked_pair)
return self.puzzle
def find_all_naked_pairs(self) -> list:
result = []
component_candidates = self.component.all_candidates()
for pos, cand in enumerate(component_candidates):
if len(cand) == 2:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
Now my initial reaction to this is a bit of dismay because as a chunk, it looks complicated. When I calm down enough to read apply, that, at least, makes some sense: find all the naked pairs and remove them from the component. I do kind of wonder about that remove operation: knowing what the rule is makes me think that the ‘remove’ had better only process some of the elements of the component.
Then attention goes to the find_all_naked_pairs. OK, that’s clever but maybe not too clever. We loop over the candidates and if we find a pair (it is therefore naked), we search the subsequent candidates and if the pair is there (again), we add it to our result. The result will contain lists of two candidates.
The Python itertools includes a function combinations that will return the unique combinations of the elements in the sequence it is applied to. If we ask for size 2, from a list [a, b, c, d], it will give us [a, b], [a, c], [a, d], [b, c], [b, d], and [c, d]. That might be convenient. I am slightly concerned because I think I’m going to want the complement of the list, that is, if I’m looking at [b, d] I may want the [a, c] list. I’m also a bit concerned about equality vs identity among the candidates lists. I think we should put off that concern, on the assumption that we can resolve it somehow.
Let’s explore whether combinations will help us write find_all_naked_pairs more readily.
def find_all_naked_pairs(self) -> list:
result = []
for possibles1, possibles2 in combinations(self.component.all_candidates(), 2):
if len(possibles1) == 2 and possibles1 == possibles2:
result.append(possibles1)
return result
That passes three of the four relevant tests. The one that does not pass is this one:
def test_find_result_with_three_illegal_configuration(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(5, ['7', '8'])
technique = NakedPairsTechnique(puzzle, row)
pairs = technique.find_all_naked_pairs()
assert len(pairs) == 2
assert pairs[0] == ['7', '8']
assert pairs[1] == ['7', '8']
This test checks what happens with a situation that “can never occur”, the same pair occurring more than twice in a Component. It seems that every time I change the algorithm, the result of this test changes. Let’s change it a bit, to allow for more cases but still test something:
def test_find_result_with_three_illegal_configuration(self):
"""may find two or three."""
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(5, ['7', '8'])
technique = NakedPairsTechnique(puzzle, row)
pairs = technique.find_all_naked_pairs()
assert len(pairs) == 2 or len(pairs) == 3
for pair in pairs:
assert pair == ['7', '8']
We are green. Could commit. Let’s see if we want to. Here are our new method and the old one for comparison:
def find_all_naked_pairs(self) -> list:
result = []
for possibles1, possibles2 in combinations(self.component.all_candidates(), 2):
if len(possibles1) == 2 and possibles1 == possibles2:
result.append(possibles1)
return result
def find_all_naked_pairsx(self) -> list:
result = []
component_candidates = self.component.all_candidates()
for pos, cand in enumerate(component_candidates):
if len(cand) == 2:
if cand in component_candidates[pos+1:]:
result.append(cand)
return result
New way is shorter. More explicit in that the two possibles are equal.
We could write that as a comprehension, I suppose:
def find_all_naked_pairs(self) -> list:
candidates_pairs = combinations(self.component.all_candidates(), 2)
return [p1 for p1, p2 in candidates_pairs if len(p1) == 2 and p1 == p2]
We could have done the other way as a comprehension also, I suppose. I almost like this and will commit: refactor to use combinations while considering stronger objects.
Reflection
We weren’t just here to see if we could make our Naked Pairs thing a bit nicer. We’re here to think about making all these techniques easier, by having smarter objects and better techniques.
Generalizing what I think I know about upcoming Techniques, they will tend to want to consider one, two, three, or four2 possible elements of a Component. They will want to limit the combinations they consider. In our Naked Pairs, we only consider combinations whose length is 2 and which are equal. That’s the if part of our comprehension.
What if we could pull that part out? Like this:
def find_all_naked_pairs(self) -> list:
def cond(p1, p2):
return len(p1) == 2 and p1 == p2
candidates_pairs = combinations(self.component.all_candidates(), 2)
return [p1 for p1, p2 in candidates_pairs if cond(p1, p2)]
Commit: pull out condition into function.
We observe that this method isn’t concerned with itself so much as with the Component. What if we had a useful method on Component, called find_pairs, like this:
class NakedPairsTechnique:
def find_all_naked_pairs(self) -> list:
def cond(p1, p2):
return len(p1) == 2 and p1 == p2
return self.component.find_candidates_pairs(cond)
All my tests for this fail, of course, no such method. But …
class Component:
def find_candidates_pairs(self, condition):
groups = combinations(self.all_candidates(), 2)
return [p1 for p1, p2 in groups if condition(p1, p2)]
Green. Commit: push finding pair candidates behavior to Component.
But now … check this out:
class NakedPairsTechnique:
def apply(self):
naked_pairs = self.find_all_naked_pairs()
for naked_pair in naked_pairs:
self.puzzle = self.component.remove_naked_pair(naked_pair)
return self.puzzle
def find_all_naked_pairs(self) -> list:
def cond(p1, p2):
return len(p1) == 2 and p1 == p2
return self.component.find_candidates_pairs(cond)
That can be simplified somewhat, I think. Unfortunately we have tests calling `find_all_naked_pairs’, which I would otherwise inline.
I think I want to do it anyway. Let’s see what we can do.
def apply(self):
def cond(p1, p2):
return len(p1) == 2 and p1 == p2
for naked_pair in self.component.find_candidates_pairs(cond):
self.puzzle = self.component.remove_naked_pair(naked_pair)
return self.puzzle
I tried the cond as a lambda, but it makes the for line too complicated in my opinion.
Now if I remove the old find_all_naked_pairs method here I get two tests failing. I change them to test the find in the row component, which is where the action is:
def test_find_etc(self):
def cond(p1, p2):
return len(p1) == 2 and p1 == p2
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(0, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(2, ['5', '6'])
row.set_candidates_at(7, ['5', '6'])
pairs = row.find_candidates_pairs(cond)
assert len(pairs) == 2
assert pairs[0] == ['7', '8']
assert pairs[1] == ['5', '6']
def test_find_result_with_three_illegal_configuration(self):
"""may find two or three."""
def cond(p1, p2):
return len(p1) == 2 and p1 == p2
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(1, ['7', '8'])
row.set_candidates_at(3, ['7', '8'])
row.set_candidates_at(5, ['7', '8'])
pairs = row.find_candidates_pairs(cond)
assert len(pairs) == 2 or len(pairs) == 3
for pair in pairs:
assert pair == ['7', '8']
We still have two tests for the NakedPairsTechnique itself:
def test_naked_pairs(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
row.set_candidates_at(0, ['1', '2'])
row.set_candidates_at(4, ['1', '2'])
assert row.candidates_at(0) == ['1', '2']
technique = NakedPairsTechnique(puzzle, row)
new_puzzle = technique.apply()
assert new_puzzle is not puzzle
new_row = Component.row(new_puzzle, 0)
for cell in [1, 2, 3, 5, 6, 7, 8]:
candidates = new_row.candidates_at(cell)
assert candidates == ['3', '4', '5', '6', '7', '8', '9']
def test_naked_pairs_2(self):
empty = [
'000000000', '000000000', '000000000', '000000000',
'000000000', '000000000', '000000000', '000000000',
'000000000',
]
puzzle = Puzzle.from_list(empty)
row = Component.row(puzzle, 0)
for cell in range(9):
assert row.candidates_at(cell) == ['1', '2', '3', '4', '5', '6','7', '8', '9']
row.set_candidates_at(1, ['8', '9'])
row.set_candidates_at(3, ['8', '9'])
technique = NakedPairsTechnique(puzzle, row)
new_puzzle = technique.apply()
new_row = Component.row(new_puzzle, 0)
for cell in [0, 2, 4, 5, 6, 7, 8]:
candidates = new_row.candidates_at(cell)
assert candidates == ['1', '2', '3', '4', '5', '6', '7']
Those seem to me to be pretty decent checks. If they run, the cond must be good enough. I would prefer, I think, to have a way of promoting the cond functions to enough prominence that we could test them on their own. As things stand, they are tested in direct tests but inline in the tests, and also in conjunction with something like the tests for NakedPairs. If a NakedPairs test fails, it could be due to the cond, it could be due to something wrong with the find, and it could be due to a flawed adjustment of the candidates list based on good values from the find.
We would much prefer tests that can only fail for one reason. Makes our job easier when, inevitably, something breaks. Er, is broken. Er, when we break something.
We’ll leave that for now. Let’s sum up.
Summary
We set out this morning to explore whether our objects could be more helpful in dealing with searches like what we need for Naked Pairs. As I review my starting remarks, there was a lot of focus on the Component, and, sure enough, as we evolved the code for Naked Pairs, we came to a point where it made sense to move the action over to Component. A win for early-morning musing, perhaps.
That said, so far we only have one cond that is useful, the one for Naked Pairs, and it’s not entirely clear that pushing one thing into Component is entirely brilliant. It does make Naked Pairs simpler, but we’ve added some powerful generality with only one application of that generality. Might be more than we need.
That could be. How can we know how far to go unless we know how far is too far? If, later on, this looks too complicated, we’ll improve it. That’s what we do, improve code.
But I think it’s more likely to work out, though I can’t really foresee how. I know that there are Naked Triples and Naked Quads. Those won’t work with the current find_candidates_pairs, but clearly there can be analogous methods for threes and fours.
I think we might still want direct access to the “complement”, the candidates lists that are not in the pair, triple, quad we’ve found. If we do want that, I anticipate that we’ll want to code up our own version of combinations, unless there’s something hiding in itertools. Something to read about in my copious free time.
And we’re seeing signs that our tests don’t always have quite the ideal granularity, sometimes manipulating something larger than we care about, sometimes something smaller but not the actual production code, more of a sample of how to do it.
So, bottom line, I think we’ve improved the Naked Pairs, added a somewhat useful capability to Component, and that we’re beginning to see the shape of some useful capability coming out of the fog.
We’ll see, as time goes on. For today, I’m pleased, and hope that you are, too. See you next time!
-
I’ve been calling these sub-grid. Box is perhaps better. It seems to be more common in the “literature”, and it makes for a convenient yet meaningful set of three-letter names,
row,col,box, which would be nice. ↩ -
Two is probably all a normal human would ever use, but there are techniques that consider triples and even quads. Yeesh. ↩