FAFO 12: Oops, Arrgh
In which, I realize that I have made a mistake.
Immediately upon signing off yesterday, I realized that there is a mistake in my tests and code. Possibly two mistakes.
def test_xset_restrict_again(self):
ron = XSet([Atom("jeffries", "last"), Atom("ron", "first"), Atom("boss", "job")])
chet = XSet([Atom("chet", "first"), Atom("hendrickson", "last"), Atom("boss", "job")])
hill = XSet([Atom("hill", "last"), Atom("geepaw", "first"), Atom("serf", "job")])
personnel = XSet([ron, chet, hill])
serf_record = XSet([Atom("serf", "job")])
serf_set = XSet([serf_record])
serfs = personnel.restrict(serf_set)
assert ron not in serfs
assert chet not in serfs
assert hill in serfs
My “design” is that every XSet contains a frozen set of Atom instances. In the code above, the set personnel contains a frozen set of XSet instances, not a set of Atoms. The code works, but only because it compensates accidentally on the way out for the accident on the way in.
In Extended Set Theory (XST), a set consists of elements with scopes. A classical set, in XST, has the null set as the scope of each element. An ordered set (n-tuple) has contiguous integer scopes, starting at 1. (I think it would be OK to start at zero, but Childs disagrees. I think it’s just because he used one, not for any deep reason.)
So in the code above, we need to decide what we intend personnel to be, and then make it so. If it is a simple set, we should populate it with Atoms that have the null set as scopes. (We have not as yet defined the null set but it’s empty, so it won’t be difficult.) If we intend personnel to be a tuple, so that we could speak of record number 1 or 15, then we should give it increasing integer scopes. We should set up serfs similarly.
That part is easy. What is not easy is the assertions:
assert ron not in serfs
assert chet not in serfs
assert hill in serfs
Suppose we choose the tuple formulation. Then the hill record will be in personnel as number 3:
hill = XSet([Atom("hill", "last"), Atom("geepaw", "first"), Atom("serf", "job")])
personnel = XSet([Atom(ron, 1), Atom(chet, 2), Atom(hill, 3)])
Then, we have a choice as to how restrict works, but Childs’ convention would have the result set be
serfs == XSet([Atom(hill, 3)])
The set hill isn’t in the XSet … it’s inside an Atom in the XSet. I think that Python’s in will not find it. I am not certain of that, because I am not certain how in actually works, but I think a basic rule is that if a thing hashes unequal it must be unequal.
This discussion is confusing, even for me. I’m sorry. The code will be less so, I hope. But it is making me question this frozen set approach.
Let’s bite the bullet, and recast the set and operation and see if we can figure out how to finish the test. That will move things from hypothesis and imagination to something concrete.
Let’s use some class methods. I can imagine a few. Let’s do classical_set:
def test_xset_restrict_again(self):
ron = XSet([Atom("jeffries", "last"), Atom("ron", "first"), Atom("boss", "job")])
chet = XSet([Atom("chet", "first"), Atom("hendrickson", "last"), Atom("boss", "job")])
hill = XSet([Atom("hill", "last"), Atom("geepaw", "first"), Atom("serf", "job")])
personnel = XSet.classical_set([ron, chet, hill])
serf_record = XSet([Atom("serf", "job")])
serf_set = XSet.classical_set([serf_record])
serfs = personnel.restrict(serf_set)
assert ron not in serfs
assert chet not in serfs
assert hill in serfs
And …
class XSet:
@classmethod
def classical_set(cls, a_list):
null = cls([])
wrapped = [Atom(item, null) for item in a_list]
return cls(wrapped)
We should test that. I’ll write a new test and see what I can figure out.
def test_classical_set(self):
things = ["a", "b", "c"]
classical = XSet.classical_set(things)
null = XSet([])
b_atom = Atom("b", null)
assert b_atom in classical
Let’s make that harder:
def test_classical_set(self):
things = ["a", "b", "c"]
classical = XSet.classical_set(things)
null = XSet([])
b_atom = Atom("b", null)
assert b_atom in classical
wrong_atom = Atom("b", 1)
assert wrong_atom not in classical
That passes. And now I think I can make my restrict set pass, but I’m not going to like it. First update the test:
def test_xset_restrict_again(self):
ron = XSet([Atom("jeffries", "last"), Atom("ron", "first"), Atom("boss", "job")])
chet = XSet([Atom("chet", "first"), Atom("hendrickson", "last"), Atom("boss", "job")])
hill = XSet([Atom("hill", "last"), Atom("geepaw", "first"), Atom("serf", "job")])
personnel = XSet.classical_set([ron, chet, hill])
serf_record = XSet([Atom("serf", "job")])
serf_set = XSet.classical_set([serf_record])
serfs = personnel.restrict(serf_set)
null = XSet([])
assert Atom(ron, null) not in serfs
assert Atom(chet, null) not in serfs
assert Atom(hill, null) in serfs
I think that’s right. Now to fix restrict. Here’s the existing version:
def restrict(self, other):
if not isinstance(other, self.__class__):
return NotImplemented
return frozenset(
[candidate for candidate in self.contents
if any([check.is_subset(candidate) for check in other.contents])])
I think I have to write it out longhand again.
def restrict(self, other):
if not isinstance(other, self.__class__):
return NotImplemented
result_atoms = []
for candidate_atom in self.contents:
for check_atom in other.contents:
if check_atom.element.is_subset(candidate_atom.element):
result_atoms.append(candidate_atom)
return frozenset(result_atoms)
This passes the test, and once I change the other one to use the same form, it passes as well.
What I like about this method is that if the original set is not classical, the result set will retain the scope of matched records. Let’s do a new test to show that. We’ll need a new class method to make a tuple. I’m not sure we can safely name it tuple as that is a type in Python. Hmm. Let’s just bull forward for now.
def test_xset_tuple_restrict(self):
ron = XSet([Atom("jeffries", "last"), Atom("ron", "first"), Atom("boss", "job")])
chet = XSet([Atom("chet", "first"), Atom("hendrickson", "last"), Atom("boss", "job")])
hill = XSet([Atom("hill", "last"), Atom("geepaw", "first"), Atom("serf", "job")])
personnel = XSet.tuple_set([ron, chet, hill])
boss_record = XSet([Atom("boss", "job")])
boss_set = XSet.tuple_set([boss_record])
bosses = personnel.restrict(boss_set)
assert Atom(ron, 1) in bosses
assert Atom(chet, 2) in bosses
assert Atom(hill, 3) not in bosses
And:
class XSet:
@classmethod
def tuple_set(cls, a_list):
wrapped = [Atom(item, index+1) for index, item in enumerate(a_list)]
return cls(wrapped)
Green. I think we have fixed restrict. Let me commit: Fixed test and code mistake making restrict not dig deep enough.
Reflection
It’s convenient to use thing in set notation, but it just became a lot less convenient. In some previous experiments with XST, I’ve used a method has_at(element, scope), which is a bit more convenient. We could also implement has_at_any or make has_at(element) default not to care, or to default to classical.
I think it would be a good design technique if we were not to use the Atom explicitly anywhere. Why? Because formally, no such thing exists, and if we make it visible outside our XSet, we have the details of the implementation leaking out, and that’s not good. However, we would like some kind of convenient syntax to replace this:
hill = XSet([
Atom("hill", "last"),
Atom("geepaw", "first"), A
tom("serf", "job")])
How about this:
hill = XSet.record(
("hill", "last"),
("geepaw", "first"),
("serf", "job"))
Or even this:
hill = XSet.record({"last":"hill", "first":"geepaw", "job":"serf"})
That last is a legitimate dictionary syntax, but would require us to put the scope first. How about:
hill = XSet.record(
elements="hill, geepaw, serf",
scopes="last, first, job")
That one seems error-prone.
I’m going to have to think about that. This article is long enough. Let’s sum up.
Summary
I recognized a mistake in restrict, namely that it was written to expect elements without scopes, and the test had the same design error. We corrected the tests and the code without too much difficulty.
We’re leaving the code in need of a little improvement, although it works just fine. Or does it? Let’s take another look:
def restrict(self, other):
if not isinstance(other, self.__class__):
return NotImplemented
result_atoms = []
for candidate_atom in self.contents:
for check_atom in other.contents:
if check_atom.element.is_subset(candidate_atom.element):
result_atoms.append(candidate_atom)
return frozenset(result_atoms)
Arrgh. I can make that code do something that we don’t want. What if there were two matching records in the restricting set? We would put two copies of the matched record in the output. In principle, that might be OK, since sets are allowed to have duplicates: formally speaking { a, a } == { a }. In practice, no.
So I think we could write a failing test … if we had the ability to get the cardinality (length) of a set. If we change the code to use any, that issue will go away.
I have four items on my notecard:
- we need a null set constant
- Atom should be invisible to XSet users
- restrict code needs improvement
- duplicates can occur in restrict output
We’ll address those next time. And I had an interesting insight that I meant to tell you about this morning, but now it’s too late. Next time for that as well.
One more thing: I am beginning to question my use of frozenset as the basis for XSets. Just a little bit, though. they seem to be helping … but there’s a bit more under the covers than I feel comfortable with.
Still, its early days for this most recent XST experiment … we’ll see what happens. For now, we’ve made a bit of progress and I’m beginning to build up some improved intuition about all this.
See you next time!
P.S. Random thought. What if there was XSet.add_at(element, scope), and what if add_at returned the set? Then we could do
hill = XSet().add_at("hill", "last").add_at("geepaw", "first").add_at("serf", "job")
That might be interesting … and maybe add_next(ron).add_next(chet) could auto-increment integer scope … hmm.