XST: Mapping Considerations
The Big Picture
Chet and I were chatting at lunch today, about what’s going on here, and the difficulty of building a “framework” without specific stories. Certainly what I usually recommend with respect to a framework is that we write a real application, and factor out everything that looks like a framework.
We’re looking at a different situation here. We represent some company with a technical invention (Extended Set Theory, in this case), and we’re trying to figure out two things simultaneously: whether we can use it effectively and what product to build with it. Right now, though these articles have gone on for days, there’s still probably less than two days’ real programming in the code, so it’s not like we’re over-investing in up front work, even if you don’t take into account that we actually have running code.
This whole project can be thought of as “Research and Development”. An important aspect of that is that while the research is going on … so is the development. R&D more commonly means Research … and then after a long time … Development.
But enough philosophy. I’m here to kill some alligators.
Mapping Considerations
Yesterday (and Saturday night), we did that little ShiftedRecord object, to explore how bytes might be slud1 over to line up with bytes in other parts of other records. It wasn’t hard to make the test work, and we might not even be far away from being able to make this one work:
# def test_firstname_restrict # name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee " # input = XSet.new(16, name_data) # select_data = "Ron Lee " # select = XSet.new(4, select_data) # expected = "Jeffries Ron Johnson Lee " # result = input.restrict(select) # assert_equal(expected, result.contents) # end
We can’t quite make it work, because the test as written has no way to indicate that we intend the “Ron” and “Lee” to line up with byte 12 rather than 0. But we have a decent technical start on the underlying implementation. No hurry on that, and we’re getting closer.
I was thinking today about the ShiftedRecord object and some issues with it. The ShiftedRecord has at least one serious drawback. It seems to imply that the offset and length are right there as part of the data. (That’s not required by the object, but it is certainly the way I was thinking and the way I described it.) As I discussed in the preceding article, including all those offsets and lengths would be redundant. It would also constitute duplication and it would be repetitive2. In particular, our plain flat string implementation has an implied ScopeTransform of “identity”, i.e. [0, 1, … ] or { 00, 11, … }. We wouldn’t want to have to put an indicator of that in every record: it would be wasteful3.
So I was thinking. There are some sets where every record has the same identity Scope Transform. There are others where every record has the same non-identity transform. And there are some where each record has its own transform … and surely some in between. Therefore …
What we might want is a single kind of set that could support all these notions. It would include two separate parts … a data part, a string or a slice of one; and a map. The string slice might change as we increment forward record by record, as in :each. The map would change, never, seldom, or all the time, depending on the needs of the set.
Speculating just a bit further, the Scope Transform map might change based on a map-changing strategy:
- Flat Unmapped Set: always identity Scope Transform;
- Flat Mapped Set: always some constant Scope Transform;
- Each Record Unique: reset Scope Transform on every record.
Hey! Isn’t this YAGNI?? Well, no. As we’ve discussed, YAGNI was created to keep us from building things before their time, not to keep us from thinking. Thinking is good. We’re just chatting here. In fact, there’s value to a limited amount of speculation about what we might do – it gives us confidence. There is a big difference between knowing no way to do something and knowing one way. There is a lesser difference, but an important one, in knowing a few good ways to do something. When we know how we might do something, we’ve moved from “might be impossible” to “might be ugly”. That’s a big step.
In this case, it’s a bigger step. The thinking has helped me to resolve a concern that was growing in my mind, about whether there need to be overhead bytes packed into all the records. The sketch of an idea described here tells me that we can probably have no overhead at all, in most sets, and have descriptive overhead only where we need it, in sets of complex structure.
Now, the footnotes, then the code for reference. See you next time!
-
Slud: past tense of slide, according to Dizzy Dean
-
This sort of thing is what I use in lieu of humor. My apologies.
-
See 2.
Appendix: Current Code
class TC_MyTest < Test::Unit::TestCase def setup name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee " @name_set = XSet.new(16, name_data) @five_element_set = XSet.new(4, "123 234 132 342 abc ") end def test_cardinality assert_equal(5, @five_element_set.cardinality) end def test_one_byte_record input = XSet.new(1,"abcdef") assert_equal("b", input.element(1).element(0)) end def test_record_bytes johnson = @name_set.element(3); assert_equal(2, @name_set.rank) assert_equal(1, johnson.rank) assert_equal("J", johnson.element(0)) end def test_element_range assert_equal(0...5, @five_element_set.element_range) end def test_element_extraction assert_equal("132 ", @five_element_set.element(2).contents) end def test_restrict select = XSet.new(1,"1") expected = "123 132 " result = @five_element_set.restrict(select) assert_equal(expected,result.contents) end def test_name_restrict select_data = "HendricksonJeffries " select = XSet.new(11, select_data) expected = "Jeffries Ron Hendrickson Chet" result = @name_set.restrict(select) assert_equal(expected, result.contents) end def test_single_selection select_data = "Jeffries Jeffries " select = XSet.new(11, select_data) expected = "Jeffries Ron " result = @name_set.restrict(select) assert_equal(expected, result.contents) end def test_each_using_scope ann = "" @name_set.each do | scope_element | if (scope_element.scope==2) ann = scope_element.element.contents end end assert_equal("Anderson Ann ", ann) end def test_detect chet_scope_element = @name_set.detect { | scope_element | scope_element.element.contents.include? "Chet" } assert_equal("Hendrickson Chet", chet_scope_element.element.contents) end def test_rank assert_equal(2, @name_set.rank) end def test_element_rank element = @name_set.element(2) assert_equal(1, element.rank) end def test_shifted_record r = XSet.new(1, "Hendrickson Chet", 1) chet = ShiftedRecord.new(12,4,"Chet") ron = ShiftedRecord.new(12,4,"Ron ") assert(chet.subset?(r), "Chet sought but not found") assert(!ron.subset?(r), "Ron incorrectly found") end # def test_firstname_restrict # name_data = "Jeffries Ron Hendrickson ChetAnderson Ann Johnson Lee " # input = XSet.new(16, name_data) # select_data = "Ron Lee " # select = XSet.new(4, select_data) # expected = "Jeffries Ron Johnson Lee " # result = input.restrict(select) # assert_equal(expected, result.contents) # end end class ShiftedRecord def initialize(offset, length, string) @offset = offset @length = length @string = string end def subset? set each do | se | if ( set.element(se.scope) != se.element ) return false end end return true end def each for index in 0...@length yield ScopedElement.new(@string[index,1], index+@offset) end end end class XSet include Enumerable attr_reader :contents def initialize(element_length, contents, rank=2) @element_length = element_length @contents = contents @rank = rank end def each for scope in element_range yield ScopedElement.new(element(scope), scope) end end def restrict(selector) matching_scopes = [] each do | scoped_element | if selector.matches(scoped_element) matching_scopes << scoped_element.scope end end ScopeTransform.new(self, matching_scopes) end def matches(a_scoped_element) any? { | scoped_element | match(a_scoped_element, scoped_element) } end def match(my_scoped_element, selector_scoped_element) selector_scoped_element.element.subset?(my_scoped_element.element) end def subset?(larger_set) element_range.all? { | scope | larger_set.contains?(element(scope), scope) } end # def subset? set # each do | se | # if ( set.element(se.scope) != se.element ) # return false # end # end # return true # end def element(scope) element_contents = @contents[scope*@element_length,@element_length] if (@rank > 1) return XSet.new(1,element_contents, self.rank-1) else return element_contents end end def contains?(an_element, scope) element(scope) == an_element end def element_range 0...cardinality end def cardinality @contents.length / @element_length end def rank @rank end end class ScopedElement attr_reader :element, :scope def initialize(element, scope) @element = element @scope = scope end def to_s "SE#{@scope}=>#{@element}" end end class ScopeTransformTest < Test::Unit::TestCase def test_select_two_records input = XSet.new(4, "1111222233334444") trans = ScopeTransform.new(input, [ 1, 2 ]) assert_equal(input.element(1).contents, trans.element(0).contents) assert_equal(input.element(2).contents, trans.element(1).contents) end def test_reverse_two_records input = XSet.new(4, "1111222233334444") trans = ScopeTransform.new(input, [ 3, 1 ]) assert_equal(input.element(3).contents, trans.element(0).contents) assert_equal(input.element(1).contents, trans.element(1).contents) end end class ScopeTransform def initialize(set, array) @base_set = set @map = array end def element(scope) @base_set.element(@map[scope]) end def contents result_string = "" @map.each do | scope | result_string << @base_set.element(scope).contents end result_string end end class HashExperiment < Test::Unit::TestCase def test_hash h = { :LastName=>"Jeffries", :FirstName=>"Ron" } assert_equal("Jeffries", h[:LastName]) s = [ { :LastName=>"Jeffries", :FirstName=>"Ron" }, { :LastName=>"Hendrickson", :FirstName=>"Chet" } ] assert_equal("Chet", s[1][:FirstName]) end def test_mixed_set s = [ { :LastName=>"Jeffries", :FirstName=>"Ron" }, { :Age=>35 } ] assert_equal( 35, s[1][:Age]) end end