We need to look around a bit, to decide where to take this project. To do so, we need to get a bit more clarity on the big picture, and the small one.

Looking Further Out

It’s time to look at a slightly larger picture, in the vain hope of figuring out what we’re up to with this Extended Set Theory thing. Let’s start with a list of potential objectives. Then we’ll pick some of them that seem interesting and kick them around a bit more, then set a direction.

  • Compete with Oracle. This one seems right out, as does creation of any large-scale product.
  • Build a small product. Possible; I've done it before. Could be an open source thing, or not.
  • Enable something interesting. If XST is any good, this series of papers might help it be adopted.
  • Demonstrate "R&D" XP style. Or at least show how one fat old guy approaches building an unspecified framework kind of software while working, more or less, in the XP way.
  • Have fun. This one is definitely on. I am having fun, and will continue as long as that's true.

General Project Objectives

But those, while interesting perhaps, aren’t specific enough. Let’s drill in to things we could perhaps accomplish with the program:

  • Build at least two physical data representations, including a file-based representation, and make them fully mutually operable -- all setops working on all kinds of sets. This would, if nothing else, give a chance for some very nice evolutionary design discussion.
  • Demonstrate some high-performance set operations. That would be both interesting and potentially useful if the library goes forward as a product.
  • Demonstrate some "intelligent" optimizations, perhaps selecting optimal ways of executing some commands, or caching useful temp values.
  • Experiment with and demonstrate Symmetric Difference updating, which allows for updating without changing existing records.

Near Term Goals

Still not specific enough to decide what to do next. Here are some potential next steps:

  • Push forward with the flat relational structure, adding set operations.
  • Deal with that commented-out test that allows for a restrict that matches other than at the beginning of the record.
  • Produce a file-based version of the system, perhaps paging, perhaps with more explicit I/O.
  • Push forward with Scope Transform, letting it drive out a superclass or mixin approach to supporting multiple physical structures.
  • Provide a higher-level interface to XSets, so that practical programs can be written.
  • Begin work on Symmetric Difference. This is an operation of surprising simplicity and power. It could lead into interesting terrain.
  • Provide for "curly" sets, in particular sets that are substantially more complex than relations.

That’s more like it. Let’s discuss the near term a bit.

Adding more set operations to the flat structure should, I think, be mostly straightforward. I would expect to learn a bit – and might well discover something surprising, but most of the operations that are interesting are a lot like :restrict other than in detail.

The commented-out test actually calls for Scope Transformation: if we put the sets through appropriate Scope Transformations, we could line up the bytes for comparison.

File-based operations, though interesting from a performance viewpoint, are not likely to provide any deep learning. There will be some interesting mapping of file to memory, but I believe – perhaps wrongly – that I know how to do that.

Scope Transformation is fascinating to me. As I mentioned in the previous article, when I first read about it, I perceived it as possibly being the key to a lot I didn’t understand about how to implement XST effectively. I think there would be a lot to learn. It might lead to some better understanding of “curly” sets, but I’m not sure of that.

The higher-level interface is certainly needed. My current guess as to the best to do it includes use of the hashmap building notation in Ruby. We’ll probably wind up with another set representation of that form, with some convenient way to flow into the flat form as needed.

Symmetric Difference is indeed interesting. It could be started now, under the “more set operations” banner. My intuition is that it interacts with, and could take advantage of, Scope Transformation.

Curly sets are a dilemma. The whole point of XST, as I understand it, was to allow for the independent mathematical handling of order and nesting. Yet Dave Childs tole me that his efforts do not process “curly” sets. Yet again, he says that XST is good for processing XML, which appears to me to have a “curly” set model. One more thing that he has said that I don’t yet understand. There are many.

What ... and Why?

I am inclined to move forward with Scope Transform. The next steps would be to change :restrict to produce a ScopeTransform class result, to promote the ScopeTransform class to include all the necessary set ops (:restrict, :each, probably more), and then to normalize the code, removing duplication.

This feels like a very good direction to me, and I’m the customer. But it’s easy, particularly as a technical customer, to get caught up in some techie chase and lose site of the big picture. I owe it to us all to at least figure out what the big picture is, and how this fits into it.

What has gone before?

Prior to now I have worked on a few XST-related projects. The one at Comshare was very large scale, and though I instituted it, drove it, and contributed to it, most of the development of significance was done by other developers. I feel that it is “my” project, but the gut-level work, and much of the creativity belongs to others.

I worked on the product for a set-theory company formed to capitalize on Dave Childs’s work. For a variety of reasons, that came to no good end. I wrote much of the code and owned most of the design for the rest. I don’t remember many of the details, though, and it would be wrong to apply them if I did.

Lee Johnson and I built a small product called “Venn” that processed flat data sets with setops much like :restrict. The few people who bought it really liked it, but it was not financially successful. Venn included very sophisticated memory management on early PCs, but the need for that is obviated by today’s larger memories and huge virtual addressing space. But Venn did not include any sophisticated internal set “reasoning”. You could build interesting structures with it, but the programmer had to do all the decision-making.

None of these products went in the direction I felt Extended Set Theory could in principle go, though they were all successful in some ways, the Comshare one most of all. I feel closer to being able to demonstrate what XST can do with this version than I’ve been before.

Potential Benefits

These articles are interesting to me (at least), similarly to how Adventures in C# was interesting. They allow me to explore and express how I approach problems, in a way that can, I hope, give readers insight into the mind of an experienced, thoughtful, but still very human developer. The “warts and all” approach of Adventures and of these papers gives a better picture of the real design process of an actual human being, compared to the “cleaned up” examples that we typically read in conventional design and development texts.

The XST tool itself might catch people’s interest and allow the technology to go forward. I think the XST ideas deserve that. Alternatively, I suppose, this example might demonstrate once and for all that the ideas aren’t worth anything. I don’t expect that to be the case, but either way it would be interesting and possibly of value.

A useful library or toolset might emerge, that people could actually use. Whether open sourced or a little product, it would be a fun addition to my “legacy”. (But it wouldn’t be Legacy Code, since I plan to keep it under test!)

Most of all, I’m having a good time, and at least a few people are enjoying the articles. (If you are, please drop me an email to keep me going. If you aren’t … well, keep it to yourself unless you have a better idea for what I should work on. ;->)


For now, because I’m working in the areas that are most unknown: use of multiple data structures, optimization, set transformation, I think I’ll continue to follow my nose with Scope Transformation. That direction is necessary to demonstrating how XST can be of benefit; it addresses high performance options; it would be a valuable part of a library; and it’s a lot of fun. Therefore …

I’ll proceed to make the XSet implementation of :restrict produce a ScopeTransform class set, which will imply implementing some of the operations such as :restrict and :each. That will probably lead to some refactoring that will give us a better look at an overall emerging design.

I expect this to get interesting, and I hope you do as well. I’ve got a gig next week, so I’m not sure if that means more articles or fewer. Either way, stay tuned!