The current inner-join operator joins on \t
to form the comparison key, which
is wrong: it needs to join on \0
to be consistent with the way sort
looks
at it. (And I'm not even sure that will work; need to research it more.)
There's some possibility it would be faster, and it preemptively avoids issues
around LC_ALL
.
For example, suppose we have a scale-by-single-row or scale-by-key operator later on. We might want to know the details of how each of those is working. This makes me think that fixed space allocation is a non-starter; we might want some kind of expand/collapse interface, or we might want aggregation by axis.
I think this requires us to write a pager (or at least defer its instantiation)
because less
captures terminal input immediately.
Double-layer namespacing is way too confusing.
We need more detailed process tracking, and ideally some more structured interface to pipelines. Streams should be objects since we aren't creating/destroying them inside any loops.
Like S
, but auto-configure buffer sizes and #children to maximize throughput.
Should apply to JSON, XML, headed CSV/TSV, SQL-as-text, possibly other formats too. Also should optimize for the consistent-schema case and predict field positions. Support assertions (?)
Must be client-side; this way it can happen after autoranging and during zooms.