I’m chasing down an issue in the coffeescript execution engine, but it’s led me to question some things regarding sort.
The Problem
In the R1.1 spec, we introduce sort with this example:
[Encounter: “Inpatient”] E sort by start of period
I think this is the way sort by is intended to be used – directly referencing a property in each of the returned items.
That said, we also have some examples further on that do things like this:
First([Condition: "Acute Pharyngitis"] C sort by C.onsetDateTime desc)
Note the use of the C
alias in the sort expression. I’m not sure this is actually valid and it raises the question: Can a query alias be referenced in a sort clause?
At first glance, this seems OK – but when you really think about it, C
is an alias used during the query execution. The sort is defined to operate on the results of the query – after the execution has occurred – at which point C
doesn’t really have meaning.
Looking at the CQL 1.1 spec, section 5.3.3 states:
After the iterative clauses are executed for each element of the query source, the sort clause, if present, specifies a sort order for the final output. This step simply involves sorting the output of the iterative steps…
Since the alias only holds meaning during the iterative clauses (and changes during each iteration), then if sort comes after iteration, the alias has no meaning. Section 5.3.4 makes this ordering even more clear when it lists out the steps to implementing a query: ForEach, Times, Filter, Distinct, Sort. It seems that by the time you get to Sort, the alias shouldn't be in scope anymore.
Another reason it likely doesn’t make sense to reference an alias in the sort clause is that it introduces a possibility for ambiguity when the thing you’re sorting has a property with the same name as your alias. Consider this:
({Tuple{E: 1}, Tuple{E: 3}, Tuple{E: 2}}) N sort by E
Like the first example above, this shows how sort is intended to work -- by referencing a property in the returned items. This is fine, but what happens when we do this instead:
({Tuple{E: 1}, Tuple{E: 3}, Tuple{E: 2}}) E sort by E
If we allow aliases in the sort expression, what does the E
in the sort mean? Is it the property E
or is it the item E
? It’s unclear.
Where It's Currently Allowed
Despite the above, there are a few things that hint that using aliases in a sort clause is allowed:
- the spec has a number of examples where an alias is used in a sort clause
- the cql-to-elm translator allows aliases in sort clauses (they become
AliasRef
instances)
If it's not really allowed, these will have to be fixed.
A Further Issue / Limitation
That said, there are a few cases where we do want to reference the returned item in the sort clause rather than a property of the item. The use case that led me here is that I want to sort items by the result of a function that takes the item as a parameter. A trivial example is:
({1,-3,2,-5,4}) N sort by Abs(N)
Now... if referencing an alias is not allowed, then what do I pass into the Abs
function? It seems we would need to have some sort of inferred variable, such as:
({1,-3,2,-5,4}) N sort by Abs($result)
Right now, I can use a workaround, but it is ugly:
(({1,-3,2,-5,4}) N return Tuple{num: N} sort by Abs(num)) N2 return N2.num
So...
I guess that leaves the following questions:
- Should aliases be allowed in sort expressions?
- If yes...
- How do we reconcile that with the implementation guidance in the spec?
- How do we specify the expected behavior in the ambiguous example I provided?
- If no...
- How do we support sort expressions that want to operate directly on the result?