Giter VIP home page Giter VIP logo

axisarrays.jl's Introduction

AxisArrays.jl

This package for the Julia language provides an array type (the AxisArray) that knows about its dimension names and axis values. This allows for indexing by name without incurring any runtime overhead. This permits one to implement algorithms that are oblivious to the storage order of the underlying arrays. AxisArrays can also be indexed by the values along their axes, allowing column names or interval selections.

In contrast to similar approaches in Images.jl and NamedArrays.jl, this allows for type-stable selection of dimensions and compile-time axis lookup. It is also better suited for regularly sampled axes, like samples over time.

Collaboration is welcome! This is still a work-in-progress. See the roadmap for the project's current direction.

Note about Axis{} and keywords

An AxisArray stores an object of type Axis{:name} for each dimension, containing both the name (a Symbol) and the "axis values" (an AbstractVector). These types are what made compile-time lookup possible. Instead of providing them explicitly, it is now possible to use keyword arguments for both construction and indexing:

V = AxisArray(rand(10); row='a':'j')  # AxisArray(rand(10), Axis{:row}('a':'j'))
V[row='c'] == V[Axis{:row}('c')] == V[row=3] == V[3] 

Note about axes() and indices()

The function AxisArrays.axes returns the tuple of such Axis objects. Since Julia version 0.7, Base.axes(V) == (1:10,) gives instead the range of possible ordinary integer indices. (This was called Base.indices.) Since both names are exported, this collision results in a warning if you try to use axes without qualification:

julia> axes([1,2])
WARNING: both AxisArrays and Base export "axes"; uses of it in module Main must be qualified
ERROR: UndefVarError: axes not defined

Packages that are upgrading to support Julia 0.7+ should:

  • Replace all uses of the axes function with the fully-qualified AxisArrays.axes
  • Replace all uses of the deprecated indices function with the un-qualified axes
  • Immediately after using AxisArrays, add const axes = Base.axes

In the future, AxisArrays will be looking for a new name for its functionality. This will allow you to use the idiomatic Base name and offers an easy upgrade path to whatever the new name will be.

Example of currently-implemented behavior:

julia> using Pkg; pkg"add AxisArrays Unitful"
julia> using AxisArrays, Unitful, Random

julia> fs = 40000; # Generate a 40kHz noisy signal, with spike-like stuff added for testing
julia> import Unitful: s, ms, µs
julia> rng = Random.MersenneTwister(123); # Seed a random number generator for repeatable examples
julia> y = randn(rng, 60*fs+1)*3;
julia> for spk = (sin.(0.8:0.2:8.6) .* [0:0.01:.1; .15:.1:.95; 1:-.05:.05] .* 50,
                  sin.(0.8:0.4:8.6) .* [0:0.02:.1; .15:.1:1; 1:-.2:.1] .* 50)
           i = rand(rng, round(Int,.001fs):1fs)
           while i+length(spk)-1 < length(y)
               y[i:i+length(spk)-1] += spk
               i += rand(rng, round(Int,.001fs):1fs)
           end
       end

julia> A = AxisArray(hcat(y, 2 .* y); time = (0s:1s/fs:60s), chan = ([:c1, :c2]))
2-dimensional AxisArray{Float64,2,...} with axes:
    :time, 0.0 s:2.5e-5 s:60.0 s
    :chan, Symbol[:c1, :c2]
And data, a 2400001×2 Array{Float64,2}:
  3.5708     7.14161
  6.14454   12.2891  
  3.42795    6.85591
  1.37825    2.75649
 -1.19004   -2.38007
 -1.99414   -3.98828
  2.9429     5.88581
 -0.226449  -0.452898
  0.821446   1.64289
 -0.582687  -1.16537
                    
 -3.50593   -7.01187
  2.26783    4.53565
 -0.16902   -0.33804
 -3.84852   -7.69703
  0.226457   0.452914
  0.560809   1.12162
  4.67663    9.35326
 -2.41005   -4.8201  
 -3.71612   -7.43224

AxisArrays behave like regular arrays, but they additionally use the axis information to enable all sorts of fancy behaviors. For example, we can specify indices in any order, just so long as we annotate them with the axis name:

julia> A[time=4] # or A[Axis{:time}(4)]
1-dimensional AxisArray{Float64,1,...} with axes:
    :chan, Symbol[:c1, :c2]
And data, a 2-element Array{Float64,1}:
 1.37825
 2.75649

julia> A[chan = :c2, time = 1:5] # or A[Axis{:chan}(:c2), Axis{:time}(1:5)]
1-dimensional AxisArray{Float64,1,...} with axes:
    :time, 0.0 s:2.5e-5 s:0.0001 s
And data, a 5-element Array{Float64,1}:
  7.14161
 12.2891
  6.85591
  2.75649
 -2.38007

We can also index by the values of each axis using an Interval type that selects all values between two endpoints a .. b or the axis values directly. Notice that the returned AxisArray still has axis information itself... and it still has the correct time information for those datapoints!

julia> A[40µs .. 220µs, :c1]
1-dimensional AxisArray{Float64,1,...} with axes:
    :time, 5.0e-5 s:2.5e-5 s:0.0002 s
And data, a 7-element Array{Float64,1}:
  3.42795
  1.37825
 -1.19004
 -1.99414
  2.9429  
 -0.226449
  0.821446

julia> AxisArrays.axes(ans, 1)
AxisArrays.Axis{:time,StepRangeLen{Quantity{Float64, Dimensions:{𝐓}, Units:{s}},Base.TwicePrecision{Quantity{Float64, Dimensions:{𝐓}, Units:{s}}},Base.TwicePrecision{Quantity{Float64, Dimensions:{𝐓}, Units:{s}}}}}(5.0e-5 s:2.5e-5 s:0.0002 s)

You can also index by a single value using atvalue(t). This function is not needed for categorical axes like :chan here, as :c1 is a Symbol which can't be confused with an integer index.

Using atvalue() will drop a dimension (like using a single integer). Indexing with an Interval(lo, hi) type retains dimensions, even when the ends of the interval are equal (like using a range 1:1):

julia> A[atvalue(2.5e-5s), :c1]
6.14453912336772

julia> A[2.5e-5s..2.5e-5s, :c1]
1-dimensional AxisArray{Float64,1,...} with axes:
    :time, 2.5e-5 s:2.5e-5 s:2.5e-5 s
And data, a 1-element Array{Float64,1}:
 6.14454

You can even index by multiple values by broadcasting atvalue over an array:

julia> A[atvalue.([2.5e-5s, 75.0µs])]
2-dimensional AxisArray{Float64,2,...} with axes:
    :time, Quantity{Float64, Dimensions:{𝐓}, Units:{s}}[2.5e-5 s, 7.5e-5 s]
    :chan, Symbol[:c1, :c2]
And data, a 2×2 Array{Float64,2}:
 6.14454  12.2891
 1.37825   2.75649

Sometimes, though, what we're really interested in is a window of time about a specific index. One of the operations above (looking for values in the window from 40µs to 220µs) might be more clearly expressed as a symmetrical window about a specific index where we know something interesting happened. To represent this, we use the atindex function:

julia> A[atindex(-90µs .. 90µs, 5), :c2]
1-dimensional AxisArray{Float64,1,...} with axes:
    :time_sub, -7.5e-5 s:2.5e-5 s:7.500000000000002e-5 s
And data, a 7-element Array{Float64,1}:
  6.85591
  2.75649
 -2.38007
 -3.98828
  5.88581
 -0.452898
  1.64289

Note that the returned AxisArray has its time axis shifted to represent the interval about the given index! This simple concept can be extended to some very powerful behaviors. For example, let's threshold our data and find windows about those threshold crossings.

julia> idxs = findall(diff(A[:,:c1] .< -15) .> 0);

julia> spks = A[atindex(-200µs .. 800µs, idxs), :c1]
2-dimensional AxisArray{Float64,2,...} with axes:
    :time_sub, -0.0002 s:2.5e-5 s:0.0008 s
    :time_rep, Quantity{Float64, Dimensions:{𝐓}, Units:{s}}[0.162 s, 0.20045 s, 0.28495 s, 0.530325 s, 0.821725 s, 1.0453 s, 1.11967 s, 1.1523 s, 1.22085 s, 1.6253 s    57.0094 s, 57.5818 s, 57.8716 s, 57.8806 s, 58.4353 s, 58.7041 s, 59.1015 s, 59.1783 s, 59.425 s, 59.5657 s]
And data, a 41×247 Array{Float64,2}:
   0.672063    7.25649      0.633375      1.54583     5.81194    -4.706
  -1.65182     2.57487      0.477408       3.09505     3.52478     4.13037
   4.46035     2.11313      4.78372        1.23385     7.2525      3.57485
   5.25651    -2.19785      3.05933        0.965021    6.78414     5.94854
   7.8537      0.345008     0.960533       0.812989    0.336715    0.303909
   0.466816    0.643649    -3.67087       3.92978    -3.1242      0.789722
  -6.0445    -13.2441      -4.60716        0.265144   -4.50987    -8.84897
  -9.21703   -13.2254     -14.4409        -8.6664    -13.3457    -11.6213
 -16.1809    -22.7037     -25.023        -15.9376    -28.0817    -16.996
 -23.2671    -31.2021     -25.3787       -24.4914    -32.2599    -26.1118
                                                     
  -0.301629    0.0683982   -4.36574        1.92362    -5.12333    -3.4431
   4.7182      1.18615      4.40717       -4.51757    -8.64314     0.0800021
  -2.43775    -0.151882    -1.40817       -3.38555    -2.23418     0.728549
   3.2482     -0.60967      0.471288      2.53395     0.468817   -3.65905
  -4.26967     2.24747     -3.13758        1.74967     4.5052     -0.145357
  -0.752487    1.69446     -1.20491        1.71429     1.81936     0.290158
   4.64348    -3.94187     -1.59213        7.15428    -0.539748    4.82309
   1.09652    -2.66999      0.521931      -3.80528     1.70421     3.40583
  -0.94341     2.60785     -3.34291       1.10584     4.31118     3.6404

By indexing with a repeated interval, we have added a dimension to the output! The returned AxisArray's columns specify each repetition of the interval, and each datapoint in the column represents a timepoint within that interval, adjusted by the time of the theshold crossing. The best part here is that the returned matrix knows precisely where its data came from, and has labeled its dimensions appropriately. Not only is there the proper time base for each waveform, but we also have recorded the event times as the axis across the columns.

Indexing

Two main types of Axes supported by default include:

  • Categorical axis -- These are vectors of labels, normally symbols or strings. Elements or slices can be selected by elements or vectors of elements.

  • Dimensional axis -- These are sorted vectors or iterators that can be selected by Intervals. These are commonly used for sequences of times or date-times. For regular sample rates, ranges can be used.

Here is an example with a Dimensional axis representing a time sequence along rows and a Categorical axis of symbols for column headers.

B = AxisArray(reshape(1:15, 5, 3), .1:.1:0.5, [:a, :b, :c])
B[row = (0.2..0.4)] # restrict the AxisArray along the time axis
B[0.0..0.3, [:a, :c]]   # select an interval and two of the columns

User-defined axis types can be added along with custom indexing behaviors.

Example: compute the intensity-weighted mean along the z axis

B = AxisArray(randn(100,100,100), :x, :y, :z)
Itotal = sumz = 0.0
for iter in CartesianIndices(Base.axes(B))  # traverses in storage order for cache efficiency
    global Itotal, sumz
    I = B[iter]  # intensity in a single voxel
    Itotal += I
    sumz += I * iter[axisdim(B, Axis{:z})]  # axisdim "looks up" the z dimension
end
meanz = sumz/Itotal

The intention is that all of these operations are just as efficient as they would be if you used traditional position-based indexing with all the inherent assumptions about the storage order of B.

axisarrays.jl's People

Contributors

ajkeller34 avatar balinus avatar c42f avatar cmichelenstrofer avatar coroa avatar dependabot[bot] avatar devmotion avatar dilumaluthge avatar evizero avatar femtocleaner[bot] avatar github-actions[bot] avatar goretkin avatar iamed2 avatar jakebolewski avatar jefffessler avatar johnnychen94 avatar kshyatt avatar lantiga avatar mbauman avatar mcabbott avatar mortenpi avatar pabloferz avatar ranocha avatar richardreeve avatar rofinn avatar sbromberger avatar timholy avatar tokazama avatar tshort avatar yakir12 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

axisarrays.jl's Issues

Register and tag?

I'd love to make use of this package in the upcoming Images rewrite, and for that to work nicely we'll need to register & tag it. Is there a list of things that need to be addressed before that happens? It will be some time (week or two?) before the need is urgent, so there's time to address issues. I'm happy to do the work myself.

Also, if @mbauman would like to ditch implicit responsibility for the package, perhaps we should consider moving this to JuliaArrays. I just moved a bunch of my packages there.

Type piracy with cat ?

It looks like AA redefines cat on basic Vector types:

julia> using AxisArrays

julia> ll = Array{Array{Symbol,1},1}()
0-element Array{Array{Symbol,1},1}

julia> cat(ll..., dims=1)
ERROR: BoundsError: attempt to access ()
  at index [1]
Stacktrace:
 [1] getindex(::Tuple{}, ::Int64) at ./tuple.jl:24
 [2] _cat at /home/pablo/.julia/packages/AxisArrays/G6pZY/src/combine.jl:20 [inlined]
 [3] #cat#44(::Int64, ::Function) at /home/pablo/.julia/packages/AxisArrays/G6pZY/src/combine.jl:16
 [4] (::getfield(Base, Symbol("#kw##cat")))(::NamedTuple{(:dims,),Tuple{Int64}}, ::typeof(cat)) at ./none:0
 [5] top-level scope at none:0

Weighted mean

When indexing into an AxisArray, how can I get the value of the axis at the indexed value?

This becomes relevant for instance in the last example (btw, I PR:ed a version that works after following what you concluded in #127), it would make extra sense (to me at least) when the intensity-weighted mean is performed on the actual axis values, not on the integer indices. So in this specific example the axis are just integers, but if B was say AxisArray(randn(100,100,100), Axis{:x}(1:100), Axis{:y}(1:100), Axis{:z}(linspace(1mm, 3mm, 100))), how would you perform the weighted mean then?

Significant regression when indexing by value in 0.7

Apologies if this is a known issue and/or if the transition to 0.7 is still incomplete, but just in case I thought I should report this.

julia> a = AxisArray(rand(3,3), [:a,:b,:c], [:x,:y,:z])
2-dimensional AxisArray{Float64,2,...} with axes:
    :row, Symbol[:a, :b, :c]
    :col, Symbol[:x, :y, :z]
And data, a 3×3 Array{Float64,2}:
 0.991885  0.000544059  0.511753
 0.242214  0.0773739    0.131205
 0.42773   0.532756     0.565765

julia> test(a) = a[:a,:x]
test (generic function with 1 method)

Version 0.6.2:

julia> @btime test(a);
  53.651 ns (1 allocation: 16 bytes)

Version 0.7.0-DEV.4958:

julia> @btime test(a);
  571.501 μs (223 allocations: 25.45 KiB)

ataxis should return valid indices

I find the ataxis function very convenient. I'd like the function to be always safe to use:

A = AxisArray(collect(1:11), Axis{:time}(0:0.1:1));
A[-Inf .. 0.5] #works and indices are clamped correctly
A[atindex(-0.2..0.2, 5)] #works but one has to be careful not to set the index too low or high to hit the boundary
A[atindex(-0.2..0.2, 1)] #BoundsError: attempt to access 11-element Array{Int64,1} at index [[-1, 0, 1, 2, 3]] 

It would be great if the indexing with atindex would clamp the indices the same way as for A[-Inf .. 0.5]

Also, i'd like to have a function that would be used like this:

A[atrange(-0.2..0.2, 0.4)] # would return the same subarray as A[atindex(-0.2..0.2, 5)]
#now I go with the "naive implementation" 
srange(a, width, value; kw...) = atindex(-width..width, AxisArrays.to_index(a, atvalue(value,kw...))[1])

Unfortunately, one ends up inside @generated function to_index(A::AxisArray{T,N,D,Ax}, I...) where {T,N,D,Ax} that is too meta for me to get a clue what it does, nor fix the issue myself.
Petr

Should indexing return copies?

When I started development of this package, it was somewhat of a testbed for new indexing behaviors. I tried to make the package rather opinionated on how I believed indexing should work. One of those opinions was that indexing should return views by default.

Base Julia has caught up (and surpassed) AxisArrays in terms of APL indexing, but it hasn't caught the view train yet. Should AxisArrays take a step back and match behaviors?

Accessing last index

Hello, first of all: nice package! It's doing what I was trying to build in my own package. Now, I'm more into integrating this package into my own. :)

I do have one issue so far though: I'm building some AxisArrays with a time, longitude and latitude axis. Using the following syntax, I can't access the last element of the time dimension using end.

edit - using:
Julia 0.5
AxisArrays 0.0.4

julia> typeof(test)
AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}

julia> T = test[AxisArrays.Axis{:time}]
AxisArrays.Axis{:time,Array{Date,1}}(Date[1950-01-01,1950-01-02,1950-01-03,1950-01-04,1950-01-05,1950-01-06,1950-01-07,1950-01-08,1950-01-09,1950-01-10    2100-12-22,2100-12-23,2100-12-24,2100-12-25,2100-12-26,2100-12-27,2100-12-28,2100-12-29,2100-12-30,2100-12-31])

julia> typeof(T)
AxisArrays.Axis{:time,Array{Date,1}}

julia> T[1]
1950-01-01

julia> T[end]
MethodError: no method matching endof(::AxisArrays.Axis{:time,Array{Date,1}})
Closest candidates are:
  endof(::SimpleVector) at essentials.jl:169
  endof(::Char) at char.jl:18
  endof(::String) at strings/string.jl:37
  ...

julia> T[55115]
2100-12-31

Another example:

julia> test[Axis{:time}(1)]
2-dimensional AxisArray{Float64,2,...} with axes:
    :lon, [0.0,2.8125,5.625,8.4375,11.25,14.0625,16.875,19.6875,22.5,25.3125    331.875,334.688,337.5,340.313,343.125,345.938,348.75,351.563,354.375,357.188]
    :lat, [-87.8638,-85.0965,-82.3129,-79.5256,-76.7369,-73.9475,-71.1578,-68.3678,-65.5776,-62.7874    62.7874,65.5776,68.3678,71.1578,73.9475,76.7369,79.5256,82.3129,85.0965,87.8638]
And data, a 128×64 Array{Float64,2}:
 -13.1781   -9.32368   -7.82642   -8.93107   -9.87894     -1.83304   -4.26288  -10.9854   -22.7088  -32.7322  -37.4845
 -13.1757   -9.49341   -8.18998   -9.56293  -10.5652       -2.10532   -3.36542   -8.54114  -20.7078  -31.7471  -37.1273
 -13.2207   -9.75245   -8.60462  -10.1924   -11.126        -2.91254   -3.39817   -7.39728  -18.8907  -30.91    -36.9068
 -13.4053  -10.0192    -9.02738  -10.6354   -11.8759       -4.08307   -4.07075   -6.77033  -17.479   -30.0259  -36.6732
 -13.533   -10.9317    -9.48826  -11.1507   -12.4858       -5.39445   -5.13422   -7.10557  -16.6137  -29.0839  -36.4522
 -13.6584  -10.7656    -9.93928  -11.7239   -12.8303      -6.65769   -7.51227   -8.27405  -16.1455  -28.2062  -36.322 
 -13.6874  -10.9521   -10.4106   -12.1195   -13.2295       -8.16553  -10.9555    -8.8057   -15.9809  -27.546   -36.2891
 -13.7967  -11.1882   -10.8646   -12.5337   -13.5852       -9.87189  -12.7761    -9.81238  -16.2154  -27.0488  -36.325 
 -13.8111  -11.4431   -11.3259   -13.0253   -13.9069      -11.5795   -15.156    -11.1282   -16.5414  -26.7977  -36.3877
 -13.9304  -11.7004   -11.7421   -13.4728   -14.2188      -13.0959   -16.7264   -15.3921   -16.9609  -26.8751  -36.26  
 -14.0054  -11.9974   -12.1023   -13.8859   -14.5353     -15.4631   -20.6427   -17.7955   -17.3621  -27.1115  -36.0593
 -14.0489  -12.3577   -12.4554   -14.2304   -14.8702      -16.9788   -23.8476   -19.3232   -18.0571  -27.1679  -35.8542
 -14.1056  -12.4714   -12.8453   -14.4906   -15.1919      -20.4497   -26.3586   -20.5891   -18.5888  -27.1167  -35.6224
 -14.2416  -12.7158   -13.2491   -14.738    -15.3265      -22.8977   -28.4267   -21.5647   -19.1263  -27.1368  -35.4454
 -14.1789  -12.9751   -13.5664   -14.798    -15.3577      -24.4087   -29.7131   -22.5045   -19.7391  -27.3171  -35.3752
 -14.3764  -13.238    -13.8794   -14.9066   -15.2137     -25.6306   -30.6848   -23.4341   -20.2359  -27.5961  -35.4666
                                                                                                                    
 -12.6153   -7.49509   -4.35917   -3.55839   -2.69089     -30.2057   -30.5568   -32.3227   -37.9244  -38.7628  -38.3909
 -12.5765   -7.47901   -4.42795   -3.67417   -3.07596     -31.7261   -34.4672   -33.0697   -38.7802  -39.4308  -38.6722
 -12.5693   -7.47721   -4.52579   -3.77119   -2.93836     -33.7065   -35.7553   -33.3518   -39.5584  -40.0128  -38.9905
 -12.6683   -7.48649   -4.58289   -3.85673   -2.86304    -35.142    -35.927    -33.8506   -40.2003  -40.5507  -39.2897
 -12.7379   -7.51856   -4.69724   -3.96305   -2.78657     -32.0892   -37.2641   -34.1958   -40.5407  -41.0779  -39.5395
 -12.6542   -7.57889   -4.86066   -4.07657   -2.7201      -32.0626   -36.9486   -34.3959   -40.6646  -41.5107  -39.7349
 -12.643    -7.64567   -4.9675    -4.24183   -2.78983     -27.3247   -35.6354   -34.0476   -40.3867  -41.3766  -39.8855
 -12.7144   -7.73249   -5.1489    -4.43427   -2.97227     -25.3403   -33.3752   -33.4608   -39.6832  -40.7994  -40.0125
 -12.7472   -7.83204   -5.366     -4.72288   -3.25089    -21.6124   -29.8843   -33.0288   -39.0629  -40.2828  -39.9988
 -12.7618   -7.94227   -5.5405    -5.03351   -3.53205     -16.5938   -23.5874   -31.3431   -37.6958  -39.6077  -39.7354
 -12.772    -8.08826   -5.77714   -5.37885   -4.11234     -13.8241   -20.7749   -26.1037   -36.3062  -38.7978  -39.4562
 -12.8576   -8.25623   -6.13535   -5.85761   -4.90565     -10.706    -18.2744   -24.2676   -33.6903  -37.878   -39.1897
 -12.9316   -8.43967   -6.46885   -6.31055   -5.83805      -8.03733  -15.184    -22.3596   -30.9707  -36.882   -39.0329
 -12.9815   -8.64146   -6.78547   -6.95096   -6.82795     -4.94059  -11.9601   -19.3953   -29.2095  -35.781   -38.3263
 -13.0144   -8.85602   -7.04481   -7.52024   -7.87766      -3.31864   -9.8346   -16.7155   -27.2396  -34.8338  -38.1462
 -13.0999   -9.158     -7.44782   -8.31867   -8.89557      -2.30311   -6.4213   -13.8705   -24.8114  -33.7981  -37.8232
julia> test[Axis{:time}(end)]
------ BoundsError --------------------- Stacktrace (most recent call last)

 [1] — getindex(::AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}, ::AxisArrays.Axis{:time,Int64}) at indexing.jl:84

 [2] — getindex at indexing.jl:60 [inlined]

 [3] — getindex at abstractarray.jl:752 [inlined]

 [4] — _getindex at multidimensional.jl:270 [inlined]

 [5] — checkbounds at abstractarray.jl:284 [inlined]

 [6] — throw_boundserror(::Array{Float64,3}, ::Tuple{Int64,Colon,Colon}) at abstractarray.jl:355

BoundsError: attempt to access 55115×128×64 Array{Float64,3} at index [451502080,Colon(),Colon()]

Constructing an AxisArray of an AxisArray

The array underlying the AxisArray of an AxisArray is an AxisArray:

julia> a = AxisArray(zeros(2, 4), Axis{:a}((3, 4)))
2-dimensional AxisArray{Float64,2,...} with axes:
    :a, (3, 4)
    :col, Base.OneTo(4)
And data, a 2x4 Array{Float64,2}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

julia> b = AxisArray(a)
2-dimensional AxisArray{Float64,2,...} with axes:
    :row, Base.OneTo(2)
    :col, Base.OneTo(4)
And data, a 2-dimensional AxisArray{Float64,2,...} with axes:
    :a, (3, 4)
    :col, Base.OneTo(4)
And data, a 2x4 Array{Float64,2}:
 0.0  0.0  0.0  0.0
 0.0  0.0  0.0  0.0

Array, however, is a no-op:

julia> a = Array{Int64}((1, 2))
1x2 Array{Int64,2}:
 4520822288  4520821504

julia> Array{Int64}(a) === a
true

I'm not sure what the default behaviour should be:

  1. nesting as now (though, personally, I find it a bit surprising)
  2. no-op, disallow specifying axes
  3. no-op unless new axes are specified, in which case a new wrapper over the original underlying array is supplied. It the number of supplied axis is < than the number of dimensions, use the existing axes, rather than default_axes. But that behaviour might all too special.

I'm happy to do a PR for 2 or 3

Indexing with unsorted vectors

I'm not sure what to do here, but this is a problem:

julia> A = AxisArray(1:10)
10-element AxisArrays.AxisArray{Int64,1,UnitRange{Int64},(AxisArrays.Axis{:row,UnitRange{Int64}},)}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10

julia> A[[3,2,1]]
ERROR: Dimensional axes must be monotonically increasing
 in checkaxis at /Users/mbauman/.julia/v0.4/AxisArrays/src/core.jl:290
 in anonymous at /Users/mbauman/.julia/v0.4/AxisArrays/src/core.jl:155
 in anonymous at /Users/mbauman/.julia/v0.4/AxisArrays/src/indexing.jl:51

Combining AxisArrays - beyond cat

Thinking more about high dimensional concatenation and some typical use cases for combining AxisArrays, there may be a few ways to make better use of axis data for "smarter" alternatives to cat:

merge{T<:AxisArray}(As::T...): given AxisArrays with identical axes but disjoint / non-colliding axis values, automatically combine the inputs into a single AxisArray spanning at least the discrete ranges of all of the input AxisArrays. Unlike a basic cat, dimensional axes would be automatically sorted as needed. Any elements at a newly-created axis coordinate would assume some default value (in Base a comparable multidimensional cat fills undefined elements with zeros, but maybe this value could be user-definable?).

join{T<:AxisArray}(method::Symbol, As:T...): given AxisArrays with identical axes and possibly overlapping axis values, automatically concatenate the inputs on a new, higher dimension/axis. Dimensional axes would be sorted appropriately. Inner/left/right/outer join methods would determine whether all elements are included in the returned AxisArray or just those at axis values found in all/first/last input AxisArrays.

Thoughts? Are these actions generally common enough to merit special functions? Like the cat work I was going to implement something similar for a downstream package but figure it may be more generally useful?

Document indexing semantics

I'm curious why the API has this indexing syntax and why it's different from standard Julia array indexing.

julia> A = AxisArray(reshape(1:60, 12, 5), (.1:.1:1.2, .1:.1:.5));

julia>  A[Axis{:col}(2)] == A[:,2]
true

julia>  A[Axis{:col}(2)] === A[:,2]
false

Sparse Arrays

@mbauman have you given any thought as to how sparse arrays would fit into this framework? I have a project that I've been working on that overlaps with this concept and I would like to help move it here so that people will actually use it :-)

Making AxisArray Abstract

Not sure if this has been discussed before (a quick search didn't turn up much) but I'm wondering if it is possible to make AxisArray abstract (or have an AbstractAxisArray that AxisArray inherits from) and defining an interface that a new AxisArray type must implement to get the indexing behavior of an AxisArray? I ask because for variables in NetCDF files it seems natural for them to be AxisArrays since all arrays in NetCDF files have the dimension information included in the file. But they have other information (e.g., attributes) that it is natural for a NCVar type to include as well. Thus one could have NCVar <: AbstractAxisArray. This would also allow NCVar to have the option of being lazily read when indexed instead of being completely in memory (NetCDF files can be large).

Roadmap

We're getting to the point where the core API is stabilizing. We can now start making things fancier and building out interfaces to base methods and other packages (like #5). Here's a current brain-dump of some of my thoughts. Additions, critiques and comments are very welcome.

Remaining core infrastructure

  • Vector and multi-dimensional setindex! (mirroring getindex's capabilities). (#11)
  • Type-stable constructors. We should allow passing tuples of Axis types to specify the dimension names. It would be an interesting experiment to store the dimension names as Axis types instead of symbols. I'm not sure if that'd make things simpler or more convoluted. It may be a mixed bag.
  • Display. We should display axis names and values in a sensible way. In some regards, I think they're more important to see than a small window into the data, particularly with more than 2 dimensions. This isn't perfect, but 4b62efe is a pretty big improvement.
  • Online documentation and a README revamp. I think that Lexicon.jl/MkDocs is currently the easiest and best solution to make the inline documentation accessible online. Basic implementation from #29.

Possible additions to the core infrastructure

  • Add a third flavor of axis trait for Dimensional axes with elements of a discrete step-like type. The key defining characteristic of this element type is that their StepRanges must enumerate all values between the endpoints, allowing us to provide sensible indexing directly with a StepRange. This also means that there's no issues along the lines of floating point instability, so we can also allow indexing directly with single-elements of this type (and don't need to force the use of Intervals). The main use-case I see here is for Date. Are there other types that satisfy these criteria? What is a sensible name for this trait?
  • Allow (or maybe even encourage) the use of an ordered hashmap for categorical axes to enable O(1) index lookup. Some experimentation is required here: I'm sure the linear search will outperform hashing for small N (particularly with symbols), but what's the cutoff? 10? 100? What about strings? Chances are that folks won't be using categorical vectors to enumerate more than 100 elements.
  • Allow an unsafe constructor that doesn't check axis invariants. Or maybe check upon wrapping with an Axis type (#15). Ensuring that large, non-Range Dimensional axes are monotonically increasing can take a long time (we may be able to speed this up some with a special Ordering type, but it's still O(n)). Similarly, ensuring that elements in categorical vectors are unique requires hashing all elements (which could be used for the above hashmap).
  • Custom iterators. An eachslice iterator would be nice and useful in and of itself, and if we allow the same sort of syntax and semantics as mapslices it can serve as the building block for augmenting that Base function.
  • Windowed repetitions, with a more generic implementation of Signals.window. I was thinking of allowing windowing as an indexing operation (mbauman/Signals.jl#10), but constructing vectors of interval types with deferred promotion (to allow, e.g., windows specified in time about integer indices) has been very challenging.

Extensions to Base

  • We should specialize all Base functions that allow selecting specific dimensions, like sum, mean, maximum, mapslices, etc. We can also return AxisArrays with the properly reduced axis set, dropping a dimension and eliminating the type-unstable squeezes.
  • Permutations and transposes could keep track of and preserve axis information
  • NamedArrays.jl and TimeSeries.jl go even farther to specialize matrix arithmetic to try to preserve names through operations. I'm not convinced this is worth the effort, but it is appealing.

Extensions to other packages

  • It'd be nice to allow re-interpolating along any axis with Interpolations.jl. This will require some work on Interpolations.jl, too.
  • DSP.jl could provide filtering (mbauman/Signals.jl#1) with sensible units for filter design and spectrograms that return a properly annotated AxisArray with an extra frequency (or inverse-whatever) axis.

Dimensional Axes with TimeTypes

I've been thinking about building a range-based n-dimensional time-series data structure (similar in functionality to TimeSeries.jl, with the first axis dimensional and any others categorical) on top of AxisArrays.jl - TimeTypes like Date, DateTime, and ZonedDateTime don't have a defined zero value, however (see JuliaLang/julia#10450), which prevents them from being used as dimensional variables (because of the current implementation of the monitonically-increasing / positive step check).

Would any of these solutions make sense?

  1. A more general check that supports TimeTypes
  2. A special case of the check for dimensional axes of type Range{T} with T <: TimeType
  3. A custom axis type for holding TimeTypes

While I'm still getting my bearings with the package, I'd be happy to either contribute here or add extensions via the downstream package as appropriate.

Add ability to iterate on axis elements

It would be great to have the ability to do the following:

Main> a = AxisArray(randn(10,10,5), :x, :y, :time);

Main> timeaxis(a)
AxisArrays.Axis{:time,Base.OneTo{Int64}}(Base.OneTo(5))

Main> for timepoint in timeaxis(a)
          println(timepoint)
       end
ERROR: MethodError: no method matching start(::AxisArrays.Axis{:time,Base.OneTo{Int64}})
Closest candidates are:
  start(::SimpleVector) at essentials.jl:258
  start(::Base.MethodList) at reflection.jl:560
  start(::ExponentialBackOff) at error.jl:107
  ...
Stacktrace:
 [1] anonymous at ./<missing>:?
 [2] eval(::Module, ::Any) at ./boot.jl:235

That would be the most intuitive method for me to iterate over slices along an axis.

My current solution is way uglier and more fragile

for timepoint in 1:size(exp_img)[axisdim(exp_img, Axis{:time}())]
    println(timepoint)
end

Code in README no longer works

The last example in the readme:

B = AxisArray(randn(100,100,100), :x, :y, :z)
Itotal = sumz = 0.0
for iter in eachindex(B)  # traverses in storage order for cache efficiency
    I = B[iter]  # intensity in a single voxel
    Itotal += I
    sumz += I * iter[axisdim(B, Axis{:z})]  # axisdim "looks up" the z dimension
end
meanz = sumz/Itotal

doesn't work under Julia 0.6. It gives a bounds error:

ERROR: BoundsError
Stacktrace:
 [1] getindex(::Int64, ::Int64) at ./number.jl:38
 [2] macro expansion at ./REPL[4]:4 [inlined]
 [3] anonymous at ./<missing>:?

I believe this is due to eachindex returning a single value instead of a tuple. This is likely because it is using fast linear indexing instead of Cartesian indexing. What would the updated version of this code look like? @timholy

Uninitialized AxisArray constructor

Is there any reason there's currently no constructor to create an uninitialized AxisArray? I.e. something like:

AxisArray{T,N,D}(axes::Axis...)

This seems pretty straightforward, I'm happy to submit a PR - just wanted to check that it wasn't intentionally omitted for some reason.

There's a potential conflict between N and length(axes) - presumably N would be the canonical dimensionality with axes either truncated or extended (with the standard default dimension names and zero-length?) as needed.

Edit: To avoid the over/under-specification, an alternative would be to deviate from the AxisArray type parameter order specification and use AxisArray{T,D} or just AxisArray{T} (where T is a concrete child of AbstractArray). I assume that's frowned upon though - if it's even possible (I'm not clear on how exactly type parametrization works on constructor methods or methods in general, really - any recommended resources for that?).

Faster indexing with Categorical axes

With named columns, indexing based on a symbol slows down tight loops. DataFrames has the same problem. It's tough to get fast, type-stable code for this. Here is an example:

using AxisArrays

N = 10_000_000
A = AxisArray(rand(10000000,2), (1:N,[:a,:b]));

function f(x, j)
    res = 0.0
    for i in 1:size(x,1)
        res += x[i,j]
    end
    res
end

@time f(A, 2)  #  elapsed time: 0.028395385 seconds (117 kB allocated)

@time f(A, :b) #  elapsed time: 0.147722792 seconds (364 kB allocated)

I tried using types as column names to speed things up, but the resulting code isn't type stable, and it's even slower. Here's my attempt:

## Bunch of kludgy stuff
AxisArrays.checkaxis{T}(::Type{Val{T}}) = true
stagedfunction AxisArrays.axisindexes{T,S}(::Type{Val{T}}, ::Type{Val{S}})
    i = findfirst(T, S)
    :($i)
end
Base.length{T}(::Type{Val{T}}) = length(T)
Base.size{T}(::Type{Val{T}}) = size(T)
Base.size{T}(::Type{Val{T}},i) = size(T,i)

B = AxisArray{Float64,2,Array{Float64,2},(:row,:col),(UnitRange{Int64},DataType)}(rand(N,2), (1:N,Val{(:a,:b)}));

@time f(B, Val{:b}) # elapsed time: 0.982083383 seconds (461 MB allocated, 3.26% gc time in 20 pauses with 0 full sweep)
@time f(B, 2)       # elapsed time: 0.019993164 seconds (88 kB allocated)

AxisArray broadcast container type

Currently AxisArrays do not use the container type information from the parent when working with broadcast. Typically this is fine but in some cases this can raise an exception:

julia> using DataArrays, AxisArrays

julia> a = AxisArray(@data([1, NA, 3]), 'a':'c')
1-dimensional AxisArray{Int64,1,...} with axes:
    :row, 'a':1:'c'
And data, a 3-element DataArrays.DataArray{Int64,1}:
 1
  NA
 3

julia> parent(a) .== 1
3-element DataArrays.DataArray{Bool,1}:
  true
      NA
 false

julia> a .== 1
ERROR: MethodError: Cannot `convert` an object of type DataArrays.NAtype to an object of type Bool
This may have arisen from a call to the constructor Bool(...),
since type constructors fall back to convert methods.
Stacktrace:
 [1] setindex!(::Array{Bool,1}, ::DataArrays.NAtype, ::Int64) at ./array.jl:549
 [2] macro expansion at ./broadcast.jl:180 [inlined]
 [3] macro expansion at ./simdloop.jl:73 [inlined]
 [4] macro expansion at ./broadcast.jl:174 [inlined]
 [5] _broadcast!(::##1#2, ::BitArray{1}, ::Tuple{Tuple{Bool}}, ::Tuple{Tuple{Int64}}, ::AxisArrays.AxisArray{Int64,1,DataArrays.DataArray{Int64,1},Tuple{AxisArrays.Axis{:row,StepRange{Char,Int64}}}}, ::Tuple{}, ::Type{Val{0}}, ::CartesianRange{CartesianIndex{1}}) at ./broadcast.jl:162
 [6] broadcast_t(::Function, ::Type{Bool}, ::Tuple{Base.OneTo{Int64}}, ::CartesianRange{CartesianIndex{1}}, ::AxisArrays.AxisArray{Int64,1,DataArrays.DataArray{Int64,1},Tuple{AxisArrays.Axis{:row,StepRange{Char,Int64}}}}) at ./broadcast.jl:279
 [7] broadcast_c at ./broadcast.jl:314 [inlined]
 [8] broadcast(::Function, ::AxisArrays.AxisArray{Int64,1,DataArrays.DataArray{Int64,1},Tuple{AxisArrays.Axis{:row,StepRange{Char,Int64}}}}) at ./broadcast.jl:434

Easier way to slice along a chosen axis?

I'm thinking about mapslices-like behavior, where I have an algorithm where I might want to pass in a particular axis: foo(A, ax) where ax is one of the axes of A. I might write foo something like this:

function foo(A, ax)
    for i = 1:length(ax)
        Asl = A[ax(i)]
        # do something with Asl
    end
    return blah
end

The key line is that A[ax(i)], which doesn't work. I find myself doing this kind of experimentation:

julia> using AxisArrays

julia> A = AxisArray(rand(3,5), :x, :y)
2-dimensional AxisArray{Float64,2,...} with axes:
    :x, 1:3
    :y, 1:5
And data, a 3×5 Array{Float64,2}:
 0.351439  0.92995   0.0618315  0.9546    0.298008
 0.721548  0.149403  0.455972   0.999368  0.592674
 0.175467  0.123669  0.137399   0.66286   0.828891

julia> ax = axes(A)[2]
AxisArrays.Axis{:y,UnitRange{Int64}}(1:5)

julia> A[ax[2]]
0.7215482478540596

julia> ax(2)
ERROR: MethodError: objects of type AxisArrays.Axis{:y,UnitRange{Int64}} are not callable

julia> A[Axis{:y}(2)]
1-dimensional AxisArray{Float64,1,...} with axes:
    :x, 1:3
And data, a 3-element SubArray{Float64,1,Array{Float64,2},Tuple{Colon,Int64},true}:
 0.92995 
 0.149403
 0.123669

julia> axisnames(ax)
(:y,)

The last one works, but you seemingly have to know the name of ax to make it work. If you've passed ax as an argument, there isn't a way to know this short of dispatching on name in Axis{name}. EDIT: of course, one is likely to write foo with signature foo(A, ax::Axis), so maybe it's not so bad to use dispatch. It still feels like ax objects should be indexable-for-indexing, though.

Am I missing a simple way to do this? If not, I'd volunteer to implement ax(2) if you think that's reasonable.

Preserve omitted dimensions when indexing by axis values?

I'm really liking how the API is coming together. Normal indexing works like a charm, and it makes a lot of sense that the special A[Axis{:time}(idxs)] indexing form would preserve all the other (unnamed) dimensions. But what about when you index by axis values with missing trailing dimensions? A good example here would be the following:

A = AxisArray(reshape(1:15, 5,3), (.1:.1:0.5, [:a, :b, :c]))
A[Interval(.22, .44)]

As it is currently designed, this will lower to a call to A.data[3:4], which is linearly indexing across both dimensions. As such, you'll only get back two elements, when you probably wanted those two rows from all three columns (A.data[3:4, :]).

I think there's a case to be made here to preserve the omitted dimensions. Since we're indexing by axis value, we cannot select any indices that are out-of-bounds for that dimension — we'll only ever get values from the first column. That doesn't seem very useful to me. Any objections?

Inference failures on 0.7

Indexing: Error During Test
  Test threw an exception of type ErrorException
  Expression: @inferred(D[1, 1, 1, :]) == @inferred(D[1, 1, 1, 1:1]) == @inferred(D[1, 1, 1, [1]]) == AxisArray([10], Axis{:dim_4}(Base.OneTo(1)))
  return type AxisArrays.AxisArray{Int64,1,Array{Int64,1},Tuple{AxisArrays.Axis{:dim_4,Base.OneTo{Int64}}}} does not match inferred return type AxisArrays.AxisArray{Int64,1,Array{Int64,1},_} where _

There are others but this one is the simplest (found by running AxisArrays master tests on JuliaLang master).

confusing behavior with integer indices

Using AxisArrays master:

julia> x = AxisArray(Array{Int}(3), Axis{:t}(-3:-1))
1-dimensional AxisArray{Int64,1,...} with axes:
    :t, -3:-1
And data, a 3-element Array{Int64,1}:
 0
 0
 0

julia> x[-3]
ERROR: BoundsError: attempt to access 3-element Array{Int64,1} at index [-3]
Stacktrace:
 [1] getindex(::AxisArrays.AxisArray{Int64,1,Array{Int64,1},Tuple{AxisArrays.Axis{:t,UnitRange{Int64}}}}, ::Int64) at /home/mlubin/.julia/v0.6/AxisArrays/src/indexing.jl:41

The corresponding definition of getindex is:

# Simple scalar indexing where we just set or return scalars
@propagate_inbounds Base.getindex(A::AxisArray, idxs::Int...) = A.data[idxs...]

Is this a bug or a feature? Are integer-indexed axes like -3:-1 not really supported?

TimeArray meets AxisArray

julia> using MarketData

julia> Apple = AxisArray(ohlc[1:12].values, (ohlc[1:12].timestamp, colnames(ohlc)))
12x4 AxisArrays.AxisArray{Float64,2,Array{Float64,2},(:row,:col),(Array{Base.Dates.Date,1},Array{UTF8String,1})}:
 104.88  112.5   101.69  111.94
 108.25  110.62  101.19  102.5 
 103.75  110.56  103.0   104.0 
 106.12  107.0    95.0    95.0 
  96.5   101.0    95.5    99.5 
 102.0   102.25   94.75   97.75
  95.94   99.38   90.5    92.75
  95.0    95.5    86.5    87.19
  94.48   98.75   92.5    96.75
 100.0   102.25   99.38  100.44
 101.0   106.0   100.44  103.94
 105.62  108.75  103.38  106.56

julia> Apple.axes[1]
12-element Array{Base.Dates.Date,1}:
 2000-01-03
 2000-01-04
 2000-01-05
 2000-01-06
 2000-01-07
 2000-01-10
 2000-01-11
 2000-01-12
 2000-01-13
 2000-01-14
 2000-01-18
 2000-01-19

julia> Apple.axes[2]
4-element Array{UTF8String,1}:
 "Open" 
 "High" 
 "Low"  
 "Close"

Move more logic into Axis type?

Now that the axis information is stored as proper Axis types, one possible way to both allow an unchecked constructor and fix #14 would be to move the sort check into the Axis constructor. We could even go further, and make Axis an abstract super type of all these traits, with the Axis constructor returning the appropriate type. This could include, for example, a SortedDimensionalAxis type.

This would, of course, make the abstract Axis constructor type-unstable. It also moves more of the axis trait complexity into the foreground.

Question: How to change axis?

Hi,
I am wondering if there is a way to change axes.
For example,

using Unitful
using AxisArrays
const μm = u"μm"
const s = u"s"

tmp = AxisArray(rand(1000,1000,2,2), (:x, :y, :z, :t), (0.577μm, 0.577μm, 5μm, 2s))

I would like to change 0.577μm of :x to some other values. Is there easy with to change it (without generating a new array)?
I tried several things. All didn't work.

# All below do not work.
tmp[Axis{:x}] = 1:1000 
tmp = AxisArray(tmp, (:x), (1))
tmp = AxisArray(tmp, (:x, :y, :z, :t), (1, 0.577μm, 5μm, 2s)) # This generates additional axis.

Best,

Allow AxisArrays that are missing dimensions and/or names?

I'm still just reading the code (and getting a better feel for staged functions!). On @mbauman's question in the code on allowing missing names, I think we should make axes names mandatory or give defaults. However, we should allow the user to leave out the axes (default to nothing for each). A user might only want the Axis() indexing capability. If folks agree, it would be good to flip the order on the following constructor as follows:

stagedfunction AxisArray{T,N}(A::AbstractArray{T,N}, names::(Symbol...), axes::(AbstractVector...))

Then we need defaults of nothing for each axes. Can you have a staged function with default arguments?

Basic operators on AxisArrays

Hello,

I'd like to know if there is a way to do basic operations on AxisArrays, such as addition of two AxisArray that would have the same axes.

Thanks for any info!

Trying "Base.mapreduce!" on an AxisArray

Hello, I'm trying the following code (which simply calculate the number of element that are over a threshold 1). Probably a problem on my side (e.g. not using the correct command for what I'm trying to do).

Code
julia> Base.mapreducedim!(t -> t >= 1, +, view(FD, 1:1, :, :), view(A.data, idx, :, :))

Some Outputs

julia> typeof(A.data)
AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}

julia> typeof(FD)
Array{Int64,3}

julia> Base.mapreducedim!(t -> t >= 1, +, view(FD, 1:1, :, :), view(A.data, 1:365, :, :))
------ MethodError --------------------- Stacktrace (most recent call last)

 [1] — mapreducedim!(::Function, ::Function, ::SubArray{Int64,3,Array{Int64,3},Tuple{UnitRange{Int64},Colon,Colon},false}, ::AxisArrays.AxisArray{Float64,3,SubArray{Float64,3,Array{Float64,3},Tuple{UnitRange{Int64},Colon,Colon},false},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}) at reducedim.jl:222

 [2] — _mapreducedim!(::##39#40, ::Base.#+, ::SubArray{Int64,3,Array{Int64,3},Tuple{UnitRange{Int64},Colon,Colon},false}, ::AxisArrays.AxisArray{Float64,3,SubArray{Float64,3,Array{Float64,3},Tuple{UnitRange{Int64},Colon,Colon},false},Tuple{AxisArrays.Axis{:time,Array{Date,1}},AxisArrays.Axis{:lon,Array{Float64,1}},AxisArrays.Axis{:lat,Array{Float64,1}}}}) at reducedim.jl:206

 [3] — macro expansion at simdloop.jl:73 [inlined]

 [4] — macro expansion at reducedim.jl:207 [inlined]

MethodError: no method matching isless(::AxisArrays.AxisArray{Float64,1,Array{Float64,1},Tuple{AxisArrays.Axis{:dim_4,Base.OneTo{Int64}}}}, ::Int64)
Closest candidates are:
  isless(::Char, ::Integer) at deprecated.jl:49
  isless(::AbstractFloat, ::Real) at operators.jl:42
  isless(::Real, ::Real) at operators.jl:75
  ...

Using a regular Array{Float64,3}, the command works:

julia> R = randn(20440,128,64)
julia> Base.mapreducedim!(t -> t >= 1, +, view(FD, 1:1, :, :), view(R, 1:365, :, :))
1×128×64 SubArray{Int64,3,Array{Int64,3},Tuple{UnitRange{Int64},Colon,Colon},false}:
[:, :, 1] =
 58  59  61  67  63  46  64  54  52  54  50  51  59  55  73  53    55  56  51  56  64  62  68  46  61  66  57  55  61  56  48

...

[:, :, 64] =
 66  48  55  49  70  67  56  59  53  70  40  52  60  49  52  59    45  64  59  56  59  63  71  72  67  58  48  46  65  67  48

julia> typeof(R)
Array{Float64,3}

Reductions fail with Unitful axes

julia> using Unitful, AxisArrays

julia> C = AxisArray(collect(reshape(1:15,3,5)), Axis{:y}([2.0,3.0,4.5]u"m"), Axis{:x}([2.0,2.1,2.3,2.5,2.6]u"cm"))
2-dimensional AxisArray{Int64,2,...} with axes:
    :y, Quantity{Float64, Dimensions:{𝐋}, Units:{m}}[2.0 m, 3.0 m, 4.5 m]
    :x, Quantity{Float64, Dimensions:{𝐋}, Units:{cm}}[2.0 cm, 2.1 cm, 2.3 cm, 2.5 cm, 2.6 cm]
And data, a 3×5 Array{Int64,2}:
 1  4  7  10  13
 2  5  8  11  14
 3  6  9  12  15

julia> sum(C, 1)
ERROR: DimensionError: m and 1 are not dimensionally compatible.
Stacktrace:
 [1] copy!(::IndexLinear, ::Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}, ::IndexLinear, ::Base.OneTo{Int64}) at ./abstractarray.jl:655
 [2] reduced_axis at /Users/ajkeller/.julia/v0.6/AxisArrays/src/core.jl:345 [inlined]
 [3] (::AxisArrays.##5#6{Tuple{Int64}})(::AxisArrays.Axis{:y,Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}}, ::Int64) at /Users/ajkeller/.julia/v0.6/AxisArrays/src/core.jl:306
 [4] mapreducedim at ./reducedim.jl:241 [inlined]
 [5] sum at ./reducedim.jl:570 [inlined]
 [6] sum(::AxisArrays.AxisArray{Int64,2,Array{Int64,2},Tuple{AxisArrays.Axis{:y,Array{Quantity{Float64, Dimensions:{𝐋}, Units:{m}},1}},AxisArrays.Axis{:x,Array{Quantity{Float64, Dimensions:{𝐋}, Units:{cm}},1}}}}, ::Int64) at ./reducedim.jl:572

Would be fixed by redefining reduced_axis and reduced_axis0 in src/core.jl:

reduced_axis( ax) = ax(Base.OneTo(1))
reduced_axis0(ax) = ax(length(ax.val) == 0 ? Base.OneTo(0) : Base.OneTo(1))

However, this change would be at the expense of inferrability in certain cases, resulting in some regressions.

I noticed this issue when looking over #112. That PR improves the situation for other types but does not fix this issue, because Unitful.Quantity <: Number. One possible fix based on that PR would be to have an inferrable reduced_axis for Union{Real,Complex} rather than Number but I'm not sure what the broader consequences would be. It seems like a lot of Number subtypes defined in packages are actually <: Real (e.g. Measurements.Measurement, ForwardDiff.Dual) so maybe this would be acceptable.

Reductions do not preserve type

julia> AxisArray(randn(3,3), (:x,:y),(0.1,0.1),(-0.1,-0.2))
2-dimensional AxisArray{Float64,2,...} with axes:
    :x, -0.1:0.1:0.1
    :y, -0.2:0.1:0.0
And data, a 3×3 Array{Float64,2}:
 0.763468   0.587158  0.520983 
 0.0610593  0.801456  0.721546 
 0.755276   0.834483  0.0795272

julia> maximum(B,2)
3×1 Array{Float64,2}:
 0.763468
 0.801456
 0.834483

julia> B[:,2]
1-dimensional AxisArray{Float64,1,...} with axes:
    :x, -0.1:0.1:0.1
And data, a 3-element Array{Float64,1}:
 0.587158
 0.801456
 0.834483

I would have expected that maximum also returns an AxisArray. This was the case with the old Image object.

ping @timholy

Convenience function for setting axis units

Is there any interest in having a convenience function for setting units for an existing Axis or AxisArray that is currently unitless? I found this useful when loading someone else's NRRD image using NRRD.jl. With this format it seems common to save images with meaningful spacings but without units, and units are basically communicated by word of mouth since the format doesn't standardize them anyway. I've been manually re-creating the axes after I load the images and am looking for something more convenient. Am I missing something in the API? What if we overload arithmetic like this?

using AxisArrays, Unitful
ax = Axis{:x}(linspace(0,10,11))
ax *= Unitful.μm

Enforce axis constraint in merge function

Hello!

I'm wondering if there is a way to easily keep one of the axis as a "sort" axis constraint in merge operations. Here's an example of what is the current behavior:

A = AxisArray(collect(1:11), Axis{:time}(0:0.1:1))
B = AxisArray(collect(12:22), Axis{:time}(1.1:0.1:2.1))
M = merge(B, A) # ideally, I would like to force the time axis to an ascending order

1-dimensional AxisArray{Int64,1,...} with axes:
    :time, [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0    0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
And data, a 22-element Array{Int64,1}:
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
  
  3
  4
  5
  6
  7
  8
  9
 10
 11

Now, trying sort(M) returns

sort(M)
1-dimensional AxisArray{Int64,1,...} with axes:
    :time, [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0    0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
And data, a 22-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
  
 14
 15
 16
 17
 18
 19
 20
 21
 22

Hence, the values are sorted, but not the time axis definition.

Ideally, I would like to enforce the :time axis to be the sorting constraint in merge function.

I coded a workaround in a custom merge function which consist of reverting the order of the argument (i.e. merge(B, A) -> merge(A, B)). But I suspect that it only works for 2-input call to merge without a more complex workaround.

Thanks for any help or hint on how it can be done!

cat operations

It would be great to have AxisArray-specific concatenation operations - it would seem fairly straightforward to implement for concatenating on existing dimensions, but operating on higher (not-yet-existing) dimensions might need a bit more consideration regarding how to create the new axes. Or maybe one would just be required to supply an appropriate Axis?

axistrait works fine

On both 0.5 and 0.6:

julia> using AxisArrays

julia> AxisArrays.axistrait(Axis{:myaxis}([1,2,3]))
AxisArrays.Unsupported

julia> AxisArrays.axistrait(Axis{:myaxis}([:A, :B]))
AxisArrays.Unsupported

Right now the methods are:

axistrait(::Any) = Unsupported
axistrait{T<:Union{Number, Dates.AbstractTime}}(::AbstractVector{T}) = Dimensional
axistrait{T<:Union{Symbol, AbstractString}}(::AbstractVector{T}) = Categorical

A working dispatch-based solution should be possible in 0.6, not sure about 0.5? Alternatively this could be switched over to a single method using eltype?

Indexing by Reals

I've been incorporating AxisArrays into my measurement code and I have got to say, it is really awesome. Great work. One small thing that tripped me up a bit was the following:

julia> A = AxisArray(rand(10,10), Axis{:test1}(0.1:0.1:1.0), Axis{:test2}(0.1:0.1:1.0))
2-dimensional AxisArray{Float64,2,...} with axes:
    :test1, 0.1:0.1:1.0
    :test2, 0.1:0.1:1.0
And data, a 10×10 Array{Float64,2}:
 0.956542   0.269435   0.968573  0.749255  0.931304   0.0717015  0.881473   0.396549  0.23055    0.0553351
 0.645476   0.745046   0.127696  0.939494  0.624804   0.209243   0.390447   0.335097  0.0910745  0.258535 
 0.0884433  0.342941   0.163901  0.521877  0.876765   0.0489701  0.796682   0.365296  0.254204   0.568327 
 0.205803   0.540466   0.891773  0.704059  0.109656   0.91213    0.0746333  0.802257  0.220095   0.332555 
 0.0624195  0.0513494  0.863017  0.920406  0.443187   0.566314   0.57402    0.601021  0.0832811  0.355787 
 0.615837   0.780253   0.697182  0.587102  0.571548   0.698586   0.249986   0.714737  0.253774   0.876721 
 0.50435    0.790261   0.464834  0.90049   0.0769789  0.395214   0.262674   0.485041  0.645104   0.470171 
 0.976307   0.477756   0.576011  0.975836  0.664999   0.54624    0.28258    0.970559  0.654953   0.589304 
 0.102939   0.790315   0.141168  0.208602  0.990242   0.752532   0.0452516  0.687015  0.740944   0.4406   
 0.700986   0.795746   0.645764  0.979793  0.287919   0.398423   0.0822585  0.895504  0.446748   0.534195 

julia> A[Axis{:test2}(0.1)]
ERROR: BoundsError: attempt to access 10×10 Array{Float64,2} at index [Colon(),0.1]
 in throw_boundserror(::Array{Float64,2}, ::Tuple{Colon,Float64}) at ./abstractarray.jl:355
 in checkbounds at ./abstractarray.jl:284 [inlined]
 in _getindex at ./multidimensional.jl:270 [inlined]
 in getindex at ./abstractarray.jl:752 [inlined]
 in getindex at /Users/ajkeller/.julia/v0.5/AxisArrays/src/indexing.jl:52 [inlined]
 in getindex(::AxisArrays.AxisArray{Float64,2,Array{Float64,2},Tuple{AxisArrays.Axis{:test1,FloatRange{Float64}},AxisArrays.Axis{:test2,FloatRange{Float64}}}}, ::AxisArrays.Axis{:test2,Float64}) at /Users/ajkeller/.julia/v0.5/AxisArrays/src/indexing.jl:76

I soon realized of course that dimensional axes are not allowed to be indexed by Real elements, but it seems like the intended error message is not being delivered.

In my use case it would be really great to be able to index by a real value. (Let's say you're measuring a 2D sweep, where your axis values are probably coming from ranges with nonzero step size, and you want to take a slice at a particular value on that axis to make a plot.) Just a suggestion, but something like the following seems to work:

julia> immutable Value{T}
         val::T
       end

julia> atvalue(x) = Value(x)
atvalue (generic function with 1 method)

julia> AxisArrays.axisindexes(::Type{AxisArrays.Dimensional}, ax::AbstractVector, idx::Value) = findfirst(ax.==idx.val)

julia> A[Axis{:test1}(atvalue(0.2)), Axis{:test2}(atvalue(0.3))]
0.12769615238125231

As an aside, it seems like indexing-by-value just works out of the box with Unitful numbers, because Unitful.Quantity is not a subtype of Real. It'd be nice if there were an option to make indexing by value work with Real numbers though.

Deprecate cat?

As discussed in #25 and illustrated in #64, it's not clear whether concatenating arbitrary AxisArrays together makes sense. As far as I can tell merge and join provide the functionality that you would want from cat, provided your axes have meaningful names and values (and if they don't, why are you using an AxisArray?).

I'd be interested to know if:

  1. anyone is actually using cat
  2. anyone is actually using it for something merge/join can't do.

For some historical context, I implemented cat first and realised soon after that I needed something a bit smarter - that's when I added merge/join. Had I had a better understanding of my needs I likely wouldn't have added it to begin with. Now it just seems like a source of potential confusion and frustration: the same axis constraints that govern the other functions apply to cat as well, it just isn't smart enough to do the right thing when those constraints would be violated, and throws an error instead.

If cat is indeed redundant, it might make sense to either:

  • deprecate it then remove it outright: AxisArrays will have no cat method
  • repurpose it as a smart wrapper around merge or join, depending on the arguments (e.g. does the user supply a new axis for joining on)?

One issue with the second option is that cat implies a lower-level control over dimensional ordering that merge and join abstract away, so users may expect it to do things it won't. Of course, this is true of the current cat method as well.

Use value indexing by default

In the single use case I have for AxisArrays right now, it would be very convenient if the default indexing of an AxisArray used the values of the indices instead of the indices of the indices. E.g. if I do

x = AxisArray(Axis{:x}(-3:3))
x[[-2,1]]

it would be handy if it returned the second and fifth element of the AxisVector. You can always refer to the indices of the indices by using the Axis type. Could this work?

Reductions do not work anymore

Not sure what happened after #55 but reductions now do not work anymore.
Two examples

julia> Y = AxisArray(randn(4,4,4,4),:x,:y,:z,:time)
julia> maximum(parent(Y),2)
ERROR: BoundsError: attempt to access (:x,:y,:z,:time)
  at index [5]
 in getindex(::Tuple{Symbol,Symbol,Symbol,Symbol}, ::Int64) at ./tuple.jl:8
 in reaxis(...) at /home/knopp/.julia/v0.5/AxisArrays/src/indexing.jl:42
 in macro expansion at ./reducedim.jl:230 [inlined]
 in macro expansion at ./simdloop.jl:73 [inlined]
 in _mapreducedim!(::Base.#identity, ::Base.#scalarmax, ::Array{Float64,4}, ::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}) at ./reducedim.jl:229
 in mapreducedim!(::Function, ::Function, ::Array{Float64,4}, ::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}) at ./reducedim.jl:237
 in mapreducedim(::Function, ::Function, ::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}, ::Int64) at ./reducedim.jl:268
 in maximum(::AxisArrays.AxisArray{Float64,4,Array{Float64,4},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}},AxisArrays.Axis{:time,Base.OneTo{Int64}}}}, ::Int64) at ./reducedim.jl:320

and

julia> Y = AxisArray(randn(4,4,4),:x,:y,:z)
julia> maximum(parent(Y),2)
ERROR: ArgumentError: ordering is not well-defined for arrays
 in macro expansion at ./reducedim.jl:230 [inlined]
 in macro expansion at ./simdloop.jl:73 [inlined]
 in _mapreducedim!(::Base.#identity, ::Base.#scalarmax, ::Array{Float64,3}, ::AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}}}}) at ./reducedim.jl:229
 in mapreducedim! at ./reducedim.jl:237 [inlined]
 in mapreducedim at ./reducedim.jl:268 [inlined]
 in maximum at ./reducedim.jl:318 [inlined]
 in maximum(::AxisArrays.AxisArray{Float64,3,Array{Float64,3},Tuple{AxisArrays.Axis{:x,Base.OneTo{Int64}},AxisArrays.Axis{:y,Base.OneTo{Int64}},AxisArrays.Axis{:z,Base.OneTo{Int64}}}}, ::Int64) at ./reducedim.jl:320

Interestingly its two different errors. I don't know what the parent is supposed to do but it was proposed in #56 by @timholy. It seems that parent is the identity for AxisArrays

Support logical linear indexing

Currently:

julia> input = AxisArray(collect(reshape(1:60, 12, 5)), .1:.1:1.2, [:a, :b, :c, :d, :e])
2-dimensional AxisArray{Int64,2,...} with axes:
    :row, 0.1:0.1:1.2
    :col, Symbol[:a, :b, :c, :d, :e]
And data, a 12×5 Array{Int64,2}:
  1  13  25  37  49
  2  14  26  38  50
  3  15  27  39  51
  4  16  28  40  52
  5  17  29  41  53
  6  18  30  42  54
  7  19  31  43  55
  8  20  32  44  56
  9  21  33  45  57
 10  22  34  46  58
 11  23  35  47  59
 12  24  36  48  60

julia> input[find(isfinite.(input))]
1-dimensional AxisArray{Int64,1,...} with axes:
    :col, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  51, 52, 53, 54, 55, 56, 57, 58, 59, 60]
And data, a 60-element Array{Int64,1}:
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
  ⋮
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60

julia> input[isfinite.(input)]
ERROR: BoundsError: attempt to access 12×5 Array{Int64,2} at index [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10  …  51, 52, 53, 54, 55, 56, 57, 58, 59, 60], Base.Slice(Base.OneTo(5))]
Stacktrace:
 [1] throw_boundserror(::Array{Int64,2}, ::Tuple{Array{Int64,1},Base.Slice{Base.OneTo{Int64}}}) at ./abstractarray.jl:433
 [2] checkbounds at ./abstractarray.jl:362 [inlined]
 [3] macro expansion at ./multidimensional.jl:441 [inlined]
 [4] _getindex at ./multidimensional.jl:438 [inlined]
 [5] getindex at ./abstractarray.jl:882 [inlined]
 [6] getindex at /Users/ericdavies/.julia/v0.6/AxisArrays/src/indexing.jl:88 [inlined]
 [7] getindex(::AxisArrays.AxisArray{Int64,2,Array{Int64,2},Tuple{AxisArrays.Axis{:row,StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}},AxisArrays.Axis{:col,Array{Symbol,1}}}}, ::BitArray{2}) at /Users/ericdavies/.julia/v0.6/AxisArrays/src/indexing.jl:99

IMO these should have identical results. I'll work on this unless there are any concerns.

Split Interval into separate package?

As you mention in JuliaLang/julia#12624, several people have implemented interval types in their packages. I know at least of @dcjones's IntervalTree package. I'm also looking at intervals for a package that I'm working on. Would you be interested in splitting out the interval type into a separate package? It seems like your implementation is useful pretty generally.

Implement squeeze for multiple axes

We can't restrict the type accepted to Tuple{Vararg{Type{<:Axis}}} because typeof((Axis{:time}, Axis{:label})) == Tuple{UnionAll, UnionAll} but we might be able to assume they are axis types when they're not Base.Dims.

This would be especially useful after JuliaLang/julia#23500

Axes are copied when taking a view

Having looked at the source code it is clear that when taking a view of an AxisArray, the indexed axes might be copied if _new_axes is called (https://github.com/JuliaArrays/AxisArrays.jl/blob/master/src/indexing.jl#L80). Is this the intended behaviour? It would make more sense to me for views to be taken of the original axes as well, to aid performance.

Example below (there is 16KB of allocation - i.e. the index copy)

using Dates
using AxisArrays
using BenchmarkTools

index = collect(DateTime(2018, 1, 1):Minute(1):DateTime(2018, 1, 10))
B = AxisArray(randn(Float64, (length(index), 10)), index)

@btime view(B, 1000:3000, :)
# 994.800 ns (6 allocations: 15.98 KiB)

If this should be fixed / changed, I'm happy to give a PR a go.

Update to Julia v0.7

Here is package test output from 0.7-beta (omitting lots of deprecations):

Test Summary: Pass Fail Error Total
AxisArrays 140 12 14 166
Core 1 1
Intervals 60 60
Indexing 19 5 24
SortedVector 3 7 10
CategoricalVector 4 4 3 11
Search 26 26
Combine 28 4 32
README 1 1

Also, are users now expected to module-qualify all calls of axes (since Base exports an incompatible function)?

(Sorry to ping you, @mbauman, but this repo doesn't seem to get much attention and the last question is important for the image-processing ecosystem.)

Combine AxisArrays with cat?

a=AxisArray(rand(3,3,3,10),:x,:y,:z,:time);
b=AxisArray(rand(3,3,3,20),:x,:y,:z,:time);
c=cat(4,a,b) does not work, am I missing something? Is there something like c=cat(Axis{:time},a,b)?

c=cat(4,a.data,b.data) works

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.