Hi,
I am thinking of utilising the SortMedian algorithm (looks great!), but would need it shoe-horned into a different interface. I can either do it myself, or I can do it in collaboration, if you are interested.
For me, my requirements are:
- C++, templated
- percentile-orientated rather than pure median (50%ile)
- ability to push and pop items on, rather than feed the whole vector
- able to use indexes rather than numbers (or numbers directly - use template to achieve this)
- return the percentile as an index
- return the percentile as a pair of weighted indexes
for the weighted result, in the case of the median,
its for the situation where the median is half way between two items. so each item would be weighted 50%.
For a different percentile, the weighting would be something else, eg 20% and 80%
(0.2 and 0.8) for example.
I imagine an interface like this:
template <typename T, class Compare>
class RollingPercentile {
Compare compare;
public:
RollingPercentile( double percentile, size_t estimated_window_size, Compare compare );
void push( T new_value );
void pop( T old_value );
T result() const;
struct Weighted {
T a_value; double a_weight;
T b_value; double b_weight;
};
Weighted weighted_result() const;
};
Compare could either be a lessthan function,
or a compare --> -1, 0,+1 kind of function,
depending on what is required for the internal algorithm.
So T can be (eg) an integer index, and compare can do the comparision as required,
meaning the result will be the index of the percentile item, so can be used to identify the actual item in question, rather than just the percentile value.
estimated_window_size helps for reserve()ing space for vectors, but window size may grow to be bigger.
Could instead be max_window_size in my particular case.
So, would anyone else have a need for such a class?
Would be cool if it could be added to boost as well.
Oh and I'd probably also try and port it to Javascript.
cheers,
Paul