##What's this? This is an outlier detection algorithm which works on categorical data. It calculated the frequency of occurence of each attribute of a data-point within the entire dataset. Based on these frequencies scores are assigned to each data point Data points with minimum scores are the designated outliers.
##How to Run You should have spark already build as a jar file in your build library path. It has a scala file with class 'OutlierWithAVFModel'
From your main call the function "outliers" of this class, with following parameters
val model = OutlierWithAVFModel.outliers(master:String, input dir:String , percentage:Double)
returned model has two attributes score and trimmed_data.
model.score : RDD(String, Int)
It contains the hash key representation of a datapoint and its avf score.
model.trimmed_data: RDD(String)
It contains the dataset minus the outliers by the percentage provided.