Giter VIP home page Giter VIP logo

autoquant's Introduction

autoquant's People

Contributors

adrianantico avatar ammubharatram avatar dougvegas avatar justinsavage49 avatar solomondaner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

autoquant's Issues

Invalid uid value and malformed maintainer field upon install dependency Catboost

Hi @AdrianAntico

I'm trying to install the latest version of RemixAutoML package but faced the following errors.
I'm still on R3.6 if that helps. Does your package requires certain version of R?

remotes::install_github('catboost/catboost', subdir = 'catboost/R-package') Downloading GitHub repo catboost/catboost@master 
✓  checking for file ‘/tmp/Rtmpse3O8p/remotesca3839e02421/catboost-catboost-4aed7fe/catboost/R-package/DESCRIPTION’ (496ms) 
─  preparing ‘catboost’: E  checking DESCRIPTION meta-information ...    
Malformed maintainer field.        
See section 'The DESCRIPTION file' in the 'Writing R Extensions'    manual.     
Error: Failed to install 'catboost' from GitHub:   System command 'R' failed, exit status: 1, stdout + stderr: E> * checking for file ‘/tmp/Rtmpse3O8p/remotesca3839e02421/catboost-catboost-4aed7fe/catboost/R-package/DESCRIPTION’ ... OK E> * preparing ‘catboost’: E> * checking DESCRIPTION meta-information ... ERROR E> Malformed maintainer field. E>  E> See section 'The DESCRIPTION file' in the 'Writing R Extensions' E> manual. E>  

remotes::install_github('AdrianAntico/RemixAutoML', upgrade = FALSE, dependencies = FALSE, force = TRUE) 
Downloading GitHub repo AdrianAntico/RemixAutoML@master ✓  checking for file ‘/tmp/Rtmpse3O8p/remotesca387ebc3f88/AdrianAntico-RemixAutoML-c7e8030/DESCRIPTION’ (493ms) ─  preparing ‘RemixAutoML’: 
✓  checking DESCRIPTION meta-information ... 
─  checking for LF line-endings in source and make files and shell scripts 
─  checking for empty or unneeded directories 
─  building ‘RemixAutoML_0.2.3.tar.gz’    
Warning: invalid uid value replaced by that for user 'nobody'     
Installing package into ‘/datascience/R/x86_64-redhat-linux-gnu-library/3.6’ (as ‘lib’ is unspecified) ERROR: dependency ‘catboost’ is not available for package ‘RemixAutoML’ * removing ‘/datascience/R/x86_64-redhat-linux-gnu-library/3.6/RemixAutoML’ Error: Failed to install 'RemixAutoML' from GitHub:   (converted from warning) installation of package ‘/tmp/Rtmpse3O8p/fileca3870f854a8/RemixAutoML_0.2.3.tar.gz’ had non-zero exit status

Suggested Changes to AutoTS

Runtime Using Walmart Data Set

user system elapsed
20.23 1.44 21.76

Output Plots

Output plot is base plot from forecast package - need to add RemixTheme to plot.
X axis should display as date, not decimal.

Title of chart should be:
paste(FCPeriods, TimeUnit, "forecast for", TargetName, sep = " ")

Subtitle of chart should be:
paste("Champion Model:", ChampionModel, "Mean Absolute Percent Error:", paste(round(min(EvaluationMetrics$MAPE),2) * 100, "%", sep = ""), sep = " ")

Caption of chart should be: "Forecast generated by Remix Institute's RemixAutoML R package"

Color of Line should be: #00AA9D

Remix Theme code:

remix_theme1 = function(){ theme( axis.title = element_text(family = "Helvetica", size = 11), axis.text = element_text(family = "Helvetica", size = 11), legend.background = element_blank(), legend.key = element_blank(), legend.text = element_text(family = "Helvetica", color = "#1c1c1c", size = 11), legend.title = element_blank(), legend.justification = 0, legend.position = "top", #plot.background = element_rect(fill = "#d1d1d1"), #panel.background = element_rect(fill= "#d1d1d1"), plot.background = element_rect(fill = "#E7E7E7"), panel.background = element_rect(fill= "#E7E7E7"), panel.grid.major.x = element_blank(), panel.grid.minor.x = element_blank(), panel.grid.major.y = element_line(color = "white"), panel.grid.minor.y = element_line(color = "white"), plot.title = element_text(family = "Helvetica", color = "#1c1c1c", size = 28, hjust = 0, face = "bold"), plot.subtitle = element_text(family = "Helvetica", color = "#1c1c1c", size = 16, hjust = 0), plot.caption = element_text(family = "Helvetica", size = 9, hjust = 0, face = "italic") ) }

Issue 1

EvaluationMetrics only include metrics for following models:

  • ARIMA
  • NN
  • ARFIMA
  • TBATS
  • ETS

No evaluation metrics were outputted for TSLM or PROPHET and SkipModels is set to NULL

Feature Request 1

TimeSeriesModel output says:
Series: dataTSTrain[ ,TargetName]

Any way to allow this to be dynamic and say what the actual data set and TargetName are called, based on what user puts in?
For example, using the Walmart data set, can it say:
Series: walmart_train[, Weekly_Sales]

Feature Request 2

Can we add another output to the list output called "ChampionModel" which is just a character string of the winning model?
(ie ARIMA, NN, ARFIMA, TBATS, ETS, TSLM, PROPHET)

AutoCatBoostCARMA problems with t + 2 predictions

Hi again,

I tried your AutoCatBoostCARMA function. It seems, there is something wrong with t +2,.. predictions. Here is sample of my data:

structure(list(index = structure(c(13880, 13881, 13882, 13885, 
13886, 13887, 13888, 13889, 13892, 13893, 13894, 13895, 13896, 
13899, 13900, 13901, 13902, 13903, 13906, 13907), class = "Date"), 
    zadnja = c(351.75, 347, 348, 342, 339, 339.86, 342.61, 345, 
    340, 336.11, 331, 333.94, 330.01, 317, 313, 313.98, 315, 
    319.45, 313, 316)), row.names = c(NA, -20L), ticker = "HT", index_quo = ~index, index_time_zone = "UTC", class = c("tbl_time", 
"tbl_df", "tbl", "data.frame"))

And here is your function:

AutoCatBoostCARMA_forecast <-  RemixAutoML::AutoCatBoostCARMA(
  data = sample,
  TargetColumnName = "zadnja",
  DateColumnName = "index",
  FC_Periods = 5,
  TimeUnit = "day",
  TargetTransformation = TRUE,
  Lags = c(1:5)
)
AutoCatBoostCARMA_forecast$Forecast

Results are:

           index Predictions
   1: 2008-01-02          NA
   2: 2008-01-03          NA
   3: 2008-01-04          NA
   4: 2008-01-07          NA
   5: 2008-01-08          NA
  ---                       
2836: 2019-07-05    159.5785
2837: 2019-07-06          NA
2838: 2019-07-06     -1.0000
2839: 2019-07-07          NA
2840: 2019-07-07     -1.0000

For t +2 and forward results are NA and -1.

The same thing happens on bigger sample.

P.S. I would like to add LSTM time series prediction code to your arsenal. Do you agree with that and do you have some incorporate new models in your code?

Prediction interval with XGBoost

Currently, I think AutoH2OModeler does not have the option for quantile regression. H2o has a quantile regression for GBM only so far and this option is not available for XGBoost. Do you have a plan to add a prediction interval for XGBoost with H2o? Or there is any way we can do it.

I tried to find lower and upper interval using the function from this (https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b)
And attempted to add an interval to the prediction of XGBoost of H2o. However, by using the above-mentioned function for the quantile XGBoost interval are flat and the prediction is going beyond the range of the lower and upper interval. I will appreciate your suggestion.

Error in paste0("Calibration Evaluation Plot: ", toupper(eval_metric), : object 'BaseModelEval' not found

Hi i am trying to compare AutoTS() , AutoCatBoostCARMA() , AutoXGBoostCARMA() , AutoH2oDRFCARMA() , and AutoH2oGBMCARMA() on a single time series.
AutoTS() , AutoCatBoostCARMA() , AutoXGBoostCARMA() works perfectly fine, but when i tried to run AutoH2oDRFCARMA and AutoH2oGBMCARMA an error message shown as follow:

Error in paste0("Calibration Evaluation Plot: ", toupper(eval_metric), : object 'BaseModelEval' not found

My code run as follow:

result=AutoH2oGBMCARMA(
x[,1:2],
TargetColumnName = "POSITIVE_DEMAND",
DateColumnName = "FULL_DATE",
FC_Periods = 2,
TimeUnit = "month",
TargetTransformation = TRUE,
Lags =12,
MA_Periods = 3,
CalendarVariables = TRUE,
HolidayVariable = TRUE,
TimeTrendVariable = TRUE,
DataTruncate = FALSE,
#SplitRatios = c(1 - (30+z)/nrow(x), 30/nrow(x), z/nrow(x)),
EvalMetric = "MAPE",
GridTune = FALSE,
ModelCount = 1,
NTrees = 2000,
PartitionType = "timeseries",
MaxMem = "28G",
NThreads = 8,
Timer = TRUE)

can AutoH2oDRFCARMA and AutoH2oGBMCARMA applied to a single time series? It those two methods can be applied to a single time series, what is wrong to my code? Hope you help, thanks

AutoXGBoostCARMA fails

I am using the AutoXGBoostCARMA to forecast a time series. Yet when I do I get the following failure:

Error in if (rng.nch[1] != rng.nch[2]) stop("'charvec' has non-NA entries of different number of characters") : 
  missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In `[.data.table`(TestDataEval, , `:=`(Target, NULL)) :
  Column 'Target' does not exist to remove
2: In `[.data.table`(Preds, , `:=`(eval(DateColumnName), NULL)) :
  Column 'Date_Column' does not exist to remove

I run the following command:

AutoXGBoostCARMA(
  data = tidy_model_tbl
  , TargetColumnName = "Value"
  , DateColumnName = "Date_Column"
  , GroupVariables = "data_type"
  , FC_Periods = ifelse(time_param == "weekly", 52, 12)
  , TimeUnit = ifelse(time_param == "weekly", "week", "month")
)

My data is attached.
tidy_model_tbl.xlsx

Certain functions change working directory and do not change it back

In 3 places in the code base, there is the following code:

if (SaveModelObjects) {    
    setwd(model_path)  
    catboost::catboost.save_model(model = model, model_path = paste0(ModelID))  
}

The problem with this code is that the working directory is changed, and not changed back. This is of particular concern because if "model_path" is a relative directory, this code can only be run once before it starts failing. E.g. if my current working directory is "C:/", and model_path = "some_folder", after this code is run once the current working directory is "C:/some_folder". The next time this code runs, it will try to set the working directory to a non-existent folder, "C:/some_folder/some_folder".

I think there are two ways to fix this approach:

if (SaveModelObjects) {
    oldwd <- getwd()
    setwd(model_path)
    catboost::catboost.save_model(model = model, model_path = paste0(ModelID))
    setwd(oldwd)
}

Or

if (SaveModelObjects) {
    model_path = paste0(model_path, '/', ModelID)
    catboost::catboost.save_model(model = model, model_path = model_path)
}

If this works correctly, I think the second approach has fewer chance of side-effects.

Microsoft R Open: Error in utils::download.file(url, path, method = download_method(), quiet = quiet, : cannot open URL

Downloading GitHub repo AdrianAntico/RemixAutoML@master Error in utils::download.file(url, path, method = download_method(), quiet = quiet, : cannot open URL 'https://api.github.com/repos/AdrianAntico/RemixAutoML/tarball/master'

I want to try Microsoft R Open (MRAN) as alternative to speed up R. Although I face this installation issue when I try to use Microsoft R.

Is this a known issue with MRAN?

Error in read.dcf(path) : Found continuation line starting ' c(person(given = ...' at begin of record.

Error
The command
devtools::install_github('AdrianAntico/RemixAutoML', upgrade = FALSE, dependencies = FALSE, force = TRUE)
throws the error:
Error in read.dcf(path) : Found continuation line starting ' c(person(given = ...' at begin of record.

Similar Issues
I think that it might be due to an issue with the description files as it was the case in here.

Sys Info:
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

devtools 2.0.2


Thank you!

Installation Issues

I'm getting the following error when I try to run the install:

looking for catboost
Loading required namespace: catboost
Failed with error: ‘there is no package called ‘catboost’’
Downloading GitHub repo catboost/catboost@master
Error: Failed to install 'catboost' from GitHub:
cannot open the connection
In addition: Warning message:
In file(name, "wb") :
cannot open file 'catboost-catboost-adab52f/catboost/pytest/canondata/test.test_ctr_target_quantization_boosting_type=Plain-border_count=1-border_type=UniformAndQuantiles_/': Invalid argument

Any ideas on how to fix this? I'm running R 3.6 which may be the issue?

Using beste model for future predictions

Hi @AdrianAntico,

I have just tried AutoBanditSarima functin on hourly data. Everything works fine. This is my best model:

      DataSetName BoxCox IncludeDrift SeasonalDifferences SeasonalMovingAverages SeasonalLags MaxFourierTerms Differences MovingAverages Lags BiasAdj
1: ModelFrequency   skip        FALSE                   0                      0            1               3           1              4    0   FALSE
                    GridName Train_MSE Train_MAE  Train_MAPE Validate_MSE Validate_MAE Validate_MAPE Blended_MSE Blended_MAE Blended_MAPE
1: StratifyParsimonousGrid_4 0.4708038 0.2543165 0.002108957    0.3804573    0.4921519   0.002360596   0.4256306   0.3732342  0.002234776
   BanditProbs_ParsimonousGrid BanditProbs_RandomGrid BanditProbs_StratifyParsimonousGrid_1 BanditProbs_StratifyParsimonousGrid_2
1:                        0.08                   0.01                                  0.08                                  0.15
   BanditProbs_StratifyParsimonousGrid_3 BanditProbs_StratifyParsimonousGrid_4 BanditProbs_StratifyParsimonousGrid_5 BanditProbs_StratifyParsimonousGrid_6
1:                                  0.15                                  0.08                                  0.08                                  0.08
   BanditProbs_StratifyParsimonousGrid_7 BanditProbs_StratifyParsimonousGrid_8 BanditProbs_StratifyParsimonousGrid_9 BanditProbs_StratifyParsimonousGrid_10
1:                                  0.08                                  0.08                                  0.08                                   0.08
         RunTime ModelRankByDataType Mode
lRank ModelRunNumber
1: 2.083744 mins       

The question is, how can I use this model in the future, for the prediction? I have parameters here, but from which package is the main function?

Catboost carma

Hello Adrian,

Is there a particular reason why you are encoding categorical variables in catboost carma?

Best regards

Res

threshOptim

`...

Plot of results

Plot <- ggplot2::ggplot(results, ggplot2::aes(x = Thresholds, y = Utilities)) +
ggplot2::geom_line(color = "blue") +
RemixAutoAI::ChartTheme(AngleX = 0) +
ggplot2::ggtitle(paste0("Threshold Optimization: best cutoff at ",thresh)) +
ggplot2::geom_vline(xintercept = thresh, linetype="dotted", color = "red", size=1.5)
return(list(Thresholds = thresh, EvaluationTable = results, Plot = Plot))
}`
shuld by RemixAutoML

No Formats Found

Hi Adrian,
When I run AutoCatboostCarma, I receive this error:
Error in if (min(as.ITime(data[[eval(DateCols[i])]])) - max(as.ITime(data[[eval(DateCols[i])]])) == :
missing value where TRUE/FALSE needed
In addition: Warning message:
All formats failed to parse. No formats found.
Can you help me to identify this error?
Thank you!
Quoc

Suggested Changes to R Documentation for Easier UX

  1. can we change "is the source time series data.table" to "is the source time series data as a data.table (use package data.table to convert data.frame to data.table)"

  2. In the Lags argument, can we change "same with moving averages" as "same as moving average lags"

  3. In the SLags argument, can we change "same with moving averages" as "same as moving average lags"

How to change the caption on the ggplot?

Hi;
I found your post on Rbloggers and also on your website. I was able to install all of the packages although catboost was a challenge. It is a great demo and going through it right now. I have a naive question though. I looked at the str of the model and see all the captions you are using for the plot. How do I change or customize the captions you have on the title area and by the X axis area? I want to customize this plot to my own data. Thanks beforehand.

Error in as.POSIXlt.numeric(x, tz = tz(x)) : 'origin' must be supplied

Data:

data <- structure(list(date = structure(c(885394380, 885394440, 885394500, 
                                          885394560, 885394620, 885394680, 885394740, 885394800, 885394860, 
                                          885394920, 885394980, 885395040, 885395100, 885395220, 885395280, 
                                          885395400, 885395520, 885395640, 885395700, 885395760, 885395820, 
                                          885398400, 885457980, 885458040, 885458100, 885458160, 885458220, 
                                          885458280, 885458340, 885458400), class = c("POSIXct", "POSIXt"
                                          ), tzone = ""), close = c(96.96875, 96.875, 96.9375, 97.03125, 
                                                                    96.9375, 97, 97.15625, 97.0625, 97.15625, 97.0625, 97.1875, 97.09375, 
                                                                    97.125, 97.125, 97, 97.0625, 97.03125, 97, 96.9375, 96.9375, 
                                                                    97, 96.9375, 96.15625, 96.15625, 96.25, 96.15625, 96.15625, 96.1875, 
                                                                    96.25, 96.40625)), row.names = c(NA, 30L), class = "data.frame")

If I try AutoTs fucntion:

output <- AutoTS(
  data = data,
  TargetName = 'close',
  DateName = 'date',
  FCPeriods = 14,
  HoldOutPeriods = 1,
  EvaluationMetric = 'MAPE',
  TimeUnit = '1Min',
  Lags = 10,
  NumCores = 16
)

I get error
Error in as.POSIXlt.numeric(x, tz = tz(x)) : 'origin' must be supplied

I tried to set origin but doesn't help:
data$date <- as.POSIXct(data$date, format='%Y-%m-%d %H:%M:%S', origin='1970-01-01 00:00.00 UTC', tzone='GMT')

Error [object 'TransformationResults' not found] when try code in Article

When I try reproduce code from article https://www.remixinstitute.com/blog/automated-demand-forecasts-using-autocatboostcarma-in-r/#.XRNZaExuLvU
I have error:

> Results <- RemixAutoML::AutoCatBoostCARMA(
+   data,
+   TargetColumnName = "Weekly_Sales",
+   DateColumnName = "Date",
+   GroupVariables = c("Store","Dept"),
+   FC_Periods = 52,
+   TimeUnit = "week",
+   TargetTransformation = TRUE,
+   Lags = c(1:25, 51, 52, 53),
+   MA_Periods = c(1:25, 51, 52, 53),
+   CalendarVariables = TRUE,
+   TimeTrendVariable = TRUE,
+   DataTruncate = FALSE,
+   SplitRatios = c(1 - 2*30/143, 30/143, 30/143),
+   TaskType = "GPU",
+   EvalMetric = "MAE",
+   GridTune = FALSE,
+   GridEvalMetric = "mae",
+   ModelCount = 1,
+   NTrees = 200,
+   PartitionType = "timeseries",
+   Timer = TRUE)
...
bestTest = 104.8901737
bestIteration = 199
Shrink model to first 200 iterations.
Error in AutoCatBoostRegression(data = train, ValidationData = valid,  : 
  object 'TransformationResults' not found

exampl

Error in AutoXGBoostClassifier(data, ValidationData = NULL, TestData = NULL, : object 'CatFeatures' not found
What is it?

Can data argumetn be multivariate?

I have just tried your package. I am not sure is data argument in AutoTS univariate time series or it can contain multiplie variables?

I tried with more than one variable, but I got final graph with two time series (instead of one, target variable).

EDIT: One more issue

I have following data:

data <- structure(list(zadnja = c(421, 425, 432, 415, 414, 409.99, 407, 
415, 424.99, 432, 425, 433, 428, 428.99, 425, 425, 420, 420, 
420, 419.98, 415, 410, 407, 407.5, 399.98, 400.05, 380, 400, 
394.99, 389.98, 395.05, 381.5, 385, 395.9, 383, 376, 390, 385.01, 
385, 379, 375.1, 380, 378.99, 368.99, 355.75, 367.97, 370, 376, 
386.98, 392), index = structure(c(13917, 13920, 13921, 13922, 
13923, 13924, 13927, 13928, 13929, 13930, 13931, 13934, 13935, 
13936, 13937, 13938, 13941, 13942, 13943, 13944, 13945, 13948, 
13949, 13950, 13951, 13952, 13955, 13956, 13957, 13958, 13963, 
13964, 13965, 13966, 13969, 13970, 13971, 13972, 13973, 13976, 
13977, 13978, 13979, 13980, 13983, 13984, 13985, 13986, 13987, 
13990), class = "Date")), row.names = c(NA, -50L), index_quo = ~index, index_time_zone = "UTC", class = c("tbl_time", 
"tbl_df", "tbl", "data.frame"))

When I tried to estimate model using AutoTS:

stock_forecast = RemixAutoML::AutoTS(
  data = data,
  TargetName = "zadnja",
  DateName = "index",
  FCPeriods = 7,
  HoldOutPeriods = 5,
  TimeUnit = "day"
)

I got an error:

 Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'y' 

P. S. Why do you have R code in one file?

AutoXGBoostClassifier sometimes rejects "f" metric

I am running the following:

xgboost_results <- AutoXGBoostClassifier(data = cbind(train_x_data, act = train_y_data[, 1]), ValidationData = cbind(val_x_data, act = val_y_data[, 1])[1:(ceiling(nrow(val_x_data) / 2)), ], TestData = cbind(val_x_data, act = val_y_data[, 1])[(ceiling(nrow(val_x_data) / 2) + 1):nrow(val_x_data), ], TargetColumnName = "act", FeatureColNames = seq(1, ncol(train_x_data)), Trees = 25, GridTune = TRUE, MaxModelsInGrid = 15, grid_eval_metric = "f", TreeMethod = "hist", ModelID = "xgboost_rev6", NThreads = 256)

and get this error:
Error in metric %chin% c("auc", "tpr", "tnr", "prbe", "f", "odds") :
object 'metric' not found

Changing the metric to "auc" it runs. However, since "f" is in the allowed list, why is this occurring?

Also, I'm "sure" I ran it with "f" before, but now I cannot reproduce how.

AutoTS creates daily forecasts even though TimeUnit is set to "month"

Below is R code using the Walmart store sales data set where AutoTS is creating daily forecasts even though TimeUnit="month"

# read in walmart data
walmart_data = data.table::fread("https://remixinstitute.box.com/shared/static/9kzyttje3kd7l41y1e14to0akwl9vuje.csv")

# add month
walmart_data$Month = lubridate::month(walmart_data$Date)
walmart_data$Year = lubridate::year(walmart_data$Date)
walmart_data$MonthAsDate = as.Date(
  paste(
    walmart_data$Year,
    ifelse(nchar(walmart_data$Month) == 1, paste("0", walmart_data$Month, sep = ""), walmart_data$Month),
    "01",
    sep = "-"
  )
)


# sum up sales by month
sales_by_month = walmart_data %>% dplyr::group_by(., MonthAsDate) %>%
  dplyr::summarize(., Monthly_Sales = sum(Weekly_Sales, na.rm = TRUE))


# forecast 18 months
Data_forecast = RemixAutoML::AutoTS(
data = sales_by_month,
TargetName = "Monthly_Sales",
DateName = "MonthAsDate",
FCPeriods = 18,
HoldOutPeriods = 12,
TimeUnit = "month")

shared object ‘RemixAutoML.so’ not found

Hello,

I am probably doing something wrong, but when trying to install RemixAutoML following these instructions, the installation fails with:

Error: package or namespace load failed for ‘RemixAutoML’ in library.dynam(lib, package, package.lib):
shared object ‘RemixAutoML.so’ not found
Error: loading failed
Execution halted
ERROR: loading failed

This might be related to to #32, while slightly different.

Here is the detailed information:

Click to expand

R> library(devtools)
Loading required package: usethis
R> to_install <- c("arules","catboost","caTools","data.table","doParallel","xgboost",
+   "foreach","forecast","fpp","ggplot2","gridExtra","h2o","itertools","lubridate",
+   "magick","Matrix", "MLmetrics","monreg","nortest","RColorBrewer","recommenderlab","ROCR","zoo",
+   "pROC","scatterplot3d","stringr","sde","timeDate","tm","tsoutliers","wordcloud","Rcpp")
R> for (i in to_install) {
+   message(paste("looking for ", i))
+   if(i == "catboost" & !requireNamespace(i)) {
+     # CURRENT VERSIONS ARE FAILING WITH MultiClass: devtools::install_github('catboost/catboost', subdir = 'catboost/R-package')
+     # Use the below instead as it is the latest release that doesn't fail
+     remotes::install_url('https://github.com/catboost/catboost/releases/download/v0.17.5/catboost-R-Windows-0.17.5.tgz', build_opts = c("--no-multiarch"))
+   } else if(i == "h2o" & !requireNamespace(i)) {
+     if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
+     if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }
+     pkgs <- c("RCurl","jsonlite")
+     for (pkg in pkgs) {
+       if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) }
+     }
+     install.packages("h2o")
+   } else if (!requireNamespace(i)) {
+     message(paste("     installing", i))
+     install.packages(i)
+   }
+ }
looking for  arules
Loading required namespace: arules
looking for  catboost
Loading required namespace: catboost
looking for  caTools
Loading required namespace: caTools
looking for  data.table
Loading required namespace: data.table
looking for  doParallel
Loading required namespace: doParallel
looking for  xgboost
Loading required namespace: xgboost
looking for  foreach
looking for  forecast
Loading required namespace: forecast
Registered S3 method overwritten by 'xts':
  method     from
  as.zoo.xts zoo 
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
Registered S3 methods overwritten by 'forecast':
  method             from    
  fitted.fracdiff    fracdiff
  residuals.fracdiff fracdiff
looking for  fpp
Loading required namespace: fpp
looking for  ggplot2
looking for  gridExtra
Loading required namespace: gridExtra
looking for  h2o
Loading required namespace: h2o
looking for  itertools
Loading required namespace: itertools
looking for  lubridate
Loading required namespace: lubridate
looking for  magick
Loading required namespace: magick
looking for  Matrix
looking for  MLmetrics
Loading required namespace: MLmetrics
looking for  monreg
Loading required namespace: monreg
looking for  nortest
Loading required namespace: nortest
looking for  RColorBrewer
Loading required namespace: RColorBrewer
looking for  recommenderlab
Loading required namespace: recommenderlab
Registered S3 methods overwritten by 'registry':
  method               from 
  print.registry_field proxy
  print.registry_entry proxy
looking for  ROCR
Loading required namespace: ROCR
looking for  zoo
looking for  pROC
Loading required namespace: pROC
looking for  scatterplot3d
Loading required namespace: scatterplot3d
looking for  stringr
looking for  sde
Loading required namespace: sde
looking for  timeDate
looking for  tm
Loading required namespace: tm
looking for  tsoutliers
Loading required namespace: tsoutliers
looking for  wordcloud
Loading required namespace: wordcloud
looking for  Rcpp
R> # Install RemixAutoML:
R> devtools::install_github('AdrianAntico/RemixAutoML', upgrade = FALSE, dependencies = FALSE, force = TRUE)
Downloading GitHub repo AdrianAntico/RemixAutoML@masterchecking for file/tmp/RtmpcPITXk/remotes1c3047cb91eb/AdrianAntico-RemixAutoML-9b3d3e0/DESCRIPTION...preparingRemixAutoML: (404ms)
✔  checking DESCRIPTION meta-informationchecking for LF line-endings in source and make files and shell scriptschecking for empty or unneeded directoriesbuildingRemixAutoML_0.11.0.tar.gzInstalling package into/home/oettli/R/library’
(aslibis unspecified)
* installing *source* packageRemixAutoML...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed forRemixAutoMLin library.dynam(lib, package, package.lib):
 shared objectRemixAutoML.sonot found
Error: loading failed
Execution halted
ERROR: loading failed
* removing/home/oettli/R/library/RemixAutoMLError: Failed to install 'RemixAutoML' from GitHub:
  (converted from warning) installation of package/tmp/RtmpcPITXk/file1c304a0cc39e/RemixAutoML_0.11.0.tar.gzhad non-zero exit status

Session info

R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS/LAPACK: /opt/OpenBLAS/lib/libopenblasp-r0.3.7.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sos_2.0-0  brew_1.0-6

loaded via a namespace (and not attached):
[1] compiler_3.6.2

Error message while running code for AutoCatboostCARMA

Hi Adrian,
When I run the code, I get an error message:

Error in if (min(as.ITime(data[[eval(DateCols[i])]])) - max(as.ITime(data[[eval(DateCols[i])]])) == :
missing value where TRUE/FALSE needed
In addition: Warning message:
All formats failed to parse. No formats found.

Any suggestion?


My code is:
library(RemixAutoML)
library(data.table)
data <- data.table::fread("x://Check.csv")
data <- data[, Counts := .N, by = c("Store","Dept")][
Counts == 138][, Counts := NULL]
keep <- c("Store","Dept","Date","Weekly_Sales")
data <- data[, ..keep]
Results <- RemixAutoML::AutoCatBoostCARMA(
data,
TargetColumnName = "Weekly_Sales",
DateColumnName = "Date",
GroupVariables = c("Store","Dept"),
FC_Periods = 30,
TimeUnit = "day",
TargetTransformation = TRUE,
Lags = c(1, 7),
MA_Periods = c(1, 7),
CalendarVariables = TRUE,
TimeTrendVariable = TRUE,
DataTruncate = FALSE,
SplitRatios = c(0.7, 0.2, 0.1),
TaskType = "CPU",
EvalMetric = "MAPE",
GridTune = FALSE,
GridEvalMetric = "mape",
ModelCount = 1,
NTrees = 20000,
PartitionType = "timeseries",
Timer = TRUE)
CatBoost_Eval <- CatBoostResults$ModelInformation$EvaluationMetricsByGroup
CatBoost_Forecast <- CatBoostResults$Forecast
data.table::fwrite(CatBoost_Eval, paste0(getwd(),"/CatBoost_Eval.csv"))
data.table::fwrite(CatBoost_Forecast, paste0(getwd(),"/CatBoost_Forecast.csv"))
rm(CatBoost_Eval,CatBoostResults)
rm(CatBoost_Forecast,CatBoostResults)

AutoKMeans produces 0 clusters

When using the following function with the following parameters:

AutoK_obj <- RemixAutoML::AutoKMeans(
    data = customer_product_tbl %>% select(-bikeshop_name)
    , KMeansK = 15
    , KMeansMetric = "tot_withinss"
    , GridTuneGLRM = TRUE
    , GridTuneKMeans = TRUE
    )

I get only 0 returned in the cluster column. Yet when I run a skree plot I can see that there are at least 3 or 4 as a good cut off.

kmeans_mapper <- function(centers = 3) {
    
    # Body
    customer_product_tbl %>%
        select(-bikeshop_name) %>%
        kmeans(
            centers = centers
            , nstart = 100
        )
    
}
kmeans_mapper(3) %>% glance()

# Mapping the function to many elements
kmeans_mapped_tbl <- tibble(centers = 1:15) %>%
    mutate(k_means = centers %>% map(kmeans_mapper)) %>%
    mutate(glance = k_means %>% map(glance))

# Skree Plot ----
kmeans_mapped_tbl %>%
    unnest(glance) %>%
    select(centers, tot.withinss) %>%
    ggplot(
        mapping = aes(
            x = centers
            , y = tot.withinss
        )
    ) +
    geom_point() +
    geom_line() +
    ggrepel::geom_label_repel(mapping = aes(label = centers)) +
    theme_tq()

The data is a user-item matrix form.

customer_trends_tbl.xlsx

undefined exports: AutoXGBoostClassifier

Install RemixAutoML:

devtools::install_github('AdrianAntico/RemixAutoML', upgrade = FALSE, dependencies = FALSE, force = TRUE)
Downloading GitHub repo AdrianAntico/RemixAutoML@master
√ checking for file 'C:\Users\User1\AppData\Local\Temp\RtmpKc3aCz\remotes5c4cb56ae4\AdrianAntico-RemixAutoML-7563aa0/DESCRIPTION' (452ms)

  • preparing 'RemixAutoML': (351ms)
    √ checking DESCRIPTION meta-information ...
  • checking for LF line-endings in source and make files and shell scripts
  • checking for empty or unneeded directories
  • building 'RemixAutoML_0.11.0.tar.gz'
  • installing source package 'RemixAutoML' ...
    ** using staged installation
    ** R
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    converting help for package 'RemixAutoML'
    finding HTML links ... done
    AutoCatBoostCARMA html
    AutoCatBoostClassifier html
    AutoCatBoostHurdleModel html
    AutoCatBoostMultiClass html
    AutoCatBoostRegression html
    AutoCatBoostScoring html
    AutoDataPartition html
    AutoH2OMLScoring html
    AutoH2OModeler html
    AutoH2OScoring html
    AutoH2OTextPrepScoring html
    AutoH2oDRFCARMA html
    AutoH2oDRFClassifier html
    AutoH2oDRFHurdleModel html
    AutoH2oDRFMultiClass html
    AutoH2oDRFRegression html
    AutoH2oGBMCARMA html
    AutoH2oGBMClassifier html
    AutoH2oGBMHurdleModel html
    AutoH2oGBMMultiClass html
    AutoH2oGBMRegression html
    AutoKMeans html
    AutoMarketBasketModel html
    AutoNLS html
    AutoRecomDataCreate html
    AutoRecommender html
    AutoRecommenderScoring html
    AutoTS html
    AutoTransformationCreate html
    AutoTransformationScore html
    AutoWord2VecModeler html
    AutoWordFreq html
    AutoXGBoostCARMA html
    AutoXGBoostClassifier html
    AutoXGBoostHurdleModel html
    AutoXGBoostMultiClass html
    AutoXGBoostRegression html
    AutoXGBoostScoring html
    ChartTheme html
    CreateCalendarVariables html
    CreateHolidayVariables html
    DT_GDL_Feature_Engineering html
    DummifyDT html
    EvalPlot html
    GDL_Feature_Engineering html
    GenTSAnomVars html
    ModelDataPrep html
    ParDepCalPlots html
    Partial_DT_GDL_Feature_Engineering html
    PrintObjectsSize html
    ProblematicFeatures html
    ProblematicRecords html
    RedYellowGreen html
    RemixAutoML-package html
    RemixTheme html
    ResidualOutliers html
    Scoring_GDL_Feature_Engineering html
    SimpleCap html
    TimeSeriesFill html
    multiplot html
    percRank html
    tempDatesFun html
    threshOptim html
    tokenizeH2O html
    ** building package indices
    ** installing vignettes
    ** testing if installed package can be loaded from temporary location
    *** arch - i386
    Error: package or namespace load failed for 'RemixAutoML' in namespaceExport(ns, exports):
    undefined exports: AutoXGBoostClassifier
    Error: loading failed
    Execution halted
    *** arch - x64
    Error: package or namespace load failed for 'RemixAutoML' in namespaceExport(ns, exports):
    undefined exports: AutoXGBoostClassifier
    Error: loading failed
    Execution halted
    ERROR: loading failed for 'i386', 'x64'

AWS sagemaker instance, fedora: cannot find -lMagick++, cannot find -lMagickCore

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/../lib/gcc/x86_64-conda_cos6-linux-gnu/7.3.0/../../../../x86_64-conda_cos6-linux-gnu/bin/ld: cannot find -lMagick++
/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/../lib/gcc/x86_64-conda_cos6-linux-gnu/7.3.0/../../../../x86_64-conda_cos6-linux-gnu/bin/ld: cannot find -lMagickCore

I'm trying to install the R package on a AWS sagemaker instance which uses fedora distribution.
I've made sure to install magick by invoking
sudo yum install ImageMagick-c++-devel

However I keep getting cannot find-lMagick++ and -lMagickCore, which is a prerequisite to install RemixAutoML.

Anyone has encountered the same and found a solution?

Thanks

Error Install

** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location Error: package or namespace load failed for 'RemixAutoML' in namespaceExport(ns, exports): undefined exports: AutoXGBoostMultiClass
Error in the package or for my part?

AutoCatBoostCARMA Error

Error in data.table::rbindlist(list(UpdateData[ID != 1], Temporary), use.names = TRUE) :
Item 2 has 62 columns, inconsistent with item 1 which has 63 columns. To fill missing columns use fill=TRUE.

Error installing package

Thanks for your work in creating this library.

I tried to install the package using the instructions in README.md but I am getting the following error:

* installing *source* package 'RemixAutoML' ...
** R
Error in parse(outFile) : 
  C:/Users/Ajay/AppData/Local/Temp/Rtmp4mKcfB/R.INSTALLb390631b7c86/RemixAutoML/R/EconometricsFunctions.R:1024:89: unexpected ']'
1023:   if(ModelFreq) {
1024:     ModelFreqFrequency <- forecast::findfrequency(data_train[, get(names(data_train)[2L]]

There is a missing ')' in forecast::findfrequency(data_train[, get(names(data_train)[2L]] in EconometricsFunctions.R.

AutoKMeans unused argument

Hi,
Working through your Package Overview course and ran into problems running the AutoKMeans() example on the Iris data set.

It gives me the following:
Error in AutoKMeans(data, nthreads = 8, MaxMem = "28G", SaveModels = NULL, : object 'FilePath' not found

Under the hood
if (!is.null(FilePath)) { if (!is.character(FilePath)) { warning("FilePath needs to resolve to a character value. E.g. getwd()") } }

If I comment this out I can get it to run.

Should this have been PathFile from your function params?

Cheers
Bart

NumCores parameter

Hi @AdrianAntico,

I'm running a machine with 36 cores and 64 gb ram.

However I notice that the runtime don't seem to be any faster than my laptop with 8 cores and 8 gb ram.

I've made sure to update the NumCores parameter though.

Is this a known issue? Thanks

Error in AutoXGBoostRegression function

Hi Adrian,
there is an error with this code:

Correl <- 0.85
N <- 10000
data <- data.table::data.table(Target = runif(N))
data[, x1 := qnorm(Target)]
data[, x2 := runif(N)]
data[, Independent_Variable1 := log(pnorm(Correl * x1 +

  •                                           sqrt(1-Correl^2) * qnorm(x2)))]
    

data[, Independent_Variable2 := (pnorm(Correl * x1 +

  •                                        sqrt(1-Correl^2) * qnorm(x2)))]
    

data[, Independent_Variable3 := exp(pnorm(Correl * x1 +

  •                                           sqrt(1-Correl^2) * qnorm(x2)))]
    

data[, Independent_Variable4 := exp(exp(pnorm(Correl * x1 +

  •                                               sqrt(1-Correl^2) * qnorm(x2))))]
    

data[, Independent_Variable5 := sqrt(pnorm(Correl * x1 +

  •                                            sqrt(1-Correl^2) * qnorm(x2)))]
    

data[, Independent_Variable6 := (pnorm(Correl * x1 +

  •                                        sqrt(1-Correl^2) * qnorm(x2)))^0.10]
    

data[, Independent_Variable7 := (pnorm(Correl * x1 +

  •                                        sqrt(1-Correl^2) * qnorm(x2)))^0.25]
    

data[, Independent_Variable8 := (pnorm(Correl * x1 +

  •                                        sqrt(1-Correl^2) * qnorm(x2)))^0.75]
    

data[, Independent_Variable9 := (pnorm(Correl * x1 +

  •                                        sqrt(1-Correl^2) * qnorm(x2)))^2]
    

data[, Independent_Variable10 := (pnorm(Correl * x1 +

  •                                         sqrt(1-Correl^2) * qnorm(x2)))^4]
    

data[, Independent_Variable11 := as.factor(

  • ifelse(Independent_Variable2 < 0.20, "A",
    
  •        ifelse(Independent_Variable2 < 0.40, "B",
    
  •               ifelse(Independent_Variable2 < 0.6,  "C",
    
  •                      ifelse(Independent_Variable2 < 0.8,  "D", "E")))))]
    

data[, ':=' (x1 = NULL, x2 = NULL)]

TestModel <- AutoXGBoostRegression(data,

  •                                TrainOnFull = FALSE,
    
  •                                ValidationData = NULL,
    
  •                                TestData = NULL,
    
  •                                TargetColumnName = "Target",
    
  •                                FeatureColNames = 2:12,
    
  •                                IDcols = NULL,
    
  •                                ReturnFactorLevels = FALSE,
    
  •                                TransformNumericColumns = NULL,
    
  •                                eval_metric = "RMSE",
    
  •                                Trees = 50,
    
  •                                GridTune = TRUE,
    
  •                                grid_eval_metric = "mae",
    
  •                                MaxModelsInGrid = 10,
    
  •                                NThreads = max(1, parallel::detectCores()-2),
    
  •                                TreeMethod = "hist",
    
  •                                model_path = NULL,
    
  •                                metadata_path = NULL,
    
  •                                ModelID = "FirstModel",
    
  •                                NumOfParDepPlots = 3,
    
  •                                ReturnModelObjects = TRUE,
    
  •                                SaveModelObjects = FALSE,
    
  •                                PassInGrid = NULL)
    

[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
[1] 11
Error in get(Target) : primer argumento inválido

Best Regards

Mauricio

AutoTS Function ERROR

Hey,

I am trying to run the following code:

output <- AutoTS(kpi, TargetName = 'Reject Qty', DateName = 'month_year', FCPeriods = 3, HoldOutPeriods = 4, EvaluationMetric = 'MAPE', InnerEval = 'AICc', TimeUnit = 'month', Lags = 1, SLags = 1, SkipModels = c("NNET","TBATS","ETS","TSLM","ARFIMA","DSHW"), StepWise = TRUE, TSClean = FALSE, ModelFreq = TRUE, PlotPredictionIntervals = TRUE, PrintUpdates = FALSE)

and getting this error:

Error in ncol(Final_metrics) : object 'Final_metrics' not found

Does anyone know what mistake am I making?

Error: package or namespace load failed for 'RemixAutoML' in library.dynam(lib, package, package.lib): DLL 'RemixAutoML' not found: maybe not installed for this architecture?

Hello, i tried installing the package but always failed with error:
Error: package or namespace load failed for 'RemixAutoML' in library.dynam(lib, package, package.lib):
DLL 'RemixAutoML' not found: maybe not installed for this architecture?

I already installed all the dependencies and have no idea where things went wrong. My R env:
R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Platform: x86_64-w64-mingw32/x64 (64-bit)

Hope you can solve my problem, thanks

Undefined exports: AutoCatBoostCARMA

Hello,

There is an error during installation:
"Error: package or namespace load failed for ‘RemixAutoML’ in namespaceExport(ns, exports):
undefined exports: AutoCatBoostCARMA"

It is possible there is an error in NAMESPACE file and export()

Thank you !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.