Giter VIP home page Giter VIP logo

distributedr's People

Contributors

bjnoe avatar etduwx avatar fun-indra avatar jorgemarsal avatar pratikdhandharia avatar shreya2k7 avatar vinxu123 avatar vishrutg avatar winstongitvertica avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

distributedr's Issues

Shiny App Crash

Shiny App crashes creating ODBC connection using vODBC:

library(vertica.dplyr)

vrt <- src_vertica(
  dsn = 'VerticaDSN'
)

shinyServer(function(input, output) {

  output$distPlot <- renderPlot({

    # generate bins based on input$bins from ui.R
    x    <- faithful[, 2]
    bins <- seq(min(x), max(x), length.out = input$bins + 1)

    # draw the histogram with the specified number of bins
    hist(x, breaks = bins, col = 'darkgray', border = 'white')

  })

})

I have discovered that problem is at odbcConnect call. The interesting thing is if I execute odbcConnect in a console before launching my shiny app everything is OK.

Source code installation failing with error during make: 'Rcpp_eval' is not a member of 'Rcpp'

Hello I am following the installation document provided in the source code and trying to compile it on CentOS but getting and error mentioned in the subject.Is there a fix for it.

[root@quickstart DistributedR-master]# cat /etc/redhat-release
CentOS release 6.4 (Final)

[root@quickstart DistributedR-master]# uname -ar
Linux quickstart.cloudera 2.6.32-358.el6.x86_64

Creating a generic function for 'ncol' from package 'base' in package 'distributedR'
Creating a generic function for 'NCOL' from package 'base' in package 'distributedR'
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
During startup - Warning message:
Setting LC_CTYPE failed, using "C"

  • DONE (distributedR)
    g++ platform/executor/src/Rtools.cpp -c -g -O2 -fopenmp -finline-limit=10000 -DNDEBUG -DBOOST_LOG_DYN_LINK -DCSTACK_DEFNS -DHAVE_NETINET_IN_H -DHAVE_INTTYPES_H -I /home/cloudera/Downloads/DistributedR-master/third_party/install/include -I /home/cloudera/Downloads/DistributedR-master/platform/messaging/gen-cpp -I /home/cloudera/Downloads/DistributedR-master/third_party/boost_threadpool/threadpool -I /home/cloudera/Downloads/DistributedR-master/third_party/atomicio -I platform/common /usr/lib64/R/bin/R CMD config --cppflags Rscript -e "Rcpp:::CxxFlags()" Rscript -e "RInside:::CxxFlags()" -lm -rdynamic -L /home/cloudera/Downloads/DistributedR-master/lib -Wl,-rpath,/home/cloudera/Downloads/DistributedR-master/lib /usr/lib64/R/bin/R CMD config --ldflags -lpthread -L/home/cloudera/Downloads/DistributedR-master/third_party/install/lib -Wl,-rpath,/home/cloudera/Downloads/DistributedR-master/third_party/install/lib -lboost_thread -lboost_system -lboost_log -lboost_log_setup -lboost_chrono -L /home/cloudera/Downloads/DistributedR-master/third_party/atomicio -Wl,-rpath,/home/cloudera/Downloads/DistributedR-master/third_party/atomicio -latomicio Rscript -e "Rcpp:::LdFlags()" Rscript -e "RInside:::LdFlags()" -laio -lrt -Wno-deprecated-declarations -DSTRICT_R_HEADERS -fPIC -o platform/executor/src/Rtools.o
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    g++ platform/executor/src/executor.cpp -c -g -O2 -fopenmp -finline-limit=10000 -DNDEBUG -DBOOST_LOG_DYN_LINK -DCSTACK_DEFNS -DHAVE_NETINET_IN_H -DHAVE_INTTYPES_H -I /home/cloudera/Downloads/DistributedR-master/third_party/install/include -I /home/cloudera/Downloads/DistributedR-master/platform/messaging/gen-cpp -I /home/cloudera/Downloads/DistributedR-master/third_party/boost_threadpool/threadpool -I /home/cloudera/Downloads/DistributedR-master/third_party/atomicio -I platform/common /usr/lib64/R/bin/R CMD config --cppflags Rscript -e "Rcpp:::CxxFlags()" Rscript -e "RInside:::CxxFlags()" -lm -rdynamic -L /home/cloudera/Downloads/DistributedR-master/lib -Wl,-rpath,/home/cloudera/Downloads/DistributedR-master/lib /usr/lib64/R/bin/R CMD config --ldflags -lpthread -L/home/cloudera/Downloads/DistributedR-master/third_party/install/lib -Wl,-rpath,/home/cloudera/Downloads/DistributedR-master/third_party/install/lib -lboost_thread -lboost_system -lboost_log -lboost_log_setup -lboost_chrono -L /home/cloudera/Downloads/DistributedR-master/third_party/atomicio -Wl,-rpath,/home/cloudera/Downloads/DistributedR-master/third_party/atomicio -latomicio Rscript -e "Rcpp:::LdFlags()" Rscript -e "RInside:::LdFlags()" -laio -lrt -Wno-deprecated-declarations -DSTRICT_R_HEADERS -fPIC -o platform/executor/src/executor.o
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    During startup - Warning message:
    Setting LC_CTYPE failed, using "C"
    In file included from /usr/lib64/R/library/RInside/include/RInside.h:26,
    from platform/executor/src/executor.cpp:49:
    /usr/lib64/R/library/RInside/include/RInsideCommon.h:49:1: warning: "CSTACK_DEFNS" redefined
    : warning: this is the location of the previous definition
    platform/executor/src/executor.cpp: In function 'int main(int, char*)':
    platform/executor/src/executor.cpp:661: error: 'Rcpp_eval' is not a member of 'Rcpp'
    make: *
    * [platform/executor/src/executor.o] Error 1

Cannot build with g++ 4.8

I installed all the deps, but it also gave me errors. I think there are some gcc version issues.

$make
make -C third_party -j8 all
make[1]: Entering directory /root/src/DistributedR/third_party' make[1]: Nothing to be done forall'.
make[1]: Leaving directory /root/src/DistributedR/third_party' g++ platform/worker/src/ExecutorPool.cpp -c -std=c++0x -g -O2 -fopenmp -finline-limit=10000 -DNDEBUG -DBOOST_LOG_DYN_LINK -DCSTACK_DEFNS -DHAVE_NETINET_IN_H -DHAVE_INTTYPES_H -I /root/src/DistributedR/third_party/install/include -I /root/src/DistributedR/platform/messaging/gen-cpp -I /root/src/DistributedR/third_party/boost_threadpool/threadpool -I /root/src/DistributedR/third_party/atomicio -I platform/common/usr/lib/R/bin/R CMD config --cppflags Rscript -e "Rcpp:::CxxFlags()" Rscript -e "RInside:::CxxFlags()"-lm -rdynamic -L /root/src/DistributedR/lib -Wl,-rpath,/root/src/DistributedR/lib/usr/lib/R/bin/R CMD config --ldflags-lpthread -L/root/src/DistributedR/third_party/install/lib -Wl,-rpath,/root/src/DistributedR/third_party/install/lib -lboost_thread -lboost_system -lboost_log -lboost_log_setup -lboost_chrono -lboost_filesystem -lboost_date_time -L /root/src/DistributedR/third_party/atomicio -Wl,-rpath,/root/src/DistributedR/third_party/atomicio -latomicioRscript -e "Rcpp:::LdFlags()" Rscript -e "RInside:::LdFlags()"` -lrt -Wno-deprecated-declarations -DSTRICT_R_HEADERS -fPIC -o platform/worker/src/ExecutorPool.o
In file included from /usr/local/lib/R/site-library/RInside/include/RInside.h:26:0,
from platform/common/ArrayData.h:38,
from platform/worker/src/ExecutorPool.h:45,
from platform/worker/src/ExecutorPool.cpp:31:
/usr/local/lib/R/site-library/RInside/include/RInsideCommon.h:49:0: warning: "CSTACK_DEFNS" redefined [enabled by default]
#define CSTACK_DEFNS
^
:0:0: note: this is the location of the previous definition
platform/worker/src/ExecutorPool.cpp:52:28: error: reference to ‘unordered_set’ is ambiguous
unordered_set shmem_arrays,
^
In file included from /root/src/DistributedR/third_party/install/include/boost/unordered/unordered_set.hpp:16:0,
from /root/src/DistributedR/third_party/install/include/boost/unordered_set.hpp:16,
from platform/worker/src/ExecutorPool.h:34,
from platform/worker/src/ExecutorPool.cpp:31:
/root/src/DistributedR/third_party/install/include/boost/unordered/unordered_set_fwd.hpp:27:15: note: candidates are: template<class T, class H, class P, class A> class boost::unordered::unordered_set
class unordered_set;
^
In file included from /usr/include/c++/4.8/unordered_set:48:0,
from /usr/local/lib/R/site-library/Rcpp/include/Rcpp/platform/compiler.h:157,
from /usr/local/lib/R/site-library/Rcpp/include/RcppCommon.h:29,
from /usr/local/lib/R/site-library/Rcpp/include/Rcpp.h:27,
from /usr/local/lib/R/site-library/RInside/include/RInsideCommon.h:38,
from /usr/local/lib/R/site-library/RInside/include/RInside.h:26,
from platform/common/ArrayData.h:38,
from platform/worker/src/ExecutorPool.h:45,
from platform/worker/src/ExecutorPool.cpp:31:
/usr/include/c++/4.8/bits/unordered_set.h:93:11: note: template<class Value, class Hash, class Pred, class Alloc> class std::unordered_set
class unordered_set : check_copy_constructible<Alloc>
^
platform/worker/src/ExecutorPool.cpp:52:28: error: ‘unordered_set’ has not been declared
unordered_set *shmem_arrays,
^
platform/worker/src/ExecutorPool.cpp:52:41: error: expected ‘,’ or ‘...’ before ‘<’ token
unordered_set *shmem_arrays,
^
platform/worker/src/ExecutorPool.cpp:49:1: error: prototype for ‘presto::ExecutorPool::ExecutorPool(int, presto::ServerInfo
, presto::MasterClient
, boost::timed_mutex
, int)’ does not match any in class ‘presto::ExecutorPool’
ExecutorPool::ExecutorPool(int n
, ServerInfo my_location,
^
In file included from platform/worker/src/ExecutorPool.cpp:31:0:
platform/worker/src/ExecutorPool.h:49:7: error: candidates are: presto::ExecutorPool::ExecutorPool(const presto::ExecutorPool&)
class ExecutorPool {
^
platform/worker/src/ExecutorPool.h:51:3: error: presto::ExecutorPool::ExecutorPool(int, presto::ServerInfo
, presto::MasterClient
, boost::timed_mutex
, boost::unordered::unordered_setstd::basic_string, const string&, int, std::string, int)
ExecutorPool(int n_, ServerInfo *my_location_, MasterClient_ master_,
^
In file included from platform/common/ArrayData.h:28:0,
from platform/worker/src/ExecutorPool.h:45,
from platform/worker/src/ExecutorPool.cpp:31:
platform/common/SharedMemory.h: In member function ‘virtual void presto::BoostSharedMemoryObject::truncate(size_t)’:
platform/common/SharedMemory.h:117:45: warning: ignoring return value of ‘int lockf(int, int, off_t)’, declared with attribute warn_unused_result [-Wunused-result]
lockf( shared_memory_sem, F_ULOCK, 0 );
^
platform/common/SharedMemory.h:123:43: warning: ignoring return value of ‘int lockf(int, int, off_t)’, declared with attribute warn_unused_result [-Wunused-result]
lockf( shared_memory_sem, F_ULOCK, 0 );
^
In file included from platform/worker/src/ExecutorPool.cpp:29:0:
platform/common/UpdateUtils.h: In function ‘int32_t presto::ParseUpdateLine(FILE
, char
, size_t
, int
, size_t_, size_t_, char_)’:
platform/common/UpdateUtils.h:72:36: warning: ignoring return value of ‘int fscanf(FILE_, const char_, ...)’, declared with attribute warn_unused_result [-Wunused-result]
fscanf(in, "\n%[^\n]", message);
^
make: *_* [platform/worker/src/ExecutorPool.o] Error 1

I cannot build it.

Here is the error message:

xuzhan@markTwo:~/lab/distributedR/DistributedR-master$ make
make -C third_party -j8 all
make[1]: Entering directory '/home/xuzhan/lab/distributedR/DistributedR-master/third_party'
make[1]: Nothing to be done for 'all'.
make[1]: Leaving directory '/home/xuzhan/lab/distributedR/DistributedR-master/third_party'
mkdir -p /home/xuzhan/lab/distributedR/DistributedR-master/lib
g++ platform/common/ArrayData.o platform/common/WorkerInfo.o platform/common/DistDataFrame.o platform/common/TransferServer.o platform/common/common.o platform/common/MasterClient.o platform/common/DistList.o -std=c++0x -g -O2 -fopenmp -finline-limit=10000 -DNDEBUG -DBOOST_LOG_DYN_LINK -DCSTACK_DEFNS -DHAVE_NETINET_IN_H -DHAVE_INTTYPES_H -I /home/xuzhan/lab/distributedR/DistributedR-master/third_party/install/include -I /home/xuzhan/lab/distributedR/DistributedR-master/platform/messaging/gen-cpp -I /home/xuzhan/lab/distributedR/DistributedR-master/third_party/boost_threadpool/threadpool -I /home/xuzhan/lab/distributedR/DistributedR-master/third_party/atomicio -I platform/common /usr/lib/R/bin/R CMD config --cppflags Rscript -e "Rcpp:::CxxFlags()" Rscript -e "RInside:::CxxFlags()" -lm -rdynamic -L /home/xuzhan/lab/distributedR/DistributedR-master/lib -Wl,-rpath,/home/xuzhan/lab/distributedR/DistributedR-master/lib /usr/lib/R/bin/R CMD config --ldflags -lpthread -L/home/xuzhan/lab/distributedR/DistributedR-master/third_party/install/lib -Wl,-rpath,/home/xuzhan/lab/distributedR/DistributedR-master/third_party/install/lib -lboost_thread -lboost_system -lboost_log -lboost_log_setup -lboost_chrono -lboost_filesystem -lboost_date_time -L /home/xuzhan/lab/distributedR/DistributedR-master/third_party/atomicio -Wl,-rpath,/home/xuzhan/lab/distributedR/DistributedR-master/third_party/atomicio -latomicio Rscript -e "Rcpp:::LdFlags()" Rscript -e "RInside:::LdFlags()" -lrt -Wno-deprecated-declarations -DSTRICT_R_HEADERS -fPIC -shared -o /home/xuzhan/lab/distributedR/DistributedR-master/lib/libR-common.so /home/xuzhan/lab/distributedR/DistributedR-master/third_party/install/lib/libprotobuf.a -lR-proto /home/xuzhan/lab/distributedR/DistributedR-master/third_party/install/lib/libzmq.a /home/xuzhan/lab/distributedR/DistributedR-master/third_party/install/lib/libuuid.a
platform/common/DistDataFrame.o:(.bss+0x0): multiple definition of R_running_as_main_program' platform/common/ArrayData.o:(.bss+0x10): first defined here platform/common/DistList.o:(.bss+0x0): multiple definition ofR_running_as_main_program'
platform/common/ArrayData.o:/home/xuzhan/lab/distributedR/DistributedR-master/platform/common/ArrayData.cpp:452: first defined here
collect2: error: ld returned 1 exit status
build.mk:41: recipe for target '/home/xuzhan/lab/distributedR/DistributedR-master/lib/libR-common.so' failed
make: *** [/home/xuzhan/lab/distributedR/DistributedR-master/lib/libR-common.so] Error 1

A <- darray(dim=c(9,9), blocks=c(3,3), sparse=FALSE, data=10) stuck R

2015-Feb-05 05:10:54.290356 [DEBUG] New Update pointer to maintain list of updated split/composite variables in Function execution created.
2015-Feb-05 05:10:54.290439 [INFO] *** No Task under execution. Waiting from Task from Worker **
2015-Feb-05 05:11:11.942543 [DEBUG] Number of Split variables: 1
2015-Feb-05 05:11:11.942623 [DEBUG] Read Split Variable R-shm-50437-2011658264_0_0 (dhs in R)
2015-Feb-05 05:11:11.942818 [DEBUG] ParseShm: Uninited array R-shm-50437-2011658264_0_0 (this should only happen at array creation)
2015-Feb-05 05:11:11.942905 [INFO] New Task received from Worker. Reading Function Arguments and Body
2015-Feb-05 05:11:11.942940 [DEBUG] Reading Raw Arguments of Function from Worker and load it in R-session
2015-Feb-05 05:11:11.942971 [DEBUG] Number of Raw variables: 5
2015-Feb-05 05:11:11.942999 [DEBUG] Read Raw variable val

it is seems following code cause this issue

int ReadRawArgs(RInside& R) { // NOLINT
LOG_DEBUG("Reading Raw Arguments of Function from Worker and load it in R-session");
int raw_vars; // number of non-split variables
Timer timer;
timer.start();
int res = scanf(" %d ", &raw_vars);
if (res != 1) {
LOG_WARN("ReadRawArgs => Bad file format for Executor. Executor cannot recognize commands from Worker.");
throw PrestoWarningException
("ReadRawArgs::Executor cannot recognize commands from a worker");
}

LOG_DEBUG("Number of Raw variables: %d", raw_vars);

for (int i = 0; i < raw_vars; i++) {
char name[256];
int size;
res = scanf(" %s %d:", name, &size); //**************************stuck here,but don't know why
LOG_DEBUG("Read Raw variable %s", name);
if (res != 2) {
LOG_WARN("ReadRawArgs => Variable %s - Bad file format for Executor. Executor cannot recognize commands from Worker.", name);
throw PrestoWarningException
("ReadRawArgs::Executor cannot recognize commands from a worker");
}

DistributedR :: Issue HPdata with kerberos based hadoop environment.

Hi,

I am trying to load csv file from hadoop hdfs (Cloudera 5.8.0+Kerberos) but receiving the error.

Rscript:
library(HPdata)
library(distributedR)
distributedR_start()
Sys.setenv(DEBUG_DDC=TRUE)
system("kinit -kt ")
mydframe <- csv2dframe(url='hdfs:///user//Sample.csv',schema='A1:character,A2:character,A3:character', hdfsConfigurationFile='/home//hdfsconfig.json')

Error:
response-parse: lexical error: invalid char in json text.
<meta http-equiv
(right here) ------^

Error: basic_string::_S_construct null not valid

hdfsconfig.json file content:

{
"webhdfsPort": 50070,
"hdfsPort": 8020,
"hdfsHost": "",
"hdfsUser": ""
}

Not sure whether kerberos authentication is supported in HPdata package.

I have tried by running wget command on the webhdfs url and able to get the json response after kerberos authentication.So it seems issue with kerberos authentication in HPdata package. Do we have any configuration or workaround to this.

Thanks

Support sparse darrays in hpdglm

The main goal should be making the package more memory efficient. The biggest darray is the one contains predictors. In some applications, the matrix of predictors are sparse so using sparse darrays may have a big impact on both memory usage and total computation time (because of reducing communication overhead). Efficient support of sparse darray for predictors will touch almost all the foreach loops of the package.

Integration with accumulo

How accumulo can be made a data source for distributedR so that analytics can be done over that data parallely?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.