Giter VIP home page Giter VIP logo

plcontainer's People

Contributors

0x0fff avatar baishaoqi avatar beeender avatar davecramer avatar dotyjim-work avatar gfphoenix78 avatar gp-releng avatar haozhouwang avatar higuoxing avatar kmacoskey avatar krait007 avatar liuxueyang avatar markwort avatar paul-guo- avatar rmtt avatar sasasu avatar stanlyxiang avatar violet2016 avatar wengyanqing avatar xuebinsu avatar yihong0618 avatar yv5125 avatar zhangh43 avatar zhrt123 avatar zxuejing avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

plcontainer's Issues

Remove unused test ans files.

e.g. plcontainer_populate.out

I did not check carefully, but I assume they are useless now. We should double-check them and remove them accordingly.

Remove _DEBUG_CLIENT

We do not seem to need to have this code now. After removing this we could close(listen_fd) although this is trivial.

Test Issue

Test if this issue will be synced to tracker.

Support extension in plcontainer

Now postgresql extension is in gpdb now, we need to add support for this in plcontainer since extension has advantage over language.

Should use fixed-width variables for [de]serialization

I've seen a lot of code similar like this.

send_int32(conn, call->hasChanged);

hasChanged is defined as int.

It would better to define them as the fixed-width variable, e.g.

s/int hasChanged/int32 hasChanged/

This could avoid potential bug.

Using separate GUC to control log level at client side.

Currently we use log_min_messages to control log level at client side. we'd better use a separate GUC. Since plcontainer is dynamic loaded, set custom guc may not work on segment. Perhaps, after gpdb supporting create extension, we can revisit this problem.

"plcontainer image-add" failed when using image filename contains relative path

[gpadmin@jwu-vm ~]$ pwd
/home/gpadmin
[gpadmin@jwu-vm ~]$ plcontainer image-add -f plcontainer/daily/plcontainer-python-images-devel.tar.gz
20171211:12:52:24:020741 plcontainer:jwu-vm:gpadmin-[INFO]:-Checking whether docker is installed on all hosts...
20171211:12:52:24:020741 plcontainer:jwu-vm:gpadmin-[INFO]:-Distributing image file plcontainer/daily/plcontainer-python-images-devel.tar.gz to all hosts...
20171211:12:52:25:020741 plcontainer:jwu-vm:gpadmin-[CRITICAL]:-plcontainer failed. (Reason='ExecutionError: 'Error Executing Command: ' occured.  Details: '/bin/scp plcontainer/daily/plcontainer-python-images-devel.tar.gz jwu-vm:/usr/local/greenplum-db/./share/postgresql/plcontainer/plcontainer/daily/plcontainer-python-images-devel.tar.gz'  cmd had rc=1 completed=True halted=False
  stdout=''
  stderr='scp: /usr/local/greenplum-db/./share/postgresql/plcontainer/plcontainer/daily/plcontainer-python-images-devel.tar.gz: No such file or directory
'') exiting...

Residual containers cannot be cleanup in faultinjection test randomly.

Detailed error message.
--- /home/gpadmin/plcontainer_src/tests/expected/faultinject_python.out 2017-12-27 23:26:52.654279211 +0000
+++ /home/gpadmin/plcontainer_src/tests/results/faultinject_python.out 2017-12-27 23:26:52.657279189 +0000
@@ -72,7 +72,7 @@
GP_IGNORE:
GP_IGNORE:-- end_ignore
! ssh psql -d ${PL_TESTDB} -c 'select address from gp_segment_configuration where dbid=2' -t -A docker ps -a </dev/null | wc -l
-1
+2

Remove docker non-curl api

We have both curl and non-curl docker api code. However the non-curl code has some bugs or limitation. e.g. it does not handle tcp partial read well; it does not have timeout mechanism; it does not support chunked encoding (so inspect api code actually is working around this). We might better remove this code and leave this to more professional package, i.e. libcurl.

We requires libcurl >=7.40 to use the curl code. I assume that is because the unix domain socket support in libcurl starts from 7.40. We could document this requirement on README and Makefile. In the long run, we might change to use tcp thus lower version libcurl is allowed.

Support DO in plcontainer.

See below.

postgres=# DO LANGUAGE plpythonu $$
# container: plc_python_shared
print 1;
$$;
DO
postgres=#

postgres=# DO LANGUAGE plcontainer $$
# container: plc_python_shared
print 1;
$$;
ERROR:  language "plcontainer" does not support inline code execution


Client should have a solution to use log level

Currently clients do not have a solution to filter various levels of log.

We should allow to set them. Typically solution includes: Set level in client argument or set via guc ( environment variable and/or message).

spi free plan and spi execute with plan should sanity check about plan.

Before spi execute with plan, it should double check the plan pointer which comes from the client code. A typical solution is to save previous spi-planned plans and have a check. See FIXME.

               /* FIXME: Sanity-check is needed!
+                * Maybe hash-store plan pointers for quick search?
+                * Or use array since we need to free all plans when backend quits.
+                * Or both?
+                */
+               plc_plan = (plcPlan *) ((char *) msg->pplan - offsetof(plcPlan, plan));
+               if (plc_plan->nargs != msg->nargs) {
+                   elog(ERROR, "argument number wrong for execute with plan: "
+                       "Saved number (%d) vs transferred number (%d)",
+                       plc_plan->nargs, msg->nargs);

For plan free, it is also needed.

Need to make sure only one client run in container.

I run the client program after entering the container, and see this.
[root@16bb7a989b05 share]# ./pyclient
plcontainer log: pythonclient, gpadmin, postgres, 11275, LOG: Client has started execution at Fri Feb 9 02:11:44 2018
plcontainer log: pythonclient, gpadmin, postgres, 11275, ERROR: Socket timeout - no client connected within 20 seconds

Typically we could use a file to easily implement this.

Enhance logging on the QE side.

e.g. Add plcontainer meta information in log. Remove or replace debug_print(). Maybe add a guc to control plcontainer log level.

Need to test memory leak.

While clients do not have memory context so we should be careful that there is memory leak in the code. Better have S/W infrastructure to easily test it.

Should use fixed-size type for python/r conversion funtions.

e.g. In code below, (int16)out = .......

static int plc_pyobject_as_int2(PyObject *input, char **output, plcPyType *type UNUSED) {
int res = 0;
char out = (char)malloc(2);
*output = out;
if (PyInt_Check(input))
((short)out) = (short)PyInt_AsLong(input);

Update README.md

Recently there are a lot of change in plcontainer so that document is much out of date. So need to update README.md. At least:

  1. Docker API version and OS version (We could run on both centos/rhel 6 and 7).
  2. No "plcontainer configure" command.
  3. Whether needs vagrant? Or at least should be friendly to users who have had a Linux environment.
  4. Detailed step-by-step setup/build/test/run doc.

Wrong path when plcontainer install R image

when run

plcontainer install -n plc_r_shared -i /usr/local/greenplum-db-devel/share/postgresql/plcontainer/plcontainer-r-images.tar.gz -c pivotaldata/plcontainer_r_shared:devel;

The host path is set to $GPHOME/bin/pyclient which should be $GPHOME/bin/rclient

Harden the shared path/file access.

Currently we created a shared path for unix domain socket when creating container. This path is writable so in theory client code could write under the path and this introduces a bit security concern. Potential solutions include: setuid to a less-privileged user after client initialization code runs and thus client code does not have permission to write under the path; set quota/limit for the path/directory in a feasible and simple way?

We also have log files in new shared path set in default configuration but we seem to be able to easily resolve this since it seems that there is sophisticated solution for container logging.

Allow a spi execute function with limit in R

Currently R spi execute does not have limit. Upstream does not seem to have the functionality, but why not we provide these given this just needs very small effort since we have the support in shared code (for both python/r).

Need to retry for backend api error.

We've seen case that docker fails to start during "medium-concurrency" testing (200-300 docker create/start). From users' respectively, we better have retry, instead of simply telling users a failure.

Add support for SPI_OK_INSERT, SPI_OK_DELETE and SPI_OK_UPDATE

For insert/delete/update, currently we just support the "returning" versions of them although currently gpdb does not seem to support them. We should support the "non-returning" version in this function.

          switch (retval) {
            case SPI_OK_SELECT:
            case SPI_OK_INSERT_RETURNING:
            case SPI_OK_DELETE_RETURNING:
            case SPI_OK_UPDATE_RETURNING:
                /* some data was returned back */
                result = (plcMessage*)create_sql_result();
                break;
            default:
                elog(ERROR, "Cannot handle sql ('%s') with fn_readonly (%d) "
                     "and limit (%lld). Returns %d", msg->statement,
                     pinfo->fn_readonly, msg->limit, retval);
                break;
            }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.