Giter VIP home page Giter VIP logo

fats's People

Contributors

dependabot-preview[bot] avatar dependabot[bot] avatar dprotaso avatar ekcasey avatar ericbottard avatar fbiville avatar freynca avatar glyn avatar jchesterpivotal avatar scothis avatar trisberg avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fats's Issues

Please configure GITBOT

Pivotal provides the Gitbot service to synchronize issues and pull requests made against public GitHub repos with Pivotal Tracker projects.

If you are a Pivotal employee, you can configure Gitbot to sync your GitHub repo to your Pivotal Tracker project with a pull request.

Steps:

  • Fork this repo: cfgitbot-config (an ask+cf@ ticket is the fastest way to get read access if you get a 404)
  • Add the Toolsmiths-Bots team to have admin access to your repo
  • Add the cf-gitbot ([email protected]) user to have owner access to your Pivotal Tracker project
  • Add your new repo and or project to config-production.yml file
  • Submit a PR, which will get auto-merged if you've done it right

If you are not a pivotal employee, you can request that [email protected] set up the integration for you.

You might also be interested in configuring GitHub's Service Hook for Tracker on your repo so you can link your commits to Tracker stories. You can do this yourself by following the directions at:

https://www.pivotaltracker.com/blog/guide-githubs-service-hook-tracker/

If you do not want to use Pivotal Tracker to manage this GitHub repo, Please add this repo to the Ignored repositories list

If there are any questions, please reach out to [email protected].

Wait for all knative deployments to be running

The knative service web hook was registered but not running which caused resource to fail to be created. We should wait for all deployments to be ready before proceeding.

$ kubectl get deployments --all-namespaces
NAMESPACE          NAME                         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
istio-system       istio-citadel                1         1         1            1           1m
istio-system       istio-egressgateway          1         1         1            1           1m
istio-system       istio-galley                 1         1         1            1           1m
istio-system       istio-ingressgateway         1         1         1            1           1m
istio-system       istio-pilot                  1         1         1            1           1m
istio-system       istio-policy                 1         1         1            1           1m
istio-system       istio-sidecar-injector       1         1         1            1           1m
istio-system       istio-statsd-prom-bridge     1         1         1            1           1m
istio-system       istio-telemetry              1         1         1            1           1m
istio-system       knative-ingressgateway       1         1         1            1           13s
knative-build      build-controller             1         1         1            1           14s
knative-build      build-webhook                1         1         1            1           14s
knative-eventing   eventing-controller          1         1         1            1           12s
knative-eventing   stub-clusterbus-dispatcher   1         1         1            1           6s
knative-eventing   webhook                      1         1         1            0           12s
knative-serving    activator                    1         1         1            1           13s
knative-serving    autoscaler                   1         1         1            0           13s
knative-serving    controller                   1         1         1            1           13s
knative-serving    webhook                      1         1         1            0           13s
kube-system        event-exporter-v0.2.1        1         1         1            1           3m
kube-system        fluentd-gcp-scaler           1         1         1            1           3m
kube-system        heapster-v1.5.3              1         1         1            1           3m
kube-system        kube-dns                     2         2         2            2           3m
kube-system        kube-dns-autoscaler          1         1         1            1           3m
kube-system        l7-default-backend           1         1         1            1           3m
kube-system        metrics-server-v0.2.1        1         1         1            1           3m

$ ./run.sh
Current function scenario: uppercase
~/gopath/src/github.com/projectriff/fats/functions/uppercase/command ~/gopath/src/github.com/projectriff/fats
[2018-10-04T21:33:27Z] Creating fats-uppercase-command as command:
Error: Internal error occurred: failed calling admission webhook "webhook.serving.knative.dev": Post https://webhook.knative-serving.svc:443/?timeout=30s: no endpoints available for service "webhook"

Remove "user" from registry prefix

The "user" portion of the registry prefix is a holdover from when riff namespace init had a --registry-user flag. This was never part of the registry API and has been correctly removed from riff. We should also remove it from FATS.

Cleanup Travis specific idioms

We're now using Azure pipelines for our builds instead of Travis, but we still use a number of Travis idioms like travis_retry, travis_fold and some TRAVIS_* environment variables.

We should convert retry and fold into generic bash functions and eliminate assumptions about TRAVIS env vars existing. We should be careful not to couple to any Azure specific functionality as we may still want to run builds on Travis (or another environment) in the future. For example, the fold function should continue to work as-is on Travis, but not pollute the log output when running on AZP.

Cleanup gke resources

After deleting a GKE cluster, there are a number of orphaned resources including health checks and persistent volumes. We either need to figure out how to get these resources to be cleaned up automatically or purge them after the cluster is destroyed.

Pick a random region and zone for GKE clusters

A couple times now we've been hit by STOCKOUT errors when trying to create a GKE cluster. Currently FATS is hard coded to us-central1a. We should minimally pick a zone at random for the region and probably also pick a region dynamically as well. This will make retries much more likely to succeed. In the future, we could detect the STOCKOUT error and try a new zone within the same job.

The STOCKOUT will also impact PKS on GCP, however, PKS chooses where to place the new cluster rather than the client...

Consider a go-based test harness

FATS is a pile a bash scripts. Bash is useful for quick and dirty scripts, but does not scale well as more people need to interact consume and author the system. Especially as bash is not a common language across the team, and no-one is particularly strong (or admits to being strong).

There are a few key aspects of FATS that I'd like to preserve:

  • library of capabilities that are composed by the component under test to fit its specific needs
  • platform/os agnostic (runs on Linux and Windows)
  • extensible to add new tools, cluster, registries, workloads (functions, applications) and macros
  • versioned so that consumers are not surprised by changes

Some things we can do better:

  • explicit interfaces: many of the current contracts are loose a typed language can have typed interfaces
  • extensibility: end-users should be able to extend any extensible element of the system as if it was first party (like adding a custom tool, or cluster)
  • configurability: many components offer some configurability, but it's non-uniform (like tool versions). Other aspects should be configurable, but are not (like k8s version)
  • local development: most tests will run in CI, but while developing and debugging, local support is crucial #249

Delete a GKE cluster if it exists before creating a new GKE cluster

Stalled jobs can orphan a GKE cluster. Retriggering the job will fail when it attempts to create a new cluster, but the name already exists. Before we create a new GKE cluster we should check if there is an existing cluster and if so, delete it.

This behavior only effects Travis because it reuses build ids when retriggering.

unpin gcloud version

217.0.0-0 was a bag a hurt, so we're currently pinned to 216.0.0-0. At some point we should get back in sync with the latest.

Run FATS after successful build from master branch of dependency repos

Currently, FATS is running once per day via cron. It would be nice to detect issues quicker by running FATS after every successful master branch build of a dependency repo. We can use the Travis API to trigger builds from each repo.

  • projectriff/riff
  • projectriff/command-function-invoker
  • projectriff/go-function-invoker
  • projectriff/java-function-invoker
  • projectriff/node-function-invoker
  • projectriff/python2-function-invoker
  • projectriff/python3-function-invoker

Create a PKS cluster per run

Currently we share a single PKS cluster for all jobs. Because riff uses a number of cluster scoped resources, we cannot run multiple jobs against a single cluster concurrently. Moreover, there is a risk of leaking state between runs causing false test results.

We can create a new cluster for each run so long as we are willing to wait for the cluster to be provisioned (about an hour right now ๐Ÿคข) and setup a load balancer to target the master node.

Note: travis builds will time out after 10 minutes of inactivity. We can prefix long running tasks with travis_wait 70 where 70 is the number of minutes to wait.

Restore full job concurrency

We limited the job concurrency to 1 because we sharing a single PKS cluster for builds. Once we are creating a PKS cluster per job, we can remove the concurrency limits.

Needs #52

Gracefully run tests locally

FATS tests assume they are running in a clean CI machine. They make liberal use of sudo and system directories. While shared, external resources are cleaned up, there is minimal attempt made to cleanup local resources, or create resources in a way that won't impact the broader environment.

Specific, actionable issues should be created and worked based on the pains faced by riff developers testing riff with FATS.

Decouple test runner from gcloud

We run FATS on GKE via TravisCI, but it should also be easy for someone to configure a local cluster and run the core of the test suite against that cluster. Decoupling functions/run.sh will make it easier to smoke test various kubernetes distributions, even if we don't have them run automatically (minikube on travis is a bit rough). we could also decide to run on other managed k8s environments via travis.

Add tests for streaming functions

The java and node invokers currently support streaming functions in addition to request-reply. FATS should test these flavors of functions

Decouple image registries from k8s runtime

Any k8s runtime should be able to use any image registry. We should be able to mix and match the registry with the k8s runtime more easily.

Right now we assume:

  • Minikube -> Docker Hub
  • GKE -> GCR
  • PKS -> GCR

Add support for multiple functions

There's a lot of useful logic in the uppercase/run.sh script that should also be used for other functions. It should be abstracted out to support other functions without a lot of duplication.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.