The Panoptes Image Analysis Algorithm (PIAA) repository contains the data processing pipeline and algorithms for the images taken by PANOPTES observatories. PIAA currently handles storing the raw data, processing intermediate data and storing final data products. Eventually it will also analyze final data products and store results.
For a detailed explanation of the system, design choices and alternatives, see the Design Doc.
####Data Simulator#### Produces simulated light curves and Postage Stamp Cubes (PSCs) and uploads them to a Google Cloud Storage (GCS) bucket. ####App Engine Notification Proxy#### Receives notifications when new objects are added to the GCS bucket and pushes them to the Kubernetes cluster. ####Kubernetes Cluster#### Cluster of nodes managed using Kubernetes on Google Container Engine (GKE) and Docker. A Flask server on each pod recieves notifications from App Engine and spawns subprocesses to combine the stored light curves into master light curves for each Panoptes Input Catalog (PIC) star. ####Google Cloud Storage#### All simulated data inputs and products - light curves, PSCs and master light curves - are currently stored in a GCS bucket.
GCS has a feature called Object Change Notifications. These send an HTTP request with the metadata of the changed object in a given bucket. To set up this channel, runr
gsutil notification watchbucket -i <channel-name> <app-url> gs://<bucket-name>
Currently, a channel is set up called panoptes-simulated-data-channel
that sends notifications to https://notification-proxy-dot-panoptes-survey.appspot.com/
, an App Engine proxy, when changes occur in the bucket panoptes-simulated-data
.
Clone the source code from the Google Cloud repository into a local directory named notification-proxy
. Make sure the App Engine SDK is installed. Deploy using
appcfg.py update -A panoptes-survey .
In your local PIAA directory, make desired changes. Make sure kubectl, Docker and the Cloud SDK are installed. Then run
docker build -t gcr.io/panoptes-survey/piaa:<version> .
gcloud docker push gcr.io/panoptes-survey/piaa:<version>
kubectl set image deployment/combiner piaa=gcr.io/panoptes-survey/piaa:<version>
where <version>
is the new version number of the image. The existing versions can be found on Google Container Registry.
To change the number of nodes in the cluster, run
gcloud container clusters resize <cluster_name> --size <num_nodes>
where <cluster_name>
is the GKE cluster, currently image-analysis-cluster
.
To change the number of replicated pods that run on the nodes, run
kubectl scale deployment/<deployment_name> --replicas=<num_pods>
where <deployment_name>
is the name of the Deployment, currently combiner
.
Clone this repository to a directory named PIAA on your local machine and run
pip install -r requirements.txt
to install the dependencies. Set up everything as in the Coding in PANOPTES wiki, and add the environment variable
PIAA=$PANDIR/PIAA
For development purposes, a Google Compute Engine (GCE) instance can be used. The instance piaa-instance has been used for development and contains the PIAA and POCS repos, though they likely need to be updated. The url that App Engine sends notifications to needs to be changed to the external IP for the instance (set listener='GCE'
).
The logs can be monitored to see how system components are running. Here is where to find the logs for the App Engine notification proxy and the logs for the Container Engine cluster.