This is a demo application for Cloud Native Batch presentation highlighting use of spring technologies and Pivotal Application Services to deploy batch apps in a cloud native way.
Inspired by: Need 24x7 ETL? Then Move to Cloud Native File Ingest with Spring Cloud Data Flow
This was first demoed at Spring One Tour Philadelphia and the slides can be accessed here.
Common commands while working with the app.
If you want to get all the spring cloud data flow start apps, then use this
app import https://dataflow.spring.io/rabbitmq-maven-latest
cf create-service p-dataflow standard data-flow -c '{"concurrent-task-limit": 2, "scheduler": {"name": "scheduler-for-pcf", "plan": "standard"},"maven.remote-repositories.bintray.url": "https://dl.bintray.com/dpfeffer/maven-repo"}'
cf create-service p-service-registry standard discovery-server
cf create-service p-config-server standard config-server -c '{"git":{"uri":"https://github.com/doddatpivotal/dragonstone-finance.git","searchPaths":"dragonstone-finance-config","label":"master"}}'
cf create-service nfs Existing volume-service -c '{"share":"nfs-pcf.pez.pivotal.io/pcfone/dpfeffer","uid":"$EMPID","gid":"$EMPID", "mount":"/var/scdf"}'
cf create-service p.mysql db-small app-db
cf dataflow-shell data-flow
app register trades-loader --type task --uri maven://io.pivotal.dragonstone-finance:trades-loader:0.0.24
app default --id task:trades-loader --version 0.0.24
app info trades-loader --type task
task create trades-loader-task --definition "trades-loader --spring.cloud.task.batch.jobNames=tradesLoaderJob --io.pivotal.dataflow-db-service-name=relational-e918cdd7-4b79-49aa-945c-ecace0a007b7"
task create a-trades-extractor-task --definition "trades-loader --spring.cloud.task.batch.jobNames=aRatedTradesExtractorJob --io.pivotal.dataflow-db-service-name=relational-e918cdd7-4b79-49aa-945c-ecace0a007b7"
task validate trades-loader-task
task launch trades-loader-task --arguments "localFilePath=classpath:data.csv --spring.cloud.task.batch.jobNames=tradesLoaderJob" --properties "deployer.trades-loader.cloudfoundry.services=app-db,config-server,discovery-server,volume-service deployer.trades-loader.memory=768"
app register ratings-loader --type task --uri maven://io.pivotal.dragonstone-finance:ratings-loader:0.0.26
app default --id task:ratings-loader --version 0.0.26
app info ratings-loader --type task
task create ratings-loader-task --definition "ratings-loader --io.pivotal.dataflow-db-service-name=relational-e918cdd7-4b79-49aa-945c-ecace0a007b7"
task validate ratings-loader-task
task launch ratings-loader-task --arguments "localFilePath=classpath:data.csv --spring.cloud.task.batch.failOnJobFailure=true" --properties "deployer.ratings-loader.cloudfoundry.services=app-db,config-server,volume-service deployer.ratings-loader.memory=768"
app register task-launcher-dataflow --type sink --uri maven://org.springframework.cloud.stream.app:task-launcher-dataflow-sink-rabbit:1.0.1.RELEASE
app register sftp-dataflow-persistent-metadata --type source --uri maven://io.pivotal.dragonstone-finance:sftp-dataflow-source-rabbit:2.1.7
app default --id source:sftp-dataflow-persistent-metadata --version 2.1.7
stream create inbound-sftp-trades --definition "sftp-dataflow-persistent-metadata --password=<PASSWORD> >:task-launcher-dataflow-destination"
stream create inbound-sftp-ratings --definition "sftp-dataflow-persistent-metadata --password=<PASSWORD> > :task-launcher-dataflow-destination"
stream create process-task-launch-requests --definition ":task-launcher-dataflow-destination > task-launcher-dataflow --spring.cloud.dataflow.client.server-uri=https://dataflow-server.apps.pcfone.io"
stream deploy inbound-sftp-trades --properties "deployer.sftp-dataflow-persistent-metadata.memory=768,deployer.sftp-dataflow-persistent-metadata.cloudfoundry.services=app-db,config-server,volume-service"
stream deploy inbound-sftp-ratings --properties "deployer.sftp-dataflow-persistent-metadata.memory=768,deployer.sftp-dataflow-persistent-metadata.cloudfoundry.services=app-db,config-server,volume-service"
stream deploy process-task-launch-requests --properties "deployer.task-launcher-dataflow.memory=768"
For trades-loader
mvn clean deploy -Ddistribution.management.release.id=bintray -Ddistribution.management.release.url=https://api.bintray.com/maven/dpfeffer/maven-repo/trades-loader
remember to publish
For ratings-loader
mvn clean deploy -Ddistribution.management.release.id=bintray -Ddistribution.management.release.url=https://api.bintray.com/maven/dpfeffer/maven-repo/ratings-loader
remember to publish
For sftp-dataflow-source-rabbit
mvn deploy -Ddistribution.management.release.id=bintray -Ddistribution.management.release.url=https://api.bintray.com/maven/dpfeffer/maven-repo/sftp-dataflow-source-rabbit
remember to publish
What we have
- Multiple data sources
- Batch Apps use Spring Cloud Config Server and Spring Cloud Service Registry
- One of the batch apps has two different batch jobs defined in the same app
- ETL File to DB and DB to file
- Use of scheduler
- Bintray for custom maven repo
- I used bintray to publish custom Spring Cloud Task and Spring Cloud Stream apps and then added reference to this maven repository to data flow server
Used this blog to create sftp server on GCP for the demo. Remember to create firewall rule allowing inbound access on 22.
- Currently the dataflow-tasklauncher does not support accessing oauth protected data flow server. I could have updated the client defined in UAA to accept password grant types and then provided the basic auth creds and it would have worked, but I don't have requisite access on my demo environment to do so. So as a work around, I deployed an OSS data flow server in my space and bound it to the servers created by the SCDF for PCF. Then pointed dataflow-tasklauncher to this unprotected instance. Working with spring r&d to consider feature enhancement.
- Could not get task to show exit code of 1 when batch fails. There stops working when spring cloud services config dependency was added. Working with spring team to investigate.