Giter VIP home page Giter VIP logo

google-cloud-datastore-php's Introduction

A PHP web application implementing autocomplete using google cloud datastore

This is a quick and dirty PHP web application that accesses google cloud datastore. The intent of this code is to demonstrate the technique only. This code, purposefully, in order to save time and effort, does not adhere to even minimum security standards (e.g., in order to quickly run the PHP code, it asks you to "chmod 777 /var/www/html" - it is your responsibility to understand the gravity of this and remove the VM as soon as you are done running the demo (or tighten things up yourself). Similarly, the code does not handle errors, does not log, and uses poor performance optimization (e.g., it does not connection-pool the datastore connection, instead it connects to Datastore every time a key is pressed).

Needless to say, this code can be vastly improved. I welcome pull requests, thank you. I will, of course, put that on my backburner as well!

How to run and test this app

Let us see how you can, step by step, set up your environment and run this app as a demo

On the google cloud side

  1. Create a GCE VM in your desired region
  • Use Ubuntu 14.04
  • 1 vCPU and 3.75 GB of memory will suffice. I used 16 GB of persistent Standard Disk as I was dealing with large datasets, you can go less if you just restrict yourself to the bestbuy dataset that this demo uses.
  • You will need access to Cloud Datastore API, so turn on "Allow full access to all Cloud APIs" right now, will save some time later (See number 2 in the pre-reqsuisite list here: https://cloud.google.com/datastore/docs/datastore-api-tutorial)
  • Allow HTTP and HTTPS traffic
  • use SSH key if you have one
  1. Create a project on GCP cloud console
  1. If you have completed the above steps successfully, you should be all set to start tinkering with the VM you just now created

On the VM you created

  1. Log in (use your SSH key if needed)
  2. Run "sudo apt-get update", and then "sudo apt-get install apache2 php5 php5-memcached memcached"

Note: We are installing memcached and php 5's memcache extension so that we can use memcache for RAM-based caching

  1. Test if apache is working - browse to the external IP address of the VM from your favorite browser
  2. Run "sudo apt-get install git", you will need to clone this repo
  3. Run "git clone https://github.com/kbxkb/google-cloud-datastore-php.git" from anywhere meaningful, cd into google-cloud-datastore-php
  4. Run "sudo chmod -R 777 /var/www" - see the security warning in the initial paragraph of this README
  5. Copy calldatastore.php, form.html, loaddatastore.php, loaddatastore.sh, products.json into /var/www/html, use sudo as needed
  6. Test access - browse to {your IP address}/form.html. You will see the rudimentary front end of my PHP application. Go ahead and type something in the textbox, the output area will show error, as it will try to access Datastore, but we have not set it up yet
  7. Now, you need to add an environment variable to this VM. A good, permamnent way to do this is to edit /etc/environment and add a line to it. The line should be (without the quotes): "GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json" (remember you downloaded this file in the step 2 of the previous section?). If the path has spaces, put the value of the variable (i.e., the path) in double quotes. Once you save this file, you will have to log back out and in for it to take effect
  8. cd into /var/www/html
  9. Now, we will install composer so that we can install other PHP libraries and dependencies. Inside /var/www/html, Run: "sudo curl -sS https://getcomposer.org/installer | php"
  10. Run "php composer.phar require google/auth" -> this installs the auth library needed to authenticate against the Google Cloud
  11. Run "php composer.phar require google/cloud" -> this installs the general GCP cloud library needed to access datastore
  12. If you list the contents of /var/www/html now, you will see a vendor directory created with an autoload.php file in it. if you are wondering what is PHP auto-loading, read this: http://ditio.net/2008/11/13/php-autoload-best-practices/
  13. Change the PHP code to point at your project instead of mine:
  • In loaddatastore.php, change the line "$projectId = 'triple-cab-162115';" - update the value to the project id to whatever your Google Cloud Project id is (the cloud console is a good place to grab this from)
  • Same for calldatastore.php

Load test data into Cloud Datastore

  1. Obviously, you have to do this once. I have used the publicly available BestBuy dataset here: https://github.com/BestBuyAPIs/open-data-set. In fact, the only file I have used from this dataset is products.json, and I have included that file in my repository, so you have it copied inside your /var/www/html folder right now (if you have followed the above steps)
  2. You will have to run this command to load the data into Cloud datastore, the script and the file are already in /var/www/html by now, just run this command from that directory: "./loaddatastore.sh products.json". Before you attempt this, here is something you should note:
  • Make sure that the sh file is an executable, set +x on it if needed
  • There are almost 52K records in products.json. This command will take around 4-5 hours to complete
  • The script that loads it (both the sh file and the php file that it calls repeatedly for each line) is poorly optimized. It is horrible to be precise. It just connects to datastore for every line on the JSON file, and runs a tight loop writing the entities into Datastore. I have used sed to extract the SKU and the Product Name fields only, that is all I write. The intent is to only demo auto-complete on the Product name, hence...
  • Run this command and take a break, it will take a while

Demo auto-complete!

After loading is complete, go back to the form.html on the browser, and start typing something in the text box, see what happens! If everything was successful, it should auto-complete.

Caching is working!

You might notice that the first time you type in a letter, it takes a little while to show the results. But if you type the same sequence of letters again, the results show up a lot quicker. This is because we are using memcache to cache the results for every unique letter-sequence in the code

Still feeling a sluggish? No wonder! Though we have used caching, performance optimization is still quite poor in this demo as of now. As you type, every key-press results in a call to datastore, but instead of connection-pooling, the code creates a new connection every time. That is not good, especially if you care about the end user's experience for auto-complete

A small case in point

GQL queries used against datastore are case-sensitive. That is why I use strtolower(...) in the file loaddatastore.php. That means all those 52K records are stored in datastore in lowercase. So if there is a product called "Battery", it will match if you start typing "battery". However, it will not match if you start typing "Battery" unless we convert everything you type to lowercase before issuing the query.

Clean-up

Do not forget to stop the VM, remove the GAE App and clean up Datastore. This is a metered platform, treat it like your own electricity bill, even if you are using an account with credits!

google-cloud-datastore-php's People

Contributors

kbxkb avatar kbxkb-games avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

pouya-racker

google-cloud-datastore-php's Issues

AJAX Error

I keep getting this error and I'm not sure where it's coming from. I've checked that my AppEngine app is installed. I killed the http AppEngine process to stop the service. However I still have the web services enabled on my Compute Instance image. Anad, I have my Datastore loaded with the BestBuy data.

Thank you!

Error message
addServer(127.0.0.1, 11211); $projectId = 'default-180700'; $datastore = new DatastoreClient([ 'projectId' => $projectId ]); echo 'Matches:
'; $matches_string = ""; //Get the string typed in by user for autocompletion... $queryval = strtolower($_GET['searchtext']); //If it is not blank... //if ($queryval[0]) { //we first have to check our local cache if we have the results for this query... $cache_hit = $mem->get($queryval); if ($cache_hit) { //yes! we have it it cache, no need to go back to datastore... $matches_string = (string) $cache_hit; } else { //cache miss, let us go to datastore and fetch the result... $upperlimit = $queryval . json_decode('"\ufffd"'); $query = $datastore->query() ->kind('SKU') ->filter('name', '>=', $queryval) ->filter('name', '<', $upperlimit) ->order('name'); $result = $datastore->runQuery($query); foreach ($result as $SKU) { $matches_string = $matches_string . $SKU['name'] . "
"; } //finally, let us insert this query result in our local cache so that next time, //we do not have to make a round-trip to datastore - note that we are caching for 7 days $mem->set($queryval, $matches_string, 604800); } echo $matches_string; } } catch (Exception $e) { echo 'Caught exception: ', $e->getMessage(), "\n"; } ?>

Why not using App Engine?

Hi, thankyou for sharing.
Wonder what were the considerations to choose VM instance instead of compute engine to host the web server?

besides the fact that php7.0 is not supported by app engine

thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.