Giter VIP home page Giter VIP logo

voice-powered-analytics's Introduction

Voice Powered Analytics

Introduction

In this workshop, you will build an Alexa skill that queries metrics from a data lake, which you will define. The goal after leaving this workshop, is for you to understand how to uncover Key Performance Indicators (KPIs) from a data set, build and automate queries for measuring those KPIs, and access them via Alexa voice-enabled devices. Startups can make available voice powered analytics to query at any time, and Enterprises can deliver these types of solutions to stakeholders so they can have easy access to the Business KPIs that are top of mind.
This workshop requires fundamental knowledge of AWS services, but is designed for first time users of QuickSight, Athena, and Alexa. We have broken the workshop into three sections or focus topics. These are:

  • Data Discovery using QuickSight
  • Building Data Lake analytics in Athena (based on objects in S3) to generate answers for Alexa
  • Building a custom Alexa skill to access the analytics queries from Athena

We expect most attendees to be able to complete the full workshop in 2 hours.

To help keep moving through the sections in case you get stuck anywhere, we have provided CloudFormation templates and sample code.

For those feeling creative, many sections also have Bonus Sections where you can build additional capability on top of the workshop.
Feel free to engage your workshop facilitator(s)/lab assistant(s) if you'd like additional assistance with these areas.

You can also contact @WestrichAdam or @chadneal on twitter if you have additional questions or feedback.

Prerequisites

Please make sure you have the following available prior to the workshop.

Lab Setup

We have provided a CloudFormation template to create baseline resources needed by this lab but are not the focus of the workshop. These include IAM Roles, IAM Policies, a DynamoDB table, and a CloudWatch Event rule. These are listed as outputs in the CloudFormation template in case you want to inspect them.

Please launch the template below so that the resources created will be ready by the time you get to those sections in the lab guides.

**Pick the desired region that's closest to your location for optimal performance **

When you launch the template you will be asked for a few inputs. Use the following table for reference.

Input Name Value
Stack Name VPA-Setup
DDBReadCapacityUnits 5
DDBWriteCapacityUnits 5
Watch a video of launching CloudFormation (Click to expand)

launcg CloudFormation

Region Launch Template
EU-WEST-1
US-EAST-1

Modules

By default, you can access twitter data that exists in a public S3 bucket filtered on #reinvent, #aws or @AWSCloud. If you'd like to use this pre-existing data, you can skip to Module 1. But if you'd like to deploy this workshop through building your own Data Lake and using your own filters, follow the outlined steps below:

Optional Module 0 (Build Your Own Data Lake)

Step 1: Generate Twitter Keys

  1. Go to http://twitter.com/oauth_clients/new
  2. Apply for a Twitter Developer Account. Takes ~15 minutes. Requires detailed justification, twitter approval, and email verification
  3. Under Name, enter something descriptive, e.g., awstwitterdatalake can only have alpha-numerics
  4. Enter a description
  5. Under Website, you can enter the website of your choosing
  6. Leave Callback URL blank
  7. Read and agree to the Twitter Developer Agreement
  8. Click "Create your Twitter application"

Step 2: Deploy App In Repo

  1. Navigate to Twitter-Poller to-Kinesis-Firehose in the Serverless Application Repository.
  2. Click the Deploy button (top righthand corner)
  3. You may be prompted to login to your AWS account. After doing so, scroll down to the Application Settings section where you will be able to enter: a. The 4 Tokens received from Twitter b. You can keep the Kinesis Firehose resource name the same or change it to a preferred name c. Customize the search text that twitter will bring back
  4. After deploying the Serverless Application, your application will begin polling automatically within the next 5 minutes.

Lastly, note the name of the S3 bucket so you can use it to create the Athena schema.

Main Modules:

  1. Amazon QuickSight Section
  2. Amazon Athena Section
  3. Amazon Alexa Section

If you'd like to make your skills private to your organization, you can deploy your skill privately through Alexa For Business

After you have completed the workshop you can disable to the CloudWatch Event to disable the Athena poller. This will stop the automated scans of s3 from Athena and also serve to stop any further Athena costs. If you want to completely remove all resources please follow the cleanup guide.

voice-powered-analytics's People

Contributors

ahwestrich avatar asimov4 avatar chadneal avatar collinforrester avatar hyandell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

voice-powered-analytics's Issues

Query Error in Athena

Hi,
I tried to run the query to create a table in Athena using this code:
CREATE EXTERNAL TABLE IF NOT EXISTS default.tweets(
id bigint COMMENT 'Tweet ID',
text string COMMENT 'Tweet text',
created timestamp COMMENT 'Tweet create timestamp',
screen_name string COMMENT 'Tweet screen_name',
screen_name_followers_count int COMMENT 'Tweet screen_name follower count',
place string COMMENT 'Location full name',
country string COMMENT 'Location country',
retweet_count int COMMENT 'Retweet count',
favorite_count int COMMENT 'Favorite count')
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'paths'='id,text,created,screen_name,screen_name_followers_count,place_fullname,country,retweet_count,favorite_count')
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
's3://aws-vpa-tweets/tweets/'

I changed the location according to Ireland but when I run the query in the query editor in Athena, I get the following error:

Your query has the following error(s):

Unable to verify/create output bucket aws-vpa-tweets-euw1 (Service: AmazonAthena; Status Code: 400; Error Code: InvalidRequestException; Request ID: 890390cf-1db9-40d3-9259-ba7798be20dd)

Lambda not writing to s3 and therefore 404 error

Hey i am going through this workshop and when i copy in the lambda, fill out the env and provided and get the bucket from the stack i get:


errorMessage": "An error occurred (404) when calling the HeadObject operation: Not Found",
  "errorType": "ClientError",

When it is trying to 's3_client.download_file(s3_bucket, s3_key[1:], "/tmp/results.csv")'

I looked in the bucket and there is no data in there which is correct.

it seems the athena output location isnt working. Can you help please

Tried doing this for my own data but cant seem to pickup the 2nd question

Hi again,

So after the success of the last one i wanted to run this against my data sets with my own questions.
However when i get to the last stay of asking the Alexa she can find my skill but not answer the question 'whats my ' .
the 1st lambda works and the alexa can find my skill but its the specific question it dosent like.
Would you guys be able to help? not sure what specifics you need to solve this as i I have changed the name and she responses with
'There was a problem with the requested skill's response'

Can't download the tweet file

Hi,
I've tried us-east1 and ireland, but I keep getting this error:

$ aws s3 cp s3://aws-vpa-tweets/tweets/2017/11/06/04/aws-vpa-tweets-1-2017-11-06-04-23-28-2020b61e-ac18-4c9e-b446-6a49f8cced21.gz .

fatal error: An error occurred (404) when calling the HeadObject operation: Key "tweets/2017/11/06/04/aws-vpa-tweets-1-2017-11-06-04-23-28-2020b61e-ac18-4c9e-b446-6a49f8cced21.gz" does not exist

Let me know how I can access this file please :)

I've tried the AWS CLI from Windows, Windows bash and an AWS instance in us-east1 - all same error!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.