An app streaming tweets from twitter using kafka based on some hashtag and then make Sentiment Analysis on them and save them on a hive table
- Compatible with java 7+.
- Basma Ashour
- Saloni Vora
- Shristi Maharjan
- Start the kafka server
~ ❯❯❯ cd kafka-<VERSION>
~/kafka-<VERSION> ❯❯❯ bin/zookeeper-server-start.sh config/zookeeper.properties
~/kafka-<VERSION> ❯❯❯ bin/kafka-server-start.sh config/server.properties
- Create a hive table
~ ❯❯❯ hive
~ hive> drop table TwitterData;
~ hive> CREATE EXTERNAL TABLE IF NOT EXISTS TwitterData(createdAt STRING,Id STRING,userId STRING,location STRING,followersCount STRING,isVerified STRING,UserCreatedAt STRING,timezone STRING,sentiment STRING,tweetHours STRING,tweetMinutes STRING,tweetSeconds STRING,userCreatedMonth STRING,userCreatedYear STRING,hashtags STRING,userName STRING,Text STRING)COMMENT 'Twitter Live Data'ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ("separatorChar" = ",","quoteChar" = "\"" )LOCATION '/user/cloudera/twitterTweets';
~ hive> CREATE EXTERNAL TABLE IF NOT EXISTS HashTags(tweetHours STRING,hashtag STRING)COMMENT 'Twitter Live hashtags'ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' WITH SERDEPROPERTIES ("separatorChar" = ",","quoteChar" = "\"" )LOCATION '/user/cloudera/twitterHashtags';
- Download CS523-Twitter-Kafka-Streaming by cloning the Git Repo
- Import the project using eclipse
- Run
KafkaProducer, KafkaStreamSQL, SentimentAnalyzer