Giter VIP home page Giter VIP logo

cassandra.lunch's Introduction

Cassandra.Lunch

Resources from weekly Zoom lunches revolving around Apache Cassandra and Apache Cassandra-related topics. Hosted by Anant Corporation.

Join Cassandra Lunch Weekly at 12 PM EST Every Wednesday

If you would like to be a guest speaker, you can reach us at [email protected]. If you would like to sponsor Cassandra Lunch, please reach us at the email listed above.

Check out the Cassandra.Lunch playlist on Youtube


Table of Contents

Jump To Topic YouTube SlideShare
Cassandra 4.0 YouTube
Different Cassandra Distributions and Variants YouTube SlideShare
Cassandra & Kubernetes YouTube SlideShare
Jump Start Projects for Cassandra YouTube SlideShare
Basic Cassandra Log Diagnostics with ELK/FEK/BEK YouTube SlideShare
Cassandra Backup and Restoration YouTube SlideShare
Cassandra Anti-Entropy, Repair, and Synchronization YouTube SlideShare
Cassandra Tombstones YouTube SlideShare
Connecting Cassandra to Kafka YouTube
Combined Use of Relational Databases and Apache Cassandra YouTube SlideShare
Cassandra Read / Write Path YouTube SlideShare
Cassandra Stages / Thread Pools YouTube SlideShare
Cassandra Deployment and Admin Tools YouTube SlideShare
Lucene Based Indexes on Cassandra YouTube SlideShare
Cassandra Use Cases YouTube SlideShare
Cassandra Use Cases - Reference Architectures YouTube SlideShare
Cassandra Troubleshooting with Logs YouTube SlideShare
Cassandra on Baremetal / Virtual Machines / Containers YouTube SlideShare
Cassandra Backup / Restore Scenarios YouTube SlideShare
Cassandra & Kubernetes Update YouTube SlideShare
Cassandra and Spark Foundations YouTube SlideShare
Business Intelligence with Cassandra YouTube SlideShare
Cassandra Data Operations – Common Ways to Move Data in Cassandra YouTube SlideShare
Cassandra Deployment – Ansible and Terraform with Cassandra YouTube SlideShare
Liquibase and Cassandra YouTube SlideShare
Data Operations with Spark and Cassandra YouTube SlideShare
Specialized Databases on Cassandra YouTube SlideShare
CQL Copy for Data Operations YouTube SlideShare
Cassandra SSTables and Spark YouTube SlideShare
General Apache Cassandra Updates YouTube SlideShare
Scylla Migrator for Cassandra Data Operations YouTube SlideShare
Cassandra on Kubernetes - Docker/Kubernetes/Helm - Part 1 YouTube SlideShare
SSTable Files with SSTableloader YouTube SlideShare
DSBulk with Sed & Awk YouTube SlideShare
Docker/Kubernetes/Helm Fundamentals Part 2 YouTube SlideShare
Alpakka Cassandra and Twitter YouTube SlideShare
Apache Spark Jobs in Scala for Cassandra Data Operations YouTube SlideShare
Airflow and Cassandra YouTube SlideShare
Spark SQL for Cassandra Data Operations YouTube SlideShare
Machine Learning with Spark + Cassandra YouTube SlideShare
Cassandra Cluster Design & Architecture YouTube SlideShare

Apache Cassandra Lunch Online Meetup #10: Cassandra 4.0

  • We discuss and take an in-depth look at the improvements and new features that come with Cassandra 4.0.

  • We discuss various Cassandra distributions ranging from Cassandra / Cassandra Compliant Databases on JVM, Cassandra Compliant Databases on C++, Cassandra as a Service / Managed Cassandra Based on Open Source Cassandra, and Cassandra as a Service / Managed Cassandra Based on Proprietary Technology.

  • We cover Kubernetes, discussing what it is and how it works with Docker and Cassandra. We also looked at some of Kubernetes' competitors and a variety of open sources tools for Kubernetes which will give you an insight as to why we picked Kubernetes to be a worth while investment when working with databases.

  • We discuss a number of projects and platforms that you can use to jumpstart your Cassandra projects. They make useful educational resources; as well as, good starting codebases for new projects. We also discuss a recent article on the Yugabyte blog about Cassandra.

  • We discuss methods for finding and diagnosing issues in Cassandra clusters with ELK/FEK/BEK.

  • We discuss Cassandra Backup / Restoration. We also discuss disaster avoidance, disaster recovery, and different tools that can be used for backup and restoration of your Cassandra data. Also, we discuss an example scenario of how someone has set up multi-node clusters and how they go about data backup and restoration.

  • We discuss Cassandra Anti-entropy which is a process of comparing the data of all replicas and updating each replica to the newest version. We also looked at repair and synchronization in Cassandra and how you can prepare for the unexpected.


  • Guest speaker, Ryan Quey, a full stack data engineer, discusses a personal project he has been working on called java-podcast-processor, which is a tool to find podcast metadata over an external API, store them, get their RSS feeds, and run ETL using Airflow, Kafka, Spark, and Cassandra. The particular Cassandra distribution used is Elassandra, which allows seamless integration with Elasticsearch. The data is also displayed using a Gatsby app and served using Flask.

  • We discuss the combined use of relational databases and Cassandra. We also discuss the advantages of using relational databases and Cassandra separately; as well as, covering the advantages and methods for using both concurrently.

  • We discuss Cassandra read and write paths, which is how Cassandra stores and retrieves data at high speeds. We do not cover how Cassandra replicates data because that its own subject, but we take a look at these four sub-topics: Write Path, Update / Delete, Maintenance Path, and Read Path.

  • We discuss Cassandra and Staged Event-Driven Architecture with an emphasis on Cassandra stages / thread pools. We also discuss a few different tools that we can use to monitor these stages and thread pools in order to keep your Cassandra running as smoothly as possible.

  • We discuss deployment and administration tools for Cassandra. We also discuss a number of tools for the installation, configuration, monitoring, and administration of Cassandra clusters.

  • We discuss packaged and DIY methods for Lucene based indexes on Cassandra; as well as, give some pros and cons for using Lucene Based Indexes on Cassandra.

  • We discuss a number of use cases for Cassandra, focusing on Cassandra's place in running a digital business technology platform.

  • We discuss how Cassandra is used for real-time data platforms; as well as, cover different reference architectures in which Cassandra is and can be used.

  • We discuss how Cassandra is used for real-time data platforms; as well as, cover different reference architectures in which Cassandra is and can be used.

  • We discuss different methods in which we can deploy Cassandra whether it be on Baremetal, Virtual Machines, or Containers; as well as, pros, cons, and deployment tools.

  • We discuss specific scenarios for Cassandra's backup and restore, some methods for restoring data to a Cassandra cluster, and covered how factors like the topology of a cluster or the need for constant uptime can affect the backup/restore process.

  • We discuss updates regarding Cassandra and Kubernetes after the recent KubeCon event.

  • We discuss the basics of using Spark and Cassandra together, the advantages of each, and the advantages of using them together. We also discuss the potential drawbacks, and configuration methods for avoiding those drawbacks.

  • We discuss open-source tools that can be used for BI with Cassandra including a live demo using DSE, Presto, and Metabase.

  • We discuss the various ways of moving data into and out of Cassandra clusters.

  • We discuss using Terraform and Ansible to set up the infrastructure for and handle the provisioning of a new Cassandra cluster

  • We discuss how to use Liquibase with Cassandra and DataStax Astra.

  • We discuss some basic data operations that you can do with Apache Spark and Cassandra.


  • We discuss CQL Copy and how we can use it for Cassandra data operations.

  • We discuss Apache Spark projects that interact with Cassandra specifically through Cassandra’s SSTables

  • We discuss General Updates to Apache Cassandra and relevant articles of interest.

  • We discuss Scylla’s Spark Migrator and walk through how we can use the Scylla Migrator for Cassandra Data Operations.

  • We discuss Cassandra on Kubernetes and give an introduction to Docker, Kubernetes, and Helm.

  • We cover SSTable files, their relation to SSTableLoader, and we walk through an example using SSTableloader to load data taken from a cluster to a new, empty cluster.

  • We will introduce DSBulk or DataStax Bulk Loader, and show how we can use it with tools like sed and awk to do ETL on Cassandra data.

  • We will introduce DSBulk or DataStax Bulk Loader, and show how we can use it with tools like sed and awk to do ETL on Cassandra data.

  • In Apache Cassandra Lunch #45, we will discuss how you can stream tweets using Twitter4S (Scala Twitter client) and save them to Cassandra using Alpakka Cassandra.

  • In Apache Cassandra Lunch #46, we will discuss how we can use Apache Spark jobs written in Scala to do Cassandra data operations, which will include a live walkthrough!

  • In Cassandra Lunch #48, we will discuss using Airflow and Cassandra together. Airflow provides a Cassandra connection type and a Cassandra operator. We will explore what we can do to manage a Cassandra cluster via Airflow.

  • We will discuss how to use Spark SQL to do Cassandra data operations such as moving data in Apache Cassandra tables.

  • In Apache Cassandra Lunch #50, we will discuss how you can use Apache Spark and Apache Cassandra to perform basic Machine Learning tasks.

  • In Cassandra Lunch #51, we will discuss an overview of Cassandra cluster architecture, not to be confused with the Cassandra database architecture. Specifically, using Cassandra Datacenters to isolate workloads.

cassandra.lunch's People

Contributors

adp8ke avatar stefanvinica avatar jbarnes439 avatar xingh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.