Modeling your data for analytics with Apache Cassandra and Spark SQL

Modeling your data for analytics with Apache Cassandra and Spark SQL

Date: 10 June, 9:00 – 13:30
Trainers: Felix Crisan, Valentina Crisan
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places:  15
Price: 150 RON (including VAT)

Description:

This session is intended for those looking to understand better how to model data for queries in Apache Cassandra and Apache Cassandra + Spark SQL. The session will help you understand the concept of secondary indexes and materialized views in Cassandra and the way Spark SQL can be used in conjunction with Cassandra in order to be able to run complex analytical queries. We assume you are familiar with Cassandra & Spark SQL (but it’s not mandatory since we will explain the basic concepts behind data modeling in Cassandra and Spark SQL). The whole workshop will be run in Cassandra Query Language and SQL and we will use Zeppelin as the interface towards Cassandra + Spark SQL.

Agenda:
1. Cassandra secondary indexes
         – how are implemented, best use cases, worst use cases
         – SASI: SStables Attached Secondary indexes
2. User side versus server side denormalization
         – user side denormalization
         – materialized views
         – differences
3. Adding Spark SQL capabilities to Cassandra
         – understand Spark SQL:  a short overview of its capabilities
         – when to use it (in conjunction with Cassandra)
         – what kind of queries can be run from Spark SQL
2. Modeling your data per queries
         – user or server side denormalization, secondary indexes, Spark SQL
         – we take several queries and see which is the best implementation

 

The price for the workshop is 150 RON (including VAT).

1. Complete registration form:

 

2. Payment (150 RON – incl. VAT) – please complete first the above form: http://mpy.ro/3sus3vev