Analytics with Cassandra and Spark
Date: 22 April, 9:30 – 14:00
Trainers: Felix Crisan, Valentina Crisan
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15 5 places left
Description:
We continue the series of Spark SQL and Cassandra with more hands on exercises on the integration between the 2 solutions, working on open Movielens data. This workshop addresses those who know the basics of Cassandra & CQL and have SQL knowledge. Spark is not mandatory, although would be good to know it’s basic concepts ( RDD, transformations, actions) since we will not address these concepts in the workshop but we will mention them in several occasions. Without Spark basic concepts you will still understand the aggregations that can be done at Spark SQL level but you will not fully understand how Spark SQL integrates in the whole Spark system.
In this workshop you will understand the optimal way of making queries in a solution composed of Apache Cassandra and Apache Spark.
Prerequisites: Cassandra Concepts knowledge, SQL knowledge
Agenda:
Data modelling in Cassandra – principles
Spark SQL – basic concepts
Spark SQL – Cassandra queries (full Spark aggregation of C* data)
– create dataframe from C* table, write to Cassandra table
– SQL queries to Cassandra tables: SELECT, GROUP BY, JOIN
– window functions: rank
Spark SQL – Cassandra queries ( pre-filtering of data in C*, final aggregation in Spark)
– denormalized data modeling in C* + secondary indexes
– SQL queries to C* tables
– predicates pushdown
The price for the workshop is 125 RON (including VAT).
1. Complete registration form:
2. Payment (125 RON – incl. VAT) – please complete first the above form: http://mpy.ro/3sus3vev