Modeling your data for analytics with Apache Cassandra and Spark SQL

This session is intended for those looking to understand better how to model data for queries in Apache Cassandra and Apache Cassandra + Spark SQL. The session will help you understand the concept of secondary indexes and materialized views in Cassandra and the way Spark SQL can be used in conjunction with Cassandra in order to be able to run complex analytical queries. We assume you are familiar with Cassandra & Spark SQL (but it’s not mandatory since we will explain the basic concepts behind data modeling in Cassandra and Spark SQL). The whole workshop will be run in Cassandra Query Language and SQL and we will use Zeppelin as the interface towards Cassandra + Spark SQL.

Date: 10 June, 9:00 – 13:30 – this workshop will be rescheduled
Trainers: Felix Crisan, Valentina Crisan
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places:  15
Price: 150 RON (including VAT)

Check out the agenda and register for future session here.

Let’s talk BigQuery

Not all big data projects need a complex architecture and engineering team in order to start making sense of the data, so what should you do if you need to do some good old analysis and just want to get started right away? Assuming, for example, that you’re part of a small company, starting up a project and you need to analyze lots of data without spending additional time thinking of/planning the build of an architecture, hiring an architect / engineer, managing an infrastructure…, just need to see through your data and make sense of it. This is where Google’s BigQuery comes into play (of course there are many other potential uses but let’s stick with this for the moment). Called (a bit pretentiously maybe) an Enterprise Cloud Data Warehouse solution, thus scaring upfront many potential users in my opinion, in fact BigQuery is helping many to, at least, quick start their path in the Big Data world.

As part of the preparation of our next workshop, Data Analytics with BigQuery, we interviewed Gabriel Preda – trainer for the workshop but most importantly enthusiastic user of the solution for the last couple of years – to give us a glimpse of what we should expect from this solution.   

Why BigQuery, why did it made sense to you?

Usually in a startup each person wears more than one hat. You put the hat of the sysadmin…. you’re the sysadmin. Later you might need to wear the hat which says „innovation”… and start collecting GBs of daily data and of course process them in a timely fashion. Being short on people it was clear that we needed a SaaS solution.

In which use cases should we use BigQuery (analytical, data migration, cloud requirements)?

BigQuery is designed for OLAP (Online Analytical Processing) or BI. You should not use BigQuery for OLTP. Best use case for BigQuery are: ad hoc and trial-and- error interactive query of large dataset for quick analysis and troubleshooting.

Can you list the best fit scenarios for it?

I have used it successfully for in house analytics solutions. But I think it’s one of the best candidates on the market for data fishing because of it’s ability to perform ad hoc queries on large amount of data…

Is it more feasible to be used in projects where the data has been already natively stored in the cloud (e.g. Google Cloud Storage)?

Data transfer towards BigQuery is free. You might have some costs in transforming the data as there are some requirements on the data BigQuery can ingest. If you already have data in CSV, Avro (and soon Parquet) you can import them directly.

Which are the BigQuery alternatives/competitors?

I don’t know what to say about this… as it is quite a unique beast product!

Can you control where your data is, in case you have some requirements regarding location of your data?

You can choose between US and EU. But that is where it ends. Though there are some awesome news… there is an experimental extension to the BigQuery client that offers client-side encryption (Homomorphic encryption) for a subset of query types… that is: you can encrypt your data, upload encrypted data to BigQuery, run queries, fetch the results and decrypt them locally. It’s magic!

How you visualize the results of the analysis or the correlations of the data in BigQuery.

In the worst case scenario, when you can’t use the existing integrations, you can retrieve the results and use any visualization tool you are accustomed with. Now there are a lot of available integrations like: Tableau, Qlik, Talend, Informatica, SnapLogic or newcomers like Chartio or even free & open source BI tools like Metabase. There is also a Google solution (for now in beta) called Data Studio which covers more than BigQuery. I’ll do my best add details about Data Studio during the workshop.

Interview by: Valentina Crisan –

Data Analytics with BigQuery

BigQuery is generally seen as a “fast and fully-managed enterprise data warehouse for large-scale data analytics”. The workshop is designed to go through all the concepts of Big Query and to provide a seamless start into using BigQuery. After this workshop you will be able to start a real project with BigQuery.

Date: 13 Mai, 9:30 – 14:00
Trainer: Gabriel Preda 
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15, no more places left
Price: 150 RON  including  VAT

Check out the agenda and register here

Analytics with Cassandra and Spark SQL Workshop

We continue the series of Spark SQL and Cassandra with more hands on exercises on the integration between the 2 solutions, working on open Movielens data. This workshop addresses those who know the basics of Cassandra & CQL and have SQL knowledge. Spark is not mandatory, although would be good to know it’s basic concepts ( RDD, transformations, actions) since we will not address these concepts in the workshop but we will mention them in several occasions. Without Spark basic concepts you will still understand the aggregations that can be done at Spark SQL level but you will not fully understand how Spark SQL integrates in the whole Spark system.
 In this workshop you will understand the optimal way of making queries in a solution composed of Apache Cassandra and Apache Spark.
Prerequisites: Cassandra Concepts knowledge, SQL knowledge

Trainers: Felix Crisan, Valentina Crisan
When: 22 April
Time: 9:30-14:00
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15 5 places left
Price: 125 RON  including  VAT

Check out the agenda and register here.

Analytics with Cassandra and Spark SQL Workshop

For those that learned about Apache Cassandra, you have realized so far that Cassandra it’s a storage and pre-aggregation layer, thus a computational layer should exist in order to complete the queries we would like to run on our data. In this workshop we will look at the analytics that can be done on top of Cassandra with Spark SQL, we will start with similar examples in CQL and Spark SQL and we will evolve into examples that can only be run with Spark SQL.

Trainers: Felix Crisan, Valentina Crisan
When: 18 March
Time: 9:30-14:30
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15   no more places left
Price: 125 RON  including  VAT

There are no more places left for this session, but you can check out the agenda and register here and we’ll keep you informed if places become available.

SQL & noSQL: Intro in Cassandra

In this 4 hours session we will learn about Cassandra concepts and data model and what analytics can be done with it. We will discuss about several noSQL solutions out there, how’s Cassandra differentiated from those and we will work on real data importing data in Cassandra, learning CQL and data modeling rules. We will use docker for local installations of Cassandra.

Trainers: Felix Crisan, Valentina Crisan
When: 30 July
Time: 9:30-13:30
Location: Impact Hub ( )
Number of places: 15 2
Price: 100 RON without VAT

Check out the agenda and register here

Using Elasticsearch for Logs – hands on session by Radu Gheorghe

This workshop gives an overview of what Elasticsearch can do and how you would use it for searching and analyzing logs and other time-series data (metrics, social media, etc).

Trainer: Radu Gheorghe
When: 25 June
Location: Impact Hub ( )
Time: 9:30 – 13:30
Price: 100 RON without VAT
Number of places: 15 NO MORE SEATS LEFT.

There are no more seats left for this session. If you want to sign-up for a future session check out the next link: Sign up for Future Elasticsearch hands on session


Spark Intro and Machine Learning workshops

Spark and Machine Learning workshops day on March 12, at TechHub:

1. 9:00 – 12:30 – Getting started with Spark: intro and hands on session (20 places)

2. 13:00 – 16:30 ML & Spark: MLlib intro and exercises (15 places)

Registration should be made separately for each workshop.

1. Getting started with Spark: intro and hands on session, in the limit of 20 places

Spark is the new trend in big data technologies, offering us an easy API and multiple environments to work with, like Batch, SQL, Graph, Machine Learning and Streaming processing.

The workshop will start with an introduction in Spark and will continue with many Spark examples, including the well known Wordcount example.

You have the option to choose the programming language you are most familiar with, so the examples will be written and explained in Java, Scala and Python (to be confirmed the Java one).

We will try all Spark examples in local mode, so all you need is your own laptop. The major benefit of this is that you can continue learning and try new examples even after the workshop.

Trainer :Tudor Lapusan

Agenda and Sign up for Intro to Spark

2. ML & Spark: MLlib intro and exercises, in the limit of 15 places

Description: Theoretical understanding of various ML algorithms: RandomForest, Clustering.

What you will learn: How to solve ML problems using Spark and MLLib (ml library on top of spark).


Agenda and Sign up for ML & Spark

Join the CEE Innovators Community this November at How to Web Conference 2015


Hello everybody, this year we are supporting How To Web, a wonderful event created by an incredible team. See below their pitch in convincing our readers to join their event. October 21 is the last day of Very Early Bird tickets – choose to be where interesting things happen.

Brace yourselves! How to Web is coming and it’s going to be… LEGENDARY! Meet some of the best & brightest minds in the CEE & beyond, listen to hands-on talks & practical case studies, get to feel the vibe of the CEE startup ecosystem! All these & more on November 26 & 27 at the 6th edition of How to Web, the most important conference on tech innovation & entrepreneurship in South Eastern Europe. Are you the founder of an early-stage tech startup with disruptive potential at global scale? Then apply now for Startup Spotlight, competition & mentoring program with total cash prizes of USD 20.000!

Why attend?

How to Web Conference 2015 brings you at the same table with 1000+ startup founders, product managers, developers, online marketers & community leaders from all around the CEE. Beyond high-quality content, including case studies and hands-on talks on different topics, you have great networking opportunities by connecting with the who’s who in the regional tech industry.

Learn best practices from remarkable entrepreneurs & top-notch professionals

Successful entrepreneurs and experienced professionals from all around the world will take the main stage of How to Web Conference to share best practices & practical case studies, and deliver insightful talks on product launches, product metrics, product marketing, content marketing, conversion rate optimization through growth hacking, or team building and management, among many others.

Among the speakers that you’ll get to meet and learn from this year, there are:

  • Jan Reichelt, Co-Founder and President, Mendeley, a company that disrupted information sharing in scientific research, grew to 50 employees and a few million users, and was acquired by Reed Elsevier in April 2013;

  • Larry Gadea, Founder & CEO, Envoy, startup that allows businesses to check-in people and keep track of their visitors and that recently raised a USD 15 million series A round led by Andreessen Horowitz;

  • Martin Eriksson, Co-Founder, Mind the Product and Product Tank, entrepreneurial, driven and innovative product management professional with 18+ years experience in building leading online products;

  • Kalman Kemenczy, Director of Product,, a cloud-based (SaaS) presentation software and storytelling tool for presenting and sharing ideas on a virtual canvas;

  • Edial Dekker, Co-Founder, Gidsy & Hack de Overheid and ex-Senior Product Manager, Eventbrite;

  • Bram Kanstein, Founder, Startup Stash, a curated directory of 400 resources and tools for startups, that went viral and got 230.000 pageviews in 48 hours, and ex-European Community Manager, Product Hunt;

  • Sujan Patel, Renowned growth hacker, marketer, and serial entrepreneur, Co-Founder of and author of ”100 Days of Growth”, ebook sold in over 10,000 copies.

And many many more! Check out the list of speakers confirmed on the conference website..

The secondary stage of the conference is dedicated to the startup community! It’s here where you’ll find out more about raising angel / VC money, startup valuation, managing the relationship with investors, or the accelerator experience. These add up to a series of community panels with regional experts moderated by the leaders of the local communities.

High level networking, dedicated activities, amazing parties

Get in touch with others, set meetings in advance and receive interests-based recommendations using How to Web Meet, the mobile app developed by mReady for the conference. Besides, you can always choose to join roundtables, “Ask the Expert” sessions or open discussions that will be organized in the two networking lounges throughout the event. And networking doesn’t end when the conference does: discover Bucharest’s vibrant nightlife, get social and connect with the innovators community by joining one of the networking cocktails, special meetups and exclusive parties set up for you!

Discover the latest trends & how tech is reshaping the world

Passionate of hardware & gadgets? Visit the gadget expo area to check out what’s new in the IoT world and test devices that use Artificial Intelligence to make your life better. Don’t miss the “Gadget Showcase” on stage to see live product demos & interactive presentations. From drones to exoscheletons, smartwatches & VR, we’ve got everything set straight for you to have an amazing experience!

Discover the startup opportunities offered by the Startup Spotlight program

Early stage startup looking for opportunities? You’re in the right place! Apply now for Startup Spotlight and get ready to close deals and compete for the USD 20.000 cash prizes! Join the program to get access to mentoring sessions customized to fit your needs, a curated deal-making pipeline, valuable connections, visibility & more. And all for free! All you have to do is submit your application by Friday, Oct. 30!

Very Early Bird tickets now available

See you on November 26 & 27 at How to Web Conference 2015, the meeting place for the innovators community! The conference will take place at Grand Cinema & More, Baneasa Shopping City, and Very Early Bird tickets are now available!

How to Web Conference 2015 is an event organized in collaboration with Telekom Romania, Bitdefender, and IXIA, with the support of Microsoft, Avangate, hub:raum, the Canadian Embassy in Romania, Mozilla and Okapi Studio.

See you at How to Web!