Let’s talk BigQuery

Not all big data projects need a complex architecture and engineering team in order to start making sense of the data, so what should you do if you need to do some good old analysis and just want to get started right away? Assuming, for example, that you’re part of a small company, starting up a project and you need to analyze lots of data without spending additional time thinking of/planning the build of an architecture, hiring an architect / engineer, managing an infrastructure…, just need to see through your data and make sense of it. This is where Google’s BigQuery comes into play (of course there are many other potential uses but let’s stick with this for the moment). Called (a bit pretentiously maybe) an Enterprise Cloud Data Warehouse solution, thus scaring upfront many potential users in my opinion, in fact BigQuery is helping many to, at least, quick start their path in the Big Data world.

As part of the preparation of our next workshop, Data Analytics with BigQuery, we interviewed Gabriel Preda – trainer for the workshop but most importantly enthusiastic user of the solution for the last couple of years – to give us a glimpse of what we should expect from this solution.   

Why BigQuery, why did it made sense to you?

Usually in a startup each person wears more than one hat. You put the hat of the sysadmin…. you’re the sysadmin. Later you might need to wear the hat which says „innovation”… and start collecting GBs of daily data and of course process them in a timely fashion. Being short on people it was clear that we needed a SaaS solution.

In which use cases should we use BigQuery (analytical, data migration, cloud requirements)?

BigQuery is designed for OLAP (Online Analytical Processing) or BI. You should not use BigQuery for OLTP. Best use case for BigQuery are: ad hoc and trial-and- error interactive query of large dataset for quick analysis and troubleshooting.

Can you list the best fit scenarios for it?

I have used it successfully for in house analytics solutions. But I think it’s one of the best candidates on the market for data fishing because of it’s ability to perform ad hoc queries on large amount of data…

Is it more feasible to be used in projects where the data has been already natively stored in the cloud (e.g. Google Cloud Storage)?

Data transfer towards BigQuery is free. You might have some costs in transforming the data as there are some requirements on the data BigQuery can ingest. If you already have data in CSV, Avro (and soon Parquet) you can import them directly.

Which are the BigQuery alternatives/competitors?

I don’t know what to say about this… as it is quite a unique beast product!

Can you control where your data is, in case you have some requirements regarding location of your data?

You can choose between US and EU. But that is where it ends. Though there are some awesome news… there is an experimental extension to the BigQuery client that offers client-side encryption (Homomorphic encryption) for a subset of query types… that is: you can encrypt your data, upload encrypted data to BigQuery, run queries, fetch the results and decrypt them locally. It’s magic!

How you visualize the results of the analysis or the correlations of the data in BigQuery.

In the worst case scenario, when you can’t use the existing integrations, you can retrieve the results and use any visualization tool you are accustomed with. Now there are a lot of available integrations like: Tableau, Qlik, Talend, Informatica, SnapLogic or newcomers like Chartio or even free & open source BI tools like Metabase. There is also a Google solution (for now in beta) called Data Studio which covers more than BigQuery. I’ll do my best add details about Data Studio during the workshop.

Interview by: Valentina Crisan – bigdata.ro

Data Analytics with BigQuery

BigQuery is generally seen as a “fast and fully-managed enterprise data warehouse for large-scale data analytics”. The workshop is designed to go through all the concepts of Big Query and to provide a seamless start into using BigQuery. After this workshop you will be able to start a real project with BigQuery.

Date: 13 Mai, 9:30 – 14:00
Trainer: Gabriel Preda 
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15, no more places left
Price: 150 RON  including  VAT

Check out the agenda and register here

Analytics with Cassandra and Spark SQL Workshop

We continue the series of Spark SQL and Cassandra with more hands on exercises on the integration between the 2 solutions, working on open Movielens data. This workshop addresses those who know the basics of Cassandra & CQL and have SQL knowledge. Spark is not mandatory, although would be good to know it’s basic concepts ( RDD, transformations, actions) since we will not address these concepts in the workshop but we will mention them in several occasions. Without Spark basic concepts you will still understand the aggregations that can be done at Spark SQL level but you will not fully understand how Spark SQL integrates in the whole Spark system.
 In this workshop you will understand the optimal way of making queries in a solution composed of Apache Cassandra and Apache Spark.
Prerequisites: Cassandra Concepts knowledge, SQL knowledge

Trainers: Felix Crisan, Valentina Crisan
When: 22 April
Time: 9:30-14:00
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15 5 places left
Price: 125 RON  including  VAT

Check out the agenda and register here.

Analytics with Cassandra and Spark SQL Workshop

For those that learned about Apache Cassandra, you have realized so far that Cassandra it’s a storage and pre-aggregation layer, thus a computational layer should exist in order to complete the queries we would like to run on our data. In this workshop we will look at the analytics that can be done on top of Cassandra with Spark SQL, we will start with similar examples in CQL and Spark SQL and we will evolve into examples that can only be run with Spark SQL.

Trainers: Felix Crisan, Valentina Crisan
When: 18 March
Time: 9:30-14:30
Location: eSolutions Academy, Budişteanu Office Building, strada General Constantin Budişteanu Nr. 28C, etaj 1, Sector 1, Bucureşti.
Number of places: 15   no more places left
Price: 125 RON  including  VAT

There are no more places left for this session, but you can check out the agenda and register here and we’ll keep you informed if places become available.

SQL & noSQL: Intro in Cassandra

In this 4 hours session we will learn about Cassandra concepts and data model and what analytics can be done with it. We will discuss about several noSQL solutions out there, how’s Cassandra differentiated from those and we will work on real data importing data in Cassandra, learning CQL and data modeling rules. We will use docker for local installations of Cassandra.

Trainers: Felix Crisan, Valentina Crisan
When: 30 July
Time: 9:30-13:30
Location: Impact Hub ( http://www.impacthub.ro/ )
Number of places: 15 2
Price: 100 RON without VAT

Check out the agenda and register here

Using Elasticsearch for Logs – hands on session by Radu Gheorghe

This workshop gives an overview of what Elasticsearch can do and how you would use it for searching and analyzing logs and other time-series data (metrics, social media, etc).

Trainer: Radu Gheorghe
When: 25 June
Location: Impact Hub ( http://www.impacthub.ro/ )
Time: 9:30 – 13:30
Price: 100 RON without VAT
Number of places: 15 NO MORE SEATS LEFT.

There are no more seats left for this session. If you want to sign-up for a future session check out the next link: Sign up for Future Elasticsearch hands on session


Join the CEE Innovators Community this November at How to Web Conference 2015


Hello everybody, this year we are supporting How To Web, a wonderful event created by an incredible team. See below their pitch in convincing our readers to join their event. October 21 is the last day of Very Early Bird tickets – choose to be where interesting things happen.

Brace yourselves! How to Web is coming and it’s going to be… LEGENDARY! Meet some of the best & brightest minds in the CEE & beyond, listen to hands-on talks & practical case studies, get to feel the vibe of the CEE startup ecosystem! All these & more on November 26 & 27 at the 6th edition of How to Web, the most important conference on tech innovation & entrepreneurship in South Eastern Europe. Are you the founder of an early-stage tech startup with disruptive potential at global scale? Then apply now for Startup Spotlight, competition & mentoring program with total cash prizes of USD 20.000!

Why attend?

How to Web Conference 2015 brings you at the same table with 1000+ startup founders, product managers, developers, online marketers & community leaders from all around the CEE. Beyond high-quality content, including case studies and hands-on talks on different topics, you have great networking opportunities by connecting with the who’s who in the regional tech industry.

Learn best practices from remarkable entrepreneurs & top-notch professionals

Successful entrepreneurs and experienced professionals from all around the world will take the main stage of How to Web Conference to share best practices & practical case studies, and deliver insightful talks on product launches, product metrics, product marketing, content marketing, conversion rate optimization through growth hacking, or team building and management, among many others.

Among the speakers that you’ll get to meet and learn from this year, there are:

  • Jan Reichelt, Co-Founder and President, Mendeley, a company that disrupted information sharing in scientific research, grew to 50 employees and a few million users, and was acquired by Reed Elsevier in April 2013;

  • Larry Gadea, Founder & CEO, Envoy, startup that allows businesses to check-in people and keep track of their visitors and that recently raised a USD 15 million series A round led by Andreessen Horowitz;

  • Martin Eriksson, Co-Founder, Mind the Product and Product Tank, entrepreneurial, driven and innovative product management professional with 18+ years experience in building leading online products;

  • Kalman Kemenczy, Director of Product, Prezi.com, a cloud-based (SaaS) presentation software and storytelling tool for presenting and sharing ideas on a virtual canvas;

  • Edial Dekker, Co-Founder, Gidsy & Hack de Overheid and ex-Senior Product Manager, Eventbrite;

  • Bram Kanstein, Founder, Startup Stash, a curated directory of 400 resources and tools for startups, that went viral and got 230.000 pageviews in 48 hours, and ex-European Community Manager, Product Hunt;

  • Sujan Patel, Renowned growth hacker, marketer, and serial entrepreneur, Co-Founder of ContentMarketer.io and author of ”100 Days of Growth”, ebook sold in over 10,000 copies.

And many many more! Check out the list of speakers confirmed on the conference website..

The secondary stage of the conference is dedicated to the startup community! It’s here where you’ll find out more about raising angel / VC money, startup valuation, managing the relationship with investors, or the accelerator experience. These add up to a series of community panels with regional experts moderated by the leaders of the local communities.

High level networking, dedicated activities, amazing parties

Get in touch with others, set meetings in advance and receive interests-based recommendations using How to Web Meet, the mobile app developed by mReady for the conference. Besides, you can always choose to join roundtables, “Ask the Expert” sessions or open discussions that will be organized in the two networking lounges throughout the event. And networking doesn’t end when the conference does: discover Bucharest’s vibrant nightlife, get social and connect with the innovators community by joining one of the networking cocktails, special meetups and exclusive parties set up for you!

Discover the latest trends & how tech is reshaping the world

Passionate of hardware & gadgets? Visit the gadget expo area to check out what’s new in the IoT world and test devices that use Artificial Intelligence to make your life better. Don’t miss the “Gadget Showcase” on stage to see live product demos & interactive presentations. From drones to exoscheletons, smartwatches & VR, we’ve got everything set straight for you to have an amazing experience!

Discover the startup opportunities offered by the Startup Spotlight program

Early stage startup looking for opportunities? You’re in the right place! Apply now for Startup Spotlight and get ready to close deals and compete for the USD 20.000 cash prizes! Join the program to get access to mentoring sessions customized to fit your needs, a curated deal-making pipeline, valuable connections, visibility & more. And all for free! All you have to do is submit your application by Friday, Oct. 30!

Very Early Bird tickets now available

See you on November 26 & 27 at How to Web Conference 2015, the meeting place for the innovators community! The conference will take place at Grand Cinema & More, Baneasa Shopping City, and Very Early Bird tickets are now available!

How to Web Conference 2015 is an event organized in collaboration with Telekom Romania, Bitdefender, and IXIA, with the support of Microsoft, Avangate, hub:raum, the Canadian Embassy in Romania, Mozilla and Okapi Studio.

See you at How to Web!

Big Data in Tg Mures

Looks like Big Data topic is getting more and more attention and more cities are starting events that include this topic, thus we decided we will have a new bigdata.ro page where we will present these events. The first event we will feature will take place in Tg Mures on June 2nd, Kodok Marton will talk about “Complex Realtime Event Analytics Using BigQuery”.

Join the event if you are in the Tg Mures area, you can see the event details here.

Looking forward to see a Big Data community forming in Tg Mures as well.

Hadoop MapReduce and Spark training

With the occasion of completing the Big Data Romanian Tour in Bucharest we planned a light intro and hands-on session for MapReduce and Spark. Together with the trainers – Tudor Lapusan and Andrei Avramescu – we are planning a dynamic and interactive session that is going to go from theory to practice. We have only 4 hours so we will need to be pretty fast – thus we will send in advance some materials to get familiar to. But, the good thing is that at the end of this session you should be able to start your own Hadoop or Spark cluster, pump data into it and start the analysis. Tudor and Andrei will also talk about the differences between Spark and Hadoop MapReduce. We will have a limited number of places available for this training – since we will need to work with everybody on the hands on part of the training – thus, if you are interested, don’t wait up. Register here for the training and tell us your expectations so that we can prepare better in advance. See you all soon & if you have any questions contact us on contact at bigdata.ro or comment this post.

Romanian Big Data Roadshow: Cluj, Timisoara, Bucharest

colaj big data meetups

3 Romanian communities, more then 700 members, incredible energy and subjects, these are just a few of the ingredients of the Romanian Big Data Roadshow that brings together speakers and members of three dynamic communities that have in common an extremely high interest in Big Data & Data Science.

A bit of a background for the event 

I met (virtually) Alex Sisu and Tudor Lapusan, the organizers for Timisoara and Cluj Big Data meetups, early 2014 and was not long after that we decided we need to organize a Big Data event, given:

– the amount of knowledge but also of interest that exists in Romania on this particular area

– but as well our similar thoughts on keeping such events technology/companies independent as much as possible, in order keep the content genuine and not transform presentations in sales pitches.

To get started and see how our collaboration goes first we started with this roadshow. Follow us on this page (or on the respective pages of the local communities for the detailed agendas, presentations, pictures) and see how the roadshow looks like, but as well stay tuned for what we’re planning next.