Big Data Bucharest Meetup, June 12

Meetup Registration

  • 17:00 – 17:30 Welcome
  • 17:30 – 18:10 Machine Learning: From (Big) Data to (Big) Decisions, Marius Fersigan
    • Data is everywhere. You are data – or at least an imperfect representation of your DNA. You are also a data generator: your blood pressure today, your heart rate tomorrow and the sustained electrical signals in your brain can be represented as data. And there is more data from all kind of fields: business industry, healthcare systems, environment, our solar system – and the list can continue with every existing microscopic and macroscopic system.
      On the other hand, data – big and small – is useless: we do not care about data unless we can learn something valuable from it: information, knowledge and ability to make good decisions. And we need to make decisions in order to prevent things from happening (who likes cancer?) and some of us want to make things happen (profit sounds great) – and where the disaster is unavoidable a good prediction can lower the harm.So here we are, in the realm of machine learning – the “god” who gave our computers the gift of learning. This presentation is a short (and general) review of machine learning history, algorithms and applications in current Big Data ecosystem.Marius Fersigan: Machine Learning aficionado (focus on genetic programming), dreaming of an artificial intelligence system that will speed up the scientific research process. In the last years I have been involved in artificial intelligence R&D for electronic trading. Currently I am involved in cancer genomics research (together with SAIA – Solutions of Artificial Intelligence Applications)

 

  • 18:10 – 18:50 Neural Networks in Machine Learning, Alexandru Sisu
    • Neural networks represent a powerful tool for approaching different types of machine learning problems. In this talk will explore what are the neural networks starting from the simple perceptron model to deep learning models. We will demo code that will perform handwritten digits classification. We will explain different training strategies for neural networks.It will help in understanding if you have some:
      – basic mathematical skills
      – basic python knowledge
    • Alexandru Sisu – Timisoara Big Data Meetup OrganizerBig data engineer working at Atigeo. Interested in machine learning and neural networks, and currently researching optimization techniques in training deep neural networks.
  • 18:50 – 19:00 Break

 

  • 19:00 – 20:00 Analyzing logs with Elasticsearch, Radu Gheorghe
    • An Elasticsearch demo session in which will show you how to analyze Apache logs:
      Parse with Logstash, Import in Elasticsearch, Visualize with Kibana, but also will go through theoretical aspects like indexes, types of indexes, aggregations, searches, refresh intervals, time-based indices, store throttling, hot & cold nodes and many others.
    • Radu GheorgheRadu is working for Sematext Group mainly as consultant and trainer on Elasticsearch. He is also the author of the book “Elasticsearch in Action” 
  • 20:00 – 20:40 In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon & Parquet, Ema Orhian & Radu Chilom 
    • Live demo of building an in-memory data pipeline and data warehouse from a web console with architectural guidelines and lessons learned. The tools and APIs behind are built on top of Spark, Tachyon, Mesos/YARN, SparkSQL and are using our own open-sourced Spark Job Server and Http Spark SQL Rest Service. In this talk we will showcase the strengths of all these open source technologies and will share the lessons learned while using them for building an in-memory data pipeline (data ingestion and transformation) and a data warehouse for interactive querying of the in-memory datasets (Spark JVM and Tachyon) using a familiar SQL interface.
    • Ema OrhianSenior software engineer at Atigeo and main commiter on github on a highly scalable and resilient restful (http) interface on top of a managed Spark SQL/Shark session (https://github.com/Atigeo/jaws-spark-sql-rest). Enthusiastic engineer, interested in scaling algorithms and implementing statistical models.Radu Chilom

      Big data engineer at Atigeo, working on scalable applications in order to make big data matter. As a developer, I love to work with new technologies, learning new things on a daily basis. Contributed in ambitious open source projects, main commiter on spark-job-rest https://github.com/Atigeo/spark-job-rest .

 

  • 20:40 – 21:30 Food & Socializing

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s