Check for Predicate Pushdown in BigQuery with Apache Spark on Databricks

When I tested the features of the recently released Databricks on the Google Cloud platform, I checked out the BigQuery integration. Databricks is using a fork of the open-source Google Spark Connector for BigQuery. So I wondered how to check if a certain predicate of a query is indeed pushed down to BigQuery (or not). It turns out it is easy!

https://medium.com/geekculture/predicate-pushdown-for-apache-spark-with-google-bigquery-2ad4f9e81e6

Java One 2017: Open Source Big Data in the Cloud (Hadoop, Hive, Spark, Kafka)

It’s true. I always said “presenting at Java One is like playing in champions league”. Last month I had the great pleasure to present at the Java One 2017 conference in San Francisco together with Edelweiss Kammermann about Open Source Big Data used in the cloud. The presentation included 4 live demos about Apache Hadoop with Map Reduce, Apache Hive, Apache Spark and Kafka all using Oracle Big Data Cloud Service – Compute Edition (aka BDCS-CE) and the Oracle Event Hub Service. The presentation was recorded – so you can enjoy from anywhere in the world.

For your convenience the slides are available on slideshare:

Feedback: Big Data Training: Hadoop, Spark, Kafka, Cassandra, Oracle and the Cloud

Guys, thanks for attending my DOAG training day about Cloudera/Hadoop and Oracle Big data. I am pleased about your amazing feedback!

Statistics about Big Data Training Day

About the course

Attendees had the following opinion about the course when asked right after the training

  • 100% of those who answered would recommend the course
  • 100% of all found it interesting
  • 81% found that content matched their experience
  • 86% were happy or very happy with the course (these are the 2 highest grades possible)
  • 0% were unhappy
  • Everyone  except one person (that is 20 people) found the level of difficulty okay.
  • Everyone except one person found that they were engaged enough

About myself

The following is what attendees mentioned as feedback about myself (there were no explicit questions about Edelweiss, so she is not included unfortunately). Multiple answers were possible, answers weren’t mandatory. This basically tells that someone making a cross at e.g. informative sincerely means it.

  • 86% found me interesting
  • 40% entertaining 🙂
  • 90% informative
  • 71% demonstrative and clear

It’s a wrap: Oracle and Cloudera Big Data Training – On Premise and Cloud

Wow – We have done it! Weeks of preparation, reading, trying out tools, and hacking went into the preparation of this training course. Even during the OTN APAC tour I took some days off and worked from my Bangkok home office to prepare for the DOAG trainings day.

At the end it was totally worth it. I had 21 top notch DBAs and developers on the attendee list, also some 10 students attended. DOAG is running a good student program. For helping out a little bit students are allowed to attend sessions and the DOAG training day. Quite often I get a lot of them. 2 years ago I was running a full day multi-cloud training and every single student including their professor decided to attend my session – although they had the choice amongst 6 different trainings. Anyway, nice to be popular with the young people. Next time I will come in sneakers and wear that Cloudera T-Shirt. Also it became a bit of a challenge for Oracle to attract students. So I am glad to help 🙂

Fabulous news was that Edelweiss took over the Oracle part, so I was lucky to talk about what I love – the open source and Cloudera part.

fullsizerender-14

Edelweiss did her session via Skype. I was a bit sceptical and expecting technical problems because of the network latency but it went swimmingly. The conference room had good speakers, so I could play some music in the break and Edelweiss almost seemed to be present in the room but kind of invisible.

fullsizerender-19

 

Well now you know it. This is what is cool.

img_2162

 

A really great surprise was that everyone enjoyed the idea of working with VirtualBox and accepted labs that I prepared. Most people took them home to further play with the Big Data Lite instance.

At the end we covered a whole lot of content in a long day:

From Open Source / Cloudera Stack 

From Oracle Big Data Products

Oracle Big Data ApplianceOBIEEOracle Data Integrator (ODI)Oracle Big Data Discovery

Training Day: Cloudera Hadoop Stack with Kafka and Cassandra and Oracle Big Data / BI and the Cloud

Right after the DOAG conference 2016 in Nürnberg / Germany we will be running a big data training day. Meet the big 4: Hadoop, Spark, Kafka (all 3 from the Cloudera distribution), and Cassandra. Plus Oracle Big Data on top. For details in German see here.

Topics of the workshop are the following (subject to change):
Open Source / Cloudera Stack 

Oracle Big Data Products

Oracle Big Data Appliance

OBIEE

Oracle Data Integrator (ODI)

Oracle Big Data Discovery

 

The training day will include live demos and some are running in the cloud. So stay tuned!

This event will be co-hosted by Edelweiss Kammermann, Oracle ACE director from Uruguay and BI expert. Edelweiss will present mostly about Oracle Big Data / Business Intelligence, whereas I will try to cover the open source Hadoop / Cloudera part.

OLYMPUS DIGITAL CAMERA