Check for Predicate Pushdown in BigQuery with Apache Spark on Databricks

When I tested the features of the recently released Databricks on the Google Cloud platform, I checked out the BigQuery integration. Databricks is using a fork of the open-source Google Spark Connector for BigQuery. So I wondered how to check if a certain predicate of a query is indeed pushed down to BigQuery (or not). It turns out it is easy!

https://medium.com/geekculture/predicate-pushdown-for-apache-spark-with-google-bigquery-2ad4f9e81e6

Drone Delivery of your Amazon Orders with Apache Kafka

At re:Invent 2019 I claimed the stage with a presentation about Apache Kafka and drone delivery of Amazon orders. Note, there are no slides or recordings available for this presentation.

For slides about Apache Kafka on AWS please check my speakerderdeck account or my YouTube channel.

Apache Kafka on AWS

I recently collected some resources about running Apache Kafka on AWS (with or without using Amazon Managed Streaming for Apache Kafka) after recording a webcast about it.

You can find the Medium article here with all the links and ressources.

If you are only interested in the slide set about Apache Kafka on AWS, it’s on speakerdeck:

KubeCon 2019 Barcelona: Service Mesh Presentation (AWS AppMesh

Earlier this year in May, I presented at KubeConf 2019 in Barcelona about what is Service Mesh and how AWS does Service Mesh to almost 600 attendees. It was the best rated presentation at the AWS Container day (4.4 of 5.0).

For more resources regarding Istio and Envoy on Amazon EKS, have a look at this article that I created earlier before App Mesh was released.

Serverless Days Zurich: 2019 Update – Serverless Beyond Lambda

This Thursday I will present a new topic at Serverless Days Conference 2019 in Zürich. Once done, I will publish a link to the slides here.

2019 Update: Serverless Beyond Lambda

https://zurich.serverlessdays.io/speakers/frank-munz.html

Keynote: Culture of Innovation at Amazon

https://techcamp.hamburg/events/keynote-culture-of-innovation-at-amazon/

Coming week I will give the keynote at the TechCamp conference 2019 in Hamburg.

Topic of the first evening keynote is “Culture of Innovation at Amazon“.

Istio & Envoy: Is a Service Mesh the New Service Bus?

Check out the details about my presentation at CODE One in San Francisco.

Frank Munz: Istio & Envoy from Service Bus to Service Mesh ?

… read the full story on my new medium account. Or just watch the video of the “Istio and Envoy on Amazon EKS” presentation.

Frank Munz: Istio & Envoy from Service Bus to Service Mesh ?

New Episode.

It’s a bit late I understand, also I guess most of you read it on LinkedIn or Twitter already. I accepted a job offer and work as Senior Technical Evangelist for AWS now. So more cloudy things, yeiii. 

This is a great time to check out my speakerdeck account.

Now Certified AWS Developer, SysOps, and Solutions Architect Associate 2018

April was a good month. Now I completed all three Amazon Web Services Cloud (AWS) associate certifications AWS Certified Solution Architect, AWS Certified Developer (2018), AWS Certified SysOPS Associate (2018).

json-server

A quick reminder about how to install json-server, that I use in many microservices workshops to quickly expose a REST API:

sudo yum install epel-release 
sudo yum install nodejs 
sudo yum install npm 
sudo npm install -g json-server

Kubernetes from the kubectl Command Line

Oracle Container Engine (OCE)

You most likely read the news that Oracle joined the CNCF and now offers a Kubernetes service named Oracle Container Engine. Basically you could use OCE nicely integrated with the CI/CD Wercker or alternatively from the command line.

OCE with Wercker

About using Kubernetes together with Wercker I will present at the CODE conference in NY city. So stay tuned for slides and possibly a recording.

OCE with standard kubectl CLI

OCE is a standard upstream Kubernetes. So with an existing kubectl client that is correctly pointing to your OCE instance you can try your first K8s steps from the CLI. So here is a quick primer.

The first thing to note is that you should set your namespace if you are using the OCE trial. The reason is, that there is shared cluster for trial and different users are assigned different namespaces. Don’t worry, if you are following the example with Wercker, given to the trial participants the namespace will be set correctly.

Set your namespace, replace fmunz with your namespace in the command below:

$ kubectl config set-context $(kubectl config current-context) --namespace=fmunz

Create a pod and scale it to 3 and :

$ kubectl run microg --image=fmunz/microg --port 5555
$ kubectl scale --replicas=3 deployment/microg

Note, that so far you only have a pod running, but no service. So your container will not be reachable from the outside. Now expose it as a service via the NodePort:

$ kubectl expose deployment microg --type=NodePort

Maybe most difficult question is how to access the service. You can find the NODE IPs of the pods using the wide flag when retrieving information about the pods:

$ kubectl get pods -o wide

NAME                                  READY     STATUS    RESTARTS   AGE       IP              NODE
microg-858154966-bfhg9                1/1       Running   0          2h        10.244.56.146   129.213.30.58
microg-858154966-k4v07                1/1       Running   0          2h        10.244.93.146   129.213.58.116
microg-858154966-p1tn9                1/1       Running   0          2h        10.244.99.40    129.213.36.50

Pick one of the NODE IPs, e.g. 129.213.30.58. Next, retrieve the NodePort that was created.

$ kubectl describe service microg | grep NodePort
Type:                     NodePort
NodePort:                 <unset>  32279/TCP

Now you can simply combine port and IP for the URL to access the service:

$ curl -s 129.213.30.58:32279

Which will show the following service response:

{"date":"Tuesday, 27-Feb-18 15:12:47 UTC","ip":"10.244.99.40","rel":"v1.0","cnt":174}

Hit it a couple more more times to investigate the load balancing.

Serverless Docker Containers with AWS Fargate / ECS

I took Fargate on ECS for a quick spin the other day. My idea was to use reveal.js for serving slides about Docker in a Docker container – something I have shown last year at OUGN 2017 in Norway. To make this Docker container run serverless on AWS Fargate I created a new task and specified the container as shown in the webcast.

I guess you can just follow my webcast to create an example yourself – the only edits I did was removing the parts where you wait for the provisioning. A few things to point out:

  • I recommend to get started with the ECS quick start wizzard.
  • It does help if you understand about ECS tasks and services.
  • The Docker image I used: fmunz/slidesougn17
  • Surprisingly the hardest part is to find the correct IP to be able to connect to the running container. Note that there is obviously no EC2 instance and the ENI is attached to the ECS task. So you can find the IP under Task -> public IP4.

Have (serverless) fun!

I will be speaking at CODE in New York!

Stay tuned for more details, but my presentation about Kubernetes was accepted for the CODE conference 2018 in New York City, March 8th. That is of course fantastic news 🙂

In more detail: I will present about the evolution of containers. From Docker, to Swarm, to container orchestrations systems, Kubernetes and managed Kubernetes (such as Oracle Container Engine or others). At the end I guess you will agree that Kubernetes is great and getting better every day, but you won’t like to manage your own Kubernetes cluster. Interesting enough, Bob Quillin summarised my CODE presentation as the new Oracle strategy really well.

Oracle CODE New York

Of course we will have a lot fun fun live coding with Mini, the Raspi cluster again. I plan to demo the setup of the cluster, service deployment, load balancing, failover etc. All this live on stage with hopefully a really big screen for the projection.

New DZone publication: Serverless with Fn Project on Kubernetes

Today I realised that my Serverless with Fn on Kubernetes article was published on DZone. That is great news. Not sure why, but I didn’t pay too much attention to DZone but realised lately that so many good content is published there. E.g. check out the refcards!

Serverless with Fn Project on Kubernetes for Docker (Mac)

Docker for Mac

Last week I deployed Fn Project on Kubernetes as a quick smoke test. Fn is the new serverless platform that was open sourced at Java One 2017. Running it on Kubernetes is easier than ever because Docker directly supports Kubernetes now, as announced at the last DockerCon. In the end it just worked without any issues.

To reproduce the steps, first of all make sure the latest version of Docker with Kubernetes support is installed properly and Kubernetes is enabled (in my case this is 17.12.0-ce-mac45 from the edge channel) .

Prerequisites and Checks

List the images of running Docker containers. This should show you the containers required for K8s if you enabled it in the Docker console under preferences:

$ docker container ls --format "table{\t{{.Image }}}"

Next, check if there are existing contexts. For example I have minikube and and GKE configured as well. Make sure the * (astericks) is set to docker-for-desktop:

$ kubectl config get-contexts
CURRENT   NAME                                         CLUSTER                                      AUTHINFO                                     NAMESPACE
*         docker-for-desktop                           docker-for-desktop-cluster                   docker-for-desktop                           
          gke_fmproject-194414_us-west2-a_fm-cluster   gke_fmproject-194414_us-west2-a_fm-cluster   gke_fmproject-194414_us-west2-a_fm-cluster   
          minikube                                     minikube                                     minikube                                  

If it is not set correctly, you can point kubectl to the correct Kubernetes cluster with the following command:

$ kubectl config use-context docker-for-desktop

Also you can see the running nodes:

$ kubectl get nodes
NAME                 STATUS    ROLES     AGE       VERSION
docker-for-desktop   Ready     master    9d        v1.8.2

Check out the cluster, it just consists of a single node:

$ kubectl cluster-info
Kubernetes master is running at https://localhost:6443
KubeDNS is running at https://localhost:6443/api/v1/namespaces/kube-system/services/kube-dns/proxy

Setup

To get better visibility into K8s I recommend to install the Kubernetes Dashboard:

$ kubectl create -f 
https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml

The dashboard is running in the kube-system namespace and you can check this with the following command:

$ kubectl get pods --namespace=kube-system

Enable Port Forwarding for the dashboard

Enable port forwarding to port 8443 with the following command and make sure to use the correct pod name:

$ kubectl port-forward kubernetes-dashboard-7798c48646-ctrtl 8443:8443 --namespace=kube-system

With a web browser connect to https://localhost:8443. When asked, allow access for the untrusted site and click on “Skip”.

Alternative to Port Forward: Proxy

Alternatively you could access it via the proxy service:

$ kubectl proxy

Then use the following URL with the browser

http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

Microservice smoke test

The following steps are not necessary to run Fn project. However, I first deployed a small microservice to see if Kubernetes was running fine for me on my Mac. Feel free to skip that entirely. To copy what I did, you could follow the steps for load balancing a microservice with K8s

Fn on Kubernetes

Helm

Make sure your Kubernetes cluster is up and running and working correctly. We will use the K8s package manager Helm to install Fn.

Install Helm

Follow the instructions to [install Helm(https://docs.helm.sh/using_helm/#installing-helm) on your system, e.g. on a Mac it can be done with with brew. Helm will talk to Tiller, a deployment on the K8s cluster.

Init Helm and provision Tiller

$ helm init
$HELM_HOME has been configured at /Users/frank/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.
Happy Helming!

Install Fn

You can simply follow the instructions about installing Fn on Kubernetes. I put the steps here for completeness. First, let’s clone the fn-helm repo from github:

$ git clone https://github.com/fnproject/fn-helm.git && cd fn-helm

Install chart dependencies (from requirements.yaml):

$ helm dep build fn

Then install the chart. I chose the release name fm-release:

$ helm install --name fm-release fn

Then make sure to set the FN_API_URL as described in the output of the command above.

This should be it! You should see the following deployment from the K8s console.

Try to run a function. For more details checke the Fn Helm instruction on github.

Summary

Installing Fn on K8s with Helm should work on any Kubernetes cluster. Give it a try yourself, code some functions and run them on Fn / Kubernetes. Feel free to check out my Serverless slides.

Basic Load Balancing with Kubernetes

Howto: Getting Started with Kubernetes and Basic Load Balancing

This posting describes a way to deploy a service, expose it via the NodePort, scale it to 3 and observe basic load balancing. Enjoy!

Run a microservice on Kubernetes

First create a deployment which also creates a pod. This deployment was used in several conferences to demonstrate Docker features. It’s proven as suitable container to explore load balancing for Kubernetes.

$ kubectl run micro --image=fmunz/micro --port=80

deployment "micro" created


$ kubectl get pods

NAME                           READY     STATUS    RESTARTS   AGE
micro-7b99d94476-9tqx5         1/1       Running   0          5m

Expose the micro service

Before you can access the service from the outside, it has to be exposed:

$ kubectl get deployments

NAME          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
micro   1         1         1            1           7m


$ kubectl expose deployment micro --type=NodePort

service “micro” exposed

Find out its port number

$ kubectl describe service micro | grep NodePort

Scale service to 3

$ kubectl scale --replicas=3 deployment/micro

deployment "micro" scaled

$ kubectl get deployments
NAME          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
hello-nginx   3         3         3            3           1d

Explore the load balancing

Now you will have 3 pods running the micro service. Acces the service via the browser with the following URL:

http://localhost:NODE_PORT

Refresh the page 3 times. You will see that different pods will serve your request because a different IP is returned each time.

ReadyApp Framework for WebLogic: Good fit for Kubernetes

Oracle and CNCF

Oracle joined the CNCF in 2017. So it doesn’t come as a big surprise that at least since Open World 2017 you can observe the overall trend towards cloud native applications in the Oracle application space. The move towards integration Oracle Fusion Middleware and Kubernetes is for sure an important part of this movement.

WebLogic / Deployments and ReadyApp Framework

There is ongoing work to make WebLogic ready for Kubernetes. I will write about some more details later this year, however an interesting step is the ReadyApp Framework for WebLogic.

The ReadyApp framework helps load balancers to detect server readiness by providing a reliable health-check URL. Also EAR-based or WAR-based application can register with the framework by adding a single line to the application’s WebLogic deployment descriptor.

WebLogic and Kubernetes Readiness Probe

When running WebLogic on K8s the ReadyApp Framework feature becomes useful e.g. if you integrate it with the Kubernetes Readiness Probe. This way K8s will able to know that WebLogic and its applications is indeed functional (and not only that the Docker container was started).

ReadyApp Framework and JCS

If your use case isn’t WebLogic on K8s, obviously ReadyApp Framework makes sense as well. It can be used with JCS or on-premises.

Analytics and Data Summit 2018: Serverless and Machine Learning + Open Source Big Data in the Cloud

The year has just started and here is the first “good news” yet: My presentation about “Serverless Architectures and Machine Learning” was accepted for the Analytics and Data Summit 2018 (former BIWA conference). The presentation will include a live demo with Fn Project.

In addition to that I will give another presentation together with Edelweiss Kammermann about Open Source Big Data (with Hadoop, Hive, Spark and Kafka live demos) in the Cloud. IMHO, two fabulous topics – I am looking forward to see you there!

A Serverless / FaaS Classification

At the time of writing there are more than a dozen FaaS frameworks or platforms available. These frameworks or platforms can be classified into three different categories based on their objective and reach.

The three categories are as follows:

  1. Complexity:
    Reduce the complexity of a particular vendor’s cloud based FaaS implementation, e.g. the configuration of the API gateway and access management that is required for a REST based serverless function. A typical example for this category: AWS Chalice.
  2. Portability:
    Provide an abstraction framework for portability and ease of use on top of the FaaS implementation of various public cloud providers. A popular example is the serverless.com framework.
  3. Standards:
    Provide a standard based, serverless platform or framework to abstract running functions from the operation of servers. These frameworks are typically developed without a particular cloud provider in mind. When running such a framework on top of IaaS, servers are abstracted away, automated scaling is possible, but no true per invocation is achieved due to the IaaS pricing model. Examples for this category are Open FaaS, and Fn Project.

 

Fn Project in Public Clouds (aka Serverless on IaaS ?)

Fn in Public Clouds (IaaS)

Fn project is a cloud agnostic FaaS platform and a common question is how to use Fn in public clouds. Similar to the local installation that we used in the Oracle blog posting (link soon), it can also be installed on any public cloud IaaS. For most IaaS it is enough to pass the installation command directly to the creation of a compute instance as so called user-data. User data is commands that are acted upon when the instance is provisioned. Also when running Fn in a public cloud, don’t forget to enable access rules for Fn server allowing port 8080 – depending on requirements – either from your own IP or all public IP addresses.

Once Fn server is running on your favourite cloud provider, you could deploy the recommendation engine mock example mentioned in the posting above in two different ways.

Deploy Your Fn Function in the Cloud

# example 1 (for teaching purpose only, in production use approach below)
# note: run these commands on the cloud instance

$ fn apps create advtravel
$ fn routes create advtravel /fn-recommend DOCKER_ID/recommend:0.0.2

 

Another, probably even more useful way to deploy the function is to set the FN_API_URL environment variable locally, point it to the remote cloud instance, and run the local Fn deploy command against the remote cloud instance.

# example 2 (easier, what you'd do in real life)
$ export FN_API_URL=URLofRemoteCloudInstance
$ fn deploy --app advtravel 

Note, that with the commands above you never had copy over the function or the container image to the cloud instance. When the function will be invoked the first time, Fn will pull the Docker container from the registry, store it locally, and then simply run the function.

Test Fn in the Cloud

Once the Fn is running in the cloud and your application is deployed you can access the application from a local machine using the command-line or Postman. The invocation is the same as in the local example, just replace localhost with the public IP address of your cloud instance:

$ curl -X POST --data @testdata/syd.json PUBLIC_IP:8080/r/advtravel/fn-recommend 

Real FaaS?

Obviously, when running Fn project on an IaaS you do not get the true pay per invocation benefit as for a FaaS implemented by the cloud provider as PaaS. You will get automated scalability to some degree, since it is built into the load balancer Fn LB. At the end running Fn on IaaS is only serverless from a user perspective.

It will be interesting to see if a cloud platform (most likely Oracle, since Fn Project is driven a lot by Oracle) will provide a proper FaaS service with pay per invocation and automated scalability that is compatible with open source Fn Project.

Fn Cloud Demo

A recorded live demo from Devoxx conference about deploying Fn on IaaS can be seen here.

Get the demo app used in webcast from here.

What’s hot? Tech Trends That Made a Real Difference in 2017

At Java One 2017 I had the pleasure to be interviewed in a podcast with industry legends such as Chris Richardson, Lucas Jellema and others:

“In order to get a sense of what’s happening on the street, we gathered a group of highly respected software developers, recognized leaders in the community, crammed them into a tiny hotel room in San Francisco (they were in town to present sessions at JavaOne and Oracle OpenWorld), tossed in a couple of microphones, and asked them to talk about the technologies that actually had an impact on their work over the past year. The resulting conversation is lively, wide-ranging, often funny, and insightful from start to finish. Listen for yourself.”

For those who care, this was last year’s interview about “The Role of the Cloud Architect”.

Devoxx 2017 Presentation: Serverless Architectures (AWS Lambda + Fn Project)

Devoxx 2017 in Casablanca has come to an end. I enjoyed this fabulous conference a lot. Highlights were the the Serverless presentations, the istio presentation by Google, the Spring people live coding, and the Kotlin session that showed that my own Kotlin in the cloud HelloWorld was not so bad after all. Tech content is one important part, but also the great atmosphere and focused but relaxed environment with superb lunches were something to write home about.

And here is my presentation:

Video recording doesn’t seem to be uploaded on Youtube yet. Fun fact: We almost bought a flying carpet after Devoxx in Casablanca!

With the same slideset I gave my Serverless presentation at DOAG 2017 conference. Thanks again folks for the marvellous feedback 🙂

With Kotlin and Spring Boot to the Clouds

0. Overview

This posting is based on a quick customer demo that I did the other week. It demonstrates running Kotlin code in different public clouds. To be more precise: We use Kotlin together with Spring Boot to create a somehow minimalistic REST-like application that is running on multiple, load balanced instances on AWS Beanstalk or Oracle ACCS.

1. Kotlin

Kotlin is one of the upcoming, trendy JVM-based languages. It’s syntax is close to Java, but just sometimes more convenient. The language is backed by two industry giants: Pivotal announced support forSpring and Google officially supports Android development with Kotlin.

The Kotlin language follows the Java coding conventions, but makes the semicolon mostly optional (great!) and adds null-pointer safety (you never had problem with this in Java, did you?) and string interpolation on top of that. Although Kotlin packages are same as in Java in reverse order, in Kotlin package names don’t have to match to folder structure (great!). Interesting enough, most Kotlin control structures are expressions.

2. Spring Boot

We will use Kotlin together with Spring Boot in this posting.

Project Creation

For the purpose of this demo we will create the smallest possible application that is still helpful to learn about the synergy of Kotlin, Spring Boot and cloud. To get started, we create a maven project  for Spring Boot with the spring.io initializer.

The project will contain an empty Spring Boot application and the maven pom file amongst others.

Add a Controller

To get some meaningful output we add another Kotlin class: a controller with three functions. Did you know, the “fun” in Kotlin is for function?

  1. the hello(World) function that simply echoes a name.
  2. a func that shows how to access java.util.Date from Java and returns the time.
  3. a info func that show a bit of load balancing magic when deployed on ACCS.

Build the Project

We can build the project with mvn package from the command-line (or from the IDE, as shown in the web cast).

IDE Support for Kotlin

Being developed by Jetbrains, Intellij offers great support for Kotlin. In this webcast I simply use Netbeans.

 

3 Kotlin in the Clouds

This little demo is more real life than you might think. I will show you to run Kotlin in two different clouds. These days I see multiple clouds at my customers usually. Simply choose the one you like most.

3.a ACCS

ACCS is a Oracle Public Cloud PaaS service that provides several language runtimes. ACCS is based on Docker containers, but don’t worry, you won’t see any of them. With ACCS you can easily deploy Java, Java EE, as well as Python, Ruby, NodeJS and even Go code. Also it supports spinning up multiple instances with load balancing.

Deploying a .jar file on ACCS is as straight forward as it gets:

  1. zip it.
  2. upload it.
  3. run it.

However, there is one more thing: to successfully run the .jar file we need to add a manifest.json file that contains the exact command how the jar file will be started in the Java container. It should be all obvious watching the webcast. Check my other posting to learn about ACCS and Java EE.

3.b Beanstalk

The AWS Beanstalk deployment isn’t more complicated than the ACCS one. Actually even the manifest.json file can be omitted. Beanstalk implicitly understands how to run a .jar file (which of course is easy in most cases: java -jar demo.jar). So all you need to do is upload to Beanstalk. However with Beanstalk we have to make sure to include a SERVER_PORT variable and set its value to 5000. This is required since Beanstalk is internally listening to port 5000 but Spring Boot uses 8080.

5. Web cast

I created a web cast for you with all the details.

6. Ressources

Some additional resources that you might find useful:

  1. Kotlin language
  2. Try Kotlin online
  3. Spring Initalizer
  4. Get started with ACCS
  5. Access the demo code from github. TO DO.
  6. AWS Beanstalk

 

 

Java One 2017: Open Source Big Data in the Cloud (Hadoop, Hive, Spark, Kafka)

It’s true. I always said “presenting at Java One is like playing in champions league”. Last month I had the great pleasure to present at the Java One 2017 conference in San Francisco together with Edelweiss Kammermann about Open Source Big Data used in the cloud. The presentation included 4 live demos about Apache Hadoop with Map Reduce, Apache Hive, Apache Spark and Kafka all using Oracle Big Data Cloud Service – Compute Edition (aka BDCS-CE) and the Oracle Event Hub Service. The presentation was recorded – so you can enjoy from anywhere in the world.

For your convenience the slides are available on slideshare:

Oracle CODE San Francisco Review: From Docker Swarm on a Raspi to Oracle Container Cloud Service (OCCS) with Wercker

Last month I presented at the Oracle CODE event in San Francisco. The presentation included almost 30 minutes of live hacking with Docker Swarm on a Raspberry Pi running Hypriot Linux. I was scheduled to speak at 8.30h in the morning and still an amazing number of 90 people showed up – thanks guys! The presentation was recorded – so you can enjoy it from the comfort of your living room.

The presentation is online here:

You can get the slides from slideshare.com: