Trip Report: Intro to Google Cloud Platform

By October 26, 2016Cloud Platform, Google

Chicago GCP Training

Intro

I recently had the opportunity to attend a day of training on the Google Cloud Platform (GCP) at the Google offices in Chicago. Being an overview, the content didn’t dip too deeply into any one topic, but gave a decent survey of the available services. Here’s a condensed summary of what we learned.

Infrastructure and Application Services

There are three main offerings in this category of services. First of all, there is the straight-up infrastructure of Google Compute Engine, which is simply a virtual machine in the cloud. You can choose from a variety of Linux and Windows operating systems and specify CPU, memory, and disk size. With all Microsoft products such as Windows Server, Google obtains its own licensing and rolls this into the price of the VM. You need not (and cannot) use your own Windows licenses. Google Compute Engine VMs are billed primarily based on the time they are on. As with other cloud providers, you must manage every aspect of these VMs yourself.

At the other end of the spectrum represented by this category of services is Google App Engine. With App Engine, you do not configure or manage any aspect of a VM. It’s just your code, running in the cloud. In the training, several possible runtimes including Java, Go, Python, and PHP were named. I know I’ve also seeen options for Node.js and Ruby in their portal, and there is official documentation on getting ASP.NET running in GCP as well. I intend to dive deeper into this topic in the near future. App Engine also bundles together several relevant services that most apps will need, such as data storage, caching, and authentication. With App Engine, you are billed based on time your code spends actually running, in contrast to a VM which is billed based on uptime regardless of utilization. Furthermore, App Engine frees you from maintaining any VMs and scales automatically.

In the middle ground between these two offerings is Google Container Engine. Our training glossed over this service, so I don’t have much to say about it right now except that it’s managed Kubernetes (an open-source container manager) clusters in the cloud. It is capable of scaling automatically based on resource utilization.

Storage Services

As with other cloud providers, GCP offers multiple options for storing data persistently. The first, most basic option is Cloud Storage, which is best likened to a file system, but in the cloud, so you get options for things like geo-redundancy and caching.

Next, GCP offers a relational database service, which is based on MySQL. Personally, I wish it was based on PostgreSQL because of its stricter enforcement of data integrity, but given the choice between managed MySQL vs DIY PostgreSQL, it might be tempting to take the easy path. Do you think that relational databases are old, stuffy, and not web scale enough? GCP also offers a NOSQL document database that lets you just throw schemaless documents into a store and let the application worry about handling the internals. Our training suggested that this should be the default first choice for any new web application, but I disagree. I would really only recommend a document database when relationships between entities are purely hierarchical. For anything else, you probably want a relational database.

Finally, when you have a lot of data–measured in gigabytes or more–relational databases start to break down and you want a different kind of database: a data warehouse or columnar database. For this kind of data, you need a Big Data solution, and GCP provides that with BigTable and BigQuery. It can be difficult to understand the differences between BigTable and BigQuery, but for me the important distinctions are in terms of mutability and queryability. BigTable is mutable but not queryable. BigQuery is immutable but queryable with ANSI 2011 SQL.

Data Motion

There are two interesting offerings here. The first one is Dataflow, which offers a programming model for performing real-time analysis on a stream of data, currently only fully supported in Java, with Python in beta. The second offering is Pub/Sub, which is exactly what it sounds like: a managed publish/subscribe broker. There are two common use cases for the pub/sub model that I find very useful: broadcasting and message queues. In the case of broadcasting, you want to be able to publish a message and have all subscribers receive the message, like a chat room. Conversely, in the case of a message queue, you want to publish some kind of actionable request and have many workers monitoring the queue, but have the next available worker consume it from the queue for processing, rather than having all of them receive it. Both of these scenarios are supported by GCP Pub/Sub.

Machine Learning

GCP offers multiple Machine Learning services. There is the fundamental Machine Learning service, which is programmable with TensorFlow, a language created by Google, and then there are several services which represent the result of using Machine Learning to perform certain universally useful tasks. While they are the result of Machine Learning, using them really boils down to simply consuming an API. Examples of this include the Vision API, which recognizes objects, faces, and words in pictures; the Speech API, which is capable of transcribing audio to text; and the Translate API, which can not only translate from one language to another, but can try to guess the source language without being told.

Wrap-up

This concludes the whirlwind tour we took through the Google Cloud Platform earlier this week. One remaining notable thing I wanted to comment on is the management portal and tooling. I’ve worked with Azure, AWS, and now GCP and it’s interesting to see not only how their services stack up, but their portals and other tooling as well. The new Azure portal stands at one extreme: it’s a Single-Page Application with heavy client-side code. At the other end of the spectrum, the AWS portal is relatively spartan static webpages, with a little bit of AJAX thrown in here and there. The GCP portal feels like it’s somewhere in between, but closer to the Azure portal than AWS. My initial impression is that the GCP portal feels more approachable and snappier than the Azure portal. Another really impressive thing is how Google has really gone all out to make their tooling really accessible directly within the browser. For example, you can launch a terminal emulator directly within the browser and use the GCP CLI to administer your projects, and you can also use this browser-based terminal emulator to SSH into your cloud VMs, all without needing to open your local terminal. While not strictly necessary, it feels considerate towards you as a developer, compared to the AWS experience of clicking to open a dialog, selecting and copying the EC2 instance IP address, and then pasting it into a local terminal. Overall, my initial impression of the portal and other tooling in terms of ergonomics is very positive.

In the near future, I intend to delve deeper into deploying some .NET based projects into the Google Cloud and report my experiences. Stay tuned!

The following two tabs change content below.