“Hands On Kubernetes” – first thoughts

So these are my first thoughts on the Hands on Kubernetes book by Nills Franssens et al…

It’s a solid book! I enjoyed especially the first chapter where it’s setting the stage for the demos / hands on work in the following chapters. Obviously there’s been some new additions in what capabilities Azure Kubernetes Service (AKS) has to offer since, which I’ll try to explain.

In chapter 2, the book asks you to create a new AKS cluster using the Azure Portal. THis could be done with CLI, ARM, or Terraform of course… The new thing here is the ability to create a cluster with Azure Arc. That allows you to use things like Azure Policy and Azure Monitor to control / observe your containers when you’re running them onprem (say using VMWare vSphere or Azure Stack HCl), or using Google Cloud or AWS. For now though we’ll just create a straight up cluster:

… which once you’re done with all the options should take about 5 minutes or so, on US West 3 region. A few notes here – you’re going to NOT want to set up Availability Zones (of course you would do this for a prod workload), and you want a STANDARD setup of 2 nodes (not Dev/Test, which would normally be the best pick) – because we’re going to want to experiment with Azure Monitor to check our observability. The AKS pricing tier you want is “free”, and the node count range is a new option – select “2-5”.

A quick snapshot of the bare bones setup we’re using here:

When that AKS cluster is finished being spun up, you’ll see something like the following:

What happened here exactly though? Select “Go to resource” – and you’ll be able to inspect what was created. For example, the Resources section shows any running deployments / pods, and it’s where you create new resources. In the Node pools, you can scale up/down by adding nodes – and add a new node pool, potentially even with a different (beefed up) VM size. In the Cluster config page, you can upgrade the control plane – and then the individual node pools in a followup step. This is also where you’d enable RBAC or integrate with Azure AD.

Insights though is where we can actually view the cluster’s utilization and how it performs under load:

From here it’s a walk in the park. You COULD download from GitHub the source materials for the book. I found selecting the quick start application gave me a very nice starting point… a simple Voting app that’s easy to create / destroy:

The next option to play with is the last one – “Connect to cluster”. From here you’re given all the information you need to connect via Bash or Azure Cloud Shell to your newly created resources. This is where you can run some of the commands noted in the book as we’re starting to play around with cmd line explorations –

kubectl get node 

az aks get-credentials --resource-group rg-handsonaks --name handsonaks

… And that’s it for now. You can easily go into the control panel again and remove the entire resource group to bring yourself back to a clean start state.

Closing Thoughts and Next Up

I feel like here’s a good point though to talk about the WHY of things… Software development seems to be making these leaps forward about every decade. In the early 2000’s, the big change took the form of a pattern and a practice – the pattern being Scrum and Agile, the technology / process taking the form of source control. Skip forward another 10 years, to say about 2014, and the leap forward was DevOps. Again this change becomes a pattern (Infra as Code), and a practice (CI/CD, and config mgmt)… The upcoming change seems to still be taking shape, but the 2020’s definitely seem to be the era when the pattern of microservices is achieving dominance – with the tech behind this being Kubernetes and Docker. The stuff we used to hear about “microservices only being for the large enterprises” or “Kubernetes isn’t meant for production workloads” is just not holding any water – it’s FUD.

Other things I want to play with down the road:

Monitoring What Matters – John-Daniel Trask of Raygun

I had a great talk recently with one of my favorite peeps – John-Daniel Trask, CEO of Raygun. We talk about the importance of monitoring and making that connection with the customer experience, and what he’s seen go right – and wrong – in working with companies large and small. We’re huge fans of Raygun and see this company’s growth as a natural byproduct of producing the right product that reinforces all the behaviors we want out of the DevOps movement. Enjoy!

I always enjoy talking with John-Daniel and he was a big factor in monitoring and metrics taking up so much room in my book. Here’s some of the topics we cover:

  • How to aggregate errors so you don’t feel like you’re putting out a tire fire with a water pistol
  • Best practices around real user monitoring, crash monitoring and APM
  • First things first; why crash reporting should be the first thing you port out
  • How John-Daniel is adjusting to life as a new father (welcome Henry!) and what it was like growing a global business as a young entrepreneur
  • How Google changed the game around how responsive and user-centric websites and services are
  • A fact we often forget: software is written ultimately for humans. “Software gives us the power to amplify human ability.”
  • Another great all-time quote, from his mentor: “It’s not the big that eats the small, it’s the fast that eats the slow.”

One last great quote to end on: “DevOps is about making engineering teams as reliably fast as possible.”

 

A link to the interview is here – and it’s on the podcast platform of your choice. AppleGoogleSpotify, blah blah…. We’re on all the major platforms now, including AnchorAppleGoogleSpotifyPocketCasts, and RadioPublic. Please support the podcast, and we’d love to hear your feedback about the book!

Enjoy the podcast!

 

 

 

LaunchDarkly and feature flags

Had a friend ask me for some videos around Feature Flags. There’s no shame in admitting that I’m a huge fan of feature flags; it seems like one of those no-brainers when it comes to making releases faster and safer. Without them, I’m not sure how close we can possibly get to true “continuous delivery” even for smaller sized projects.

As I think some of you might be interested as well – here’s some videos and web references below. I hope to expand on this with some more in-depth demos down the road. This is going to come across like I’m shilling for LaunchDarkly. (In all fairness, I’m not the only person at MSFT that loves them.)  But when it comes to FF I’m not sure if there’s another vendor in that space that offers what they do.

And some more references from my book:

  • [harris] – “Using feature flags in your app release management strategy”, Richard Harris. App Developer Magazine, 4/19/2018. 

Four questions around testing and Microsoft’s progress with DevOps

I was at a conference earlier this week and we got some outstanding questions about how Microsoft went about their transformation – especially with the Azure DevOps team. I want to build on this with a followup post going into more depth on our use of culture and automation – but here’s a good place to start with some great links.

Question #1 – How do we handle planning on a strategic level with the more tactical focus of Agile?

 
 

Question #2 – How did Microsoft go about their transformation to DevOps from a shared services model?

 
 

Question #3 – What about testing? (This is usually one of our biggest blockers to improve release reliability and velocity – an unreliable, flaky test layer)

Question #4 – Production Support. Let’s say we have an Agile team, 8-12 people. How the heck are we supposed to do global support across multiple regions, 24x7x365 in production?

  • Short answer – the only way this will work is if you 1) make sure you’re only supporting a small sliver of functionality, 2) that you gate the support demands upon your devs so it’s <50% of their time i.e. the SRE model. More than likely you’re going to have some operational support – even offshore or 3rd party – handled externally to the team. 3) alerts are tuned so that only truly important things make it through. I talk about this extensively in my book; the books “The Art of Monitoring” and “Practical Monitoring” also elaborate on this.

 

 
 

Betsy Beyer and Stephen Thorne

Just published a great podcast interview with Betsy Beyer and Stephen Thorne of Google, coauthors of the incredible “Site Reliability Engineering Workbook”. We cover a lot of ground in this interview, including how Google learns from failures, what toil is and how good organizations try to fight it, and when LESS reliability can be a good thing in software development.

Betsy and Stephen’s work around publicizing and breaking down the myths surrounding SRE have been huge boons to our industry. Their writing (along with a few other Googlers) had a big impact on the Achieving DevOps book; we find their work and thinking incredibly helpful and influential. You’ll love it too!

A link to the interview is here – and it’s on the podcast platform of your choice. AppleGoogleSpotify, blah blah…. We’re on all the major platforms now, including AnchorAppleGoogleSpotifyPocketCasts, and RadioPublic. Please support the podcast, and we’d love to hear your feedback about the book!

Enjoy the podcast!

Some link goodness: