<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Posts on danielfm.me</title>
    <link>https://danielfm.me/posts/</link>
    <description>Recent content in Posts on danielfm.me</description>
    <generator>Hugo -- 0.154.5</generator>
    <language>en</language>
    <lastBuildDate>Sat, 24 Jan 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://danielfm.me/posts/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Handling Secrets in the Terminal</title>
      <link>https://danielfm.me/posts/handling-secrets-in-the-terminal/</link>
      <pubDate>Sat, 24 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/handling-secrets-in-the-terminal/</guid>
      <description>Suggestions on how to (somewhat) securely handle secrets in the terminal.</description>
      <content:encoded><![CDATA[<p>We have to be ever more careful with the stuff we store in our disks.</p>
<p>Only in 2025, there were at least half a dozen campaigns exploiting various pieces of modern supply chain software to exfiltrate sensitive data at scale. Some of these events even leveraged AI agents installed in people&rsquo;s machines to help locate and extract sensitive information like tokens, crypto wallets and service account keys.</p>
<p>Even if you are careful with these things, it&rsquo;s possible &ndash; likely even &ndash; that you have at least a couple of tokens or secrets laying around in plain text in your system.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>Unfortunately, many CLI tools still rely on &ndash; or at least encourage in &ldquo;Getting Started&rdquo;-style guides &ndash; insecure practices, such as storing tokens and secrets in plain text configuration files and passing them as command line arguments.</p>
<h2 id="is-the-os-keyring-a-viable-option">Is the OS Keyring a Viable Option?</h2>
<p>If you use Linux with a fully-featured Desktop Environment such as <a href="https://www.gnome.org/">GNOME</a>, chances are you already have a Secret Service implementation installed, that you can interact with via the <code>secret-tool</code> utility:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Storing a secret in the system keyring:</span>
</span></span><span class="line"><span class="cl">secret-tool store --label<span class="o">=</span><span class="s2">&#34;Secret Description&#34;</span> <span class="o">[[</span>attribute<span class="o">]</span> <span class="o">[</span>value<span class="o">]</span> ...<span class="o">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Retrieving the secret:</span>
</span></span><span class="line"><span class="cl">secret-tool lookup <span class="o">[</span>attribute<span class="o">]</span> <span class="o">[</span>value<span class="o">]</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Deleting the secret:</span>
</span></span><span class="line"><span class="cl">secret-tool clear <span class="o">[</span>attribute<span class="o">]</span> <span class="o">[</span>value<span class="o">]</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>While this can fulfill some basic use-cases, the data persistence options for the OS keyring are limited, so you should only use it for non-critical short-lived secrets.</p>
<p>Keyrings also usually stay unsealed for the duration of the user session, making them not suitable for protecting more sensitive information.</p>
<h2 id="password-managers">Password Managers</h2>
<p>For critical and/or long-lived secrets, it&rsquo;s recommended the usage of a proper password manager solution.</p>
<p>Solutions such as <a href="https://1password.com/">1Password</a> or <a href="https://bitwarden.com/">Bitwarden</a> would work fine (they both provide CLI tools for interacting with their vaults), but I&rsquo;ve grown to prefer more lightweight options such as <a href="https://www.passwordstore.org/">GNU Pass</a>, at least for tokens and secrets that are mainly accessed via the terminal.</p>
<h2 id="dealing-with-environment-variables">Dealing With Environment Variables</h2>
<p>Some people just <code>export</code> their tokens in their <code>~/.bashrc</code> and get on with it. This is not a recommended approach for sensitive information as these variables would be exposed indiscriminately to all user processes.</p>
<p>The same applies for other secrets stored in plain text files in your hard drive. Any process you start could just scan your files, discover sensitive information, and compromise them. This is more and more of a concern these days with the advent of AI tools that still lack proper sandboxing and security controls.</p>
<h3 id="the-gopass-way">The GoPass Way</h3>
<p><a href="https://www.gopass.pw/">GoPass</a> is a <a href="https://www.passwordstore.org/">GNU Pass</a> reimplementation in Go with a few extra goodies.</p>
<p>One of those goodies is the <code>gopass env</code> subcommand. Suppose you have the following secrets:</p>
<table>
  <thead>
      <tr>
          <th>Secret path</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>keys/aws/AWS_ACCESS_KEY_ID</code></td>
          <td><code>AKIA..</code></td>
      </tr>
      <tr>
          <td><code>keys/aws/AWS_SECRET_ACCESS_KEY</code></td>
          <td><code>...</code></td>
      </tr>
  </tbody>
</table>
<p>You can run a command with these vars set with <code>gopass env keys/aws -- aws s3 ls</code>.</p>
<p>This feature alone greatly reduces the need for secrets stored in plain-text configuration files, since most tools allow you to pass credentials via environment variables as an alternative.</p>
<p>However, this implementation doesn&rsquo;t cover all my needs for consuming secrets in the terminal. For instance, it&rsquo;s not possible to run a command with secrets from multiple different prefixes (say a GitHub token + AWS credentials), unless you lay out your passwords in a specific way, possibly by duplicating files or symlinking files around.</p>
<h3 id="my-new-way-with-gnu-pass">My New Way (With GNU Pass)</h3>
<p>After reading the GNU Pass main page recently I came across a section <a href="https://www.passwordstore.org/#extensions">Extensions for pass</a>. I used GNU Pass for many years and didn&rsquo;t know that it could be extended like that!</p>
<p>So I decided to ask my new best friend Claude Code to come up with an extension that worked just like the <code>gopass env</code> subcommand, but with more options for composing secrets.</p>
<p>The full up-to-date version for this script can be found <a href="https://codeberg.org/danielfm/ansible-collections-workstation">here</a>, but the API looks like this:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="cp">#!/usr/bin/env bash
</span></span></span><span class="line"><span class="cl">cmd_env_usage<span class="o">()</span> <span class="o">{</span>
</span></span><span class="line"><span class="cl">    cat <span class="s">&lt;&lt;-_EOF
</span></span></span><span class="line"><span class="cl"><span class="s">Usage:
</span></span></span><span class="line"><span class="cl"><span class="s">    $PROGRAM env secret-path [secret-path2 ...] -- [command] [args...]
</span></span></span><span class="line"><span class="cl"><span class="s">        Decrypt the secrets at the specified paths, export their contents as
</span></span></span><span class="line"><span class="cl"><span class="s">        environment variables, and execute the command + args with those variables set.
</span></span></span><span class="line"><span class="cl"><span class="s">        If multiple secrets define the same variable, later secrets take precedence.
</span></span></span><span class="line"><span class="cl"><span class="s">
</span></span></span><span class="line"><span class="cl"><span class="s">        Secret paths can be either:
</span></span></span><span class="line"><span class="cl"><span class="s">        - Individual secret files (containing KEY=VALUE pairs)
</span></span></span><span class="line"><span class="cl"><span class="s">        - Directories ending with &#39;/&#39; (will recursively decrypt all secrets within,
</span></span></span><span class="line"><span class="cl"><span class="s">          using the secret filename as the variable name)
</span></span></span><span class="line"><span class="cl"><span class="s">
</span></span></span><span class="line"><span class="cl"><span class="s">Examples:
</span></span></span><span class="line"><span class="cl"><span class="s">    $PROGRAM env /env/infra-keys -- terraform init
</span></span></span><span class="line"><span class="cl"><span class="s">    $PROGRAM env /env/common /env/production -- aws s3 ls
</span></span></span><span class="line"><span class="cl"><span class="s">    $PROGRAM env secret/app/ -- node server.js
</span></span></span><span class="line"><span class="cl"><span class="s">    $PROGRAM env secret/app/ /env/common -- ./run.sh
</span></span></span><span class="line"><span class="cl"><span class="s">_EOF</span>
</span></span><span class="line"><span class="cl"><span class="o">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>You can use this by putting this file in <code>/usr/lib/password-store/extensions</code> or whatever that directory is for your distro.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<h3 id="security-considerations">Security Considerations</h3>
<p>This approach may have some security considerations of their own. In other words, do not blindly follow me, I may not know what I&rsquo;m doing. &#x1f604;</p>
<p>A couple of things I can think of:</p>
<ol>
<li>As with any solution based on environment variables, the env vars for the launched processes can still be accessed via <code>/proc/&lt;pid&gt;/environ</code> and the <code>ps e</code> command.</li>
<li>Exported variables are available to children processes, not only the original process launched by the script.</li>
</ol>
<h2 id="alternative-approaches">Alternative Approaches</h2>
<p><a href="https://linus.schreibt.jetzt/posts/shell-secrets.html">Linus Heckemann&rsquo;s Blog</a> provide an alternative and more secure approach by not relying on environment variables at all, which of course can be more or less applicable depending on your workflow and the tools involved.</p>
<p>One particularly nice trick is the usage of <a href="https://tldp.org/LDP/abs/html/process-sub.html">process substitution</a> capabilities supported by many of the modern shells, including <code>bash</code>, which suits tools like <code>curl</code> and <code>vault</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Fetch my-secret from GNU Pass and store it into Hashicorp Vault:</span>
</span></span><span class="line"><span class="cl">vault write secret/my-secret <span class="nv">value</span><span class="o">=</span>@&lt;<span class="o">(</span>pass my-secret<span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Fetch key from GNU Pass and use it as a Bearer token in a POST request:</span>
</span></span><span class="line"><span class="cl">curl -X POST -H @&lt;<span class="o">(</span><span class="nb">echo</span> <span class="s2">&#34;Authorization: Bearer </span><span class="k">$(</span>pass api-keys/my-app<span class="k">)</span><span class="s2">&#34;</span><span class="o">)</span> <span class="se">\
</span></span></span><span class="line"><span class="cl">    https://api.my-app.xyz/data
</span></span></code></pre></td></tr></table>
</div>
</div><p>The secrets extracted from GNU pass in the previous commands would not show up in usual places such as <code>/proc/&lt;PID&gt;/environ</code> and <code>/proc/&lt;PID&gt;/cmdline</code>, making this a very strong option for more sensitive tasks.</p>
<p>Of course you can also leverage <a href="https://www.digitalocean.com/community/tutorials/an-introduction-to-useful-bash-aliases-and-functions">Bash Aliases or Functions</a> for abstracting away the details about how secrets are injected onto your most frequent commands.</p>
<h2 id="ssh-keys">SSH Keys</h2>
<p>Other common type of secret to have laying around are SSH keys.</p>
<p>A recommendation is to use a hardware token such as the <a href="https://www.yubico.com/">YubiKey</a> for this, keeping the private key material secure, meaning that you don&rsquo;t need to keep your SSH keys in your filesystem anymore. You can also configure the token to require &ldquo;touch&rdquo; confirmation for different operations, such as signing, encryption, or authentication, depending on your security and convenience requirements.</p>
<p>After configuring <code>gpg-agent</code> to also act as the SSH agent, you could use <code>ssh-add -L</code> to display the public key (which you can then add to <code>~/.ssh/authorized_keys</code> on your servers), and <code>ssh</code> should forward authentication requests to the GPG agent.</p>
<p>If you choose this path, I&rsquo;d strongly recommend checking out <a href="https://github.com/drduh/YubiKey-Guide">drduh&rsquo;s Yubikey Guide</a>.</p>
<p>Other option would be to leverage your own password manager&rsquo;s features. Bitwarden (<a href="https://bitwarden.com/help/ssh-agent/">here</a>) and 1Password (<a href="https://developer.1password.com/docs/ssh/agent/">here</a>), for instance, allow you to store your public and private keys in the password manager itself and then allow access to them via custom SSH agent implementations that you can plug into your system.</p>
<h2 id="closing-note">Closing Note</h2>
<p>The idea for this post is just to show you how I do it. It may or may not work for you, and it&rsquo;s definitely not the only way to do it, so feel free to experiment and choose whatever works best for you considering your security requirements!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>You can use tools like <a href="https://github.com/trufflesecurity/trufflehog">TruffleHog</a> for scanning your files.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>While possible, I advise against storing GNU Pass extensions in your <code>$HOME</code> because some malicious tool could inject undesired behavior to your extensions right after your secrets are decrypted.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>So, I Asked Claude Code to Fix my Pi</title>
      <link>https://danielfm.me/posts/claude-code-code-fixed-my-pi/</link>
      <pubDate>Sun, 18 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/claude-code-code-fixed-my-pi/</guid>
      <description>How I used Claude Code to vibe-fix my broken Raspbian installation for streaming media.</description>
      <content:encoded><![CDATA[<p>I once owned a <a href="https://www.nvidia.com/en-us/shield/">Nvidia Shield</a> that I used for running <a href="https://kodi.tv/">Kodi</a>, for playing back media files across the local network. That device was a beast. It handled 4k media without breaking a sweat, nice upscaling features, worked with all the Dolby tech that I still don&rsquo;t know what they&rsquo;re for.</p>
<p>Recently, my house was hit by a lightning that fried the Shield, and all of a sudden I had no way of streaming my stuff.</p>
<p>At some point I may buy a new box, or try some new thing for the same purpose, but I don&rsquo;t want to spend money right now, especially because these boxes are very expensive here in Brazil due to high shipping costs and taxes.</p>
<p>As many of you, I also have a few Raspberry Pi boards sitting idle in my junk drawer. The last one I bought, many many years ago, is a <a href="https://www.raspberrypi.com/products/raspberry-pi-3-model-b/">Raspberry Pi 3 Model B</a>, that I once used as a cron job runner (i.e. for running <a href="https://codeberg.org/danielfm/ansible-collections-workstation">Ansible playbooks</a>), and I decided to use that for replacing the Shield. It isn&rsquo;t capable of streaming 4k media, can&rsquo;t hardware decode HEVC or other more modern media formats, but it&rsquo;s better than nothing.</p>
<h2 id="was-it-still-working">Was it Still Working?</h2>
<p>After plugging it in, it booted, but it was running an old <a href="https://www.raspberrypi.com/software/operating-systems/">Raspbian</a> version (based on Buster). So I manually upgraded the system to Bullseye and then to Trixie.</p>
<p>I should have installed a clean version, but I lost my SD card adapter, so I wasn&rsquo;t able to plug the SD card to my computer for a clean install.</p>
<p>Everything went seemingly okay, but after booting into Trixie, I got an error regarding a missing signing key. For the most part, everything seemed to work despite of the error, but when I tried to install Kodi with <code>sudo apt install kodi</code>, the installation failed due to unmet dependencies, so I had to fix it if I wanted to watch my stuff.</p>
<p>I tried Googling about it but the fix didn&rsquo;t emerge too easily (which key is missing? Where to find it? How to configure it?), so I decided to fire up Claude Code and let it fix it for me. I provided it with my SSH key, the SSH command for it to sign into the Raspberry Pi, and asked it to fix my shit.</p>
<h2 id="the-experiment">The Experiment</h2>
<p>First, I asked Claude Code to resolve the signing key issue in <code>apt</code> and install Kodi, which it managed to do in a single shot in a couple of minutes:</p>
<blockquote>
<p>I have a raspberry pi hosted in my local network, at the address pi.local. You can ssh into it with the following command:</p>
<p><code>ssh pi@pi.local</code></p>
<p>This pi was updated from Debian 11 to 12, and then from 12 to 13, but I think something went wrong with the the last upgrade, as commands such as <code>sudo apt update</code> show missing key errors. Not sure how to fix those.</p>
<p>What I ultimately want is to install Kodi, but the command <code>sudo apt install kodi</code> fails due to unmet dependencies.</p>
<p>I want you to fix this installation so we could install kodi.</p>
</blockquote>
<p>Then, I asked it to auto-start Kodi when the Pi was booted, which it, again, implemented correctly in the first try. It even rebooted the Pi to ensure that the service was started after the boot, by checking the Systemd service status. Neat.</p>
<blockquote>
<p>Now update the pi installation so that kodi starts up automatically once the system is powered up.</p>
</blockquote>
<p>After Kodi started, the resolution was all messed up, instead of the expected Full HD resolution. Again, it made a few changes to <code>/boot/config.txt</code>, and after a few retries, the issue was resolved.</p>
<blockquote>
<p>The next issue is that I cannot seem to configure kodi to run at full HD resolution (1920x1080); the maximum I could get was to 1024x768. How can I configure kodi to run at the maximum output resolution?</p>
</blockquote>
<p>This is what it added (don&rsquo;t ask what those mean, after all I was just vibin&rsquo;):</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-ini" data-lang="ini"><span class="line"><span class="cl"><span class="c1"># Force HDMI output at 1080p</span>
</span></span><span class="line"><span class="cl"><span class="na">hdmi_force_hotplug</span><span class="o">=</span><span class="s">1</span>
</span></span><span class="line"><span class="cl"><span class="na">hdmi_group</span><span class="o">=</span><span class="s">1</span>
</span></span><span class="line"><span class="cl"><span class="na">hdmi_mode</span><span class="o">=</span><span class="s">31</span>
</span></span><span class="line"><span class="cl"><span class="na">hdmi_drive</span><span class="o">=</span><span class="s">2</span>
</span></span><span class="line"><span class="cl"><span class="na">disable_overscan</span><span class="o">=</span><span class="s">1</span>
</span></span><span class="line"><span class="cl"><span class="na">config_hdmi_boost</span><span class="o">=</span><span class="s">4</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Then I noticed that the <code>dphys-swapfile.service</code> was failing, and asked it to fix it, and it gladly reconfigured the swapfile, restarted the service, and everything was good.</p>
<blockquote>
<p>Okay, got image back. Let&rsquo;s leave it like this for now.</p>
<p>Now I found one of the systemd services to report as failed during boot, the dphys-swapfile.service. Can you fix it as well?</p>
</blockquote>
<p>Now, the time zone is incorrect. How to configure this anyway? I don&rsquo;t remember, but apparently he did; checked the NTP daemon was running, asked for the correct time zone to set, and configured it once I provided the information it needed.</p>
<blockquote>
<p>The clock is incorrect on my raspberry pi. Configure it to fetch correct date and time from the internet.</p>
</blockquote>
<h2 id="what-didnt-work">What Didn&rsquo;t Work</h2>
<p>I could not use the TV remote control for navigating through the Kodi menus. I remember that it worked at some point, so I asked Claude Code to fix this as well.</p>
<p>Man, this is where things got crazy. It went in a reboot loop, changing random stuff in <code>/boot/config.txt</code> to see if it would work, and eventually messed up all the things it fixed previously. Sometimes it booted to a black screen, sometimes with the wrong resolution, sometimes the sound stopped working.</p>
<p>Then I eventually asked it to stop and revert to the previously working configuration, which to my surprise it did, so I didn&rsquo;t have to start from scratch again.</p>
<p>For now, I ended up using an iPhone app as the remote control. I may try to fix this at some point.</p>
<h2 id="wrapping-up">Wrapping Up</h2>
<p>I wouldn&rsquo;t recommend vibe-patching servers like this without a proper backup first, unless you are just doing it for fun like me.</p>
<p>Or if you just don&rsquo;t care.</p>
]]></content:encoded>
    </item>
    <item>
      <title>OPA Policies Without Breaking the Bank</title>
      <link>https://danielfm.me/posts/opa-policies-without-breaking-the-bank/</link>
      <pubDate>Sun, 11 Jan 2026 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/opa-policies-without-breaking-the-bank/</guid>
      <description>Using OPA and Atlantis to enforce Terraform best practices on a budget.</description>
      <content:encoded><![CDATA[<blockquote>
<p><em>The widely accepted &ldquo;healthy&rdquo; ratio, particularly at companies with mature SRE practices like Google, is
one SRE for every 10 Software Engineers (SWEs) (a 1:10 ratio). However, this ratio can vary significantly based on an organization&rsquo;s specific needs, size, and the maturity of its automation and tooling.</em></p>
<p>&ndash; Gemini</p>
</blockquote>
<p>For the past 10+ years working as SRE, most of it while in short-staffed squads in late-stage start-ups, the SRE:SWE ratio was <strong>nowhere near</strong> the 1:10 &ldquo;gold standard&rdquo;. A more accurate ratio in my experience is 1:30 - 1:40, and in this setting, saying things can get pretty busy is a bit of an understatement.</p>
<p>Working in highly constrained environments such as these does not come without its challenges, both cultural and technical. For every task you choose to work on, there are like ten others you need to put on hold, all while doing your best <em>not</em> to become a bottleneck for the entire organization.</p>
<p>But this is not the only constraint I&rsquo;m used to. The other one is <em>cost efficiency</em>; here, offloading everything to expensive SaaS offerings is not an option, so we usually resort to self-hosting, tuning and stitching open source tools for our needs. Deploying a new tool is also something we do not take lightly, as each new tool could drag the team down into an operational hell, so we tend to bet on boring/flexible tools as opposed to over-specialized ones.</p>
<h2 id="the-problem">The Problem</h2>
<p>While working on improving the operational and security maturity for the organization, we wanted to review and reduce direct permissions to the production environment, and one possibility was to use Terraform, which the SRE team already used and was familiar with. But how could we provide other teams the same tools we used<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> in a way that was safe, within the established best practices, and most importantly, did not require the SRE team to review each and every Terraform PR?</p>
<p>Here&rsquo;s a non-exhaustive list of things we were after:</p>
<ul>
<li>Enforcing consistent naming conventions<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
<li>Ensuring resources had the correct labels for proper ownership and cost tracking</li>
<li>Blocking dangerous operations (i.e. database/bucket deletion)</li>
<li>Ensuring durability best practices (i.e. enforcing backups for production databases)</li>
<li>Forcing security best practices (i.e. requiring TLS for DB instances)</li>
</ul>
<p>At the same time, we didn&rsquo;t want to <em>always</em> block things that went off the standard path. Some of these requirements were created after critical infrastructure was already in place, so these needed to be supported.</p>
<p>Other reason is that sometimes the team might have a good reason to create a publicly exposed bucket or disabling authentication for a Redis instance, etc, so a workflow for approving these changes was also needed.</p>
<blockquote>
<p>The simplest (and cheapest) solution at the time was to codify these practices as OPA policies and <a href="https://www.runatlantis.io/docs/policy-checking">enable conftest</a> in our Atlantis pipelines. The workflow is not perfect, but it gets the job done.</p>
</blockquote>
<p>With these policies in place, teams could own their Terraform workflows and apply changes to production without waiting on the SRE for changes that were considered harmless, deeply reducing the PR review load on the team.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>Other options we considered:</p>
<ul>
<li><a href="https://docs.cloud.google.com/resource-manager/docs/organization-policy/overview">GCP Organization Policies</a>: We use Google Cloud for most things, but use specific services from other public clouds as well (AWS, Azure), so a cloud-specific solution was not an option; higher than desired friction in case of policy violations / break glass scenarios.</li>
<li><a href="https://www.hashicorp.com/en/sentinel">Terraform Cloud + Sentinel</a>: Recent history of hostile decisions against the community; pricing model based on resources under management.</li>
<li><a href="https://spacelift.io">Spacelift</a> and <a href="https://envzero.com">Env0</a>: Too expensive for the number of users and features (single sign-on, RBAC, etc) we required.</li>
</ul>
<h2 id="a-simple-example">A Simple Example</h2>
<p>We had great success in using GenAI for generating both policies and tests for various cases, but to get you started, here&rsquo;s a policy I got from our library:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-rego" data-lang="rego"><span class="line"><span class="cl"><span class="kd">package</span><span class="w"> </span><span class="nx">gcp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">import</span><span class="w"> </span><span class="nx">rego</span><span class="o">.</span><span class="nx">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">import</span><span class="w"> </span><span class="nx">data</span><span class="o">.</span><span class="nx">utils</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">delete_buckets_with_force_destroy</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="p">[</span><span class="nx">res</span><span class="w"> </span><span class="o">|</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="kd">some</span><span class="w"> </span><span class="nx">res</span><span class="w"> </span><span class="kd">in</span><span class="w"> </span><span class="nx">utils</span><span class="o">.</span><span class="nf">resource_op</span><span class="p">({</span><span class="s2">&#34;delete&#34;</span><span class="p">}</span><span class="o">,</span><span class="w"> </span><span class="p">[</span><span class="err">`^</span><span class="nx">google_storage_bucket$</span><span class="err">`</span><span class="p">])</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">res</span><span class="o">.</span><span class="nx">change</span><span class="o">.</span><span class="nx">after</span><span class="o">.</span><span class="nx">force_destroy</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">deny</span><span class="w"> </span><span class="kd">contains</span><span class="w"> </span><span class="nx">msg</span><span class="w"> </span><span class="kd">if</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="kd">some</span><span class="w"> </span><span class="nx">res</span><span class="w"> </span><span class="kd">in</span><span class="w"> </span><span class="nx">delete_buckets_with_force_destroy</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">msg</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nf">sprintf</span><span class="p">(</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;The resource &#39;%s&#39; will be force-deleted, which might cause data loss&#34;</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="p">[</span><span class="nx">res</span><span class="o">.</span><span class="nx">address</span><span class="p">]</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>As our collection of policies grew, we ended up with a few extra packages for reusing some common behavior, such as:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-rego" data-lang="rego"><span class="line"><span class="cl"><span class="kd">package</span><span class="w"> </span><span class="nx">utils</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">import</span><span class="w"> </span><span class="nx">rego</span><span class="o">.</span><span class="nx">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">resource_op</span><span class="p">(</span><span class="nx">operations</span><span class="o">,</span><span class="w"> </span><span class="nx">resource_types</span><span class="p">)</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="p">[</span><span class="nx">res</span><span class="w"> </span><span class="o">|</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="kd">some</span><span class="w"> </span><span class="nx">res</span><span class="w"> </span><span class="kd">in</span><span class="w"> </span><span class="nx">input</span><span class="o">.</span><span class="nx">resource_changes</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="kd">some</span><span class="w"> </span><span class="nx">action</span><span class="w"> </span><span class="kd">in</span><span class="w"> </span><span class="nx">res</span><span class="o">.</span><span class="nx">change</span><span class="o">.</span><span class="nx">actions</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">operations</span><span class="p">[</span><span class="nx">action</span><span class="p">]</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="kd">some</span><span class="w"> </span><span class="nx">type</span><span class="w"> </span><span class="kd">in</span><span class="w"> </span><span class="nx">resource_types</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">regex</span><span class="o">.</span><span class="nf">match</span><span class="p">(</span><span class="nx">type</span><span class="o">,</span><span class="w"> </span><span class="nx">res</span><span class="o">.</span><span class="nx">type</span><span class="p">)</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">]</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><h2 id="linting-and-testing">Linting and Testing</h2>
<p>I recommend using <a href="https://www.openpolicyagent.org/ecosystem/entry/regal">Regal</a> to avoid common mistakes and help you write more idiomatic Rego code:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">% regal lint .
</span></span><span class="line"><span class="cl"><span class="m">3</span> files linted. No violations found.</span></span></code></pre></div>
<p>It&rsquo;s also important to write tests for your policies to accelerate the feedback loop when introducing new policies while catching regressions early.</p>
<p>Here&rsquo;s an example test suite for that policy:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-rego" data-lang="rego"><span class="line"><span class="cl"><span class="kd">package</span><span class="w"> </span><span class="nx">gcp_test</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">import</span><span class="w"> </span><span class="nx">rego</span><span class="o">.</span><span class="nx">v1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="kd">import</span><span class="w"> </span><span class="nx">data</span><span class="o">.</span><span class="nx">gcp</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">test_deny_delete_bucket_with_force_destroy</span><span class="w"> </span><span class="kd">if</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">result</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">gcp</span><span class="o">.</span><span class="nx">deny</span><span class="w"> </span><span class="kd">with</span><span class="w"> </span><span class="nx">input</span><span class="w"> </span><span class="kd">as</span><span class="w"> </span><span class="p">{</span><span class="s2">&#34;resource_changes&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;address&#34;</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;google_storage_bucket.my_bucket&#34;</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;type&#34;</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;google_storage_bucket&#34;</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;change&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">			</span><span class="s2">&#34;actions&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;delete&#34;</span><span class="p">]</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">			</span><span class="s2">&#34;after&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s2">&#34;force_destroy&#34;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">}</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="p">}</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="p">}]}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">result</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="p">{</span><span class="s2">&#34;The resource &#39;google_storage_bucket.my_bucket&#39; will be force-deleted, which might cause data loss&#34;</span><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="n">test_allow_delete_bucket_without_force_destroy</span><span class="w"> </span><span class="kd">if</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nx">result</span><span class="w"> </span><span class="o">:=</span><span class="w"> </span><span class="nx">gcp</span><span class="o">.</span><span class="nx">deny</span><span class="w"> </span><span class="kd">with</span><span class="w"> </span><span class="nx">input</span><span class="w"> </span><span class="kd">as</span><span class="w"> </span><span class="p">{</span><span class="s2">&#34;resource_changes&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">[{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;address&#34;</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;google_storage_bucket.my_bucket&#34;</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;type&#34;</span><span class="p">:</span><span class="w"> </span><span class="s2">&#34;google_storage_bucket&#34;</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="s2">&#34;change&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">			</span><span class="s2">&#34;actions&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">&#34;delete&#34;</span><span class="p">]</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">			</span><span class="s2">&#34;after&#34;</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="s2">&#34;force_destroy&#34;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">}</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">		</span><span class="p">}</span><span class="o">,</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="p">}]}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">	</span><span class="nf">count</span><span class="p">(</span><span class="nx">result</span><span class="p">)</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="mi">0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="p">}</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># TODO: Add tests for other corner cases...</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Running the tests:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">% conftest verify --report full
</span></span><span class="line"><span class="cl">policy/gcp/bucket_force_destroy_tests.rego:
</span></span><span class="line"><span class="cl">data.gcp_test.test_allow_create_bucket_with_force_destroy: PASS <span class="o">(</span>926.401µs<span class="o">)</span>
</span></span><span class="line"><span class="cl">data.gcp_test.test_allow_empty_resource_changes: PASS <span class="o">(</span>1.026078ms<span class="o">)</span>
</span></span><span class="line"><span class="cl">data.gcp_test.test_allow_update_bucket_with_force_destroy: PASS <span class="o">(</span>857.508µs<span class="o">)</span>
</span></span><span class="line"><span class="cl">data.gcp_test.test_deny_delete_bucket_with_force_destroy: PASS <span class="o">(</span>1.506513ms<span class="o">)</span>
</span></span><span class="line"><span class="cl">data.gcp_test.test_allow_delete_bucket_without_force_destroy: PASS <span class="o">(</span>1.319212ms<span class="o">)</span>
</span></span><span class="line"><span class="cl">data.gcp_test.test_allow_delete_other_resource_type: PASS <span class="o">(</span>1.329778ms<span class="o">)</span>
</span></span><span class="line"><span class="cl">--------------------------------------------------------------------------------
</span></span><span class="line"><span class="cl">PASS: 6/6</span></span></code></pre></div>
<h2 id="what-doesnt-work-so-well">What Doesn&rsquo;t Work so Well</h2>
<p>The following points are specific to the built-in OPA integration for Atlantis, which is what we currently use.</p>
<h3 id="ux-for-policy--violations">UX for Policy  Violations</h3>
<p>If you have a single policy set with a couple dozen policies and a single approver, the UX is good enough, but as your library of policies grow and you start separating them into different policy sets with different owners/approvers, the policy violation messages can get too long and confusing.</p>
<figure><img src="/posts/opa-policies-without-breaking-the-bank/notification.webp"
    alt="Atlantis message showing policy violations" width="1849" height="1587"><figcaption>
      <p>Policy violation notification as an Atlantis comment.</p>
    </figcaption>
</figure>

<p>By default, this notification is also sent even when no policy violations are detected, increasing the comment noise in your PRs. To mitigate this, you can experiment with the server flag <code>--quiet-policy-checks</code>.</p>
<h3 id="workflow-for-managing-policy-sets">Workflow for Managing Policy Sets</h3>
<p>Policy sets are configured via the Atlantis server-side repo configuration, which is a static configuration file:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">policies</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">owners</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">teams</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="l">sre</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">policy_sets</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">gcp.common</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">&lt;CODE_DIRECTORY&gt;/policies/gcp/common/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">source</span><span class="p">:</span><span class="w"> </span><span class="l">local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">gcp.iam</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">path</span><span class="p">:</span><span class="w"> </span><span class="l">&lt;CODE_DIRECTORY&gt;/policies/gpc/iam/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">source</span><span class="p">:</span><span class="w"> </span><span class="l">local</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">approve_count</span><span class="p">:</span><span class="w"> </span><span class="m">2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">owners</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">teams</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">sre</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">sec</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">...</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>It&rsquo;s okay to start creating policies in a single set, but as you grow, you&rsquo;ll likely need to assign different owners for different policy sets.</p>
<p>Also, depending on how frequently you update your policies, you might want to decouple the policy update workflow from the Atlantis server lifecycle. Conftest (and thus Atlantis) can <a href="https://www.conftest.dev/sharing/">fetch policies</a> from a GitHub repository, OCI repository, etc, so use this to your advantage &ndash; but you need to be careful to sync disruptive changes (i.e. directory structure changes) to the Atlantis server-side repo configuration.</p>
<h3 id="no-built-in-support-for-policy-notifications">No Built-In Support for Policy Notifications</h3>
<p>When a PR triggers one or more policy violations, all Atlantis does is create a comment in the PR. In our case, when that happens, the users themselves escalate to the SRE team for review or approval, but this might not work for you.</p>
<p>If you need other ways to be notified about policy violations (e.g. via Slack), you&rsquo;ll need to implement it yourself.</p>
<blockquote>
<p>One way could be writing a small GitHub App that receives hooks for all comments in a PR, detecting the policy violation comments, parsing/forwarding them to the owners via Slack, and maybe also providing actions directly in the notification for approval.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
</blockquote>
<h3 id="policy-approvals-vs-branch-syncs">Policy Approvals vs Branch Syncs</h3>
<p>If you approve a policy for a branch that&rsquo;s out of sync with the main branch, the policy approval will be &ldquo;erased&rdquo; after the rebase and you&rsquo;ll have to approve it again, even when none of the code in the PR changes. This is particularly annoying for changes that require lots of back and forth due to apply errors.</p>
<p>If you want to resolve this, you&rsquo;ll have to cook something yourself.</p>
<blockquote>
<p>Again, some sort of GitHub App that automatically re-approves a PR that was once approved by some human as long as the set of violated policies and resources involved do not change.</p>
</blockquote>
<h3 id="atlantis-api">Atlantis API</h3>
<p>I once tried to enable the <a href="https://www.runatlantis.io/docs/api-endpoints">Atlantis API</a> to implement basic drift detection and correction, but I could not make it work. Since the Atlantis primitives work on top of pull requests, even when using the API, you&rsquo;ll need to provide a valid PR number when asking for a plan or apply, and even then things didn&rsquo;t quite work as I expected.</p>
<h2 id="recommendations">Recommendations</h2>
<ul>
<li>Let me repeat myself: <strong>lint and write tests for your policies!</strong></li>
<li>Add some unique identifier for each policy error (i.e. <code>GCP001</code>); this would allow you to track the frequency each policy is being triggered, and maybe even provide a nice wiki for your users that explain each policy in detail, how to resolve the issue, etc.</li>
<li>Avoid putting too much stuff in the same file; keep each policy in a separate file.</li>
<li>Periodically review your policies; check whether they are blocking stuff that needs to be blocked, or just introducing noise without real benefits; remove bad/useless policies.</li>
<li>Be cautious when adopting third-party policy packs unless you have a way of granularly enabling or disabling specific policies that do not make sense for your own situation.</li>
</ul>
<blockquote>
<p>About the <em>periodically reviewing your policies</em> part: In one of the reviews, we identified that a single policy targeting a single resource accounted for 46% of all policy violation errors we had in the previous 90 days!</p>
</blockquote>
<h2 id="conclusion">Conclusion</h2>
<p>If you ask me if I think this is the dream workflow for Terraform code, I&rsquo;d be lying if I said that it is. But hey, it&rsquo;s open source, works mostly fine as long as you don&rsquo;t want to do anything too crazy, and <strong>helped us a lot.</strong> I even made a <a href="/open-source-contributions/">contribution</a> to the project a couple of years ago.</p>
<p>If my employer had deeper pockets, we might have gone in other directions, but I&rsquo;m glad Atlantis exists. Now that the project was <a href="https://www.cncf.io/projects/atlantis/">accepted into the CNCF</a>, my hopes are up!</p>
<p>For those that use Atlantis and stumbled across some of these issues, I&rsquo;d be happy to learn how you addressed them!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Terraform is a great tool, but not ideal as the go-to tool for development teams as it operates at a lower level. In the future we might explore tools like <a href="https://www.crossplane.io/">Crossplane</a>, where we could build APIs based on abstractions that make sense for our own reality.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Terraform modules are important tools for tasks like these, allowing you to &lsquo;codify&rsquo; your organization&rsquo;s standards directly as modules and reusing them where needed. In our case, we already had code managing cloud resources directly, so migrating everything to modules up front was too much work. With OPA, we could stop the bleeding by at least preventing the creation of new non-conformant resources and start getting value early in the process.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>This can work better or worse for your organization depending on a number of factors, such as the team maturity and the restrictiveness (or looseness) of your policies. You need periodic reviews to identify constantly breaking policies and fine-tune them for a better balance between noise and safety.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>That&rsquo;s one of the things that we didn&rsquo;t have the time to polish, being a small team with too much stuff on our plates. Maybe I&rsquo;ll take a stab at this and publish as an open source project.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Prometheus for Developers</title>
      <link>https://danielfm.me/posts/prometheus-for-developers/</link>
      <pubDate>Sun, 01 Jul 2018 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/prometheus-for-developers/</guid>
      <description>An introductory tutorial covering Prometheus fundamentals, including metrics collection, querying, alerting, and instrumenting applications with practical examples.</description>
      <content:encoded><![CDATA[<p>This is an introductory tutorial I created for telling the software developers
in <a href="https://descomplica.com.br">Descomplica</a> (where I worked at the time) the
basics about <a href="https://prometheus.io">Prometheus</a>.</p>
<h2 id="the-project">The Project</h2>
<p>This tutorial follows a more practical approach (with hopefully just the
right amount of theory!), so we provide a simple Docker Compose configuration
for simplifying the project bootstrap.</p>
<p>You can download the code from <a href="https://github.com/danielfm/prometheus-for-developers">here</a>.</p>
<h3 id="pre-requisites">Pre-Requisites</h3>
<ul>
<li>Docker + Docker Compose</li>
</ul>
<h3 id="running-the-code">Running the Code</h3>
<p>Run the following command to start everything up:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker-compose up -d
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Or, if you use podman:</span>
</span></span><span class="line"><span class="cl">$ podman-compose up -d
</span></span></code></pre></td></tr></table>
</div>
</div><ul>
<li>Alertmanager: <a href="http://localhost:9093">http://localhost:9093</a></li>
<li>Grafana: <a href="http://localhost:3000">http://localhost:3000</a> (user/password: <code>admin</code>)</li>
<li>Prometheus: <a href="http://localhost:9090">http://localhost:9090</a></li>
<li>Sample Node.js Application: <a href="http://localhost:4000">http://localhost:4000</a></li>
</ul>
<h3 id="cleaning-up">Cleaning Up</h3>
<p>Run the following command to stop all running containers from this project:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker-compose rm -fs
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="prometheus-overview">Prometheus Overview</h2>
<p>Prometheus is an open source monitoring and time-series database (TSDB)
designed after
<a href="https://landing.google.com/sre/book/chapters/practical-alerting.html">Borgmon</a>,
the monitoring tool created internally at Google for collecting metrics
from jobs running in their cluster orchestration platform,
<a href="https://ai.google/research/pubs/pub43438">Borg</a>.</p>
<p>The following image shows an overview of the Prometheus architecture.</p>
<figure><a href="https://prometheus.io/docs/introduction/overview/"><img src="/posts/prometheus-for-developers/prometheus-architecture.webp"
    alt="Prometheus architecture" width="1760" height="1232"></a><figcaption>
      <p>Prometheus architecture.</p>
    </figcaption>
</figure>

<p>In the center we have a <strong>Prometheus server</strong>, which is the component
responsible for periodically collecting and storing metrics from various
<strong>targets</strong> (e.g. the services you want to collect metrics from).</p>
<p>The list of <strong>targets</strong> can be statically defined in the Prometheus
configuration file, or we can use other means to automatically discover
those targets via <strong>Service discovery</strong>. For instance, if you want to monitor
a service that&rsquo;s deployed in EC2 instances in AWS, you can configure Prometheus
to use the AWS EC2 API to discover which instances are running a particular
service and then <em>scrape</em> metrics from those servers; this is preferred over
statically listing the IP addresses for our application in the Prometheus
configuration file, which will eventually get out of sync, especially in a
dynamic environment such as a public cloud provider.</p>
<p>Prometheus also provides a basic <strong>Web UI</strong> for running queries on the stored
data, as well as integrations with popular visualization tools, such as
<a href="https://grafana.net">Grafana</a>.</p>
<h3 id="push-vs-pull">Push vs Pull</h3>
<p>Previously, we mentioned that the <strong>Prometheus server</strong> <em>scrapes</em> (or pulls)
metrics from our <strong>target</strong> applications.</p>
<p>This means Prometheus took a different approach than other &ldquo;traditional&rdquo;
monitoring tools, such as <a href="https://github.com/etsy/statsd">StatsD</a>, in
which applications <em>push</em> metrics to the metrics server or aggregator,
instead of having the metrics server <em>pulling</em> metrics from applications.</p>
<p>The consequence of this design is a better separation of concerns; when
the application pushes metrics to a metrics server or aggregator, it has
to make decisions like: where to push the metrics to; how often to push the
metrics; should the application aggregate/consolidate any metrics before
pushing them; among other things.</p>
<p>In <em>pull-based</em> monitoring systems like Prometheus, these decisions go
away; for instance, we no longer have to re-deploy our applications if we want
to change the metrics resolution (how many data points collected per minute) or
the monitoring server endpoint (we can architect the monitoring system in a
way completely transparent to application developers).</p>
<blockquote>
<p><strong>Want to know more?</strong> The Prometheus documentation provides a
<a href="https://prometheus.io/docs/introduction/comparison/">comparison</a> with
other tools in the monitoring space regarding scope, data model, and storage.</p>
</blockquote>
<p>Now, if the application doesn&rsquo;t push metrics to the metrics server, how does
the applications metrics end up in Prometheus?</p>
<h3 id="metrics-endpoint">Metrics Endpoint</h3>
<p>Applications expose metrics to Prometheus via a <em>metrics endpoint</em>. To see how
this works, let&rsquo;s start everything by running <code>docker-compose up -d</code> if you
haven&rsquo;t already.</p>
<p>Visit <a href="http://localhost:3000">http://localhost:3000</a> to open Grafana and log in with the default
<code>admin</code> user and password. Then, click on the top link &ldquo;Home&rdquo; and select the
&ldquo;Prometheus 2.0 Stats&rdquo; dashboard.</p>
<figure><img src="/posts/prometheus-for-developers/dashboard-prometheus.webp"
    alt="Prometheus 2.0 Stats Dashboard" width="1680" height="953"><figcaption>
      <p>Prometheus 2.0 Stats Dashboard in Grafana.</p>
    </figcaption>
</figure>

<p>Yes, Prometheus is <em>scraping</em> metrics from itself!</p>
<p>Let&rsquo;s pause for a moment to understand what happened. First, Grafana is already
configured with a
<a href="http://docs.grafana.org/features/datasources/prometheus/">Prometheus data source</a>
that points to the local Prometheus server. This is how Grafana is able to
display data from Prometheus. Also, if you look at the Prometheus configuration
file, you can see that we listed Prometheus itself as a target.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># config/prometheus/prometheus.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># Simple scrape configuration for each service</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">scrape_configs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">job_name</span><span class="p">:</span><span class="w"> </span><span class="l">prometheus</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">static_configs</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">targets</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span>- <span class="l">localhost:9090</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>By default, Prometheus gets metrics via the <code>/metrics</code> endpoint in each target,
so if you hit <a href="http://localhost:9090/metrics">http://localhost:9090/metrics</a>, you should see something like
this:</p>
<pre tabindex="0"><code># HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile=&#34;0&#34;} 5.95e-05
go_gc_duration_seconds{quantile=&#34;0.25&#34;} 0.0001589
go_gc_duration_seconds{quantile=&#34;0.5&#34;} 0.0002188
go_gc_duration_seconds{quantile=&#34;0.75&#34;} 0.0004158
go_gc_duration_seconds{quantile=&#34;1&#34;} 0.0090565
go_gc_duration_seconds_sum 0.0331214
go_gc_duration_seconds_count 47
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 39
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version=&#34;go1.10.3&#34;} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 3.7429992e+07
# HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed.
# TYPE go_memstats_alloc_bytes_total counter
go_memstats_alloc_bytes_total 1.37005104e+08
...
</code></pre><p>In this snippet alone we can notice a few interesting things:</p>
<ol>
<li>Each metric has a user friendly description that explains its purpose</li>
<li>Each metric may define additional dimensions, also known as <strong>labels</strong>. For
instance, the metric <code>go_info</code> has a <code>version</code> label
<ul>
<li>Every time series is uniquely identified by its metric name and the set of
label-value pairs</li>
</ul>
</li>
<li>Each metric is defined as a specific type, such as <code>summary</code>, <code>gauge</code>,
<code>counter</code>, and <code>histogram</code>. More information on each data type can be found
<a href="https://prometheus.io/docs/concepts/metric_types/">here</a></li>
</ol>
<p>But how does this text-based response turns into data points in a time series
database?</p>
<p>The best way to understand this is by running a few simple queries.</p>
<p>Open the Prometheus UI at <a href="http://localhost:9090/graph">http://localhost:9090/graph</a>, type
<code>process_resident_memory_bytes</code> in the text field and hit <em>Execute</em>.</p>
<figure><img src="/posts/prometheus-for-developers/prometheus-query.webp"
    alt="Prometheus Query Example" width="1699" height="846"><figcaption>
      <p>Example of querying process_resident_memory_bytes in the Prometheus UI.</p>
    </figcaption>
</figure>

<p>You can use the graph controls to zoom into a specific region.</p>
<p>This first query is very simple as it only plots the value of the
<code>process_resident_memory_bytes</code> gauge as time passes, and as you might
have guessed, that query displays the resident memory usage for each target,
in bytes.</p>
<p>Since our setup uses a 5-second scrape interval, Prometheus will hit the
<code>/metrics</code> endpoint of our targets every 5 seconds to fetch the current
metrics and store those data points sequentially, indexed by timestamp.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># In prometheus.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">global</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">scrape_interval</span><span class="p">:</span><span class="w"> </span><span class="l">5s</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>You can see the all samples from that metric in the past minute by querying
<code>process_resident_memory_bytes{job=&quot;grafana&quot;}[1m]</code> (select <em>Console</em> in the
Prometheus UI):</p>
<table>
  <thead>
      <tr>
          <th>Element</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>process_resident_memory_bytes{instance=&quot;grafana:3000&quot;,job=&quot;grafana&quot;}</code></td>
          <td><code>40861696@1530461477.446</code><br/><code>43298816@1530461482.447</code><br/><code>43778048@1530461487.451</code><br/><code>44785664@1530461492.447</code><br/><code>44785664@1530461497.447</code><br/><code>45043712@1530461502.448</code><br/><code>45043712@1530461507.44</code><br/><code>45301760@1530461512.451</code><br/><code>45301760@1530461517.448</code><br/><code>45301760@1530461522.448</code><br/><code>45895680@1530461527.448</code><br/><code>45895680@1530461532.447</code></td>
      </tr>
  </tbody>
</table>
<p>Queries that have an appended range duration in square brackets after
the metric name (i.e. <code>&lt;metric&gt;[&lt;duration&gt;]</code>) returns what is called a
<em>range vector</em>, in which <code>&lt;duration&gt;</code> specify how far back in time
values should be fetched for each resulting range vector element.</p>
<p>In this example, the value for the <code>process_resident_memory_bytes</code>
metric at the timestamp <code>1530461477.446</code> was <code>40861696</code>, and so on.</p>
<h3 id="duplicate-metrics-names">Duplicate Metrics Names?</h3>
<p>If you inspect the contents of the <code>/metrics</code> endpoint at all our targets,
you&rsquo;ll see that multiple targets export metrics under the same name.</p>
<p>But isn&rsquo;t this a problem? If we are exporting metrics under the same name,
how can we be sure we are not mixing metrics between different applications
into the same time series data?</p>
<p>Consider the previous metric, <code>process_resident_memory_bytes</code>: Grafana,
Prometheus, and our sample application all export a gauge metric under the
same name. However, did you notice in the previous plot that somehow we were
able to get a separate time series from each application?</p>
<p>Quoting the
<a href="https://prometheus.io/docs/concepts/jobs_instances/">documentation</a>:</p>
<blockquote>
<p>In Prometheus terms, an endpoint you can scrape is called an <strong>instance</strong>,
usually corresponding to a single process. A collection of instances with
the same purpose, a process replicated for scalability or reliability for
example, is called a <strong>job</strong>.</p>
<p>When Prometheus scrapes a target, it attaches some labels automatically to
the scraped time series which serve to identify the scraped target:</p>
<ul>
<li><code>job</code> - The configured job name that the target belongs to.</li>
<li><code>instance</code> - The <code>&lt;host&gt;:&lt;port&gt;</code> part of the target&rsquo;s URL that was scraped.</li>
</ul>
</blockquote>
<p>Since our configuration has three different targets (with one instance each)
exposing this metric, we can see three lines in that plot.</p>
<h3 id="monitoring-uptime">Monitoring Uptime</h3>
<p>For each instance scrape, Prometheus stores a <code>up</code> metric with the value <code>1</code>
when the instance is healthy, i.e. reachable, or <code>0</code> if the scrape failed.</p>
<p>Try plotting the query <code>up</code> in the Prometheus UI.</p>
<p>If you followed every instruction up until this point, you&rsquo;ll notice that
so far all targets were reachable at all times.</p>
<p>Let&rsquo;s change that. Run <code>docker-compose stop sample-app</code> and after a few
seconds you should see the <code>up</code> metric reporting our sample application
is down.</p>
<p>Now run <code>docker-compose restart sample-app</code> and the <code>up</code> metric should
report the application is back up again.</p>
<figure><img src="/posts/prometheus-for-developers/sample-app-downtime.webp"
    alt="Sample application downtime" width="1699" height="846"><figcaption>
      <p>The up metric showing the sample application going down and coming back up.</p>
    </figcaption>
</figure>

<blockquote>
<p><strong>Want to know more?</strong> The Prometheus query UI provides a combo box with all
available metric names registered in its database. Do some exploring, try
querying different ones. For instance, can you plot the file descriptor
handles usage (in %) for all targets? <strong>Tip:</strong> the metric names end with
<code>_fds</code>.</p>
</blockquote>
<h4 id="a-basic-uptime-alert">A Basic Uptime Alert</h4>
<p>We don&rsquo;t want to keep staring at dashboards in a big TV screen all day
to be able to quickly detect issues in our applications, after all, we have
better things to do with our time, right?</p>
<p>Luckily, Prometheus provides a facility for defining alerting rules that,
when triggered, will notify
<a href="https://prometheus.io/docs/alerting/alertmanager/">Alertmanager</a>, which is
the component that takes care of deduplicating, grouping, and routing them
to the correct receiver integration (i.e. email, Slack, PagerDuty,
OpsGenie). It also takes care of silencing and inhibition of alerts.</p>
<p>Configuring Alertmanager to send metrics to PagerDuty, or Slack, or whatever,
is out of the scope of this tutorial, but we can still play around with alerts.</p>
<p>We already have the following alerting rule defined in
<code>config/prometheus/prometheus.rules.yml</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="nt">groups</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span>- <span class="nt">name</span><span class="p">:</span><span class="w"> </span><span class="l">uptime</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">rules</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c"># Uptime alerting rule</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="c"># Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span>- <span class="nt">alert</span><span class="p">:</span><span class="w"> </span><span class="l">ServerDown</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">expr</span><span class="p">:</span><span class="w"> </span><span class="l">up == 0</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">for</span><span class="p">:</span><span class="w"> </span><span class="l">1m</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">labels</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">severity</span><span class="p">:</span><span class="w"> </span><span class="l">page</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">summary</span><span class="p">:</span><span class="w"> </span><span class="l">One or more targets are down</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">          </span><span class="nt">description</span><span class="p">:</span><span class="w"> </span><span class="l">Instance {{ $labels.instance }} of {{ $labels.job }} is down</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><figure><img src="/posts/prometheus-for-developers/prometheus-alerts-1.webp"
    alt="Prometheus alerts" width="1920" height="409"><figcaption>
      <p>Prometheus alerts page showing the ServerDown alert.</p>
    </figcaption>
</figure>

<p>Each alerting rule in Prometheus is also a time series, so in this case you can
query <code>ALERTS{alertname=&quot;ServerDown&quot;}</code> to see the state of that alert at any
point in time; this metric will not return any data points now because so far
no alerts have been triggered.</p>
<p>Let&rsquo;s change that. Run <code>docker-compose stop grafana</code> to kill Grafana. After a
few seconds you should see the <code>ServerDown</code> alert transition to a yellow state,
or <code>PENDING</code>.</p>
<figure><img src="/posts/prometheus-for-developers/prometheus-alerts-2.webp"
    alt="Pending alert" width="1920" height="493"><figcaption>
      <p>The ServerDown alert in PENDING state.</p>
    </figcaption>
</figure>

<p>The alert will stay as <code>PENDING</code> for one minute, which is the threshold we
configured in our alerting rule. After that, the alert will transition to a red
state, or <code>FIRING</code>.</p>
<figure><img src="/posts/prometheus-for-developers/prometheus-alerts-3.webp"
    alt="Firing alert" width="1920" height="494"><figcaption>
      <p>The ServerDown alert in FIRING state.</p>
    </figcaption>
</figure>

<p>After that point, the alert will show up in Alertmanager. Visit
<a href="http://localhost:9093">http://localhost:9093</a> to open the Alertmanager UI.</p>
<figure><img src="/posts/prometheus-for-developers/prometheus-alerts-4.webp"
    alt="Alert in Alertmanager" width="1920" height="592"><figcaption>
      <p>The ServerDown alert displayed in the Alertmanager UI.</p>
    </figcaption>
</figure>

<p>Let&rsquo;s restore Grafana. Run <code>docker-compose restart grafana</code> and the alert
should go back to a green state after a few seconds.</p>
<blockquote>
<p><strong>Want to know more?</strong> There are several alerting rule examples in the
<a href="https://github.com/samber/awesome-prometheus-alerts">awesome-prometheus-alerts</a>
repository for common scenarios and popular systems.</p>
</blockquote>
<h2 id="instrumenting-your-applications">Instrumenting Your Applications</h2>
<p>Let&rsquo;s examine a sample Node.js application we created for this tutorial.</p>
<p>Open the <code>./sample-app/index.js</code> file in your favorite text editor. The
code is fully commented, so you should not have a hard time understanding
it.</p>
<h3 id="measuring-request-durations">Measuring Request Durations</h3>
<p>We can measure request durations with
<a href="https://en.wikipedia.org/wiki/Quantile">percentiles</a> or
<a href="https://en.wikipedia.org/wiki/Arithmetic_mean">averages</a>. It&rsquo;s not
recommended relying on averages to track request durations because averages
can be very misleading (see the <a href="/posts/prometheus-for-developers/#references">References</a> for a few posts on
the pitfalls of averages and how percentiles can help). A better way for
measuring durations is via percentiles as it tracks the user experience
more closely:</p>
<figure><img src="/posts/prometheus-for-developers/percentiles.webp"
    alt="Percentiles as a way to measure user satisfaction" width="2048" height="934"><figcaption>
      <p>Percentiles as a way to measure user satisfaction. Source: <a href="https://twitter.com/rakyll/status/1045075510538035200">Twitter</a></p>
    </figcaption>
</figure>

<p>In Prometheus, we can generate percentiles with summaries or histograms.</p>
<p>To show the differences between these two, our sample application exposes
two custom metrics for measuring request durations with, one via a summary
and the other via a histogram:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="c1">// Summary metric for measuring request durations
</span></span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">requestDurationSummary</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">prometheusClient</span><span class="p">.</span><span class="nx">Summary</span><span class="p">({</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// Metric name
</span></span></span><span class="line"><span class="cl">  <span class="nx">name</span><span class="o">:</span> <span class="s1">&#39;sample_app_summary_request_duration_seconds&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Metric description
</span></span></span><span class="line"><span class="cl">  <span class="nx">help</span><span class="o">:</span> <span class="s1">&#39;Summary of request durations&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Extra dimensions, or labels
</span></span></span><span class="line"><span class="cl">  <span class="c1">// HTTP method (GET, POST, etc), and status code (200, 500, etc)
</span></span></span><span class="line"><span class="cl">  <span class="nx">labelNames</span><span class="o">:</span> <span class="p">[</span><span class="s1">&#39;method&#39;</span><span class="p">,</span> <span class="s1">&#39;status&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// 50th (median), 75th, 90th, 95th, and 99th percentiles
</span></span></span><span class="line"><span class="cl">  <span class="nx">percentiles</span><span class="o">:</span> <span class="p">[</span><span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.75</span><span class="p">,</span> <span class="mf">0.9</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span><span class="mi">95</span><span class="p">,</span> <span class="mf">0.99</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">// Histogram metric for measuring request durations
</span></span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">requestDurationHistogram</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">prometheusClient</span><span class="p">.</span><span class="nx">Histogram</span><span class="p">({</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// Metric name
</span></span></span><span class="line"><span class="cl">  <span class="nx">name</span><span class="o">:</span> <span class="s1">&#39;sample_app_histogram_request_duration_seconds&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Metric description
</span></span></span><span class="line"><span class="cl">  <span class="nx">help</span><span class="o">:</span> <span class="s1">&#39;Histogram of request durations&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Extra dimensions, or labels
</span></span></span><span class="line"><span class="cl">  <span class="c1">// HTTP method (GET, POST, etc), and status code (200, 500, etc)
</span></span></span><span class="line"><span class="cl">  <span class="nx">labelNames</span><span class="o">:</span> <span class="p">[</span><span class="s1">&#39;method&#39;</span><span class="p">,</span> <span class="s1">&#39;status&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Duration buckets, in seconds
</span></span></span><span class="line"><span class="cl">  <span class="c1">// 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s
</span></span></span><span class="line"><span class="cl">  <span class="nx">buckets</span><span class="o">:</span> <span class="p">[</span><span class="mf">0.005</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.025</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">10</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>As you can see, in a summary we specify the percentiles in which we want the
Prometheus client to calculate and report latencies for, while in a histogram
we specify the duration buckets in which the observed durations will be stored
as a counter (i.e. a 300ms observation will be stored by incrementing the
counter corresponding to the 250ms-500ms bucket).</p>
<p>Our sample application introduces a one-second delay in approximately 5%
of requests, just so we can compare the average response time with
99th percentiles:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="c1">// Main route
</span></span></span><span class="line"><span class="cl"><span class="nx">app</span><span class="p">.</span><span class="nx">get</span><span class="p">(</span><span class="s1">&#39;/&#39;</span><span class="p">,</span> <span class="kr">async</span> <span class="p">(</span><span class="nx">req</span><span class="p">,</span> <span class="nx">res</span><span class="p">)</span> <span class="p">=&gt;</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// Simulate a 1s delay in ~5% of all requests
</span></span></span><span class="line"><span class="cl">  <span class="k">if</span> <span class="p">(</span><span class="nb">Math</span><span class="p">.</span><span class="nx">random</span><span class="p">()</span> <span class="o">&lt;=</span> <span class="mf">0.05</span><span class="p">)</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kr">const</span> <span class="nx">sleep</span> <span class="o">=</span> <span class="p">(</span><span class="nx">ms</span><span class="p">)</span> <span class="p">=&gt;</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="k">return</span> <span class="k">new</span> <span class="nb">Promise</span><span class="p">((</span><span class="nx">resolve</span><span class="p">)</span> <span class="p">=&gt;</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">        <span class="nx">setTimeout</span><span class="p">(</span><span class="nx">resolve</span><span class="p">,</span> <span class="nx">ms</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">      <span class="p">});</span>
</span></span><span class="line"><span class="cl">    <span class="p">};</span>
</span></span><span class="line"><span class="cl">    <span class="kr">await</span> <span class="nx">sleep</span><span class="p">(</span><span class="mi">1000</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="nx">res</span><span class="p">.</span><span class="nx">set</span><span class="p">(</span><span class="s1">&#39;Content-Type&#39;</span><span class="p">,</span> <span class="s1">&#39;text/plain&#39;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl">  <span class="nx">res</span><span class="p">.</span><span class="nx">send</span><span class="p">(</span><span class="s1">&#39;Hello, world!&#39;</span><span class="p">);</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Let&rsquo;s put some load on this server to generate some metrics for us to play
with:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span><span class="lnt">9
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker run --rm -it --net host williamyeh/wrk -c <span class="m">4</span> -t <span class="m">2</span> -d <span class="m">900</span>  http://localhost:4000
</span></span><span class="line"><span class="cl">Running 15m <span class="nb">test</span> @ http://localhost:4000
</span></span><span class="line"><span class="cl">  <span class="m">2</span> threads and <span class="m">4</span> connections
</span></span><span class="line"><span class="cl">  Thread Stats   Avg      Stdev     Max   +/- Stdev
</span></span><span class="line"><span class="cl">    Latency   269.03ms  334.46ms   1.20s    78.31%
</span></span><span class="line"><span class="cl">    Req/Sec    85.61    135.58     1.28k    89.33%
</span></span><span class="line"><span class="cl">  <span class="m">72170</span> requests in 15.00m, 14.94MB <span class="nb">read</span>
</span></span><span class="line"><span class="cl">Requests/sec:     80.18
</span></span><span class="line"><span class="cl">Transfer/sec:     16.99KB
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now run the following queries in the Prometheus UI:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span><span class="lnt">8
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Average response time</span>
</span></span><span class="line"><span class="cl">rate<span class="o">(</span>sample_app_summary_request_duration_seconds_sum<span class="o">[</span>15s<span class="o">])</span> / rate<span class="o">(</span>sample_app_summary_request_duration_seconds_count<span class="o">[</span>15s<span class="o">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># 99th percentile (via summary)</span>
</span></span><span class="line"><span class="cl">sample_app_summary_request_duration_seconds<span class="o">{</span><span class="nv">quantile</span><span class="o">=</span><span class="s2">&#34;0.99&#34;</span><span class="o">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># 99th percentile (via histogram)</span>
</span></span><span class="line"><span class="cl">histogram_quantile<span class="o">(</span>0.99, sum<span class="o">(</span>rate<span class="o">(</span>sample_app_histogram_request_duration_seconds_bucket<span class="o">[</span>15s<span class="o">]))</span> by <span class="o">(</span>le, method, status<span class="o">))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The result of these queries may seem surprising.</p>
<figure><img src="/posts/prometheus-for-developers/sample-app-response-times-1.webp"
    alt="Sample application response times" width="1699" height="2160"><figcaption>
      <p>Comparison between average response time and 99th percentile, showing significant difference.</p>
    </figcaption>
</figure>

<p>The first thing to notice is how the average response time fails to
communicate the actual behavior of the response duration distribution
(avg: 50ms; p99: 1s); the second is how the 99th percentile reported by the
the summary (1s) is quite different than the one estimated by the
<code>histogram_quantile()</code> function (~2.2s). How can this be?</p>
<h4 id="quantile-estimation-errors">Quantile Estimation Errors</h4>
<p>Quoting the <a href="">documentation</a>:</p>
<blockquote>
<p>You can use both summaries and histograms to calculate so-called
φ-quantiles, where 0 ≤ φ ≤ 1. The φ-quantile is the observation value
that ranks at number φ*N among the N observations. Examples for
φ-quantiles: The 0.5-quantile is known as the median. The
0.95-quantile is the 95th percentile.</p>
<p>The essential difference between summaries and histograms is that
summaries calculate streaming φ-quantiles on the client side and
expose them directly, while histograms expose bucketed observation
counts and the calculation of quantiles from the buckets of a
histogram happens on the server side using the
<code>histogram_quantile()</code> function.</p>
</blockquote>
<p>In other words, for the quantile estimation from the buckets of a
histogram to be accurate, we need to be careful when choosing the bucket
layout; if it doesn&rsquo;t match the range and distribution of the actual
observed durations, you will get inaccurate quantiles as a result.</p>
<p>Remembering our current histogram configuration:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="c1">// Histogram metric for measuring request durations
</span></span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">requestDurationHistogram</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">prometheusClient</span><span class="p">.</span><span class="nx">Histogram</span><span class="p">({</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// Metric name
</span></span></span><span class="line"><span class="cl">  <span class="nx">name</span><span class="o">:</span> <span class="s1">&#39;sample_app_histogram_request_duration_seconds&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Metric description
</span></span></span><span class="line"><span class="cl">  <span class="nx">help</span><span class="o">:</span> <span class="s1">&#39;Histogram of request durations&#39;</span><span class="p">,</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Extra dimensions, or labels
</span></span></span><span class="line"><span class="cl">  <span class="c1">// HTTP method (GET, POST, etc), and status code (200, 500, etc)
</span></span></span><span class="line"><span class="cl">  <span class="nx">labelNames</span><span class="o">:</span> <span class="p">[</span><span class="s1">&#39;method&#39;</span><span class="p">,</span> <span class="s1">&#39;status&#39;</span><span class="p">],</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Duration buckets, in seconds
</span></span></span><span class="line"><span class="cl">  <span class="c1">// 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s
</span></span></span><span class="line"><span class="cl">  <span class="nx">buckets</span><span class="o">:</span> <span class="p">[</span><span class="mf">0.005</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.025</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">2.5</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">10</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Here we are using a <em>exponential</em> bucket configuration in which the buckets
double in size at every step. This is a widely used pattern; since we
always expect our services to respond quickly (i.e. with response time
between 0 and 300ms), we specify more buckets for that range, and fewer
buckets for request durations we think are less likely to occur.</p>
<p>According to the previous plot, all slow requests from our application
are falling into the 1s-2.5s bucket, resulting in this loss of precision
when calculating the 99th percentile.</p>
<p>Since we know our application will take at most ~1s to respond, we can
choose a more appropriate bucket layout:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-js" data-lang="js"><span class="line"><span class="cl"><span class="c1">// Histogram metric for measuring request durations
</span></span></span><span class="line"><span class="cl"><span class="kr">const</span> <span class="nx">requestDurationHistogram</span> <span class="o">=</span> <span class="k">new</span> <span class="nx">prometheusClient</span><span class="p">.</span><span class="nx">Histogram</span><span class="p">({</span>
</span></span><span class="line"><span class="cl">  <span class="c1">// ...
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">// Experimenting a different bucket layout
</span></span></span><span class="line"><span class="cl">  <span class="nx">buckets</span><span class="o">:</span> <span class="p">[</span><span class="mf">0.005</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">,</span> <span class="mf">0.02</span><span class="p">,</span> <span class="mf">0.05</span><span class="p">,</span> <span class="mf">0.1</span><span class="p">,</span> <span class="mf">0.25</span><span class="p">,</span> <span class="mf">0.5</span><span class="p">,</span> <span class="mf">0.8</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">1.2</span><span class="p">,</span> <span class="mf">1.5</span><span class="p">]</span>
</span></span><span class="line"><span class="cl"><span class="p">});</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Let&rsquo;s start a clean Prometheus server with the modified bucket configuration
to see if the quantile estimation improves:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker-compose rm -fs
</span></span><span class="line"><span class="cl">$ docker-compose up -d
</span></span></code></pre></td></tr></table>
</div>
</div><p>If you re-run the load test, now you should get something like this:</p>
<figure><img src="/posts/prometheus-for-developers/sample-app-response-times-2.webp"
    alt="Sample application response times with improved bucket layout" width="1699" height="2160"><figcaption>
      <p>Response times with improved histogram bucket configuration.</p>
    </figcaption>
</figure>

<p>Not quite there, but it&rsquo;s an improvement!</p>
<blockquote>
<p><strong>Want to know more?</strong> If all it takes for us to achieve high accuracy
histogram data is to use more buckets, why not use a large number of small
buckets?</p>
</blockquote>
<p>The reason is efficiency. Remember:</p>
<p><strong>more buckets == more time series == more space == slower queries</strong></p>
<p>Let&rsquo;s say you have an SLO (more details on SLOs later) to serve 99% of
requests within 300ms. If all you want to know is whether you are
honoring your SLO or not, it doesn&rsquo;t really matter if the quantile
estimation is not accurate for requests slower than 300ms.</p>
<p>You might also be wondering: if summaries are more precise, why not use
summaries instead of histograms?</p>
<p>Quoting the
<a href="https://prometheus.io/docs/practices/histograms/#errors-of-quantile-estimation">documentation</a>:</p>
<blockquote>
<p>A summary would have had no problem calculating the correct percentile
value in most cases. Unfortunately, we cannot use a summary if we need
to aggregate the observations from a number of instances.</p>
</blockquote>
<p>Histograms are more versatile in this regard. If you have an application
with multiple replicas, you can safely use the <code>histogram_quantile()</code>
function to calculate the 99th percentile across all requests to all
replicas. You cannot do this with summaries. I mean, you can <code>avg()</code> the
99th percentiles of all replicas, or take the <code>max()</code>, but the value you
get will be statistically incorrect and could not be used as a proxy to the
99th percentile.</p>
<h3 id="measuring-throughput">Measuring Throughput</h3>
<p>If you are using a histogram to measure request duration, you can use
the <code>&lt;basename&gt;_count</code> timeseries to measure throughput without having to
introduce another metric.</p>
<p>For instance, if your histogram metric name is
<code>sample_app_histogram_request_duration_seconds</code>, then you can use the
<code>sample_app_histogram_request_duration_seconds_count</code> metric to measure
throughput:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Number of requests per second (data from the past 30s)</span>
</span></span><span class="line"><span class="cl">rate<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>30s<span class="o">])</span>
</span></span></code></pre></td></tr></table>
</div>
</div><figure><img src="/posts/prometheus-for-developers/sample-app-throughput.webp"
    alt="Sample app throughput" width="1699" height="810"><figcaption>
      <p>Requests per second measured using the histogram count metric.</p>
    </figcaption>
</figure>

<h3 id="measuring-memorycpu-usage">Measuring Memory/CPU Usage</h3>
<p>Most Prometheus clients already provide a default set of metrics;
<a href="https://github.com/siimon/prom-client">prom-client</a>, the Prometheus
client for Node.js, does this as well.</p>
<p>Try these queries in the Prometheus UI:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl"><span class="c1"># Gauge that provides the current memory usage, in bytes</span>
</span></span><span class="line"><span class="cl">process_resident_memory_bytes
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Gauge that provides the usage in CPU seconds per second</span>
</span></span><span class="line"><span class="cl">rate<span class="o">(</span>process_cpu_seconds_total<span class="o">[</span>30s<span class="o">])</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>If you use <code>wrk</code> to put some load into our sample application you might see
something like this:</p>
<figure><img src="/posts/prometheus-for-developers/sample-app-memory-cpu-usage.webp"
    alt="Sample app memory/CPU usage" width="1699" height="1559"><figcaption>
      <p>Memory and CPU usage metrics from the sample application.</p>
    </figcaption>
</figure>

<p>You can compare these metrics with the data given by <code>docker stats</code> to see if
they agree with each other.</p>
<blockquote>
<p><strong>Want to know more?</strong> Our sample application exports different metrics
to expose some internal Node.js information, such as GC runs, heap usage
by type, event loop lag, and current active handles/requests. Plot those
metrics in the Prometheus UI, and see how they behave when you put some
load to the application.</p>
</blockquote>
<p>A sample dashboard containing all those metrics is also available in our
Grafana server at <a href="http://localhost:3000">http://localhost:3000</a>.</p>
<h3 id="measuring-slos-and-error-budgets">Measuring SLOs and Error Budgets</h3>
<blockquote>
<p>Managing service reliability is largely about managing risk, and managing risk
can be costly.</p>
<p>100% is probably never the right reliability target: not only is it impossible
to achieve, it&rsquo;s typically more reliability than a service&rsquo;s users want or
notice.</p>
</blockquote>
<p>SLOs, or <em>Service Level Objectives</em>, is one of the main tools employed by
<a href="https://landing.google.com/sre/books/">Site Reliability Engineers (SREs)</a> for
making data-driven decisions about reliability.</p>
<p>SLOs are based on SLIs, or <em>Service Level Indicators</em>, which are the key metrics
that define how well (or how poorly) a given service is operating. Common SLIs
would be the number of failed requests, the number of requests slower than some
threshold, etc. Although different types of SLOs can be useful for different
types of systems, most HTTP-based services will have SLOs that can be
classified into two categories: <strong>availability</strong> and <strong>latency</strong>.</p>
<p>For instance, let&rsquo;s say these are the SLOs for our sample application:</p>
<table>
  <thead>
      <tr>
          <th>Category</th>
          <th>SLI</th>
          <th>SLO</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Availability</td>
          <td>The proportion of successful requests; any HTTP status other than 500-599 is considered successful</td>
          <td>95% successful requests</td>
      </tr>
      <tr>
          <td>Latency</td>
          <td>The proportion of requests with duration less than or equal to 100ms</td>
          <td>95% requests under 100ms</td>
      </tr>
  </tbody>
</table>
<p>The difference between 100% and the SLO is what we call the <em>Error Budget</em>.
In this example, the error budget for both SLOs is 5%; if the application
receives 1,000 requests during the SLO window (let&rsquo;s say one minute for the
purposes of this tutorial), it means that 50 requests can fail and we&rsquo;ll
still meet our SLO.</p>
<p>But do we need additional metrics for keeping track of these SLOs? Probably
not. If you are tracking request durations with a histogram (as we are here),
chances are you don&rsquo;t need to do anything else. You already got all the
metrics you need!</p>
<p>Let&rsquo;s send a few requests to the server so we can play around with the metrics:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl">$ <span class="k">while</span> true<span class="p">;</span> <span class="k">do</span> curl -s http://localhost:4000 &gt; /dev/null <span class="p">;</span> <span class="k">done</span>
</span></span></code></pre></td></tr></table>
</div>
</div><div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-sh" data-lang="sh"><span class="line"><span class="cl"><span class="c1"># Number of requests served in the SLO window</span>
</span></span><span class="line"><span class="cl">sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Number of requests that violated the latency SLO (all requests that took more than 100ms to be served)</span>
</span></span><span class="line"><span class="cl">sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span> - sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_bucket<span class="o">{</span><span class="nv">le</span><span class="o">=</span><span class="s2">&#34;0.1&#34;</span><span class="o">}[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Number of requests in the error budget: (100% - [slo threshold]) * [number of requests served]</span>
</span></span><span class="line"><span class="cl"><span class="o">(</span><span class="m">1</span> - 0.95<span class="o">)</span> * sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Remaining requests in the error budget: [number of requests in the error budget] - [number of requests that violated the latency SLO]</span>
</span></span><span class="line"><span class="cl"><span class="o">(</span><span class="m">1</span> - 0.95<span class="o">)</span> * sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span> - <span class="o">(</span>sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span> - sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_bucket<span class="o">{</span><span class="nv">le</span><span class="o">=</span><span class="s2">&#34;0.1&#34;</span><span class="o">}[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Remaining requests in the error budget as a ratio: ([number of requests in the error budget] - [number of requests that violated the SLO]) / [number of requests in the error budget]</span>
</span></span><span class="line"><span class="cl"><span class="o">((</span><span class="m">1</span> - 0.95<span class="o">)</span> * sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span> - <span class="o">(</span>sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)</span> - sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_bucket<span class="o">{</span><span class="nv">le</span><span class="o">=</span><span class="s2">&#34;0.1&#34;</span><span class="o">}[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">)))</span> / <span class="o">((</span><span class="m">1</span> - 0.95<span class="o">)</span> * sum<span class="o">(</span>increase<span class="o">(</span>sample_app_histogram_request_duration_seconds_count<span class="o">[</span>1m<span class="o">]))</span> by <span class="o">(</span>job<span class="o">))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Due to the simulated scenario in which ~5% of requests takes 1s to complete,
if you try the last query you should see that the average budget available
is around 0%, that is, we have no more budget to spend and will inevitably
break the latency SLO if more requests start to take more time to be served.
This is not a good place to be.</p>
<figure><img src="/posts/prometheus-for-developers/slo-1.webp"
    alt="Error Budget Burn Rate of 1x" width="2702" height="1006"><figcaption>
      <p>Error budget with 95% SLO showing burn rate of 1x.</p>
    </figcaption>
</figure>

<p>But what if we had a more strict SLO, say, 99% instead of 95%? What would be
the impact of these slow requests on the error budget?</p>
<p>Just replace all <code>0.95</code> by <code>0.99</code> in that query to see what would happen:</p>
<figure><img src="/posts/prometheus-for-developers/slo-2.webp"
    alt="Error Budget Burn Rate of 3x" width="2670" height="1006"><figcaption>
      <p>Error budget with 99% SLO showing burn rate of 3x.</p>
    </figcaption>
</figure>

<p>In the previous scenario with the 95% SLO, the SLO <em>burn rate</em> was ~1x, which
means the whole error budget was being consumed during the SLO window, that is,
in 60 seconds. Now, with the 99% SLO, the burn rate was ~3x, which means that
instead of taking one minute for the error budget to exhaust, it now takes
only ~20 seconds!</p>
<p>Now change the <code>curl</code> to point to the <code>/metrics</code> endpoint, which do not have
the simulated long latency for 5% of the requests, and you should see the error
budget go back to 100% again:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ <span class="k">while</span> true<span class="p">;</span> <span class="k">do</span> curl -s http://localhost:4000/metrics &gt; /dev/null <span class="p">;</span> <span class="k">done</span>
</span></span></code></pre></td></tr></table>
</div>
</div><figure><img src="/posts/prometheus-for-developers/slo-3.webp"
    alt="Error Budget Replenished" width="2688" height="1004"><figcaption>
      <p>Error budget returning to 100% after switching to fast endpoint.</p>
    </figcaption>
</figure>

<blockquote>
<p><strong>Want to know more?</strong> These queries are for calculating the error budget for
the <strong>latency</strong> SLO by measuring the number of requests slower than 100ms. Now
try to modify those queries to calculate the error budget for the
<strong>availability</strong> SLO (requests with <code>status=~&quot;5..&quot;</code>), and modify the sample
application to return a HTTP 5xx error for some requests so you can validate
the queries.</p>
</blockquote>
<p>The
<a href="https://landing.google.com/sre/books/">Site Reliability Workbook</a> is a great
resource on this topic and includes more advanced concepts such as how to alert
based on SLO burn rate as a way to improve alert precision/recall and
detection/reset times.</p>
<h3 id="monitoring-applications-without-a-metrics-endpoint">Monitoring Applications Without a Metrics Endpoint</h3>
<p>We learned that Prometheus needs all applications to expose a <code>/metrics</code>
HTTP endpoint for it to scrape metrics. But what if you want to monitor
a MySQL instance, which does not provide a Prometheus metrics endpoint?
What can we do?</p>
<p>That&rsquo;s where <em>exporters</em> come in. The
<a href="https://prometheus.io/docs/instrumenting/exporters/">documentation</a> lists a
comprehensive list of official and third-party exporters for a variety of
systems, such as databases, messaging systems, cloud providers, and so forth.</p>
<p>For a very simplistic example, check out the
<a href="https://github.com/danielfm/aws-limits-exporter">aws-limits-exporter</a>
project, which is about 200 lines of Go code.</p>
<h3 id="final-gotchas">Final Gotchas</h3>
<p>The Prometheus documentation page on
<a href="https://prometheus.io/docs/practices/instrumentation/">instrumentation</a>
does a pretty good job in laying out some of the things to watch out
for when instrumenting your applications.</p>
<p>Also, beware that there are
<a href="https://prometheus.io/docs/practices/naming/">conventions</a> on what makes
a good metric name; poorly (or wrongly) named metrics will cause you a
hard time when creating queries later.</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://prometheus.io/docs/">Prometheus documentation</a></li>
<li><a href="https://github.com/infinityworks/prometheus-example-queries">Prometheus example queries</a></li>
<li><a href="https://github.com/siimon/prom-client">Prometheus client for Node.js</a></li>
<li><a href="https://www.youtube.com/watch?v=PDxcEzu62jk">Keynote: Monitoring, the Prometheus Way (DockerCon 2017)</a></li>
<li><a href="https://www.robustperception.io/understanding-machine-cpu-usage/">Blog Post: Understanding Machine CPU usage</a></li>
<li><a href="http://latencytipoftheday.blogspot.com/2014/06/latencytipoftheday-you-cant-average.html">Blog Post: #LatencyTipOfTheDay: You can&rsquo;t average percentiles. Period.</a></li>
<li><a href="https://www.dynatrace.com/news/blog/why-averages-suck-and-percentiles-are-great/">Blog Post: Why Averages Suck and Percentiles are Great</a></li>
<li><a href="https://landing.google.com/sre/books/">Site Reliability Engineering books</a></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>Pain(less?) NGINX Ingress</title>
      <link>https://danielfm.me/posts/painless-nginx-ingress/</link>
      <pubDate>Wed, 13 Sep 2017 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/painless-nginx-ingress/</guid>
      <description>Hard-earned lessons from production NGINX ingress outages in Kubernetes.</description>
      <content:encoded><![CDATA[<blockquote>
<p>As of March 2026, ingress-nginx will no longer receive new releases,
bugfixes, or updates to resolve any security vulnerabilities that
may be discovered.</p>
</blockquote>
<p>So you have a <a href="https://kubernetes.io">Kubernetes</a> cluster and are using (or
considering using) the
<a href="https://github.com/kubernetes/ingress-nginx">NGINX ingress controller</a>
to forward outside traffic to in-cluster services. That&rsquo;s awesome!</p>
<p>The first time I looked at it, everything looked so easy; installing the NGINX
ingress controller was one <code>helm install</code> away, so I did it. Then, after hooking
up the DNS to the load balancer and creating a few
<a href="https://kubernetes.io/docs/concepts/services-networking/ingress/#the-ingress-resource">Ingress resources</a>,
I was in business.</p>
<p>Fast-forward a few months, all external traffic for all environments
(dev, staging, production) was going through the ingress servers. Everything was
good. Until it wasn&rsquo;t.</p>
<p>We all know how it happens. First, you get excited about that shiny new thing.
You start using it. Then, eventually, some shit happens.</p>
<h2 id="my-first-ingress-outage">My First Ingress Outage</h2>
<p>Let me start by saying that if you are not alerting on
<a href="http://veithen.github.io/2014/01/01/how-tcp-backlog-works-in-linux.html">accept queue overflows</a>,
well, you should.</p>
<figure><img src="/posts/painless-nginx-ingress/tcp-diagram.webp"
    alt="TCP connection flow diagram" width="860" height="487"><figcaption>
      <p>TCP connection flow diagram.</p>
    </figcaption>
</figure>

<p>What happened was that one of the applications being proxied through NGINX
started taking too long to respond, causing connections to completely fill the
<a href="http://nginx.org/en/docs/http/ngx_http_core_module.html#listen">NGINX listen backlog</a>,
which caused NGINX to quickly start dropping connections, including the ones
being made by Kubernetes'
<a href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/">liveness/readiness probes</a>.</p>
<p>What happens when some pod fails to respond to the liveness probes? Kubernetes
thinks there&rsquo;s something wrong with the pod and restarts it. The problem is that
this is one of those situations where restarting a pod will actually make more
harm than good; the accept queue will overflow, again and again, causing Kubernetes
to keep restarting the NGINX pods until they all started to crash-loop.</p>
<figure><img src="/posts/painless-nginx-ingress/listen-overflows.webp"
    alt="Graph showing surges of TCP listen overflow errors" width="1491" height="734"><figcaption>
      <p>Surges of TCP listen overflow errors.</p>
    </figcaption>
</figure>

<p>What are the lessons learned from this incident?</p>
<ul>
<li>Know every bit of your NGINX configuration. Look for anything that should
(or should not) be there, and don&rsquo;t blindingly trust any default values.</li>
<li>Most Linux distributions do not provide an optimal configuration for running
high load web servers out-of-the-box; double-check the values for each kernel
param via <code>sysctl -a</code>.</li>
<li>Make sure to measure the latency across your services and set the various
timeouts based on the expected upper bound + some slack to accommodate slight
variations.</li>
<li>Change your applications to drop requests or degrade gracefully when
overloaded. For instance, in NodeJS applications,
<a href="https://medium.com/springworks-engineering/node-js-profiling-event-loop-lag-flame-charts-539e04723e84">latency increases</a>
in the event loop might indicate the server is in trouble keeping up with the
current traffic.</li>
<li>Do not use just one NGINX ingress controller deployment for balancing across
all types of workloads/environments.</li>
</ul>
<h3 id="the-importance-of-observability">The Importance of Observability</h3>
<p>Before detailing each of the previous points, my 0th advice is to <em>never</em> run a
production Kubernetes cluster (or anything else for that matter) without proper
monitoring; by itself, monitoring won&rsquo;t prevent bad things from happening, but
collecting telemetry data during such incidents will give you means to root-cause
and fix most issues you&rsquo;ll find along the way.</p>
<figure><img src="/posts/painless-nginx-ingress/netstat-metrics.webp"
    alt="Netstat metrics in Grafana" width="3316" height="1560"><figcaption>
      <p>Netstat metrics in Grafana.</p>
    </figcaption>
</figure>

<p>If you choose to jump on the <a href="https://prometheus.io">Prometheus</a> bandwagon, you can
leverage <a href="https://github.com/prometheus/node_exporter">node_exporter</a> in order to
collect node-level metrics that could help you detect situations like the one I&rsquo;ve
just described.</p>
<figure><img src="/posts/painless-nginx-ingress/ingress-metrics.webp"
    alt="NGINX ingress controller metrics in Grafana" width="3314" height="1638"><figcaption>
      <p>NGINX ingress controller metrics in Grafana.</p>
    </figcaption>
</figure>

<p>Also, the NGINX ingress controller itself exposes Prometheus metrics; make
sure to collect those as well.</p>
<h2 id="know-your-config">Know Your Config</h2>
<p>The beauty of ingress controllers is that you delegate the task of generating and
reloading the proxy configuration to this fine piece of software and never worry
about it; you don&rsquo;t even have to be familiar with the underlying technology
(NGINX in this case). Right? <strong>Wrong!</strong></p>
<p>If you haven&rsquo;t done that already, I urge you to take a look at the configuration
your ingress controller generated for you. For the NGINX ingress controller,
all you need to do is grab the contents of <code>/etc/nginx/nginx.conf</code> via <code>kubectl</code>.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">kubectl -n &lt;namespace&gt; <span class="nb">exec</span> &lt;nginx-ingress-controller-pod-name&gt; -- /
</span></span><span class="line"><span class="cl">   cat /etc/nginx/nginx.conf &gt; ./nginx.conf
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now look for anything that&rsquo;s not compatible with your setup. Want an example? Let&rsquo;s start with <a href="http://nginx.org/en/docs/ngx_core_module.html#worker_processes"><code>worker_processes auto;</code></a></p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-nginx" data-lang="nginx"><span class="line"><span class="cl"><span class="c1"># $ cat ./nginx.conf
</span></span></span><span class="line"><span class="cl"><span class="k">daemon</span> <span class="no">off</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">worker_processes</span> <span class="s">auto</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">pid</span> <span class="s">/run/nginx.pid</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">worker_rlimit_nofile</span> <span class="mi">1047552</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="k">worker_shutdown_timeout</span> <span class="s">10s</span> <span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">events</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kn">multi_accept</span>        <span class="no">on</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kn">worker_connections</span>  <span class="mi">16384</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kn">use</span>                 <span class="s">epoll</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">http</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kn">real_ip_header</span>      <span class="s">X-Forwarded-For</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="c1"># ...
</span></span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># ...
</span></span></span></code></pre></td></tr></table>
</div>
</div><blockquote>
<p>The optimal value depends on many factors including (but not limited to) the
number of CPU cores, the number of hard disk drives that store data, and load
pattern. When one is in doubt, setting it to the number of available CPU cores
would be a good start (the value “<code>auto</code>” will try to autodetect it).</p>
</blockquote>
<p>Here&rsquo;s the first gotcha: as of now (will it ever be?), NGINX is not
<a href="https://en.wikipedia.org/wiki/Cgroups">Cgroups</a>-aware, which means the <code>auto</code>
value will use the number of <em>physical CPU cores</em> on the host machine, not the
number of &ldquo;virtual&rdquo; CPUs as defined by the Kubernetes
<a href="https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/">resource requests/limits</a>.</p>
<p>Let&rsquo;s run a little experiment. What happens when you try to load the following
NGINX configuration file from a container limited to only one CPU in a dual-core
server? Will it spawn one or two worker processes?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-nginx" data-lang="nginx"><span class="line"><span class="cl"><span class="c1"># $ cat ./minimal-nginx.conf
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">worker_processes</span> <span class="s">auto</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">events</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="kn">worker_connections</span> <span class="mi">1024</span><span class="p">;</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="k">http</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">  <span class="kn">server</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">    <span class="kn">listen</span> <span class="mi">80</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="kn">server_name</span> <span class="s">localhost</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="kn">location</span> <span class="s">/</span> <span class="p">{</span>
</span></span><span class="line"><span class="cl">      <span class="kn">root</span>  <span class="s">html</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">      <span class="kn">index</span> <span class="s">index.html</span> <span class="s">index.htm</span><span class="p">;</span>
</span></span><span class="line"><span class="cl">    <span class="p">}</span>
</span></span><span class="line"><span class="cl">  <span class="p">}</span>
</span></span><span class="line"><span class="cl"><span class="p">}</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Thus, if you intend to restrict the NGINX ingress CPU share, it might not make
sense to spawn a large number of workers per container. If that&rsquo;s the case, make
sure to explicitly set the desired number in the <code>worker_processes</code> directive.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">$ docker run --rm --cpus=&#34;1&#34; -v `pwd`/minimal-nginx.conf:/etc/nginx/nginx.conf:ro -d nginx
</span></span><span class="line"><span class="cl">fc7d98c412a9b90a217388a094de4c4810241be62c4f7501e59cc1c968434d4c
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">$ docker exec fc7 ps -ef | grep nginx
</span></span><span class="line"><span class="cl">root         1     0  0 21:49 pts/0    00:00:00 nginx: master process nginx -g daemon off;
</span></span><span class="line"><span class="cl">nginx        6     1  0 21:49 pts/0    00:00:00 nginx: worker process
</span></span><span class="line"><span class="cl">nginx        7     1  0 21:49 pts/0    00:00:00 nginx: worker process
</span></span></code></pre></td></tr></table>
</div>
</div><p>Now take the <code>listen</code> directive; it does not specify the <code>backlog</code> parameter
(which is <code>511</code> by default on Linux). If your kernel&rsquo;s <code>net.core.somaxconn</code> is
set to, say, <code>1024</code>, you should also specify the <code>backlog=X</code> parameter
accordingly. In other words, make sure your config is in tune with your kernel.</p>
<p>And please, don&rsquo;t stop there. Do this thought exercise to every line of the
generated config. Hell, take at look at
<a href="https://github.com/kubernetes/ingress-nginx/blob/master/rootfs/etc/nginx/template/nginx.tmpl">all the things</a>
the ingress controller will let you change, and don&rsquo;t hesitate to
change anything that does not fit your use case. Most NGINX directives can be
<a href="https://github.com/kubernetes/ingress-nginx/blob/master/docs/user-guide/configmap.md">customized</a>
via <code>ConfigMap</code> entries and/or annotations.</p>
<h3 id="kernel-params">Kernel Params</h3>
<p>Using ingress or not, make sure to always review and tune the kernel params
of your nodes according to the expected workloads.</p>
<p>This is a rather complex subject on its own, so I have no intention of covering
everything in this post; take a look at the <a href="/posts/painless-nginx-ingress/#references">References</a> section
for more pointers in this area.</p>
<h4 id="kube-proxy-conntrack-table">Kube-Proxy: Conntrack Table</h4>
<p>If you are using Kubernetes, I don&rsquo;t need to explain to you what
<a href="https://kubernetes.io/docs/concepts/services-networking/service/">Services</a>
are and what they are used for. However, I think it&rsquo;s important to understand
in more detail how they work.</p>
<blockquote>
<p>Every node in a Kubernetes cluster runs a kube-proxy, which is
responsible for implementing a form of virtual IP for <code>Services</code> of type
other than <code>ExternalName</code>. In Kubernetes v1.0 the proxy was purely in
userspace. In Kubernetes v1.1 an iptables proxy was added, but was not
the default operating mode. Since Kubernetes v1.2, the iptables proxy is
the default.</p>
</blockquote>
<p>In other words, all packets sent to a Service IP are forwarded/load-balanced to
the corresponding <code>Endpoint</code>s (<code>address:port</code> tuples for all pods that match the
<code>Service</code>
<a href="https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/">label selector</a>)
via iptables rules managed by <a href="https://kubernetes.io/docs/admin/kube-proxy/">kube-proxy</a>;
connections to <code>Service</code> IPs are tracked by the kernel via the <code>nf_conntrack</code>
module, and, as you might have guessed, this connection tracking  information is
stored in RAM.</p>
<p>As the values of different conntrack params need to be set in conformance with
each other (ie. <code>nf_conntrack_max</code> and <code>nf_conntrack_buckets</code>), kube-proxy
configures sane defaults for those as part of its bootstrapping procedure.</p>
<pre tabindex="0"><code>$ kubectl -n kube-system logs &lt;some-kube-proxy-pod&gt;
I0829 22:23:43.455969       1 server.go:478] Using iptables Proxier.
I0829 22:23:43.473356       1 server.go:513] Tearing down userspace rules.
I0829 22:23:43.498529       1 conntrack.go:98] Set sysctl &#39;net/netfilter/nf_conntrack_max&#39; to 524288
I0829 22:23:43.498696       1 conntrack.go:52] Setting nf_conntrack_max to 524288
I0829 22:23:43.499167       1 conntrack.go:83] Setting conntrack hashsize to 131072
I0829 22:23:43.503607       1 conntrack.go:98] Set sysctl &#39;net/netfilter/nf_conntrack_tcp_timeout_established&#39; to 86400
I0829 22:23:43.503718       1 conntrack.go:98] Set sysctl &#39;net/netfilter/nf_conntrack_tcp_timeout_close_wait&#39; to 3600
I0829 22:23:43.504052       1 config.go:102] Starting endpoints config controller
...
</code></pre><p>These are good defaults, but you might want to <a href="https://kubernetes.io/docs/admin/kube-proxy/">increase those</a>
if your monitoring data shows you&rsquo;re running out of conntrack space. However,
bear in mind that increasing these params will result in
<a href="https://johnleach.co.uk/words/372/netfilter-conntrack-memory-usage">increased memory usage</a>,
so be gentle.</p>
<figure><img src="/posts/painless-nginx-ingress/conntrack-usage.webp"
    alt="Grafana dashboard showing the conntrack usage" width="1024" height="322"><figcaption>
      <p>Conntrack usage.</p>
    </figcaption>
</figure>

<h2 id="sharing-is-not-caring">Sharing Is (Not) Caring</h2>
<p>We used to have just a single NGINX ingress deployment responsible for proxying
requests to all applications in all environments (dev, staging, production)
until recently. I can say from experience this is <strong>bad</strong> practice;
<em>don&rsquo;t put all your eggs in one basket.</em></p>
<p>I guess the same could be said about sharing one cluster for all environments,
but we found that, by doing this, we get better resource utilization by allowing
dev/staging pods to run on a best-effort QoS tier, taking up resources not
used by production applications.</p>
<p>The trade-off is that this limits the things we can do to our cluster. For
instance, if we decide to run a load test on a staging service, we need to be
really careful or we risk affecting production services running in the same
cluster.</p>
<p>Even though the level of isolation provided by containers is generally good, they
still <a href="https://sysdig.com/blog/container-isolation-gone-wrong/">rely on shared kernel resources</a>
that are subject to abuse.</p>
<h3 id="split-ingress-deployments-per-environment">Split Ingress Deployments Per Environment</h3>
<p>That being said, there&rsquo;s no reason not to use dedicated ingresses per
environment. This will give you an extra layer of protection in case your
dev/staging services get misused.</p>
<p>Some other benefits of doing so:</p>
<ul>
<li>You get the chance to use different settings for each environment if needed</li>
<li>Allow testing ingress upgrades in a more forgiving environment before rolling
out to production</li>
<li>Avoid bloating the NGINX configuration with lots of upstreams and servers
associated with ephemeral and/or unstable environments</li>
<li>As a consequence, your configuration reloads will be faster, and you&rsquo;ll have
fewer configuration reload events during the day (we&rsquo;ll discuss later why you
should strive to keep the number of reloads to a minimum)</li>
</ul>
<h4 id="ingress-classes-to-the-rescue">Ingress Classes To The Rescue</h4>
<p>One way to make different ingress controllers manage different <code>Ingress</code>
resources in the same cluster is by using a different <strong>ingress class name</strong> per
ingress deployment, and then annotate your <code>Ingress</code> resources to specify which
one is responsible for controlling it.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span><span class="lnt">37
</span><span class="lnt">38
</span><span class="lnt">39
</span><span class="lnt">40
</span><span class="lnt">41
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># Ingress controller 1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">extensions/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">args</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- <span class="l">/nginx-ingress-controller</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- --<span class="l">ingress-class=class-1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- <span class="l">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># Ingress controller 2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">extensions/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Deployment</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">template</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">      </span><span class="nt">containers</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">        </span>- <span class="nt">args</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- <span class="l">/nginx-ingress-controller</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- --<span class="l">ingress-class=class-2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">            </span>- <span class="l">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># This Ingress resource will be managed by controller 1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">extensions/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kubernetes.io/ingress.class</span><span class="p">:</span><span class="w"> </span><span class="l">class-1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w"> </span><span class="l">...</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="c"># This Ingress resource will be managed by controller 2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">apiVersion</span><span class="p">:</span><span class="w"> </span><span class="l">extensions/v1beta1</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">kind</span><span class="p">:</span><span class="w"> </span><span class="l">Ingress</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">metadata</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">annotations</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span><span class="nt">kubernetes.io/ingress.class</span><span class="p">:</span><span class="w"> </span><span class="l">class-2</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">spec</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">rules</span><span class="p">:</span><span class="w"> </span><span class="l">...</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><h2 id="ingress-reloads-gone-wrong">Ingress Reloads Gone Wrong</h2>
<p>At this point, we were already running a dedicated ingress controller for the
production environment. Everything was running pretty smoothly until we
decided to migrate a WebSocket application to Kubernetes + ingress.</p>
<p>Shortly after the migration, I started noticing a strange trend in memory usage
for the production ingress pods.</p>
<figure><img src="/posts/painless-nginx-ingress/ingress-memory-issue.webp"
    alt="Grafana dashboard showing nginx-ingress containers leaking memory" width="829" height="386"><figcaption>
      <p>Nginx-ingress containers leaking memory.</p>
    </figcaption>
</figure>

<p>Why was the memory consumption skyrocketing like this? After I <code>kubectl exec</code>’d
into one of the ingress containers, what I found was a bunch of worker processes
stuck in shutting down state for several minutes.</p>
<pre tabindex="0"><code>root     17755 17739  0 19:47 ?        00:00:00 /usr/bin/dumb-init /nginx-ingress-controller --default-backend-service=kube-system/broken-bronco-nginx-ingress-be --configmap=kube-system/broken-bronco-nginx-ingress-conf --ingress-class=nginx-ingress-prd
root     17765 17755  0 19:47 ?        00:00:08 /nginx-ingress-controller --default-backend-service=kube-system/broken-bronco-nginx-ingress-be --configmap=kube-system/broken-bronco-nginx-ingress-conf --ingress-class=nginx-ingress-prd
root     17776 17765  0 19:47 ?        00:00:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nobody   18866 17776  0 19:49 ?        00:00:05 nginx: worker process is shutting down
nobody   19466 17776  0 19:51 ?        00:00:01 nginx: worker process is shutting down
nobody   19698 17776  0 19:51 ?        00:00:05 nginx: worker process is shutting down
nobody   20331 17776  0 19:53 ?        00:00:05 nginx: worker process is shutting down
nobody   20947 17776  0 19:54 ?        00:00:03 nginx: worker process is shutting down
nobody   21390 17776  1 19:55 ?        00:00:05 nginx: worker process is shutting down
nobody   22139 17776  0 19:57 ?        00:00:00 nginx: worker process is shutting down
nobody   22251 17776  0 19:57 ?        00:00:01 nginx: worker process is shutting down
nobody   22510 17776  0 19:58 ?        00:00:01 nginx: worker process is shutting down
nobody   22759 17776  0 19:58 ?        00:00:01 nginx: worker process is shutting down
nobody   23038 17776  1 19:59 ?        00:00:03 nginx: worker process is shutting down
nobody   23476 17776  1 20:00 ?        00:00:01 nginx: worker process is shutting down
nobody   23738 17776  1 20:00 ?        00:00:01 nginx: worker process is shutting down
nobody   24026 17776  2 20:01 ?        00:00:02 nginx: worker process is shutting down
nobody   24408 17776  4 20:01 ?        00:00:01 nginx: worker process
</code></pre><p>In order to understand why this happened, we must take a step back and look at how
configuration reloads is implemented in NGINX.</p>
<blockquote>
<p>Once the master process receives the signal to reload configuration, it checks
the syntax validity of the new configuration file and tries to apply the
configuration provided in it. If this is a success, the master process starts
new worker processes and sends messages to old worker processes, requesting
them to shut down. Otherwise, the master process rolls back the changes and
continues to work with the old configuration. Old worker processes, receiving
a command to shut down, stop accepting new connections <strong>and continue to service
current requests until all such requests are serviced. After that, the old
worker processes exit.</strong></p>
</blockquote>
<p>Remember we are proxying WebSocket connections, which are long-running by nature;
a WebSocket connection might take hours, or even days to close depending on the
application. The NGINX server cannot know if it&rsquo;s okay to break up a connection
during a reload, so it&rsquo;s up to you to make things easier for it. (One thing you
can do is to have a strategy in place to actively close connections that are
idle for far too long, both at the client and server-side; don&rsquo;t leave this as
an afterthought)</p>
<p>Now back to our problem. If we have that many workers in that state, this means
the ingress configuration got reloaded many times, and workers were unable to
terminate due to the long-running connections.</p>
<p>That&rsquo;s indeed what happened. After some debugging, we found that the NGINX
ingress controller was repeatedly generating a different configuration file due
to changes in the ordering of upstreams and server IPs.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">  1
</span><span class="lnt">  2
</span><span class="lnt">  3
</span><span class="lnt">  4
</span><span class="lnt">  5
</span><span class="lnt">  6
</span><span class="lnt">  7
</span><span class="lnt">  8
</span><span class="lnt">  9
</span><span class="lnt"> 10
</span><span class="lnt"> 11
</span><span class="lnt"> 12
</span><span class="lnt"> 13
</span><span class="lnt"> 14
</span><span class="lnt"> 15
</span><span class="lnt"> 16
</span><span class="lnt"> 17
</span><span class="lnt"> 18
</span><span class="lnt"> 19
</span><span class="lnt"> 20
</span><span class="lnt"> 21
</span><span class="lnt"> 22
</span><span class="lnt"> 23
</span><span class="lnt"> 24
</span><span class="lnt"> 25
</span><span class="lnt"> 26
</span><span class="lnt"> 27
</span><span class="lnt"> 28
</span><span class="lnt"> 29
</span><span class="lnt"> 30
</span><span class="lnt"> 31
</span><span class="lnt"> 32
</span><span class="lnt"> 33
</span><span class="lnt"> 34
</span><span class="lnt"> 35
</span><span class="lnt"> 36
</span><span class="lnt"> 37
</span><span class="lnt"> 38
</span><span class="lnt"> 39
</span><span class="lnt"> 40
</span><span class="lnt"> 41
</span><span class="lnt"> 42
</span><span class="lnt"> 43
</span><span class="lnt"> 44
</span><span class="lnt"> 45
</span><span class="lnt"> 46
</span><span class="lnt"> 47
</span><span class="lnt"> 48
</span><span class="lnt"> 49
</span><span class="lnt"> 50
</span><span class="lnt"> 51
</span><span class="lnt"> 52
</span><span class="lnt"> 53
</span><span class="lnt"> 54
</span><span class="lnt"> 55
</span><span class="lnt"> 56
</span><span class="lnt"> 57
</span><span class="lnt"> 58
</span><span class="lnt"> 59
</span><span class="lnt"> 60
</span><span class="lnt"> 61
</span><span class="lnt"> 62
</span><span class="lnt"> 63
</span><span class="lnt"> 64
</span><span class="lnt"> 65
</span><span class="lnt"> 66
</span><span class="lnt"> 67
</span><span class="lnt"> 68
</span><span class="lnt"> 69
</span><span class="lnt"> 70
</span><span class="lnt"> 71
</span><span class="lnt"> 72
</span><span class="lnt"> 73
</span><span class="lnt"> 74
</span><span class="lnt"> 75
</span><span class="lnt"> 76
</span><span class="lnt"> 77
</span><span class="lnt"> 78
</span><span class="lnt"> 79
</span><span class="lnt"> 80
</span><span class="lnt"> 81
</span><span class="lnt"> 82
</span><span class="lnt"> 83
</span><span class="lnt"> 84
</span><span class="lnt"> 85
</span><span class="lnt"> 86
</span><span class="lnt"> 87
</span><span class="lnt"> 88
</span><span class="lnt"> 89
</span><span class="lnt"> 90
</span><span class="lnt"> 91
</span><span class="lnt"> 92
</span><span class="lnt"> 93
</span><span class="lnt"> 94
</span><span class="lnt"> 95
</span><span class="lnt"> 96
</span><span class="lnt"> 97
</span><span class="lnt"> 98
</span><span class="lnt"> 99
</span><span class="lnt">100
</span><span class="lnt">101
</span><span class="lnt">102
</span><span class="lnt">103
</span><span class="lnt">104
</span><span class="lnt">105
</span><span class="lnt">106
</span><span class="lnt">107
</span><span class="lnt">108
</span><span class="lnt">109
</span><span class="lnt">110
</span><span class="lnt">111
</span><span class="lnt">112
</span><span class="lnt">113
</span><span class="lnt">114
</span><span class="lnt">115
</span><span class="lnt">116
</span><span class="lnt">117
</span><span class="lnt">118
</span><span class="lnt">119
</span><span class="lnt">120
</span><span class="lnt">121
</span><span class="lnt">122
</span><span class="lnt">123
</span><span class="lnt">124
</span><span class="lnt">125
</span><span class="lnt">126
</span><span class="lnt">127
</span><span class="lnt">128
</span><span class="lnt">129
</span><span class="lnt">130
</span><span class="lnt">131
</span><span class="lnt">132
</span><span class="lnt">133
</span><span class="lnt">134
</span><span class="lnt">135
</span><span class="lnt">136
</span><span class="lnt">137
</span><span class="lnt">138
</span><span class="lnt">139
</span><span class="lnt">140
</span><span class="lnt">141
</span><span class="lnt">142
</span><span class="lnt">143
</span><span class="lnt">144
</span><span class="lnt">145
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-diff" data-lang="diff"><span class="line"><span class="cl">I0810 23:14:47.866939       5 nginx.go:300] NGINX configuration diff
</span></span><span class="line"><span class="cl">I0810 23:14:47.866963       5 nginx.go:301] --- /tmp/a072836772	2017-08-10 23:14:47.000000000 +0000
</span></span><span class="line"><span class="cl"><span class="gi">+++ /tmp/b304986035	2017-08-10 23:14:47.000000000 +0000
</span></span></span><span class="line"><span class="cl"><span class="gu">@@ -163,32 +163,26 @@
</span></span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">     proxy_ssl_session_reuse on;
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-1-80 {
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream upstream-default-backend {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.71.14:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.32.22:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.157.13:8080 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-2-80 {
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-3-80 {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.110.13:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.109.195:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.82.66:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.79.124:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.59.21:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.45.219:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">     upstream production-app-4-80 {
</span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.109.177:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">         server 10.2.12.161:3000 max_fails=0 fail_timeout=0;
</span></span><span class="line"><span class="cl"><span class="gd">-    }
</span></span></span><span class="line"><span class="cl"><span class="gd">-
</span></span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-5-80 {
</span></span></span><span class="line"><span class="cl"><span class="gd">-        # Load balance algorithm; empty for round robin, which is the default
</span></span></span><span class="line"><span class="cl"><span class="gd">-        least_conn;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.21.37:9292 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.65.105:9292 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.109.177:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">     upstream production-app-6-80 {
</span></span><span class="line"><span class="cl"><span class="gu">@@ -201,61 +195,67 @@
</span></span></span><span class="line"><span class="cl">     upstream production-lap-production-80 {
</span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.45.223:8000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.21.36:8000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">         server 10.2.78.36:8000 max_fails=0 fail_timeout=0;
</span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.45.223:8000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">         server 10.2.99.151:8000 max_fails=0 fail_timeout=0;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.21.36:8000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-7-80{
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-1-80 {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.79.126:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.35.105:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.114.143:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.50.44:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.149.135:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.45.155:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.71.14:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.32.22:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-8-80 {
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-2-80 {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.53.23:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.110.22:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.35.91:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.45.221:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.110.13:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.109.195:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream upstream-default-backend {
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-9-80 {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.157.13:8080 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.78.26:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.59.22:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.96.249:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.32.21:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.114.177:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.83.20:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.118.111:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.26.23:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.35.150:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.79.125:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.157.165:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-3-80 {
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-5-80 {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.79.124:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.82.66:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.45.219:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.59.21:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.21.37:9292 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.65.105:9292 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl"><span class="gd">-    upstream production-app-9-80 {
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-7-80 {
</span></span></span><span class="line"><span class="cl">         # Load balance algorithm; empty for round robin, which is the default
</span></span><span class="line"><span class="cl">         least_conn;
</span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.96.249:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.157.165:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.114.177:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.118.111:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.79.125:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.78.26:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.59.22:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.35.150:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.32.21:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.83.20:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gd">-        server 10.2.26.23:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.114.143:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.79.126:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.45.155:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.35.105:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.50.44:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.149.135:3000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+    }
</span></span></span><span class="line"><span class="cl"><span class="gi">+
</span></span></span><span class="line"><span class="cl"><span class="gi">+    upstream production-app-8-80 {
</span></span></span><span class="line"><span class="cl"><span class="gi">+        # Load balance algorithm; empty for round robin, which is the default
</span></span></span><span class="line"><span class="cl"><span class="gi">+        least_conn;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.53.23:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.45.221:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.35.91:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl"><span class="gi">+        server 10.2.110.22:5000 max_fails=0 fail_timeout=0;
</span></span></span><span class="line"><span class="cl">     }
</span></span><span class="line"><span class="cl"> 
</span></span><span class="line"><span class="cl">     server {
</span></span></code></pre></td></tr></table>
</div>
</div><p>This caused the NGINX ingress controller to reload its configuration several
times per minute, making these shutting down workers pile up until the pod got
<code>OOMKilled</code>.</p>
<p>Things got a lot better once I upgraded the NGINX ingress controller to a
fixed version and specified the <code>--sort-backends=true</code> command line flag.</p>
<figure><img src="/posts/painless-nginx-ingress/ingress-reloads.webp"
    alt="Grafana dashboard showing number of nginx-ingress configuration reloads" width="1658" height="522"><figcaption>
      <p>Number of nginx-ingress configuration reloads.</p>
    </figcaption>
</figure>

<p>Thanks to <a href="https://github.com/aledbf">@aledbf</a> for his assistance in finding and
fixing this bug!</p>
<h3 id="further-minizing-config-reloads">Further Minizing Config Reloads</h3>
<p>The lesson here is to keep in mind that <em>configuration reloads are expensive
operations</em> and it&rsquo;s a good idea to avoid those especially when proxying
WebSocket connections. This is why we decided to create a specific ingress
controller deployment just for proxying these long-running connections.</p>
<p>In our case, changes to WebSocket applications happen much less frequently
than other applications; by using a separate ingress controller, we avoid
reloading the configuration for the WebSocket ingress whenever there are
changes (or scaling events/restarts) to other applications.</p>
<p>Separating the deployment also gave us the ability to use a different ingress
configuration that&rsquo;s more suited to long-running connections.</p>
<h4 id="fine-tune-pod-autoscalers">Fine-Tune Pod Autoscalers</h4>
<p>Since NGINX ingress uses pod IPs as upstream servers, every time the list of
endpoints for a given <code>Service</code> changes, the ingress configuration must be
regenerated and reloaded. Thus, if you are observing frequent autoscaling events
for your applications during normal load, it might be a sign that your
<code>HorizontalPodAutoscalers</code> need adjustment.</p>
<figure><img src="/posts/painless-nginx-ingress/hpa.webp"
    alt="Grafana dashboard showing the Kubernetes autoscaler in action" width="1656" height="418"><figcaption>
      <p>Kubernetes autoscaler in action.</p>
    </figcaption>
</figure>

<p>Another thing that most people don&rsquo;t realize is that the horizontal pod
autoscaler have a back-off timer that prevents the same target to be
scaled several times in a short period.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">Name:                                                   &lt;app&gt;
</span></span><span class="line"><span class="cl">Namespace:                                              production
</span></span><span class="line"><span class="cl">Labels:                                                 &lt;none&gt;
</span></span><span class="line"><span class="cl">Annotations:                                            &lt;none&gt;
</span></span><span class="line"><span class="cl">CreationTimestamp:                                      Fri, 23 Jun 2017 11:41:59 -0300
</span></span><span class="line"><span class="cl">Reference:                                              Deployment/&lt;app&gt;
</span></span><span class="line"><span class="cl">Metrics:                                                ( current / target )
</span></span><span class="line"><span class="cl">  resource cpu on pods  (as a percentage of request):   46% (369m) / 60%
</span></span><span class="line"><span class="cl">Min replicas:                                           8
</span></span><span class="line"><span class="cl">Max replicas:                                           20
</span></span><span class="line"><span class="cl">Conditions:
</span></span><span class="line"><span class="cl">  Type                  Status  Reason                  Message
</span></span><span class="line"><span class="cl">  ----                  ------  ------                  -------
</span></span><span class="line"><span class="cl">  AbleToScale           False   BackoffBoth             the time since the previous scale is still within both the downscale and upscale forbidden windows
</span></span><span class="line"><span class="cl">  ScalingActive         True    ValidMetricFound        the HPA was able to succesfully calculate a replica count from cpu resource utilization (percentage of request)
</span></span><span class="line"><span class="cl">  ScalingLimited        True    TooFewReplicas          the desired replica count was less than the minimum replica count
</span></span><span class="line"><span class="cl">Events:
</span></span><span class="line"><span class="cl">  FirstSeen     LastSeen        Count   From                            SubObjectPath   Type            Reason                  Message
</span></span><span class="line"><span class="cl">  ---------     --------        -----   ----                            -------------   --------        ------                  -------
</span></span><span class="line"><span class="cl">  14d           10m             39      horizontal-pod-autoscaler                       Normal          SuccessfulRescale       New size: 10; reason: cpu resource utilization (percentage of request) above target
</span></span><span class="line"><span class="cl">  14d           3m              69      horizontal-pod-autoscaler                       Normal          SuccessfulRescale       New size: 8; reason: All metrics below target
</span></span></code></pre></td></tr></table>
</div>
</div><p>According to the default value for the <code>--horizontal-pod-autoscaler-upscale-delay</code>
flag in
<a href="https://kubernetes.io/docs/admin/kube-controller-manager/">kube-controller-manager</a>,
if your application scaled up, it won&rsquo;t be able to scale up again for 3 minutes.</p>
<p>Thus, in case your application <strong>really</strong> experiences an increased load, it
might take ~4 minutes (3m from the autoscaler back-off + ~1m from the metrics
sync) for the autoscaler to react to the increased load, which might be just
enough time for your service to degrade.</p>
<h2 id="references">References</h2>
<ul>
<li><a href="https://www.nginx.com/blog/tuning-nginx/">Tuning NGINX for Performance</a></li>
<li><a href="http://veithen.github.io/2014/01/01/how-tcp-backlog-works-in-linux.html">How TCP backlog works in Linux</a></li>
<li><a href="https://eklitzke.org/how-tcp-sockets-work">How TCP Sockets Work</a></li>
<li><a href="https://johnleach.co.uk/words/372/netfilter-conntrack-memory-usage">Netfilter Conntrack Memory Usage</a></li>
<li><a href="https://blogs.dropbox.com/tech/2017/09/optimizing-web-servers-for-high-throughput-and-low-latency/">Optimizing web servers for high throuhgput and low latency</a></li>
<li><a href="https://sysdig.com/blog/container-isolation-gone-wrong/">Container isolation gone wrong</a></li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>Five Months of Kubernetes</title>
      <link>https://danielfm.me/posts/five-months-of-kubernetes/</link>
      <pubDate>Wed, 14 Sep 2016 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/five-months-of-kubernetes/</guid>
      <description>How migrating to Kubernetes helped us achive cost reduction and support dynamic development environments.</description>
      <content:encoded><![CDATA[<p>For the past year, <a href="https://descomplica.com.br">Descomplica</a> moved towards a more
service-oriented architecture for its core components (auth, search, etc) and
we&rsquo;ve been using <a href="http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/Welcome.html">Elastic Beanstalk</a>
from the start to orchestrate the deployment of those services to AWS.</p>
<p>It was a good decision at the time. In general, Elastic Beanstalk works fine
and has a very gentle learning curve; it didn&rsquo;t take long for all teams to start
using it for their projects.</p>
<p>Fast-forward a few months, everything was nice and good. Our old problems were
solved, but - as you might have guessed - we had new ones to worry about.</p>
<h2 id="cost-issues">Cost Issues</h2>
<p>In Elastic Beanstalk, each EC2 instance runs exactly one application container.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>
This means that, if you follow reliability best practices, you&rsquo;ll have two or more
instances (spread across multiple <a href="http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html">availability zones</a>) for each application.
You might need even more instances if you have other environments besides the
production one, i.e. staging.</p>
<p>Anyway, you&rsquo;ll end up having multiple dedicated instances per service which,
depending on your workload, will sit there doing nothing most of the time.</p>
<p>We needed to find a way to use our available compute resources more wisely.</p>
<h2 id="the-winner">The Winner</h2>
<p>After looking around for alternatives to ECS, <a href="http://kubernetes.io">Kubernetes</a>
seemed to be the right one for us.</p>
<blockquote>
<p>Kubernetes is a container orchestration tool that builds upon 15 years of
experience of running production workloads at Google, combined with
best-of-breed ideas and practices from the community.</p>
</blockquote>
<p>Although Kubernetes is a feature-rich project, a few key features caught our
attention: <a href="http://kubernetes.io/docs/user-guide/namespaces/">namespaces</a>, <a href="http://kubernetes.io/docs/user-guide/deployments/">automated rollouts and rollbacks</a>,
<a href="http://kubernetes.io/docs/user-guide/services/">service discovery via DNS</a>,
<a href="http://kubernetes.io/docs/user-guide/horizontal-pod-autoscaling/">automated container scaling based on resource usage</a>,
and of course, the promise of a <a href="http://kubernetes.io/docs/user-guide/pod-states/#container-probes">self-healing system</a>.</p>
<p>Kubernetes is somewhat opinionated around how containers are supposed to be
organized and networked, but this should not be a problem if your service
follows the <a href="https://12factor.net/">Twelve-Factor</a> practices.</p>
<h2 id="our-path-to-production">Our Path to Production</h2>
<figure><img src="five-months.png"
    alt="Project activity graph"><figcaption>
      <p>Project activity graph.</p>
    </figcaption>
</figure>

<p>In order to ensure Kubernetes was a viable option for us, the first thing we
did was perform some reliability tests to make sure it could handle failure
modes such as dying nodes, killed Kubelet/Proxy/Docker daemons, and availability
zone outages.</p>
<p>It&rsquo;s impossible to anticipate all the ways things can go wrong, but in the end,
we were very impressed by how Kubernetes managed to handle these failures.</p>
<p>At that time, we used <a href="http://kubernetes.io/docs/getting-started-guides/binary_release/">kube-up</a>
to bootstrap our test clusters. This tool, although it served its purpose, not
always worked as expected; it suffered from a number of issues, such as poorly
chosen defaults, random timeouts that left the stack only half-created, and
inconsistent behavior when destroying the cluster causing orphan resources
to be left behind.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>Once we agreed that Kubernetes was the way to go, we needed a more reliable
way to create and destroy our Kubernetes clusters.</p>
<h3 id="enter-kube-aws">Enter kube-aws</h3>
<p><a href="https://github.com/coreos/coreos-kubernetes/tree/master/multi-node/aws">kube-aws</a> is a tool created by some good guys from CoreOS. The cool
thing about it is that it uses <a href="https://aws.amazon.com/cloudformation/">CloudFormation</a> under the hoods,
which gives us some neat advantages.</p>
<p>The first obvious advantage is that it&rsquo;s very easy to create and destroy
clusters without leaving anything silently hanging around.</p>
<p>Another feature is that, unlike kube-up, you can create a cluster in an existing
VPC so all services running in Kubernetes have access to your existing
AWS resources - such as <a href="https://aws.amazon.com/rds/">relational databases</a> - right off the bat.</p>
<p>In fact, you can run multiple clusters at the same time in the same VPC. This
has a nice side-effect in which you can treat each cluster as an immutable
piece of infrastructure; instead of modifying a running cluster - and risking
break something - you simply create a new cluster and gradually shift traffic
from the old one to the new in a way that any incidents has limited impact.</p>
<p>The final and probably the most useful feature is that you can easily customize
nearly every aspect of the cluster provisioning configuration to make it fit
your own needs. In our case, we added
<a href="https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/fluentd-elasticsearch/">cluster level logging</a> that ingests application logs to
<a href="https://sumologic.com">Sumologic</a>, <a href="https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/cluster-monitoring">cluster monitoring</a> with
<a href="https://www.influxdata.com">InfluxDB</a> and <a href="http://grafana.org">Grafana</a>, <a href="http://kubernetes.io/docs/admin/authorization/#abac-mode">ABAC-based authorization</a>,
among other things.</p>
<h3 id="the-first-environment">The First Environment</h3>
<p>After solving the problem of reliably creating and destroying clusters, we felt
confident to start migrating our staging environment over to Kubernetes.</p>
<p>It was easy enough to manually create the yaml manifests for the first
<a href="http://kubernetes.io/docs/user-guide/deployments/">deployments</a>, but we needed an automated way to deploy new
application images as soon as they were built by our continuous integration
system.</p>
<p>Just as a proof of concept, we quickly hacked together a small function in
<a href="https://aws.amazon.com/documentation/lambda/">AWS Lambda</a> (based on <a href="https://aws.amazon.com/blogs/compute/dynamic-github-actions-with-aws-lambda/">this article</a>) that
automatically updated the corresponding deployment object whenever it
received a merge notification in which the tests passed.</p>
<blockquote>
<p>This small Lambda function has now evolved into a major component in our
delivery pipeline, orchestrating deployments to other environments as well,
including production.</p>
</blockquote>
<p>With this done, migrating staging services from Beanstalk to Kubernetes was
pretty straightforward. First, we created one DNS record for each service (each
initially pointing to the legacy deployment in Elastic Beanstalk) and made sure
that all services referenced each other via this DNS. Then, it was just a matter
of changing those DNS records to point the corresponding
<a href="http://kubernetes.io/docs/user-guide/services/">Kubernetes-managed load balancers</a>.</p>
<p>To ensure every part of the pipeline was working as expected, we spent some time
monitoring all staging deployments looking for bugs and polishing things up
as we could.</p>
<h3 id="more-tests-more-learning">More Tests, More Learning</h3>
<p>Before deploying our first production service to Kubernetes, we did some load
testing to find out the optimal configuration for the
<a href="http://kubernetes.io/docs/user-guide/compute-resources/">resource requirements</a> needed by each service and out how many pods
we needed to handle the current traffic.</p>
<figure><img src="grafana.png"
    alt="Grafana dashboard showing CPU and memory usage for a container"><figcaption>
      <p>CPU and memory usage for a container.</p>
    </figcaption>
</figure>

<p>Observing how your services behave under load and how much compute they need
is <em>essential</em>.</p>
<p>Also take some time to understand how
<a href="https://github.com/kubernetes/kubernetes/blob/master/docs/design/resource-qos.md#qos-classes">QoS classes</a> work in Kubernetes so you have a more fine control over
which pods gets killed in the case of memory pressure. This is particularly
important if you, like us, share the same cluster for all environments.</p>
<h4 id="tip-enable-cross-zone-load-balancing-aws">Tip: Enable Cross-Zone Load Balancing (AWS)</h4>
<p>This is <a href="https://github.com/kubernetes/kubernetes/pull/30695">already fixed</a> in Kubernetes 1.4, but for now, if
you expose your services via the <a href="http://kubernetes.io/docs/user-guide/services/#type-loadbalancer">LoadBalancer type</a>, don&rsquo;t forget to
manually enable <a href="https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/enable-disable-crosszone-lb.html">cross-zone load balancing</a> for the corresponding
ELB; if you don&rsquo;t, you might notice uneven balancing across your application
pods if they are spread in nodes from different availability zones.</p>
<h4 id="tip-give-the-kube-system-namespace-some-love">Tip: Give the kube-system Namespace Some Love</h4>
<p>If you ever tried Kubernetes, you probably noticed there&rsquo;s a <code>kube-system</code>
namespace there with a bunch of stuff in it; do yourself a favor and take some
time to understand the role of each of those things.</p>
<p>For instance, take the <a href="https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns">DNS add-on</a>; it&rsquo;s rather common to see people having
<a href="https://github.com/coreos/coreos-kubernetes/issues/533">issues</a> because they forgot to add more DNS pods to handle their
ever-increasing workload.</p>
<h3 id="going-live">Going Live</h3>
<p>Instead of shifting all traffic at once, like we did in staging, we thought we
needed to take a more careful approach and used <a href="http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html">weighted routing policy</a>
to gradually shift traffic to the Kubernetes cluster.</p>
<figure><img src="phaseout.png"
    alt="Graph showing incoming request count to an application in Elastic Beanstalk"><figcaption>
      <p>Incoming request count to an application in Elastic Beanstalk.</p>
    </figcaption>
</figure>

<p>Once we noticed no more requests were reaching the legacy Beanstalk environments,
we went ahead and killed them.</p>
<p><strong>Update (Sep 21, 2016)</strong>: All major services were migrated to our new platform!
These are the final numbers:<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<ul>
<li>~53-63% decrease in monthly costs</li>
<li>~72-82% decrease in # of instances</li>
</ul>
<h2 id="beyond-production">Beyond Production</h2>
<p>Kubernetes gave us the power to almost effortlessly mold our delivery pipeline
in a way we never thought possible. One example of such improvement is what we
call here <em>development environments</em>.</p>
<figure><img src="deploy-pending.png"
    alt="GitHub status displaying the deployment status for the development environment of one application"><figcaption>
      <p>Deployment status for the development environment of one application.</p>
    </figcaption>
</figure>

<p>Whenever someone opens a Pull Request to one of our projects, the AWS Lambda
function I mentioned earlier creates a temporary environment running the
modifications introduced by the PR.</p>
<p>Also, whenever new code is pushed, this environment gets automatically updated
as long as they pass the tests. Finally, when the PR is merged (or closed), the
environment is deleted.</p>
<figure><img src="deploy-success.png"
    alt="GitHub status displaying the deployment was finished"><figcaption>
      <p>GitHub status displaying the deployment was finished.</p>
    </figcaption>
</figure>

<p>This feature made our code reviews more thorough because the developers can
actually see the changes running. This is even more useful for UX changes in
front-end services; artists and product owners get the chance to validate the
changes and share their inputs before the PR is merged.</p>
<p>To send the <a href="https://developer.github.com/v3/repos/statuses/">GitHub Status</a> notifications you see in these pictures,
we implemented a small daemon in Go that monitors deployments to our
<code>development</code> namespace and reconciles the deployment status for each revision.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Kubernetes is a very complex piece of software that aims to solve a very complex
problem, so expect to spend some time learning how its many pieces fit together
before using it in your projects.</p>
<p>Kubernetes is production-ready, but avoid the temptation of trying to run
<em>everything</em> on it. In our experience, Kubernetes does not offer a clean
solution for a number of problems you might face, such as
<a href="http://kubernetes.io/docs/user-guide/petset/">stateful applications</a>.</p>
<p>The documentation is not great as well, but initiatives like the
<a href="https://kubernetesbootcamp.github.io/kubernetes-bootcamp/index.html">Kubernetes Bootcamp</a> and <a href="https://twitter.com/kelseyhightower">Kelsey Hightower</a>&rsquo;s
<a href="https://github.com/kelseyhightower/kubernetes-the-hard-way">Kubernetes The Hard Way</a> gives me hope that this will no longer be a
problem in the near future.</p>
<p>Without Kubernetes, I don&rsquo;t know how - or if - we could have accomplished
all the things we did in such a small period of time with such a small
engineering team.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
<p>We hope to continue building on Kubernetes to make our delivery platform even
more dynamic and awesome!</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Each AWS region seems to evolve at a different pace. At the time of this writing, multi-container Beanstalk applications and <a href="https://aws.amazon.com/ecs/">ECS</a> were not available for the <code>sa-east-1</code> region. Almost all of our users live in Brazil, so moving out to a different region wasn&rsquo;t really an option.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>There are a number of initiatives to come up with a better tool to create and manage Kubernetes clusters, such as <a href="https://github.com/kubernetes/kops">kops</a>.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Range depends on the workload.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>The ops/delivery team is actually a one-engineer team: me!&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>A Week Of Docker</title>
      <link>https://danielfm.me/posts/a-week-of-docker/</link>
      <pubDate>Fri, 15 Aug 2014 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/a-week-of-docker/</guid>
      <description>Lessons learned from Dockerizing a Rails application with best practices.</description>
      <content:encoded><![CDATA[<p>If you got here, the chances are you heard the fuss around
<a href="http://docker.com">Docker</a> and how it&rsquo;s supposed to change the way we deploy
applications.</p>
<p>According to the <a href="https://docs.docker.com/">official website</a>, Docker is&hellip;</p>
<blockquote>
<p>&hellip;a platform for developers and sysadmins to develop, ship, and run
applications. Docker lets you quickly assemble applications from components
and eliminates the friction that can come when shipping code. Docker lets
you get your code tested and deployed into production as fast as possible.</p>
</blockquote>
<p>I&rsquo;m not here to sell you anything; apparently there are too many people doing
that already. Instead, I&rsquo;m going to document my experiences trying to
&ldquo;Dockerize&rdquo; a simple <a href="http://rubyonrails.org/">Rails</a> application and show you
some things I learned along the way.</p>
<h2 id="the-application">The Application</h2>
<p>I few months ago I built <a href="https://codeberg.org/danielfm/texbin">TeXBin</a>, a
simple Rails application where you can post a <code>.tex</code> file and get a URL for its
PDF version. The code was sitting in my laptop without being used, so why not
use it as guinea pig in my first attempt to use Docker? :-)</p>
<p>The proposed stack is composed by three components: the application itself, a
<a href="http://mongodb.org">MongoDB</a> instance, and a <a href="http://nginx.org">Nginx</a> server
to both serve the static content and act as a reverse proxy to the application.</p>
<pre class="mermaid">graph LR
    Users([Users])  <-->|Listen 0.0.0.0:80| Nginx

    subgraph HostMachine [Host machine]
        Nginx[Nginx]
        App[App]
        MongoDB[(MongoDB)]

        %% Relationships
        Nginx -- proxypass --> App
        App --> MongoDB
    end
</pre>
<h2 id="wtf-is-a-container">WTF is a container?</h2>
<p>Docker is built on top of Linux kernel facilities, like <code>cgroups</code> and
<code>namespaces</code>, and provides a way to create lightweight workspaces &ndash; or
containers &ndash; that <em>run processes in isolation</em>.</p>
<blockquote>
<p>By using containers, resources can be isolated, services restricted, and
processes provisioned to have a private view of the operating system with
their own process ID space, file system structure, and network interfaces.
Multiple containers can share the same kernel, but each container can be
constrained to only use a defined amount of resources such as CPU, memory and
I/O.</p>
<p>&ndash; <a href="http://en.wikipedia.org/wiki/Docker_%28software%29">Wikipedia</a></p>
</blockquote>
<p>So, in short, you get nearly all the benefits of virtualization with barely
none of the execution overhead that comes with it.</p>
<h2 id="why-not-put-everything-within-the-same-container">Why not put everything within the same container?</h2>
<p>You get several benefits by exposing the different components of your
application as different containers. Just to be clear, by <strong>component</strong> I mean
some service that binds to a TCP port.</p>
<p>In particular, having different containers for different components gives us
freedom to <em>move the pieces around or add new pieces</em> as we see fit, like:</p>
<ul>
<li>impose different usage limits (CPU shares and memory limits) for the database,
the application, and the webserver</li>
<li>change from a simple MongoDB instance to a
<a href="http://docs.mongodb.org/manual/replication/">replica set</a> composed by
several containers across multiple hosts</li>
<li>spin up two or more application containers so you can perform
<a href="http://martinfowler.com/bliki/BlueGreenDeployment.html">blue-green deployments</a>,
improve concurrency and resource usage, etc</li>
</ul>
<p>In other words: it&rsquo;s a good idea to keep the moving parts, well, moving.</p>
<h2 id="the-dockerfile">The Dockerfile</h2>
<p>Containers are created from images, so first we need to create an image
with the application code and all the required software packages.</p>
<p>Instead of doing things manually, Docker can build images automatically by
reading the instructions from a <code>Dockerfile</code>, which is a text file that contains
all the commands you would normally execute manually in order to build a Docker
image.</p>
<p>This is the application&rsquo;s <code>Dockerfile</code>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-dockerfile" data-lang="dockerfile"><span class="line"><span class="cl"><span class="c"># Base image (https://registry.hub.docker.com/_/ubuntu/)</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">FROM</span><span class="w"> </span><span class="s">ubuntu</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Install required packages</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">RUN</span> apt-get update<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">RUN</span> apt-get install -y ruby2.0 ruby2.0-dev bundler texlive-full<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Create directory from where the code will run</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">RUN</span> mkdir -p /texbin/app<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">WORKDIR</span><span class="w"> </span><span class="s">/texbin/app</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Make unicorn reachable to other containers</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">EXPOSE</span><span class="w"> </span><span class="s">3000</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Container should behave like a standalone executable</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">CMD</span> <span class="p">[</span><span class="s2">&#34;start&#34;</span><span class="p">]</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ENTRYPOINT</span> <span class="p">[</span><span class="s2">&#34;foreman&#34;</span><span class="p">]</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Install the necessary gems</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ADD</span> Gemfile /texbin/app/Gemfile<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ADD</span> Gemfile.lock /texbin/app/Gemfile.lock<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">RUN</span> bundle install<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Copy application code to container</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ADD</span> . /texbin/app/<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Try not to add steps after the last ADD so we can use the</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Docker build cache more efficiently</span><span class="err">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Information about the individual commands can be obtained
<a href="https://docs.docker.com/reference/builder/">here</a>.</p>
<h3 id="tip-kiss-rvm-goodbye">Tip: Kiss RVM Goodbye</h3>
<p>Why do you need RVM if the application will live inside a controlled and
isolated environment?</p>
<p>The only reason you might want to do that is because you need to install a
particular version of Ruby that you can&rsquo;t find via traditional OS package
managers. If that&rsquo;s the case, you&rsquo;ll be better off installing the Ruby version
you want from the source code.</p>
<p>Using RVM from within a Docker container is not a pleasant experience; every
command must run inside a login shell session and you&rsquo;ll have problems using
<code>CMD</code> together with <code>ENTRYPOINT</code>.</p>
<h3 id="tip-optimize-for-the-build-cache">Tip: Optimize for the Build Cache</h3>
<p>Docker stores intermediate images after successfully executing each command in
the <code>Dockerfile</code>. This is a great feature; if any step fails along the way, you
can fix the problem and the next build will reuse the cache built up until that
last successful command.</p>
<p>Instructions like <code>ADD</code> are not cache friendly though. That&rsquo;s why it&rsquo;s a good
practice to only <code>ADD</code> stuff as late as possible in the <code>Dockerfile</code> since any
changes in the files &ndash; or their metadata &ndash; will invalidate the build cache for
all subsequent instructions.</p>
<p>Which leads us to&hellip;</p>
<h3 id="tip-do-not-forget-the-dockerignore">Tip: Do Not Forget the .dockerignore</h3>
<p>A really important step is to avoid <code>ADD</code>ing irrelevant files to the
container, like <code>README</code>, <code>fig.yml</code>, <code>.git/</code>, <code>logs/</code>, <code>tmp/</code>, and others.</p>
<p>If you are familiar with <code>.gitignore</code>, the idea is the same: just create a
<code>.dockerignore</code> file and put there the patterns you want to ignore. This wil
help keep the image small and the build fast by decreasing the chance of cache
busting.</p>
<h2 id="testing-the-images">Testing the Images</h2>
<p>To run the application, first we&rsquo;ll need a container that exposes a single
MongoDB server:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker run --name texbin_mongodb_1 -d mongo
</span></span></code></pre></td></tr></table>
</div>
</div><p>Then you have to build the application image and start a new container:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-bash" data-lang="bash"><span class="line"><span class="cl">$ docker build -t texbin:dev .
</span></span><span class="line"><span class="cl">$ docker run --name texbin_app_1 -d --link texbin_mongodb_1:mongodb -p 3000:3000 texbin:dev
</span></span></code></pre></td></tr></table>
</div>
</div><p>Learning how <a href="https://docs.docker.com/userguide/dockerlinks/">container linking</a>
and <a href="https://docs.docker.com/userguide/dockervolumes/">volumes</a> work is
essential if you want to understand how to &ldquo;plug&rdquo; containers together.</p>
<p><strong>Note:</strong> The project also includes a <code>Dockerfile</code> for the
<a href="https://codeberg.org/danielfm/texbin/src/branch/master/config/docker/nginx">Nginx container</a>
which I won&rsquo;t show here because it doesn&rsquo;t bring anything new to the table.</p>
<p>Now <code>docker ps</code> should display two running containers. If everything&rsquo;s
working, you should be able to access the application at
<a href="http://localhost:3000">http://localhost:3000</a>. To see the logs, run <code>docker logs texbin_app_1</code>.</p>
<h2 id="docker-in-development">Docker in Development</h2>
<p>It turns out it&rsquo;s quite easy to automate these last steps with
<a href="http://www.fig.sh/">Fig</a>:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-yaml" data-lang="yaml"><span class="line"><span class="cl"><span class="c"># fig.yml</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">mongodb</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">image</span><span class="p">:</span><span class="w"> </span><span class="l">mongo</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="nt">app</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">build</span><span class="p">:</span><span class="w"> </span><span class="l">.</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">ports</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="m">3000</span><span class="p">:</span><span class="m">3000</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">links</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">mongodb:mongodb</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">  </span><span class="nt">volumes</span><span class="p">:</span><span class="w">
</span></span></span><span class="line"><span class="cl"><span class="w">    </span>- <span class="l">.:/texbin/app</span><span class="w">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Then, run <code>fig up</code> in the terminal in order to build the images, start the
containers, and link them.</p>
<p>The only difference between this and the commands we ran manually before is
that now we&rsquo;re mounting the hosts&rsquo;s current directory to container&rsquo;s
<code>/texbin/app</code> so that we can view our changes to the application in real time.</p>
<p>Try changing some <code>.html.erb</code> template and refreshing the browser.</p>
<h2 id="defining-new-environments">Defining New Environments</h2>
<p>The goal is to run the same application in production, but with a different
configuration, right? A simple way to &ndash; sort of &ndash; solve this is by creating
another image, based on the previous one, that changes the required
configuration:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-dockerfile" data-lang="dockerfile"><span class="line"><span class="cl"><span class="c"># Uses our previous image as base</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">FROM</span><span class="w"> </span><span class="s">texbin</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Set the proper environment</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ENV</span> RAILS_ENV production<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Custom settings for that environment</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ADD</span> production_env /texbin/app/.env<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">ADD</span> production_mongoid.yml /texbin/app/config/mongoid.yml<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Precompile the assets</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">RUN</span> rake assets:precompile<span class="err">
</span></span></span><span class="line"><span class="cl"><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="c"># Exposes the public directory as a volume</span><span class="err">
</span></span></span><span class="line"><span class="cl"><span class="k">VOLUME</span><span class="w"> </span><span class="s">/texbin/app/public</span><span class="err">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>If you know a better way to do this, please let me know in the comments.</p>
<h2 id="going-live">Going Live</h2>
<p>The first thing to do is to push your images to the server. There are plenty
of ways to do that: <a href="https://registry.hub.docker.com/">the public registry</a>, a
<a href="https://github.com/docker/docker-registry">private-hosted registry</a>, git, etc.
Once the images are built, just repeat the procedure we did earlier and you&rsquo;re
done.</p>
<p>But that&rsquo;s not everything. As you probably know, deploying an application
involves <a href="http://www.oscon.com/oscon2014/public/schedule/detail/34136">a lot more</a>
than just moving stuff to some remote servers. This means you&rsquo;ll still have to
worry with things like deployment automation, monitoring (at host and container
levels), logging, data migrations and backup, etc.</p>
<h2 id="conclusion">Conclusion</h2>
<p>I&rsquo;m glad I took the time to look at Docker. Despite its young age, it&rsquo;s a very
impressive rapidly-evolving piece of technology with a lot of potential to
radically change the DevOps landscape in the next couple of years.</p>
<p>However, Docker solves only one variable of a huge equation. You&rsquo;ll still have
to take care of boring things like monitoring, and I imagine it&rsquo;s rather
difficult &ndash; not to say impossible &ndash; to use Docker in production without
<a href="https://github.com/newrelic/centurion">some layer of automation</a> on top of it.</p>
<p>Also, features like container linking, are somewhat limited and we&rsquo;ll probably
see substantial improvements in
<a href="https://github.com/docker/docker/milestones">future releases</a>. So stay tuned!</p>
]]></content:encoded>
    </item>
    <item>
      <title>Why Are Continuations so Darn Cool?</title>
      <link>https://danielfm.me/posts/why-are-continuations-cool/</link>
      <pubDate>Thu, 05 Jun 2014 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/why-are-continuations-cool/</guid>
      <description>Practical introduction to continuations and call/cc in Scheme with examples.</description>
      <content:encoded><![CDATA[<blockquote>
<p><em>Continuations are the least understood of all control-flow constructs. This
lack of understanding (or awareness) is unfortunate, given that continuations
permit the programmer to implement powerful language features and algorithms.</em></p>
<p>&ndash; Matt Might, in <a href="http://matt.might.net/articles/programming-with-continuations--exceptions-backtracking-search-threads-generators-coroutines/">Continuations By Example</a></p>
</blockquote>
<p>The usual way to control the flow of execution of a computer program is via
procedure calls and returns; a <a href="http://en.wikipedia.org/wiki/Call_stack">stack</a>
data structure is how high-level programming languages keep track of the point
to which each active subroutine should return control when it finishes
executing.</p>
<p>Unfortunately, you&rsquo;ll need more than that if you intend to write useful
programs to solve real-world problems. That&rsquo;s why most high-level programming
languages also provide other control-flow primitives, like the <code>goto</code>
statement, loops, and exception handling.</p>
<p>I&rsquo;m not saying that implementing a programming language is an easy task, but
putting that aside for a moment, it&rsquo;s like programming languages in general
fight as hard as they can to make the call stack something as hidden and
intangible as possible - something no one but itself are allowed to control.</p>
<p>What would happen if some programming languages, instead of keeping the call
stack inside a 2&quot; solid steel safe, actually gave the programmers the ability
to &ldquo;capture&rdquo; them as functions that can be invoked, stored, and passed around
as values?</p>
<p>In this post, I hope to show you what <em>continuations</em> are and how they can be
used in practical situations. So grab <a href="http://racket-lang.org">Racket</a> and
let&rsquo;s go!</p>
<p><strong>Update (Jun 20, 2014):</strong>  I&rsquo;ve changed some things in this post in response
to some great comments in this
<a href="http://www.reddit.com/r/scheme/comments/27gn0j/why_are_continuations_so_darn_cool/">Reddit discussion</a> and this
<a href="http://jecxjo.motd.org/code/blosxom.cgi/coding/explain_continuations">blog post</a>
by jecxjo.</p>
<h2 id="first-example">First Example</h2>
<p><strong>Note:</strong> The following problem is solvable without continuations, but I&rsquo;d like
to start with something simple enough.</p>
<p>Suppose you are writing code that interfaces with some API over HTTP. Also
suppose this API requires a <code>SessionId</code> header to be sent over with the request
in order to avoid
<a href="http://en.wikipedia.org/wiki/Cross-site_request_forgery">CSRF</a> attacks.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-racket" data-lang="racket"><span class="line"><span class="cl"><span class="kn">#lang </span><span class="nn">racket/base</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; Object to keep the session-id across requests</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="k">struct</span> <span class="n">session</span> <span class="p">[</span><span class="n">id</span> <span class="kd">#:mutable</span><span class="p">])</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">perform-request!</span> <span class="n">session</span> <span class="n">method</span> <span class="n">params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">;; Performs the request</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">define</span> <span class="n">headers</span>  <span class="p">(</span><span class="n">session-id-headers</span> <span class="n">session</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">define</span> <span class="n">response</span> <span class="p">(</span><span class="n">http-request</span> <span class="n">API-URL</span> <span class="n">headers</span> <span class="n">method</span> <span class="n">params</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">;; Retries the request with the given Session-Id</span>
</span></span><span class="line"><span class="cl">  <span class="c1">;; if necessary</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">when</span> <span class="p">(</span><span class="n">request-denied?</span> <span class="n">response</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">update-session-id!</span> <span class="n">session</span> <span class="n">response</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">perform-request!</span> <span class="n">session</span> <span class="n">method</span> <span class="n">params</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="n">parse-json</span> <span class="n">response</span><span class="p">))</span><span class="err">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>When the first request is sent - without the <code>SessionId</code> header - the server
responds with an error, i.e. HTTP 409, in which case the procedure updates
<code>session</code> with the session id given by the server and retries the request.</p>
<p>The code makes sense, but it&rsquo;s <strong>broken.</strong></p>
<p>The recursive call is not made in tail position. So, when it happens,
another stack frame is pushed to the call stack and even though the retried
request succeeds, what gets returned to the caller is the response to that
first unauthorized request.</p>
<p>If only we had the chance to <strong>return</strong> that second response right to the caller
instead of having the stack to unwind itself&hellip;</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-racket" data-lang="racket"><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">perform-request!</span> <span class="n">session</span> <span class="n">method</span> <span class="n">params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="c1">;; ...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">when</span> <span class="p">(</span><span class="n">request-denied?</span> <span class="n">response</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">update-session-id!</span> <span class="n">session</span> <span class="n">response</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">;; Something like this</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">return</span> <span class="p">(</span><span class="n">perform-request!</span> <span class="k">...</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="n">parse-json</span> <span class="n">response</span><span class="p">))</span><span class="err">
</span></span></span></code></pre></td></tr></table>
</div>
</div><h3 id="enter-callcc">Enter <code>call/cc</code></h3>
<p>A continuation can be viewed as the evaluation context surrounding an
expression or, in other words, a <strong>snapshot</strong> of the current control state
of the program.</p>
<p>Here&rsquo;s an example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-racket" data-lang="racket"><span class="line"><span class="cl"><span class="kn">#lang </span><span class="nn">racket/base</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; We&#39;ll keep the captured continuation here</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="n">cc</span> <span class="no">#f</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; This function returns the value 3 *and* stores the</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; continuation that represents the execution context</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; in which this function was called</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">val!</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="nb">call/cc</span>
</span></span><span class="line"><span class="cl">   <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="n">k</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">     <span class="p">(</span><span class="k">set!</span> <span class="n">cc</span> <span class="n">k</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">     <span class="mi">3</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; Stored continuation for this expression: (+ 1 (* 2 ?))</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="nb">+</span> <span class="mi">1</span> <span class="p">(</span><span class="nb">*</span> <span class="mi">2</span> <span class="p">(</span><span class="n">val!</span><span class="p">)))</span> <span class="c1">;-&gt; 7</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; Replays the continuation with different arguments</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">cc</span> <span class="mi">2</span><span class="p">)</span> <span class="c1">;-&gt;  5, or (+ 1 (* 2 2))</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">cc</span> <span class="mi">6</span><span class="p">)</span> <span class="c1">;-&gt; 13, or (+ 1 (* 2 6))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>It turns out that, if we rename <code>k</code> to <code>return</code>, this is exactly the thing
we need in order to fix that broken API client example:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-racket" data-lang="racket"><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">perform-request!</span> <span class="n">session</span> <span class="n">method</span> <span class="n">params</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">let/cc</span> <span class="n">return</span> <span class="c1">; same as (call/cc (lambda (return) body...))</span>
</span></span><span class="line"><span class="cl">    <span class="c1">;; ...</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">;; Retries the request and gives the control</span>
</span></span><span class="line"><span class="cl">    <span class="c1">;; back to the caller if request-denied?</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="k">when</span> <span class="p">(</span><span class="n">request-denied?</span> <span class="n">response</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="p">(</span><span class="n">update-session-id!</span> <span class="n">session</span> <span class="n">response</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">      <span class="p">(</span><span class="n">return</span> <span class="p">(</span><span class="n">perform-request!</span> <span class="n">session</span> <span class="n">method</span> <span class="n">params</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">parse-json</span> <span class="n">response</span><span class="p">)))</span><span class="err">
</span></span></span></code></pre></td></tr></table>
</div>
</div><p>Now the function captures the current continuation at the moment the procedure
<code>perform-request!</code> is first called. Then, if the server denies a request, we
re-send the request with the given <code>SessionId</code> and use that grabbed continuation
to transfer the control back to the caller.</p>
<p>Nice, don&rsquo;t you think? It&rsquo;s like we&rsquo;re freezing time at <code>let/cc</code>, doing some
stuff, and then resuming from there.</p>
<p>This is a common use case of continuations. Check out
<a href="https://github.com/danielfm/transmission-rpc-client">this project</a> if you want
to read the code that inspired this example.</p>
<h2 id="generators">Generators</h2>
<p><a href="http://en.wikipedia.org/wiki/Generator_%28computer_programming%29">Generators</a> can
be viewed as special routines that behave like iterators. If you are familiar
with Python, you&rsquo;ve probably seen code like this one:</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-python" data-lang="python"><span class="line"><span class="cl"><span class="k">def</span> <span class="nf">iterate</span><span class="p">(</span><span class="nb">list</span><span class="p">):</span>
</span></span><span class="line"><span class="cl">  <span class="s2">&#34;Generator function that iterates through list.&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="nb">list</span><span class="p">:</span>
</span></span><span class="line"><span class="cl">    <span class="k">yield</span> <span class="n">item</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1"># Usage</span>
</span></span><span class="line"><span class="cl"><span class="n">it</span> <span class="o">=</span> <span class="n">iterate</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="n">it</span><span class="o">.</span><span class="n">next</span><span class="p">()</span> <span class="c1"># -&gt; 0</span>
</span></span><span class="line"><span class="cl"><span class="n">it</span><span class="o">.</span><span class="n">next</span><span class="p">()</span> <span class="c1"># -&gt; 1</span>
</span></span><span class="line"><span class="cl"><span class="n">it</span><span class="o">.</span><span class="n">next</span><span class="p">()</span> <span class="c1"># -&gt; raises StopIteration error</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Do you see any resemblance between this example and the previous one? Although
Python doesn&rsquo;t provide a <code>call/cc</code>-like facility in the language, one can argue
that its generators are like a poor man&rsquo;s continuation.</p>
<p>Let&rsquo;s pretend for a moment that Racket didn&rsquo;t have a
<a href="http://docs.racket-lang.org/reference/Generators.html">generator library</a>
that does exactly this. How could this be implemented Racket using
continuations?</p>
<p>What we need is a function that returns another function which, when called,
yields one item at a time, until the list is exhausted.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-racket" data-lang="racket"><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">iterate</span> <span class="n">lst</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">lambda</span> <span class="p">()</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="k">let/cc</span> <span class="n">return</span>
</span></span><span class="line"><span class="cl">      <span class="p">(</span><span class="nb">for-each</span>
</span></span><span class="line"><span class="cl">       <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="n">item</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">         <span class="p">(</span><span class="n">return</span> <span class="n">item</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">       <span class="n">lst</span><span class="p">))))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; Usage</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="n">next</span> <span class="p">(</span><span class="n">iterate</span> <span class="p">(</span><span class="nb">range</span> <span class="mi">3</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">next</span><span class="p">)</span> <span class="c1">;-&gt; 0</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">next</span><span class="p">)</span> <span class="c1">;-&gt; 0</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This code follows the same pattern as the previous ones, but it doesn&rsquo;t seem
to work the way you might expect. The reason should be clear though: <code>iterate</code>
returns a lambda that uses the captured continuation to yield the list&rsquo;s first
item.</p>
<p>To make this code work, we need to capture the current continuation from the
inside of <code>for-each</code> and store it so it can be used to resume the computation
when <code>next</code> is called again.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt"> 1
</span><span class="lnt"> 2
</span><span class="lnt"> 3
</span><span class="lnt"> 4
</span><span class="lnt"> 5
</span><span class="lnt"> 6
</span><span class="lnt"> 7
</span><span class="lnt"> 8
</span><span class="lnt"> 9
</span><span class="lnt">10
</span><span class="lnt">11
</span><span class="lnt">12
</span><span class="lnt">13
</span><span class="lnt">14
</span><span class="lnt">15
</span><span class="lnt">16
</span><span class="lnt">17
</span><span class="lnt">18
</span><span class="lnt">19
</span><span class="lnt">20
</span><span class="lnt">21
</span><span class="lnt">22
</span><span class="lnt">23
</span><span class="lnt">24
</span><span class="lnt">25
</span><span class="lnt">26
</span><span class="lnt">27
</span><span class="lnt">28
</span><span class="lnt">29
</span><span class="lnt">30
</span><span class="lnt">31
</span><span class="lnt">32
</span><span class="lnt">33
</span><span class="lnt">34
</span><span class="lnt">35
</span><span class="lnt">36
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-racket" data-lang="racket"><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">iterate</span> <span class="n">lst</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">;; Defines `state` as being a function that starts the</span>
</span></span><span class="line"><span class="cl">  <span class="c1">;; iteration via `for-each`</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">state</span> <span class="n">return</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="nb">for-each</span>
</span></span><span class="line"><span class="cl">     <span class="p">(</span><span class="k">lambda</span> <span class="p">(</span><span class="n">item</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">       <span class="c1">;; Here, we capture the continuation that represents the</span>
</span></span><span class="line"><span class="cl">       <span class="c1">;; current state of the iteration</span>
</span></span><span class="line"><span class="cl">       <span class="p">(</span><span class="k">let/cc</span> <span class="n">item-cc</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">         <span class="c1">;; Before the item is yielded, we update `state` to</span>
</span></span><span class="line"><span class="cl">         <span class="c1">;; `item-cc` so the computation is resumed the next</span>
</span></span><span class="line"><span class="cl">         <span class="c1">;; time the generator is called</span>
</span></span><span class="line"><span class="cl">         <span class="p">(</span><span class="k">set!</span> <span class="n">state</span> <span class="n">item-cc</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">         <span class="c1">;; Yields the current item to the caller</span>
</span></span><span class="line"><span class="cl">         <span class="p">(</span><span class="n">return</span> <span class="n">item</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">     <span class="n">lst</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">    <span class="c1">;; Yields &#39;done when the list is exhausted</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="n">return</span> <span class="o">&#39;</span><span class="ss">done</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl">  <span class="c1">;; Returns a function that calls the stored `state` with the</span>
</span></span><span class="line"><span class="cl">  <span class="c1">;; current continuation so we can yield one item at a time</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">define</span> <span class="p">(</span><span class="n">generator</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="nb">call/cc</span> <span class="n">state</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">  <span class="n">generator</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="c1">;; Usage</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="k">define</span> <span class="n">next</span> <span class="p">(</span><span class="n">iterate</span> <span class="o">&#39;</span><span class="p">(</span><span class="mi">0</span> <span class="mi">1</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">next</span><span class="p">)</span> <span class="c1">;-&gt; 0</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">next</span><span class="p">)</span> <span class="c1">;-&gt; 1</span>
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="n">next</span><span class="p">)</span> <span class="c1">;-&gt; &#39;done</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>If you are having trouble understanding how this code works, the following
diagram might help.</p>
<pre class="mermaid">stateDiagram-v2
  direction LR
  s1: State
  s2: (set! state item-cc)
  s3: (return 'item)
  state if_state <<choice>>

  [*] --> s1: (next)
  s1 --> if_state: (for-each)
  
  if_state --> [*]: (return 'done)
  if_state --> s2: Has items left

  s2 --> s3
  s3 --> if_state: Yield to caller
</pre>
<h2 id="other-examples">Other Examples</h2>
<p>Moving along to more high level stuff, in
<a href="http://docs.racket-lang.org/more/index.html">this example</a> Matthew Flatt
explains how to build a continuation-based web server in Racket. Still in the
realm of continuation-based-web-something, if you are into Smalltalk, don&rsquo;t
forget to check <a href="http://www.seaside.st/">Seaside</a>, a web application framework
that uses continuations to model multiple independent flows between different
components.</p>
<p>If you don&rsquo;t code Scheme or Smalltalk for a living, don&rsquo;t worry. The chances
are your language does support some flavor of continuations, either natively or
via some third-party library.</p>
<h2 id="conclusion">Conclusion</h2>
<p>It seems that continuations can be used to implement a wide variety of advanced
control constructs including non-local exits, exception handling, backtracking,
and <a href="http://en.wikipedia.org/wiki/Coroutine">coroutines</a>.</p>
<p>In this post, I hope to have clarified some of the key aspects about
continuations. If you have any suggestion on how to improve this text, please
let me know.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Functional Programming 101 - With Clojure</title>
      <link>https://danielfm.me/posts/functional-programming-101-with-clojure/</link>
      <pubDate>Sun, 26 Jan 2014 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/functional-programming-101-with-clojure/</guid>
      <description>Solving the Hydra challenge with Clojure and functional programming concepts.</description>
      <content:encoded><![CDATA[<p>Here goes a simple yet interesting programming problem originally proposed by
<a href="http://www.iit.edu/csl/cs/faculty/beckman_mattox.shtml">Mattox Beckman</a>. After
seeing <a href="http://blog.gja.in/2014/01/functional-programming-101-with-haskell.html">Tejas Dinkar</a>&rsquo;s
take on this problem using <a href="http://haskell.org">Haskell</a>, I decided to give it
a go with <a href="http://clojure.org">Clojure</a>.</p>
<blockquote>
<p>You are Hercules, about to fight the dreaded Hydra. The Hydra has 9 heads.
When a head is chopped off, it spawns 8 more heads. When one of these 8 heads
is cut off, each one spawns out 7 more heads. Chopping one of these spawns 6
more heads, and so on until the weakest head of the hydra will not spawn out
any more heads.</p>
<p>Our job is to figure out how many chops Hercules needs to make in order to
kill all heads of the Hydra. And no, it&rsquo;s not <em>n!</em>.</p>
</blockquote>
<p>We can start by defining a function that returns a <code>n</code>-headed Hydra.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span><span class="lnt">7
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="kd">defn </span><span class="nv">new-hydra</span>
</span></span><span class="line"><span class="cl">  <span class="s">&#34;Returns a Hydra with n heads.&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">[</span><span class="nv">n</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="nb">repeat </span><span class="nv">n</span> <span class="nv">n</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="nf">new-hydra</span> <span class="mi">3</span><span class="p">)</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; (3 3 3)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>To make it easy to compare both solutions, the data structure I&rsquo;m using here
is the same one used by Dinkar: a list. In this list, each number represents
a living head and its level of strength.</p>
<p>Now, according to the problem description, when Hercules chops off a level 3
head, the Hydra grows two level 2 heads.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nf">chop-head</span> <span class="p">(</span><span class="nf">new-hydra</span> <span class="mi">3</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; (2 2 3 3)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Here&rsquo;s one possible implementation for such a function.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="kd">defn </span><span class="nv">chop-head</span>
</span></span><span class="line"><span class="cl">  <span class="s">&#34;Returns a new Hydra after chop off its first head.&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">[</span><span class="nv">hydra</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">head</span> <span class="p">(</span><span class="nb">first </span><span class="nv">hydra</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="nb">into </span><span class="p">(</span><span class="nb">rest </span><span class="nv">hydra</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">          <span class="p">(</span><span class="nf">new-hydra</span> <span class="p">(</span><span class="nb">dec </span><span class="nv">head</span><span class="p">)))))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>This code should make sense even if you are not familiar with Clojure.</p>
<blockquote>
<p>What happens if Hercules tries to cut off the head of a headless Hydra?</p>
</blockquote>
<p>Most functional programming languages I know are laid on top of a strong principle
called the <a href="http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-15.html">closure property</a>.</p>
<blockquote>
<p><em>In general, an operation for combining data objects satisfies the closure
property if the results of combining things with that operation can
themselves be combined using the same operation. Closure is the key to power
in any means of combination because it permits us to create hierarchical
structures &ndash;  structures made up of parts, which themselves are made up of
parts, and so on.</em></p>
<p>&ndash; Gerald Jay Sussman, Hal Abelson</p>
</blockquote>
<p>To illustrate this concept with code, let&rsquo;s consider Clojure&rsquo;s <code>cons</code> function.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nb">cons </span><span class="mi">1</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">2</span> <span class="o">&#39;</span><span class="p">()))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; (1 2)</span>
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="p">(</span><span class="nb">cons </span><span class="mi">1</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">2</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">3</span> <span class="nv">nil</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; (1 2 3)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>That means <code>cons</code> follows the closure principle. But what about our <code>chop-head</code>
function? Does the principle hold?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nf">chop-head</span> <span class="p">(</span><span class="nf">chop-head</span> <span class="p">(</span><span class="nf">chop-head</span> <span class="o">&#39;</span><span class="p">(</span><span class="mi">2</span><span class="p">))))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; NullPointerException</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Apparently not. To fix that, we need to make sure <code>dec</code> is not called with
<code>nil</code>, since it&rsquo;s not possible to decrement a null value.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span><span class="lnt">6
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="kd">defn </span><span class="nv">chop-head</span>
</span></span><span class="line"><span class="cl">  <span class="s">&#34;Returns a new Hydra after chop off its first head.&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">[</span><span class="nv">hydra</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">head</span> <span class="p">(</span><span class="nb">first </span><span class="nv">hydra</span><span class="p">)]</span>
</span></span><span class="line"><span class="cl">    <span class="p">(</span><span class="nb">into </span><span class="p">(</span><span class="nb">rest </span><span class="nv">hydra</span><span class="p">)</span>
</span></span><span class="line"><span class="cl">          <span class="p">(</span><span class="nf">new-hydra</span> <span class="p">(</span><span class="nb">dec </span><span class="p">(</span><span class="nb">or </span><span class="nv">head</span> <span class="mi">1</span><span class="p">))))))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>What about now?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nf">chop-head</span> <span class="p">(</span><span class="nf">chop-head</span> <span class="p">(</span><span class="nf">chop-head</span> <span class="o">&#39;</span><span class="p">(</span><span class="mi">2</span><span class="p">))))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; ()</span>
</span></span></code></pre></td></tr></table>
</div>
</div><h2 id="killing-the-hydra">Killing The Hydra</h2>
<p>In order for Hecules to kill the Hydra, he needs to repeatedly chop off Hydra&rsquo;s
heads while it still has them.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span><span class="lnt">5
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="kd">defn </span><span class="nv">chop-until-dead</span>
</span></span><span class="line"><span class="cl">  <span class="s">&#34;Repeatedly chops Hydra&#39;s heads until no head is left.&#34;</span>
</span></span><span class="line"><span class="cl">  <span class="p">[</span><span class="nv">hydra</span><span class="p">]</span>
</span></span><span class="line"><span class="cl">  <span class="p">(</span><span class="nb">take-while </span><span class="o">#</span><span class="p">(</span><span class="nb">not </span><span class="p">(</span><span class="nf">empty?</span> <span class="nv">%</span><span class="p">))</span>
</span></span><span class="line"><span class="cl">              <span class="p">(</span><span class="nb">iterate </span><span class="o">#</span><span class="p">(</span><span class="nf">chop-head</span> <span class="nv">%</span><span class="p">)</span> <span class="nv">hydra</span><span class="p">)))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>The <code>(iterate f x)</code> function returns a lazy (infinite) sequence of <code>x</code>, <code>(f x)</code>,
<code>(f (f x))</code>, etc, given that <code>f</code> is a function free of side-effects.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nb">take </span><span class="mi">3</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; (0 1 2)</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Since <code>chop-head</code> respects the closure principle by always returning a list,
we can use it in <code>iterate</code> until we get an empty list, which means the Hydra
is DEAD.</p>
<p>Let&rsquo;s test it on a 3-headed baby Hydra.</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nf">chop-until-dead</span> <span class="p">(</span><span class="nf">new-hydra</span> <span class="mi">3</span><span class="p">))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; ((3 3 3) (2 2 3 3) (1 2 3 3) (2 3 3) (1 3 3) (3 3)</span>
</span></span><span class="line"><span class="cl"><span class="c1">;;     (2 2 3) (1 2 3) (2 3) (1 3) (3) (2 2) (1 2) (2) (1))</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>How many chops are needed in order to kill the original 9-headed Hydra?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nb">count </span><span class="p">(</span><span class="nf">chop-until-dead</span> <span class="p">(</span><span class="nf">new-hydra</span> <span class="mi">9</span><span class="p">)))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; 986409</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Another interesting question: what is the maximum number of heads Hercules
fought at once?</p>
<div class="highlight"><div class="chroma">
<table class="lntable"><tr><td class="lntd">
<pre tabindex="0" class="chroma"><code><span class="lnt">1
</span><span class="lnt">2
</span><span class="lnt">3
</span><span class="lnt">4
</span></code></pre></td>
<td class="lntd">
<pre tabindex="0" class="chroma"><code class="language-clojure" data-lang="clojure"><span class="line"><span class="cl"><span class="p">(</span><span class="nb">apply </span><span class="nv">max</span>
</span></span><span class="line"><span class="cl">       <span class="p">(</span><span class="nb">map </span><span class="nv">count</span>
</span></span><span class="line"><span class="cl">            <span class="p">(</span><span class="nf">chop-until-dead</span> <span class="p">(</span><span class="nf">new-hydra</span> <span class="mi">9</span><span class="p">))))</span>
</span></span><span class="line"><span class="cl"><span class="c1">;; =&gt; 37</span>
</span></span></code></pre></td></tr></table>
</div>
</div><p>Ah. Beautiful.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Learning from Data - Course Review</title>
      <link>https://danielfm.me/posts/learning-from-data/</link>
      <pubDate>Sun, 03 Jun 2012 00:00:00 +0000</pubDate>
      <guid>https://danielfm.me/posts/learning-from-data/</guid>
      <description>Review of Caltech&amp;#39;s machine learning course with practical assignments and theory.</description>
      <content:encoded><![CDATA[<p>Machine learning students and practitioners looking for a solid
foundation on the subject probably heard about
<a href="http://work.caltech.edu/telecourse.html">Learning From Data</a>, a real
Caltech course taught by Professor
<a href="http://work.caltech.edu">Yaser Abu-Mostafa</a> and broadcast live, for
free.</p>
<p>As a contribution to this wonderful initiative, I gather here my
impressions regarding some aspects of the course.</p>
<h2 id="prerequisites">Prerequisites</h2>
<p>Basic probability, matrices, and calculus. According to the course
website, this is all you need to know. I would add <em>programming</em> to
that list though, since it&rsquo;s necessary to write some tricky
programs in order to answer to some questions.</p>
<h2 id="broadcast">Broadcast</h2>
<p>The <a href="http://ustream.com/caltech">Ustream</a> platform were chosen to
broadcast the course and, apart from some small glitches, it worked
nicely the times I tried it. This is how I&rsquo;d watch the lectures if I
lived somewhere with a more favorable time zone. (2:30 P.M. to 3:30
P.M. in Brazil)</p>
<h2 id="material">Material</h2>
<p>Lucky for us, the lecture videos were made available in several
formats &ndash; in low and high quality versions &ndash; just a couple of days
after its broadcast. The recorded Ustream session was accessible
shortly after each lecture. The lectures were also posted on
<a href="http://www.youtube.com/playlist?list=PLD63A284B7615313A&amp;feature=plcp">YouTube</a>
and <a href="http://www.apple.com/education/itunes-u/">iTunes U</a> course app
for iOS devices. This was <em>really</em> impressive.</p>
<p>I liked the quality of the slides, which were made available in PDF
format. All pictures, plots, and equations are sharp and easy to
read. Another plus: the notation chosen to explain each concept and
algorithm relate to the notation adopted in popular articles on the
same subject, for example, in <a href="http://wikipedia.org/">Wikipedia</a>. This
makes things easy for those who seek to enrich their knowledge even
further.</p>
<p>I ordered the <a href="http://www.amazon.com/gp/product/1600490069">textbook</a>
for the course, but unfortunately it didn&rsquo;t arrive in time for this
post. However, according to the few reviews on Amazon and the feedback
given in the course forum, it does its job pretty well.</p>
<p>As the course website states, this <em>really</em> isn&rsquo;t a watered-down
material and I&rsquo;m glad for that.</p>
<p>The subjects I enjoyed the most: VC Theory
(<a href="http://en.wikipedia.org/wiki/Shattered_set">Growth Function</a>,
<a href="http://en.wikipedia.org/wiki/Vc_dimension">VC Dimension</a>),
Regularization, <a href="http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29">Validation</a>,
and
<a href="http://en.wikipedia.org/wiki/Support_vector_machine">Support Vector Machines</a>.</p>
<h2 id="professors">Professors</h2>
<p>I must say the course teacher, Professor Yaser Abu-Mostafa, is one of
the most talented teachers I&rsquo;ve ever had, online or otherwise. He
truly masters the subject and knows what is relevant and what is noise
when explaining something.</p>
<p>Another impressive thing was how he was available in the Q&amp;A forum,
together with the Associate Professors, answering student&rsquo;s questions
and collecting feedback in order to keep the course at its highest
level. Associate Professor
<a href="http://www.cs.rpi.edu/~magdon/">Malik Magdon-Ismail</a> also helped a
lot in some discussions by writing some real gems. Some of his posts
could be turn into nice articles very easily.</p>
<h2 id="assignments">Assignments</h2>
<p>I enjoyed every single one of them. They were thorough, challenging,
and fun to work on. Some of them made me &ldquo;spend&rdquo; the entire weekend in
order to answer all the questions! Well, I wasn&rsquo;t looking for an easy
time anyway. :-)</p>
<p>In fact, I would stop using the word <em>homework</em> because what we did
at the end of each week was actually an <em>experiment</em>. We played with
every aspect involved in the design and implementation of simple
machine learning systems, either by implementing everything from
scratch or by using third-party packages.</p>
<p>The ambiguity present in several answers (they being intentional or
not) made me more skeptical about what was coming out of my
programs. This encouraged students to engage in enlightening
discussions in the Q&amp;A forum.</p>
<p>To be honest, I don&rsquo;t like vBulletin, the platform chosen for the Q&amp;A
forum. I think even more users would have joined the discussions if a
more friendly (and reward oriented) option were chosen instead. (some
<a href="http://stackoverflow.com/">Stack Overflow</a> clone would have been
better)</p>
<p>Still, I&rsquo;m impressed by the quality of the discussions started
there. Thanks to the very gifted and dedicated colleagues who helped
me clear my doubts &ndash; and even change my mind. I owe you! :-)</p>
<h2 id="my-solutions">My Solutions</h2>
<p>I chose <a href="http://www.gnu.org/software/octave/">Octave</a> to be my
programming language throughout the course, and this choice proved to
be the right one. Some questions involved heavy matrix-based
calculations and quadratic programming problems, and this is where
Octave shines.</p>
<p><del>Unfortunately, I&rsquo;m not allowed to post my solutions online because of
the honor code, but ask me if you want to discuss any question in
particular.</del></p>
<p><strong>Update (Jan 20, 2014):</strong> I participated in the reedition of this
<a href="https://www.edx.org/course/caltechx/caltechx-cs1156x-learning-data-1120">course on edX</a>
so I could earn a certificate as a reward for my hard work. In this edition,
the course staff encouraged students to discuss their solutions after each
week&rsquo;s deadline. My solutions are available
<a href="https://codeberg.org/danielfm/edx-learning-from-data">here</a>.</p>
]]></content:encoded>
    </item>
  </channel>
</rss>
