a t e v a n s . c o m

(╯°□°)╯︵ <ǝlqɐʇ/>

brew install groff - this contains the soelim binary, and whatever you are trying to build should now build correctly. Good luck!

Longer version: you may have encountered an error message like:

  Entering subdirectory man1
PAGES=`cd .; echo *.1`; \
	for page in $PAGES; do \
		sed -e "s%LDVERSION%2.4.44%" \
...
	done
/bin/sh: soelim: command not found
/bin/sh: soelim: command not found
...
/bin/sh: soelim: command not found
make[5]: *** [all-common] Error 127
make[4]: *** [all-common] Error 1
make[3]: *** [all-common] Error 1
make[2]: *** [all-common] Error 1
make[1]: *** [sbin/slapadd] Error 2
make: *** [gitlab-openldap/libexec/slapd] Error 2

I encountered this while trying to set up LDAP for local testing using the GitLab Development Kit. It downloads an older version of OpenLDAP, configures, compiles, and installs locally in a pretty sane way. Unfortunately, when it attempts to install the man pages for OpenLDAP, it tries to “glue” all the associated files together using soelim instead of something more universal. Since this was missing on my MacBook Pro, the install failed there.

My attempts to bypass man-page-compilation ended up with me digging through a lot of compile-time code and error messages, and it turned out to be a lot simpler to just get the soelim binary locally. The main barrier to this was my own StackOverflow / DuckDuckGo-fu; I didn’t come up with where the heck this soelim binary is. Ended up getting help from some awesome colleagues and finding the answer on Slack. Figured I should post this somewhere publicly searchable on the internet.

The groff package contains the soelim binary, so now you should have everything you need to find and start fixing the next dependency problem you encounter.

tl;dr: check out this sample code on Github for an AWS Textract client and results parser

At my previous company, we wanted to use Textract to get some table-based data out of PDFs. This was in the medical field, so the only “programmatic” interface we had to the system was to set up an inbox that would receive emails from it, and those emails might contain PDFs with the data we wanted. Medicine be like that.

However, the output of Textract can be a little hard to work with. It’s just a big bag of “blocks” - elements it has identified on the page - with a geometry, confidence, and relationships to each other. They’re returned as a paginated list, and you have to reconstruct the element hierarchy in your client. This became critical when trying to visualize where in a given “type” of document the information we wanted was located. Was it the last LINE element on a page? Was it a WORD element located inside some other elements? I wanted to visualize this to get a better look.

The result-parsing code in their tutorials is in Python, and is of the most uninspiring big-bag-of-functions type, so I thought about how to manage this in Ruby. Mostly I just wanted some data structure where I could call .parent on particular element and recurse up to the page level, kind of like the DOM in html-land.

I ended up with some code that looks like this:

class Node
  attr_reader :block, :parent, :children

  def initialize(block, parent: nil, blocks_map: {})
    @block = block
    @parent = parent
    @children = []
    return if block.relationships.nil?

    block.relationships.each do |rel|
      next unless rel.type == 'CHILD'
      next if rel.ids.nil? || rel.ids.empty?

      rel.ids.each do |block_id|
        blk = blocks_map[block_id]
        next if blk.nil?

        @children << self.class.new(blk, parent: self, blocks_map: blocks_map)
      end
    end
  end
end

This gave me a tree object with a reasonable structure. If I wanted to get fancier, I could add a grep method to search the node text and its children, or other recursive tree-based functionality. If we wanted to get really fancy, we could sort the tree by x * y in the geometry, making it easy to walk the tree from top-left to bottom-right.

But since we were writing pretty basic extractors, this was enough to let me walk through, find the element I wanted with the right block.text value, and walk up its parents to see where it lived in the document structure.

I added some code to print the whole tree to console so you can easily visualize it:

  def to_s
    txt = if block.text.nil?
      ''
    elsif block.text.length > 10
      "#{block.text[0..7]}..."
    else
      block.text
    end
    "<#{block.block_type} #{txt} #{block.id}>"
  end

  def print_tree(indent = 0)
    indent_txt = indent > 0 ? ' ' * (indent * 2) : ''
    puts "#{indent_txt}#{to_s}"
    children.each {|chld| chld.print_tree(indent + 1) }
  end

This is in leiu of just making a nicer inspect method and using something like awesome_print or the built-in pp method. While those are great, we don’t really need the Ruby object ids and other properties for this visualization - they just clutter up the terminal. We could overwrite def inspect to show only the info we want, but I feel like that’s a POLA violation, so it’s better to just write this functionlaity where it belongs.

If you’d like to run a Textract analysis and play with the results, I’ve got the sample code up on Github. It’s not well-tested or ready for deployment, but it can be a starting point if you want to do a quick integration of Textract into your own Ruby project. Hope this helps someone!

tl;dr - if Dependabot isn’t opening a PR to fix a vulnerable package, especially with npm + yarn, try it manually every so often. Use commands like:

  • npm ls vulnpackage to see where the vulnerable package is in the npm dep tree
  • yarn why --recursive vulnpackage for Yarn’s explanation of all dep trees including the vulnerable package
  • yarn up --recursive vulnpackage to upgrade everything in the dep tree from the vulnerable package, but keeping the version constraints specified in your package.json (only available in Yarn 3.0+)

This was an issue we encountered at my last job. We used Github’s Dependabot to track CVE’s in our supply chain. There are similar tools from Snyk and others, but Dependabot comes included with Github, and it’s compatible with a pretty broad array of languages.

There was a frustrating vulnerability that was open for some time - vm2 had a sandbox escape vulnerability, and it was required about 7 levels deep by Microsoft AppCenter’s react-native-code-push library. Rather than attempt to mitigate the issue, the vm2 maintainer decided to discontinue the project.

Eventually a mitigation was found, since the code path using vm2 in code-push is a very rare use case, and could be removed by using a newer version of superagent . code-push was updated, then react-native-code-push was updated to fix the dependency tree and allay Snyk and Dependabot alerts for AppCenter users.

Dependabot should have opened a pull request to update our app. Although react-native-code-push was pinned to version v.8.0.0 , the dependency specification for code-push was at "code-push@npm:^4.1.0": in the yarn.lock file. When we ran yarn up --recursive code-push , it successfully updated code-push in the yarn.lock file, removed the vm2 dependency, and looked ready to go. But Dependabot was throwing an error and saying that the dependency “could not be upgraded.”

I’ve managed to nearly reproduce this situation in a public repo - check out yarn-dependabot-example on my Github. The only difference here is that Dependbot doesn’t seem to be trying to open a pull request to fix the issue. Running yarn up --recursive code-build fixes it immediately, affecting only the lockfile.

I wasn’t able to clearly determine why Dependbot considered this issue unfixable, but it seems clear that the dep tree shaker written for Dependabot and the one Yarn uses are slightly different. For any “stubborn” vulnerabilities that aren’t getting fixes auto-generated, it’s worth trying the simply stuff manually now & then to see if a fix is available.

tl;dr - got a weird error when opening a .xlsx sheet using Roo, wrote a quick fix for it. The error had a stack trace like this:

	12: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/nokogiri-1.15.4-x86_64-darwin/lib/nokogiri/xml/node_set.rb:234:in `upto'
	11: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/nokogiri-1.15.4-x86_64-darwin/lib/nokogiri/xml/node_set.rb:235:in `block in each'
	10: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/sheet_doc.rb:224:in `block (2 levels) in extract_cells'
	 9: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/sheet_doc.rb:101:in `cell_from_xml'
	 8: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/nokogiri-1.15.4-x86_64-darwin/lib/nokogiri/xml/node_set.rb:234:in `each'
	 7: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/nokogiri-1.15.4-x86_64-darwin/lib/nokogiri/xml/node_set.rb:234:in `upto'
	 6: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/nokogiri-1.15.4-x86_64-darwin/lib/nokogiri/xml/node_set.rb:235:in `block in each'
	 5: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/sheet_doc.rb:114:in `block in cell_from_xml'
	 4: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/sheet_doc.rb:172:in `create_cell_from_value'
	 3: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/sheet_doc.rb:172:in `new'
	 2: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/cell/number.rb:16:in `initialize'
	 1: from /Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/cell/number.rb:27:in `create_numeric'
/Users/agius/.rbenv/versions/2.7.0/lib/ruby/gems/2.7.0/gems/roo-2.10.0/lib/roo/excelx/cell/number.rb:27:in `Integer': invalid value for Integer(): "" (ArgumentError)

I made a pull request with a fix here, but it hasn’t been addressed or merged. Kind of a bummer, feels like this happens nearly every time I try to contribute to open-source projects. You can implement the fix yourself by monkey-patching like so:

# frozen_string_literal: true

module Roo
  class Excelx
    class Cell
      class Number < Cell::Base
        def create_numeric(number)
          return number if Excelx::ERROR_VALUES.include?(number)
          case @format
          when /%/
            cast_float(number)
          when /\.0/
            cast_float(number)
          else
            (number.include?('.') || (/\A[-+]?\d+E[-+]?\d+\z/i =~ number)) ? cast_float(number) : cast_int(number, 10)
          end
        end
        
        def cast_float(number)
          return 0.0 if number == ''

          Float(number)
        end

        def cast_int(number, base = 10)
          return 0 if number == ''

          Integer(number, base)
        end
      end
    end
  end
end

If you’re running Rails, drop that in lib/roo/excelx/cell/number.rb and make sure it’s require -ed in an initializer or on app boot. Your app should now be able to handle the “bad” spreadsheets producing the error.

The longer story

This wasn’t the initial error I was debugging, I think the original error was a NoMethodError on the Roo::Spreadsheet object or something. From seeing the initial error on Bugsnag, I was able to determine which file it came from, but it seemed like the stack trace was truncated. So I downloaded the sheet and tried to reproduce locally. The results were weird: I didn’t get the production error, and didn’t seem to get a “proper” local error, either. When running inside a rails console, I just saw the message:

> xlsx.each_row_streaming {|row| pp [:row, row] }
...lots of output...
[:row,
 [#<Roo::Excelx::Cell::String:0x00007f8982913580
   @cell_value="Header 1",
   @coordinate=[1, 1],
   @value="Header 1">,
  #<Roo::Excelx::Cell::String:0x00007f8982913148
   @cell_value="Header 2",
   @coordinate=[1, 2],
   @value="Header 2">,
  #<Roo::Excelx::Cell::String:0x00007f8982912ea0
   @cell_value="Header 3",
   @coordinate=[1, 3],
   @value="Header 3">]]
invalid value for Integer(): ""
> $!
nil
> $@
nil
> 

Like, huh? Where’s the stack trace? Why isn’t the error in $! , the magic Ruby global for “last error message?” I did manage to get a full stack trace by writing a test that tried to parse the “bad” spreadsheet. Maybe IRB or Pry was swallowing the error somehow.

Once I got the stack trace above, it looked like something was wrong in Roo itself, and not in our code or the way we were calling it. It seemed like something that should be a simple fix. The “Numbers” app that comes with OSX opened the bad spreadsheet no problem, and uploading to Google Sheets also didn’t pose any issue. So what was going on with Roo? Clearly it was trying to parse something as an Integer when it was an empty string, but how did it get there?

Investigating the raw XML

XLSX documents are just XML files zipped up in a particular way. You can unzip them and read the raw XML if you want. Since the each_row_streaming command above indicated which cell was causing the problem, I thought I’d dig in and see if it was some weird Unicode character conversion, mis-encoded file, or what it might be.

To unzip your XLSX file, just use your OS’s built-in unzipping tool:

$ unzip Example-sheet.xlsx -d Example-sheet-unzip
$ find empty_sheet
empty_sheet
empty_sheet/[Content_Types].xml
empty_sheet/_rels
empty_sheet/_rels/.rels
empty_sheet/xl
empty_sheet/xl/workbook.xml
empty_sheet/xl/worksheets
empty_sheet/xl/worksheets/sheet1.xml
...snip...
$ open xl/worksheets/sheet1.xml

Yup! Sure is XML. It’s not, like, super-readable XML, but you can kinda reverse-engineer it. Like reading a restaurant menu in a language you don’t know.

<c r="K58" s="1" t="s">
  <v>4</v>
</c>

Here’s a cell in the spreadsheet. You can see it’s cell K58 - column K, row 58. The t="s" only seems to appear for cells with string values in them, not the numeric ones, so it probably means “type=string” . The s="1" , I kinda think means “spreadsheet 1” , since each workbook can have multiple tabs of sheets on it that can reference each other. Don’t quote me on that, it was out of scope for this investigation.

The value inside the <v>4</v> tags indicates two possible things:

  1. for numeric-type cells, it’s the numeric value in the cell, either int or float
  2. for string-type cells, it’s a reference to the file sharedStrings.xml , which is just a big array of all the strings in the spreadsheet. Helps reduce filesize by de-duping strings, I guess

Knowing all that, and knowing the specific error message invalid value for Integer(): "" , I could kind of deduce what might have gone wrong with the cell that seemed to stump the Roo parser:

<c r="L58">
  <v/>
</c>

I checked some other spreadsheets that didn’t have any errors, and I couldn’t see any instances of a self-closing tag like this. Other spreadsheets had full valid XML tags:

<c r="N26">
  <v></v>
</c>

Roo didn’t blow a gasket on this type of cell. It seemed like this might be the issue. I did some Googling and figured out how to zip the dir back up as an XLSX file so I could test out my theory:

Modifying the XML in an XLSX file for fun & profit

I followed this guide to unzipping and re-zipping XLSX files. Re-posting in case the original site goes down or something:

  1. unzip Example-sheet.xlsx -d Example-sheet-unzip
  2. cd into the extracted zip dir Example-sheet-unzip
  3. make edits to the xml as desired, probably in a file like xl/worksheets/sheet1.xml
  4. use python2 ../zipdir.py Example-sheet-edited.xlsx . (see script below) to compress the objects
  5. this should generate an .xlsx file open-able by Numbers, Excel, GSheets
#!/usr/bin/python

# Name: zipdir.py
# Version: 1.0
# Created: 2016-11-13
# Last modified: 2016-11-13
# Purpose: Creates a zip file given a directory where the files to be zipped
# are stored and the name of the output file. Don't include the .zip extension
# when specifying the zip file name.
# Usage: zipdir.py output_filename dir_name
# Note: if the output file name and directory are not specified on the
# command line, the script will prompt for them.

import sys, shutil

if len(sys.argv) == 1:
   dir_name = raw_input("Directory name: ")
   output_filename = raw_input("Zip file name: ")
elif len(sys.argv) == 3:
   output_filename = sys.argv[1]
   dir_name = sys.argv[2]
else:
   print "Incorrect number of arguments! Usage: zipdir.py output_filename dir_name"
   exit()

shutil.make_archive(output_filename, 'zip', dir_name)

Once I did all that and zipped the file back up, Roo was able to parse it and print the expected results in IRB and in my RSpec test case. Bingo! Scientific proof, basically.

I also took a stab at generating some “bad” XML files myself, since I didn’t want to check a real file with production data into our git repo if I could avoid it. I couldn’t get Numbers or Google Sheets to generate a self-closing XML tag in the output. They only generated valid open-and-close-style tags even for empty cells.

A little more investigation revealed the file came from an external web site’s reporting feature. The site was using Kendo UI with Angular on the front-end, which had a built-in export-to-Excel feature. Like most problems in web development, the blame falls on a horrible front-end framework reinventing the wheel with JavaScript.

Developing the fix

Since I couldn’t reproduce the exact path to make a “bad” file, I just wrote one myself. Made a quick, nearly-empty sheet in Numbers, exported it, and followed the steps above to change a valid <v></v> tag into an invalid <v/> tag. This I could check in to the repo along with an RSpec test reproducing the error condition. Now we could code something!

Since I had the stack trace, I could look at the file directly. The relevant class is here on Github, though I usually just use bundle show roo to inspect the exact code on my machine. Since there’s only one place with Integer() in the file, and this is basically another type of nil-check, it seemed like a pretty easy fix.

I’ve had bad luck getting fixes merged into open-source projects. Most maintainers seem overworked, short on time, and extremely particular about the kind of code they want in their repo. Even submitting a full pull request meeting the maintainers’ expectations for code style, testing, documentation, and all the other ancillary considerations, it can take a lot of back-and-forth and usually weeks to get a PR merged. I didn’t think we should wait this long to get our process fixed, so I went with monkey-patching as a first approach.

I made the modifications you can see in the monkey-patch 🙈 above. I figured we’d probably want to check for this empty-string, no-format condition for both Float and Integer values, so it’s probably worth factoring out into a separate method. This would also make it easier to extend later with a rescue block, if needed.

Added the patch to our repo, ran the specs to make sure it worked, and put up a pull request on our app. A fix is never really done until it’s verified to be working on production, so once I got code review, merged & deployed, I re-ran the file to verify it produced the expected results instead of an error. Good to go!

Upstreaming

Since I had written an essentially-empty XLSX file to test on locally, I figured it would be easy to make a PR to Roo itself. I cloned their repo, and their test suite was in RSpec and MiniTest and mostly what I expected. They had a whole pile of files with a particular naming scheme, and an easy test harness to parse the file and check for expected results.

Converted the monkey-patch into a diff on the Roo::Excelx::Cell::Number class, added my empty file as a test case, and added a unit test for the class to boot. Maybe I could have written more tests to cover every possible branch condition, but I figured I’d get the PR up and see what the maintainers thought about the approach.

Sadly I can’t end it with “they loved my PR and merged it” – not sure why my PR got ghosted when others are getting reviewed and merged. But at least we got it working for our app, so ¯\(ツ)

Wrap-up

Hope that helps somebody out there! I learned a lot just debugging this one bad spreadsheet, so I figured it was worth a write-up. If you have any questions or comments, feel free to email me about it, since social media has kinda fallen off a cliff lately. Best of luck, internet programmers!

You plug your shiny new eGPU into your MacBook Pro, and expect a game like Borderlands 3 from the Epic Games Store to run smoothly (30+ FPS) at medium settings on your 4k monitor. It’s a good graphics card, but when running the Benchmark animation, you’re still seeing drops into 10-20fps range, and it stutters and becomes a slideshow. Is it a crappy port? Is your graphics card broken? No! Take heart, and follow these steps:

  1. Use a disk space calculator like DaisyDisk, and notice that there’s a huge folder at /Users/Shared/Epic\ Games/Borderlands3/ , where the Epic Games Store puts all of its games
  2. Open it in finder by running: open /Users/Shared/Epic\ Games/Borderlands3/ on the terminal
  3. ⌃+click or on the Borderlands3 app in that folder and select “Get Info” from the menu
  4. If your eGPU is plugged in, you should see a checkbox that says “Prefer External GPU” , as explained in this Apple Support article . Check that box!
  5. Re-launch the game via the Epic Store. Your benchmarks should be markedly improved.

For some reason the game was trying to run using the MacBook Pro’s built-in mobile GPU instead of the honking, loud, industrial-strength graphics pipes of the eGPU. Checking that setting fixed it.

There’s a few things I don’t know about this setting:

  • will it get clobbered if Epic Games Store updates the game files?
  • is it retained after unplugging and re-plugging the eGPU?
  • is there any way to set it via command-line instead of in Finder?
  • can it be passed in as a command-line argument via Epic Game Store’s per-game “Additional Command Line Arguments” setting?

But at least this has solved my slideshow problem for now.

Sometimes, you’re testing a chat bot dingus, and you need a couple images so it can respond to @bot boop . I wanted more than 10, quickly, preferably without having to click through Google Image search and manually download them.

Giphy and Imgur both require you to sign up and make an oAuth app and blah blah blah before you can start using their API. Not worth it for a trivial one-off.

Turns out Reddit exposes any endpoint in JSON as well as for web browsers, and that’s publicly accessible. And Reddit has a whole community called /r/BoopableSnoots .

So I threw this command together:

curl "https://www.reddit.com/r/BoopableSnoots.json" | \
jq -c '.data.children | .[].data.url' | \
xargs -n 1 curl -O

What is this doing?

curl - sends a GET request to the specified URL and forwards the response body to stdout .

| - take the output of the previous command on stdout and feed it into the next command as stdin

jq -c '...' - jq is a command-line JSON parser and editor. This command drills one level down the object structure returned by reddit and returns the data.url field. The -c flag removes quoting from the output, and the .[] iterates across an array, outputting one element per line

xargs - is a meta-command; it says “run the following command for each line of input on stdin.” It can run in parallel, or in a pool of workers, etc.

curl -O - sends a GET request to the specified URL and saves the response to the filesystem using the filename contained in the URL

Putting it all together:

  1. Get the ~25 most recent posts off Reddit’s BoopableSnoots community
  2. Filter down to just the images that people posted
  3. download those images to the current directory

This quickly got me some snoots for my bot to boop, and I could move on with my work.

<rant>

I tried to compile our mapbox-java sdk on my Macbook, and ran into a versioning error:

$ make build-config
./gradlew compileBuildConfig
Starting a Gradle Daemon (subsequent builds will be faster)

> Task :samples:compileBuildConfig FAILED
/Users/username/Workspace/mapbox-java/samples/build/gen/buildconfig/src/main/com/mapbox/sample/BuildConfig.java:4: error: cannot access Object
public final class BuildConfig
             ^
  bad class file: /modules/java.base/java/lang/Object.class
    class file has wrong version 56.0, should be 53.0
    Please remove or make sure it appears in the correct subdirectory of the classpath.
1 error

I had installed Java via Homebrew Cask, the normal way to install developer things on macOS. Running brew cask install java gets the java command all set up for you, but what version is that?

$ java -v
Unrecognized option: -v
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
# c'mon java, really :/ smh
$ java --version
openjdk version "12" 2019-03-19
OpenJDK Runtime Environment (build 12+33)
OpenJDK 64-Bit Server VM (build 12+33, mixed mode, sharing)

$ brew cask info java
java: 12.0.1,69cfe15208a647278a19ef0990eea691
https://jdk.java.net/
/usr/local/Caskroom/java/10.0.1,10:fb4372174a714e6b8c52526dc134031e (396.4MB)
/usr/local/Caskroom/java/12,33 (64B)
From: https://github.com/Homebrew/homebrew-cask/blob/master/Casks/java.rb
==> Name
OpenJDK Java Development Kit
==> Artifacts
jdk-12.0.1.jdk -> /Library/Java/JavaVirtualMachines/openjdk-12.0.1.jdk (Generic Artifact)

Which is cool. 12.0.1 , 12+33 and 56.0 are basically the same number.

So I guess I need a lower version of Java. No idea what version of Java will get me this 53.0 “class file,” but let’s try the last release. Multiple versions means you need a version manager, and it looks like jenv is Java’s version manager manager.

$ brew install jenv
$ eval "$(jenv init - zsh)"
$ jenv enable-plugin export
$ jenv add $(/usr/libexec/java_home)
$ jenv versions
* system (set by /Users/andrewevans/.jenv/version)
12
openjdk64-12

Jenv can’t build or install Java / OpenJDK versions for you, so you have to do that separately via Homebrew, then “add” those versions via jenv add /Some/System/Directory , because java. Also, the oh-my-zsh plugin doesn’t seem to quite work, as it doesn’t set the JAVA_HOME env var. I had to manually add the “jenv init” and “enable-plugin” to my shell init scripts.

Anyway, let’s try Java 11, as 11 is slightly less than 12 and 53 is slightly less than 56.

$ brew tap homebrew/cask-versions
$ brew cask install java11
$ jenv add /Library/Java/JavaVirtualMachines/openjdk-11.0.2.jdk/Contents/Home
$ jenv local 11.0
$ jenv shell 11.0

Had to add both of the latter jenv commands, as I guess jenv local only creates the .java-version file and doesn’t actually set JAVA_HOME . Sadly 11.0 is not 53.0 , so I still got basically the same error when I ran make build-config .

After asking my coworkers, Android and our mapbox-java repo uses JDK 8. You could install this via a cask called, funnily enough, java8 . Except Oracle torpedoed it. Sounds like they successfully ran the “embrace, extend, extinguish” playbook on the “open” OpenJDK, though I am not a Java and thus do not fully understand the insanity of these versions and licensing issues). tl;dr Homebrewers had to remove the java8 cask.

Homebrewers seemed to prefer AdoptOpenJDK, which is a perfectly cromulent name and doesn’t at all add to the confusion of the dozens of things named “Java.” So let’s get that installed:

$ brew cask install homebrew/cask-versions/adoptopenjdk8
$ jenv add /Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home
$ cd ~/Workspace/mapbox-java
$ jenv local 1.8
$ jenv shell 1.8 # apparently 'jenv local' wasn't enough??
$ jenv version
1.8 (set by /Directory/.java-version)
$ java -v
Unrecognized option: -v
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
# right, forgot about that, jeebus java please suck less
$ java --version
Unrecognized option: --version
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
# wtf java srsly?!
$ java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
# farking thank you finally
$ make build-config

This seemed to be the right version, and the make build-config command succeeded this time. JDK 8 and 1.8.0 and 53.0 are pretty similar numbers, so in retrospect this should’ve been obvious. And AdoptOpenJDK has more prefixes before “Java,” so I probably should’ve realized that was the “real” Java.

Anyway, now I can compile the SDK without having to have installed IntelliJ IDEA or Android Studio, which both seemed kinda monstrous and who knows what the hell they’d leave around my system. Goooooood times.

</rant>

I joined Mapbox roughly three months ago as a security engineer. I’d been a full-stack engineer for a little over ten years, and it was time for a change. Just made one CRUD form too many, I guess. I’ve always been mildly paranoid and highly interested in security, so I was delighted when I was offered the position.

It’s been an interesting switch. I was already doing a bunch of ops and internal tool engineering at Hired, and that work is similar to what our security team does. We are primarily a support team - we work to help the rest of the engineers keep everything secure by default. We’re collaborators and consultants, evangelists and educators. From the industry thought leaderers and thinkfluencers I read, that seems to be how good security teams operate.

That said, tons of things are just a little bit different. My coworkers come from a variety of backgrounds; full-stack dev, sysadmin, security researcher; and they tend to come at things from a slightly different angle than I do. Right when I joined, we made an app for our intranet and I thought “It’s an intranet app, no scaling needed!” My project lead corrected me, though: “Oh, some pentester is going to scan this, get a thousand 404 errors per hour and DDoS it.” That kinda thing has been really neat.

I thought it’d be good to list out some of the differences I’ve noticed:

code quality

I’ve worked on a lot of multi-year-old Rails monoliths. You read stuff like POODR and watch talks by Searls because it’s important your code scales with the team. Features change, plans change, users do something unexpected and now your architecture needs to turn upside-down. It’s worth refactoring regularly to keep velocity high.

In security, sure, it’s still important the rest of the team can read and maintain your code. But a lot of your work is one-off scripts or simple automations. Whole projects might be less than 200 LoC, and deployed as a single AWS lambda. Even if you write spaghetti code, at that point it’s simple enough to understand, and rewrite from scratch if necessary.

Fixes to larger & legacy projects are usually tweaks rather than overhauls, so there’s not much need to consider the overall architecture there, either.

scaling

Definitely less important than in full-stack development. Internal tooling rarely has to scale beyond the size of the company, so you’re not going to need HBase. You might need to monitor logs and metrics, but those are likely going to be handled already.

teamwork

Teamwork is, if anything, more important in security than it was in full-stack development. Previously I might have to chat with data science, front-end, or a specialist for some features. In security we need to be able to jump in anywhere and quickly get enough understanding to push out fixes. Even if we end up pushing all the coding to the developers, having a long iteration cycle and throwing code “over the wall” between teams is crappy. It’s much better if we can work more closely, pair program, and code review to make sure a fix is complete rather than catch-this-special-case.

You also need a lot of empathy and patience. Sometimes you end up being the jerk who’s blocking code or a deploy. Often you are dumping work on busy people. It can be very difficult to communicate with someone who doesn’t have a lot of security experience, about a vulnerability in a legacy project written in a language you don’t know.

technical challenge

I’m used to technical challenges like “how do we make the UI do this shiny animation?” Or “how do we scale this page to 100k reads / min?” The technical challenges I’ve faced so far have been nothing like that. They’ve been more like: “how do we encode a request to this server, such that when it makes a request to that server it hits this weird parsing bug in Node?” and subsequently “how do we automate that test?” Or “how do we dig through all our git repos for high-entropy strings?”

It’s not all fun and games, though. There’s plenty of work in filling out compliance forms, resetting passwords and permissions, and showing people how & why to use password managers. While not exactly rocket surgery, these are important things that improve the overall security posture of the company, so it’s satisfying that way.

deadlines

Most deadlines in consumer-facing engineering are fake. “We want to ship this by the end of the sprint” is not a real deadline. I often referred folks to the Voyager mission’s use of a once-per-175-years planetary alignment for comparison. In operations, you get some occasional “the site is slow / down,” but even then the goal was to slowly & incrementally improve systems such that the urgent things can be failed over and dealt with in good time.

In security the urgency feels a bit more real. Working with outside companies for things like penetration tests, compliance audits, and live hacking events means real legal and financial risk for running behind. “New RCE vulnerability in Nginx found” means a scramble to identify affected systems and see how quickly we can get a patch out. We have no idea how long we have before somebody starts causing measurable damage either for malicious purposes or just for the lulz.

learning

In full-stack engineering & ops, I would occasionally need to jump into a different language or framework to get something working. Usually I could get by with pretty limited knowledge: patching Redis into a data science system for caching, or fixing an unhandled edge-case in a frontend UI. I felt like I had a pretty deep knowledge of Ruby and some other core tools, and I could pick up whatever else I needed.

There’s a ton of learning any time you start at a new company: usually a new language, a new stack, legacy code and conventions. But throwing that out, I’ve been learning a ton about the lower-level functioning of computers and networks. Node and Python’s peculiar handling of Unicode, how to tack an EHLO request onto an HTTP GET request, and how to pick particular requests out of a flood of recorded network traffic.

Also seeing some of the methods of madness that hackers use in the real world: that thing you didn’t think would be a big deal because it’s rate-limited? They’ll script a cron job and let it run for months until it finds something and notifies them.

wrap-up

It’s been a blast, and I look forward to seeing what the next three months brings. I’m hopeful for more neat events, more learning, and maybe pulling some cool tricks of my own one of these days.

Security@ Notes

I went to HackerOne’s Security@ conference last week, and can vouch that it was pretty cool! Thanks to HackerOne for the invite and to Hired for leave to go around the corner from our building for the day.

My notes and main take-aways:

The name of the conference comes from an email inbox that every company should theoretically have. Ideally you’d have a real vulnerability disclosure program with a policy that lets hackers safely report vulnerabilities in your software. But not every company has the resources to manage that, so at least having a security@ email inbox can give you some warning.

As a company, you probably should not have a bug bounty program unless you are willing to dedicate the resources to managing it. To operate a successful bug bounty, you need to respond quickly to all reports and at least get them triaged. You should have a process in place to quickly fix vulnerabilities and get bounties paid. If hackers have reports sitting out there forever, it frustrates both parties and discourages working with the greater bounty community.

I was surprised during the panel with three of HackerOne’s top hackers (by bounty and reputation on their site). Two of them had full-time jobs in addition to pursuing bug bounties. They seemed to treat their hacking like a freelance gig on the side - pursue the quickest & most profitable bounties, and skip over low-rep or slow companies. Personally I find it difficult to imagine having the energy to research other companies for vulnerabilities after a full day’s work. But hey, if that’s your thing, awesome!

Natalie Silvanovich from Google’s Project Zero had a really interesting talk on how to reduce attack surface. It had a lot of similar themes to good product management in general: consider security when plotting a product’s roadmap, have a process for allocating time to security fixes, and spend time cleaning up code and keeping dependencies up to date. It’s easy to think that old code and old features aren’t hurting anyone: the support burden is low, the code isn’t getting in the way, and 3% of users love this feature, so why take time & get rid of it? Lowering your attack surface is a pretty good reason.

Coinbase’s CSO had an interesting note: the max payout from your bug bounty program is a proxy marker for how mature your program is. If your max bounty is $500, you’re probably paying enough bounties that $500 is all you can afford. They had recently raised their max bounty to $50,000 because they did not expect to be paying out a lot high-risk bounties.

Fukuoka Ruby Night

Last Friday I also went to the Fukuoka Ruby Night. I guess the Fukuoka Prefecture is specifically taking an interest in fostering a tech and startup scene, which is pretty cool. They had talks from some interesting developers from Japan and SF, and they also brought in Matz for a talk. Overall a pretty cool evening.

Matz and another developer talked a bunch about mruby - the lightweight, fast, embeddable version of Ruby. It runs on an ultra-lightweight VM compiled for any architecture, and libraries are linked in rather than interpreted at runtime. I hadn’t heard much about it, and figured it was a thing for arduinos or whatever. Turns out it’s seen some more impressive use:

  • Yes, arduinos, Raspberry Pi’s, and other IoT platforms
  • MItamae - a lightweight Chef replacement distributed as a single binary
  • Nier Automata, a spiffy game for PS4 and PC

Matz didn’t have as much to say about Ruby 3. He specifically called out that if languages don’t get shiny new things, developers get bored and move on. I guess Ruby 3 will be a good counter-point to the “Ruby is Dead” meme. Ruby 3 will be largely backwards-compatible to avoid getting into quagmires like Ruby 1.9, Python 3, and PHP 6. They are shooting for 3x the performance of Ruby 2.x - the Ruby 3x3 project.

One way the core Ruby devs see for the language to evolve without breaking changes or fundamental shifts (such as a type system) is to focus on developer happiness. Building a language server into the core of Ruby 3 is one example that could drastically improve the tooling for developers.

He also talked about Duck Inference - an “80% compile time type checking” system. This could potentially catch a lot more type errors at compile time without requiring type hints, strict typing or other code-boilerplate rigamarole. Bonus: it would be fully backwards-compatible.

I’m a little skeptical - I personally find CTAGs and other auto-complete tools get in the way about as often as they help. For duck inferencing Matz mentioned saving type definitions and message trees into a separate file in the project directory, for manual tweaking as needed. Sounds like it could end up being pretty frustrating.

Guess we’ll see! Matz said the team’s goal is “before the end of this decade,” but to take that with a grain of salt. Good to see progress in the language and that Ruby continues to have a solid future.

Curses is a C library for terminal-based apps. If you are writing a screen-based app that runs in the terminal, curses (or the “newer” version, ncurses ) can be a huge help. There used to be an adapter for Ruby in the standard library, but since 2.1.0 it’s been moved into its own gem.

I took a crack at writing a small app with curses, and found the documentation and tutorials somewhat lacking. But after a bit of learning, and combining with the Verse and TTY gems, I think it came out kinda nice.

Here’s a screenshot of the app, which basically stays open and monitors a logfile:

logwatch demo gif

There are three sections - the left side is a messages pane, where the app will post “traffic alert” and “alert cleared” messages. The user can scroll that pane up and down with the arrow keys (or h/j if they’ve a vim addict). On the right are two tables - the top one shows which sections of a web site are being hit most frequently. The bottom shows overall stats from the logs.

Here’s the code for it, and I’ll step through below and explain what does what:

require "curses"
require "tty-table"
require "logger"

module Logwatch
  class Window

    attr_reader :main, :messages, :top_sections, :stats

    def initialize
      Curses.init_screen
      Curses.curs_set 0 # invisible cursor
      Curses.noecho # don't echo keys entered

      @lines = []
      @pos = 0

      half_height = Curses.lines / 2 - 2
      half_width = Curses.cols / 2 - 3

      @messages = Curses::Window.new(Curses.lines, half_width, 0, 0)
      @messages.keypad true # translate function keys to Curses::Key constants
      @messages.nodelay = true # don't block waiting for keyboard input with getch
      @messages.refresh
      
      @top_sections = Curses::Window.new(half_height, half_width, 0, half_width)
      @top_sections.refresh
      
      @stats = Curses::Window.new(half_height, half_width, half_height, half_width)
      @stats << "Stats:"
      @stats.refresh
    end

    def handle_keyboard_input
      case @messages.getch
      when Curses::Key::UP, 'k'
        @pos -= 1 unless @pos <= 0
        paint_messages!
      when Curses::Key::DOWN, 'j'
        @pos += 1 unless @pos >= @lines.count - 1
        paint_messages!
      when 'q'
        exit(0)
      end
    end

    def print_msg(msg)
      @lines += Verse::Wrapping.new(msg).wrap(@messages.maxx - 10).split("\n")
      paint_messages!
    end

    def paint_messages!
      @pos ||= 0
      @messages.clear
      @messages.setpos(0, 0)
      @lines.slice(@pos, Curses.lines - 1).each { |line| @messages << "#{line}\n" }
      @messages.refresh
    end

    def update_top_sections(sections)
      table = TTY::Table.new header: ['Top Section', 'Hits'], rows: sections.to_a
      @top_sections.clear
      @top_sections.setpos(0, 0)
      @top_sections.addstr(table.render(:ascii, width: @top_sections.maxx - 2, resize: true))
      @top_sections.addstr("\nLast refresh: #{Time.now.strftime('%b %d %H:%M:%S')}")
      @top_sections.refresh
    end

    def update_stats(stats)
      table = TTY::Table.new header: ['Stats', ''], rows: stats.to_a
      @stats.clear
      @stats.setpos(0, 0)
      @stats.addstr(table.render(:ascii, width: @stats.maxx - 2, resize: true))
      @stats.addstr("\nLast refresh: #{Time.now.strftime('%b %d %H:%M:%S')}")
      @stats.refresh      
    end

    def teardown
      Curses.close_screen
    end

  end
end

Initialize

On initialize, we do some basic initialization of the curses gem - this will set up curses to handle all rendering to the terminal window.

Curses sets up a default Curses::Window object to handle rendering and listening for keyboard input, accessible from the stdscr method. This is where Curses.lines and Curses.cols come from, and represent the whole terminal.

I initially tried using the default window’s subwin method to set up the panes used by the app, but that proved to add a whole bunch of complication for no actual benefit. Long ago it may have provided a performance boost, but we’re well past that, I think.

Also tried using the Curses::Pad class so I wouldn’t have to handle scrolling myself, but that also had tons of wonky behavior. Rendering yourself isn’t that hard; save the trouble.

To handle keyboard input, we set keypad(true) on the messages window. We also set nodelay = true (yes, one is a method call, the other is assignment, no idea why) so we can call .getch but still update the screen while waiting for input.

The two stats windows, we initialize mostly empty. Then call refresh on all three to get them set up on the active terminal.

Main Render Loop

The class that loops and takes actions is not the window manager; but the interface is pretty simple. There’s a loop that checks for updates from the log file, updates the stats data store, then calls the two render methods for the stat windows. It also tells the window manager to handle any keyboard input, and will call print_msg() if it needs to add an alert or anything to the main panel.

The main way to get text onto the screen is to call addstr() or << on a Curses::Window , then call refresh() to paint the buffer to the screen.

The Window has a cursor, and it will add each character from the string and advance that, just like in a text editor. It tries to do a lot of other stuff; if you add characters beyond what the screen can show, it will scroll right and hide the first n columns. If you draw too many lines it will scroll down and provide no way to scroll back up. I tried dealing with scrl() and scroll() methods and such, but could never get the behavior working well. In the end, I did it manually.

I used the verse gem to wrap lines of text so that we never wrote past the window boundaries. The window manager keeps an array of all lines that have been printed during the program, and a position variable representing how far we’ve scrolled down in the buffer. On each update it:

  1. clears the Curses::Window buffer
  2. moves the cursor back to (0,0)
  3. prints the lines within range to the Curses::Window
  4. calls refresh() to paint the Curses::Window buffer to the screen

The stats windows are basically the same. I used the TTY::Table gem from the tty-gems collection to handle rendering the calculated stats into pretty ASCII tables.

Teardown

The teardown method clears the screen, which resets the terminal to non-visual mode. The handle_keyboard_input method calls exit(0) when a user wants to quit, but the larger program handles the interrupt signal and ensure ‘s the teardown method gets called.

Wrap

Hope that’s helpful! I had the wrong model of how all this stuff worked in my head for most of the development of this simple app. Maybe having what I came to laid out here will be useful.

Mastodon