Working In The Open

screenshot of @nova live-debugging Hachyderm on Twitch

The other day my daughter asked me about Software Engineering -- what do we actually do? She's eighteen and probably won't follow in my footsteps, which is fine, but I still want her to see my field.

I've always found this question hard to answer. I've been an engineering manager for a long time, and I'm happy to describe that job (emails, 1:1's, PRD reviews), but I don't think that's the heart of it. Plus these days I work for Google where things there are proprietary and deeply layered, not much help for answering questions like these.

Recently I've found my way to Mastodon for obvious reasons. I chose the Hachyderm instance because it was well run by people who shared my values. It turns out that the values of the owners and operators of the things we use matter, huh.

I then learned a cool thing, that the Hachyderm admins do much of their work publicly. They livestream debugging sessions on Twitch, they write postmortems, they share live graphs. @nova happened to be live-streaming at that moment, not surprising since the team's been busy absorbing thousands of new users and fending off attacks. My daughter and I watched a bit together.

Team Hachyderm (@nova @dma @quintessence @Taniwha @hazelweakly @malte): thank you for running this service well. But also thank you for giving me something I'm proud to use and proud to show my kid.

The NY Times Bee Puzzle

How many different New York Times Spelling Bee puzzles are there? Or more precisely, how many combinations of seven letters can be used to build Bee-type puzzles?

It turns out 7,742 different seven letter combinations can be used to generate Bee-style puzzles. There are more puzzles themseves based on what letter is chosen for the middle spot.

The majority of letter choices, about 62%, have just one pangram. That's lower than I expected, actually. It's not that uncommon to have two or three pangrams, which happens about about 25% of the time, and nearly four out of ten puzzles will have more than one pangram. The full output is here, the output of this program.

Watch out for the combination einprst. If this one ever comes up, good luck finding all twenty-seven of its pangrams.

The Bee Puzzle

Example Times Bee Puzzle

A Bee puzzle has seven letters with one "special" letter in the middle. Make as many words as You can find with at least four letters, using only the letters given, but it has to use the center letter. Proper nouns aren't allowed. Every puzzle has at least one pangram, a word that uses all letters — this example's is amphibian. Wikipedia cites Frank Longo as the Bee's creator. has more today's puzzle and some interesting stats about these puzzles in general. They don't seem to be affiliated with the NY Times but that seem to be OK and is's a nice site.


Trick-or-treaters volume at our Menlo Park house this Halloween was basically flat compared to last year. Last year we had 208 trick-or-treaters, this year 211. We remain down quite a bit from our 2012 peak. Maybe this is the new norm?

Everything happened later this year. Our first trick-or-treater didn't show up until 6:30, and peak wasn't until 8:45. That's thirty minutes or more later than prior years. I speculate it's because it was a warm weekend night. Why not stay out a bit later, no school tomorrow. And with the fall-back DST change everyone would be looking foward to a "free" hour of sleep.

As usual, the full story can be seen in the numbers. Check it out!



I'm excited to start my new job at YouTube in a few weeks. I'll manage the engineering team building the data warehouse for usage metrics.

I like that YouTube is important. It's firmly a part of our culture and I'm sure it will be how my kids watch video. YouTube's impressive statistics are the result. You don't see usage like that without a bunch of hard problems, and hard problems attract bright people. Indeed that's the clincher for why I'm looking forward to working there. People vote with their feet, and I have a lot of friends who have opted for Google, and YouTube specifically. They tell me that it's a great place to work.

YouTube is one of the worlds foremost platforms for social commentary, education, and free speech. And it's plenty of entertainment too. Sounds like fun.

Thick Apps Still Lose

Microsoft Excel 2016 Error Message

Thick apps won mobile. Fine.

On laptop (and desktop) it's not so clear. What is better, thick or thin? I tend to live mostly in thin land, although I use some thick apps regularly, like Twitter's Mac client and Apple Photos.

Every so often I give a big native app a try: Excel instead of Google Sheets, instead of Gmail, Reminders instead of the barebones Tasks built into Gmail. (I can't bring myself to try Word). But it's disappointing to see how those fancy apps keep shooting themselves in the foot!

Take for example this Excel error message. Excel is whining that it can't verify my subscription the first time I ran Excel untethered (version 15.11.2, for what its worth). Sure you can click through the warning, but would a newbie know to do that? At best off-putting, at worst downright disorienting. Why warn me of this at all? And why in a modal that stops me dead in my tracks?

It seems thick apps should win. They rock the unplugged use case. An even better situation is flaky networks -- tethered, conference WiFi, travelling. UI's deal notoriously poorly with intermittent or partial outages. A thick client, relying on that connection only for hitting API's, can hide the network.

Another place they should shine is the UI itself. They should be fast, beautiful, and featureful. Too often they're not. For example I find to be clunky, difficult to customize, and its keyboard shortcuts few and poorly done. Gmail is pretty good!

Finally there's the upgrade problem. Thick apps need conscious effort from their users before their work sees user time and they get feedback. And that's what drives innovation. Long cycles means slower (less) invention. One example I love is Gmail's "undo send" feature. Boy, you sure do miss that when you need it and it's not there! That should be on every thick client by now, but I don't think it is. I do know that Gmail has it and still doesn't.

Maybe the Internet can help. Look at Chrome with its awesome auto updates. What makes this work is solid engineering and exceptional quality control. I've never seen behind the Google curtain, but I bet there's no magic, just a lot of good engineering that leads to good software. Like: good design and code reviews, tons of test coverage across many scenarios, diverse and well-instrumented canaries, and thorough performance and resource use testing. If Google didn't all of that so well, then we wouldn't accept frequent pushes. Without the frequent upgrade cycle, Chromes feature cycle would languish.

Electron is another bright spot. This is the framework that gives Slack and GitHub's thick clients their fit and finish. It makes these feel like true native apps, even though they are mostly web controls with JavaScript the covers. Right-clicking still doesn't do what I want, and text controls are finicky, but it's close. But what those rough edges buy you, and the software producer, are frequent, reliable, and clean upgrades.

My natural preference would be for thick apps. If they were done well, I'd use them.

My Next Job


I left my last job a few weeks back and it's high time to look for a new one. If you're working on something interesting and think I could help, let me know!

It's nice to not have a day job while looking for another. I was lucky enough to do this once before in 2012 which turned out great. I learned then that time and flexibility lets you talk to lots of friends and learn about a breadth of projects. I found a fun project in a new domain (online education), something I doubt I'd have found the normal way.

Maybe I'll get lucky again.

Enough small talk, what am I looking for?

I'm looking for some flavor of line manager. I'm a good senior manager and code-every-day engineer; but I'm exceptional leading a team and running a project. That's what line managers do: lead engineers, not other managers or departments or matrix-anything. Also, if you're some kind of executive then coding is an indulgence, and I'd rather it just be part of my job. Mostly I'm talking to small companies, say 10-100 people (fun-size).

I want to build on my experience. I know infrastructure and cloud, SaaS and enterprise, and online education. I'm probably not the best person for your storage, security, gaming, e-commerce, or cryptocurrency company. I want to stay working on Internet technology. I like the (micro)services model. For my own projects I choose Python, JavaScript (frontend and backend), and Java. I know web operations, especially the Amazon stack.

Location is important: I don't want to do a daily Menlo Park to San Francisco round-trip. I'd like to work with friends if possible. And I want to do something worthwhile.

You can always get to my resume from the header here, or via this short link. I'm open to a bunch of things, just no kick boxing. Let's have coffee/drink or take a walk.

Lessons from Three Years in AWS

AWS Logo

I've spent the last three years building and operating web sites with Amazon Web Services and here are a few lessons I've learned. But I first have to come clean that I'm a fan of AWS with only casual experience with other IAAS/PAAS platforms.

S3 Is Amazing. They made the right engineering choices and compromises: cheap, practically infinitely scalable, fast enough, with good availability. $0.03/GB/mo covers up for a lot of sins. Knowing it's there changes how you build systems.

IAM Machine Roles From The Start. IAM with Instance Metadata is a powerful way to manage secrets and rights. Trouble is you can't add to existing machines. Provision with machine roles in big categories (e.g. app servers, utility machines, databases) at the start, even if just placeholders.

Availability Zoness Are Only Mostly Decoupled. After the 2011 us-east-1 outage we were reassured that a coordinated outage wouldn't happen again, but it happened again just last month.

They Will Lock You In And You'll Like It. They secondary services work well, are cheap, and are handy. I'm speaking of SQS, SES, Glacier, even Elastic Transcoder. Who wants to run a durable queue again?

CloudFormation No. It's tough to get right. My objection isn't programming in YAML, I don't mind writing Ansible plays, it's the complexity/structure of CloudFormation that is impenetrable. Plus even if you get it working once, you'd never run it again on something that is running.

Boto Yes. Powerful and expressive. Don't script the CLI, use Boto. Easy as pie.

Qualify Machines Before Use. Some VMs have lousy networking, presumably due to a chatty same-host or same-rack neighbor. Test for loss and latency to other hosts you own and on EBS. (I've used home-grown scripts, don't know of a standard open-source widget, someone should write one).

VPC Yes. If you have machines talking to each other (i.e. not a lone machine doing something lonely) then put them in a VPC. It's not hard.

NAT No. You think that'll improve security, but it will just introduce SPOFS and capacity chokepoints. Give your machines publicly routable IP's and use security groups.

Network ACLs Are A Pain. Try to get as far as you can with just security groups.

You'll Peer VPC's Someday. Choose non-overlapping subnet IP ranges at the start. It's hard to change later.

Spot Instances Are Tricky. They're only For a very specific use case that likely isn't yours. Setting up a test network? You can spend the money you save by using spot on swear jar fees.

Pick a Management Toolset. Ansible, Chef, all those things aren't all that different when it comes down to it. Just don't dither back and forth. There's a little bit of extra Chef love w/ AWS but not enough to tip the scales in your decision I'd reckon.

Tech Support Is Terrible. My last little startup didn't get much out of the business level tech support we bought. We needed it so we could call in to get help when we needed it, and we used that for escalating some problems. It was nice to have a number to call when I urgently need to up a system limit, say. But debugging something real, like a networking problem? Pretty rough.

...Unless You Are Big. Stanford, on the other hand, had a named rep who was responsive and helpful. I guess she was sales, but I used her freely on support issues and she worked the backchannels for us. Presumably this is what any big/important customer would get, that's just not you, sorry.

The Real Power Is On Demand. I'm reaffirming cloud koolaid here. Running this way lets you build and run systems differently, much better. I've relied on the cloud this to bring up emergency capacity. I've used it to convert a class of machines on the fly to the double-price double-RAM tier when hitting a surprising capacity crunch. There are a whole class of problems that get much easier when you can have 2x the machines for just a little while. When someone comes to you with that cost/benefit spreadsheet arguing why you should self-host, that's when you need your file of "the cloud saved my bacon" stories at the ready.