Old McDonald Celebrities

This tweet got me wondering: how many others meet the same critera?

Googling around found a stack overflow question where someone helpfully harvested a list of about 4,000 people that's good enough for this. Grepping turned up four candidates with the vowels in the right order, but with confounding ones.

Celine Dion
Eddie Cleanhead Vinson
Jennifer Aniston
Leslie Twiggy Lawson

Eddie "Cleanhead" Vinson meets the bill if we don't consider his nickname. He was a jazz musician who lived from 1917-88 — I think I'll check him out. And TIL Twiggy's real name.

Also: Thank you, George Takei, for finding worthwhile things on X and crossposting to Mastodon. You are a national treasure.

Oppenheimer -- A Miss

Stetson Hat

I was so disappointed in Oppenheimer. It wasn't terrible, but what a missed opportunity.

Mystifying. Part of the job of a biopic is to give you insight into what a person is thinking and feeling. Especially for this movie, which is basically about this question: what would it be like to be the person who made these terrible weapons? I understand he's complicated, and it's complicated. But instead of dialog, or even some outright explaining, this movie leans on of Cillian Murphy staring into the distance or adjusting his hat.

Women. There were plenty of interesting women they could have brought in, either contemporaries, or from academia, or the Manhattan project itself. There are only two women in the movie: one is a sex object with mental health issues (Pugh); the other is a mother/lush with a surprisingly sharp wit, but only for a minute (Blunt). The sex scene, oddly using Oppenheimer's most famous line, was gratuitous and super weird.

Strauss. I'm not invested at all in the conflict with Admiral Strauss. Paraphrasing the good folks at The Incomparable, "I don't care if some guy becomes Commerce Secretary in the Eisenhower Cabinet."

Given how much I like the history and the science of this period, I was really pulling for this movie. Dang.

Cleaning Up Photo Duplicates

Pie Chart

I did a data cleanup over the weekend. I doubt this is interesting for anyone else, I just wanted to capture my own notes.

Our family photos are stored up in Google Photos. We have about 100k photos and short movies that take up 0.5TB. I wanted to back up to local storage, but also I liked the idea of trying the mdisc format for long-term storage.

I ran Takeout and unzipped everything. I was surprised to see so many duplicates. Some are copies in the same folders with suffixes like "(1)", most are multiple copies in different folders. Takeout seems to just store a copy for each album a photo is in.

Some poking at the data shows 88% of 302,755 files / 72% of the bytes were unique (histogram at the bottom of the post). Removing dups can save 200+GB. Sure not really worth the trouble but why not.

Steps

1. On the NAS, gather file checksums (md5sum) and sizes for all the files under Takeout/Google Files. Mostly variations on of find -print0 | xargs -0.

2. Data cleanup to get into nice file paths and clean delimiters, mostly interactively with vi.

3. Insert into sqlite using .separator and .import.

Here's what the database ended up looking like: Photos and Sizes are inputs, HashCounts, Duplicates, and Candidates are outputs.

CREATE TABLE Photos(hash TEXT,path TEXT);
CREATE TABLE Sizes(size INT,path TEXT);
CREATE TABLE HashCounts(hash TEXT,found INT);
CREATE TABLE Duplicates(path STRING,hash STRING,found INT,size INT);
CREATE TABLE Candidates(path STRING,hash STRING,size INT,pos INT);

4. Find the dups. I'm surprised to find some with 50 or more copies, but it out some family favorites end up in lots of albums.

insert into HashCounts
select hash, sum(1) as found from Photos group by hash;

5. Figure out candidates for deletion

insert into Duplicates
select Photos.path, Photos.hash, HashCounts.found, Sizes.size
from Photos, HashCounts, Sizes
where HashCounts.hash=Photos.hash and Photos.path=Sizes.path
and HashCounts.found > 1;

The first of each set is the one we'll keep.

insert into Candidates
select *
from (select path, hash, size, row_number() over (partition by hash) as row_number
    from (select * from Duplicates order by hash, path desc)
) where row_number > 1;

I was surprised that sqlplus supports window functions, nice.

The inner reverse-alpha sort on "path" takes care of two cases. I tend to prefer keeping photos with names that start with years, and Those come first alphabetically (nice). Also within folders often there are many copies with "(1)" and "(2)" suffixes that are generally cruft and most worthy of removing, and those also sort last alphabetically (nice).

5. Dump out the "Candidates" using .output. Copy back to the NAS. Do lots of spot checks. Convert to a bash script of rm commands, run very carefully.

Tools

Sqlite is my go-to tool for ad-hoc work like this. It's fast and simple, but only for small jobs -- this one is MB-scale.

select name, sum(pgsize) from dbstat group by name;
name           sum(pgsize)
-------------  -----------
Candidates     3006464
Duplicates     5541888
HashCounts     11051008
Photos         23240704
Sizes          13807616
sqlite_schema  4096

My Synology is a pretty good place for storage with ability to ssh in and run local commands. But if I had to do this again, though, I should have just bought a large locally-connected SSD. All the transfers to-from the NAS were a hassle. Looking now I'm stunned that you can get a 4 TB external SSD for under $300.

Some private notes here.

Histogram (source)

My Love Letter to ATP

ATP Podcast Art

I listen to a fair number of podcasts. The Accidental Tech Podcast is my favorite by a mile.

It's the one that I look forward to every week. It comes out on Thursdays, often right when I'm leaving work. It's a sign that the weekend is right around the corner.

I'll get some of the reasons why out of the way quickly, the reasons particular to me and my tastes.

  • The topic, the world of technology and all things around it, is right in my wheelhouse. It's my profession, but also my hobby. That's what got me in the door. If this stuff isn't your thing, then this podcast won't be either.
  • They talk a lot about Apple products and that ecosystem, which where I spend my personal time.
  • But they also cover a lot of issues that touch on the world of technology: business, law & politics, companies, social media. I like their descriptions, their take, their rants.
  • They end up talking a lot about personal tech too. Like how to manage family photos and backups. The hard, fussy stuff that ends up taking up so much of our lives. For example, what's the best way to help parents deal with passwords. Hard stuff!
  • I especially like tech-adjacent topics like home audio/video, home automation, and gaming.

But what I appreciate more than the content of ATP are the hosts and the care they take to produce a good show.

  • They are friends. It's nice hanging out with people who like each other. It's kind of like we have permission to eavesdrop and be a part of that.
  • They're not afraid to discuss their lives. Stuff like families and work/life stress. This is good stuff for me because I'm in the same life situation as them: middle-aged tech dads. But bringing their whole selves to the show is a bit of vulnerability that I appreciate.
  • ATP is exceptionally well produced. I didn't appreciate this until I'd heard so many other podcasts that are produced terribly, with bad recordings or poorly mixed. Usually you have to turn to the corporate, fancy podcasts, but they also have lots of ads and are usually different kinds of shows.
  • And the show is so well edited. They edit it without losing any of the content or the pacing, and sparing us all the awkward "um where were we" and technical futzing.
  • They don't talk over each other. Part of this is discipline, part of this is just politeness. But also now that I've listened to the pre-edit "bootleg" a couple of times, I've come to appreciate how much of this is also fixed in the edit. Nice job Marco.
  • They are respectful. They don't put people down, they aren't mean. When they kid each other it's in good fun.
  • When (rarely) they wade into social justice or world events, they do so respectfully and thoughtfully. They understand that as three three cisgender white guys, it's good to have views to share, but also right to listen and help others.
  • Their show is reliable. It's great that they keep to a regular format and schedule. It fits into the rest of your life and becomes something you count on.
  • Finally, some small things: Yay for proper use of chapter markers, hardly anybody else uses them properly. And I like their their occasional forays into car talk, just because I like car stuff too.

Although they joke about how much of the show is devoted to feedback, it's one of my favorite parts. It shows that they listen and are learning. And they share that learning with all of us. A recent example was six minutes or so into Episode 570 when I learned how home power battery systems, when full, signal this by changing the frequency of AC power as a signal to solar panels to back off. Fascinating!

I like the members-only specials. It's OK with me that these are only available to members. They've been candid lately about ad revenue drying up and they could use the extra revenue channel. These episodes are a nice way to reward members. They've managed to do these without compromising the core show.

The best thing they ever did on ATP was getting a sponsor to send John toasters to review. And review he did! I didn't appreciate how many bad toasters there are, and how they can be bad for so many reasons. My favorite reason was poor knob feel.

Nice work John, Marco, and Casey. Please keep it up for a long, long time.

The iPhone's SIM Tray Went Away Too Soon

SIM

If you've traveled internationally, likely you've used a SIM card for local data and calls. There is a nice ecosystem around SIMs with a wealth of easy and affordable Pay As You Go (PAYG) options.

But the newer iPhones did away with the SIM slot if favor of some new eSIM hotness. Apple has all kinds of info claiming they have good international support, but I found reality falls short.

  • Only a few carriers support eSIM's,
  • The few that do require a contract. A tourist or student studying abroad is better served by a PAYG plan, and
  • Even if you can stomach a contract, that would require a UK bank account; no way to easily pull that off.

We ended up falling back to international roaming. It works but is expensive.

I think Apple made the wrong call removing the trusty old SIM tray. Clearly the new models can be made to work well with it, since how they are sold in the UK. If you're unlucky enough to have bought your recent iPhone in the US, you're out of luck.

I this is an example of Apple bad tendency sometimes choose form over function, "courage" over usability.

The Big Dig

Big Dig podcast title art (WGBH)

I just finished the Big Dig podcast and it's worth a listen. It's about Boston's Central Artery/Tunnel Project, the most expensive highway project in US history. They cover the whole story, from conception, to getting it approved, to years of execution, and then the fallout from cost overruns and mismanagement. The podcast is well produced and has a lot of primary-source interviews.

I have a little bit of a personal connection, since I lived in Boston during the project's later years when it was on the news all the time. I even toured a part under construction.

But what I find most compelling is trying to connect to the question, can America build big infrastructure anymore? It's something I think lot about. We benefit from by the giant projects from the past (dams, bridges, interstates) but can't maintain them properly; we struggle to take on new things like high-speed rail.

They place most of the blame on headwinds that didn't exist in the public-works heyday of the twentieth century. Leaders are under more scrutiny; projects fall under a bunch of regulations intended to protect the environment and workers. Public commentary slows things down.

The problem is, while the old way was easier, it also caused a lot of harm. I learned the story of the Cambridge and East Boston families that stopped interstate projects that would have leveled their homes and neighborhoods. I know those places well. I used to own a home right where one of those roads was supposed to have been. I sure am grateful to those protesters who won!

It ends on a hopeful note. Not that it's easy, nor are we necessarily that much better managing big projects now. But there are success stories.

One part I especially liked was 20 minutes into the final episode, in the final interview with Fred Salvucci, where he told the story Saint Francis. God tells Francis to build a cathedral, but then tears it down. Francis builds another, God destroys it again. Why, Francis asks. Because it's not enough to build a physical cathedral, but you also have to build the support for it in the hearts and minds of the people. Maybe that's the part we're not doing well enough now.

Bring Me {Problems,Solutions} Bosses

Pointy Haired Boss

Some bosses want you to bring them problems. They like to unscramble Rubik's cubes and are happy to work through it with you. Sure it's great if you have a proposal or recommendation, but be prepared to show your homework.

Other bosses want you to bring them solutions. If you bring them a problem, you'll get annoyance and "what do you want me to do about it?"

Most bosses have exceptions by domain. A common pattern there is the bring-me-problems boss who is also a techie -- they'll want to dig in planning, say, but prefer you to solve messy people conflicts yourself.

I've worked with both and each has their virtues. When you have a new boss, figure out their style and adjust.

Working In The Open

screenshot of @nova live-debugging Hachyderm on Twitch

The other day my daughter asked me about Software Engineering -- what do we actually do? She's eighteen and probably won't follow in my footsteps, which is fine, but I still want her to see my field.

I've always found this question hard to answer. I've been an engineering manager for a long time, and I'm happy to describe that job (emails, 1:1's, PRD reviews), but I don't think that's the heart of it. Plus these days I work for Google where things there are proprietary and deeply layered, not much help for answering questions like these.

Recently I've found my way to Mastodon for obvious reasons. I chose the Hachyderm instance because it was well run by people who shared my values. It turns out that the values of the owners and operators of the things we use matter, huh.

I then learned a cool thing, that the Hachyderm admins do much of their work publicly. They livestream debugging sessions on Twitch, they write postmortems, they share live graphs. @nova happened to be live-streaming at that moment, not surprising since the team's been busy absorbing thousands of new users and fending off attacks. My daughter and I watched a bit together.

Team Hachyderm (@nova @dma @quintessence @Taniwha @hazelweakly @malte): thank you for running this service well. But also thank you for giving me something I'm proud to use and proud to show my kid.