MoreRSS

site iconPhil EatonModify

For the last 10 years I've chased my way down the software stack starting from humble beginnings with the venerable jQuery and PHP.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Phil Eaton

In response to a developer asking about systems

2025-09-15 08:00:00

Sometimes I get asked questions that would be more fun to answer in public. All letters are treated as anonymous unless permission is otherwise granted.

Hey [Redacted]! It's great to hear from you. I'm very glad you joined the coffee club and met some good folks. :)

You asked how to learn about systems. A great question! I think I need to start first with what I mean when I say systems.

My definition of systems is all of the underlying software we developers use but are taught not to think about because they are so solid: our compilers and interpreters, our databases, our operating system, our browser, and so on. We think of them as basically not having bugs, we just count on them to be correct and fast enough so we can build the applications that really matter to users.

But 1) some developers do actually have to work on these fundamental blocks (compilers, databases, operating systems, browsers, etc.) and 2) it's not thaaaat hard to get into this development professionally and 3) even if you don't get into it professionally, having a better understanding of these fundamental blocks will make you a better application developer. At least I think so.

To get into systems I think it starts by you just questioning how each layer you build on works. Try building that layer yourself. For example you've probably used a web framework like Rails or Next.js. But you can just go and write that layer yourself too (for education).

And you've probably used Postgres or SQLite or DynamoDB. But you can also just go and write that layer yourself (for education). It's this habit of thinking and digging into the next lower layer that will get you into systems. Basically, not being satisfied with the black box.

I do not think there are many good books on programming in general, and very very few must-read ones, but one that I recommend to everybody is Designing Data Intensive Applications. I think it's best if you read it with a group of people. (My book club will read it in December when the 2nd edition comes out, you should join.) But this book is specific to data obviously and not interested in the fundamentals of other systems things like compilers or operating systems or browsers or so on.

Also, I see getting into this as a long-term thing. Throughout my whole career (almost 11 years now) I definitely always tried to dig into compilers and interpreters, I wrote and blogged about toy implementations a lot. And then 5 years ago I started digging into databases and saw that there was more career potential there. But it still took 4 years until I got my first job as a developer working on a database (the job I currently have).

Things take time to learn and that's ok! You have a long career to look forward to. And if you end up not wanting to dig into this stuff that's totally fine too. I think very few developers actually do. And they still have fine careers.

Anyway, I hope this is at least mildly useful. I hope you join the Software Internals Discord and nycsystems.xyz as well and look forward to seeing you at future coffee clubs!

Cheers,
Phil

A simple clustering and replication solution for Postgres

2025-09-10 08:00:00

This is an external post of mine. Click here if you are not redirected.

Analytics query goes 6x faster with EDB Postgres Distributed's new analytics engine

2025-09-04 08:00:00

This is an external post of mine. Click here if you are not redirected.

Set up a single-node EDB Postgres Distributed cluster on Ubuntu

2025-08-14 08:00:00

This is an external post of mine. Click here if you are not redirected.

What even is distributed systems

2025-08-09 08:00:00

Distributed systems is simply the study of interactions between processes. Every two interacting processes form a distributed system, whether they are on the same host or not. Distributed systems create new challenges (compared to single-process systems) in terms of correctness (i.e. consistency), reliability, and performance (i.e. latency and throughput).

The best way to learn about the principles and fundamentals of distributed systems is to 1) read Designing Data Intensive Applications and 2) read through the papers and follow the notes in the MIT Distributed Systems course.

For Designing Data Intensive Applications (DDIA), I strongly encourage you to find buddies at work or online who will read it through with you. You can also always join the Software Internals Discord's #distsys channel to ask questions as you go. But it's still best if you have some partners to go through the book with, even if they are as new to it as you.

I also used to think that you might want to wait a few years into your career before reading DDIA but when you have friends to read it with I think you need not wait.

If you have only skimmed the book you should definitely go back and give it a thorough read. I have read it three times already and I will read it again as part of the Software Internals Book Club next year after the 2nd Edition is published.

Keep in mind that every chapter of DDIA provides references to papers you can keep reading should you end up memorizing DDIA itself.

When you've read parts of DDIA or the MIT Distributed Systems course and you want practice, the Fly.io x Jepsen Distributed Systems Challenge is one guided option. Other options might include simply implementing (getting progressively more complex down the list):

  • two-phase commit
  • three-phase commit
  • single-decree Paxos
  • highly available key-value store on top of a 3rd-party consensus library
  • chain replication (or CRAQ), using a 3rd-party consensus library
  • Raft
  • epaxos

And if you get bored there you can see Alex Miller's Data Replication Design Spectrum for more ideas and variants.

And if you want more people to follow, check out the Distributed Systems section of my favorite blogs page.

If these projects and papers sound arcane or intimidating, know that you will see the problems these projects/papers solve whether or not you know and understand these solutions. Developers often end up reinventing hacky versions of these which are more likely to have subtle bugs.

While instead you can recognize and use one of these well-known building blocks. Or at least have the background to better reason about correctness should you be in a situation where you must work with a novel distributed system or you end up designing a new one yourself.

And again, if you want folks to bounce ideas off of or ask questions to, I strongly encourage you to join the Software Internals Discord and ask there!

Stack traces for Postgres errors with backtrace_functions

2025-07-31 08:00:00

This is an external post of mine. Click here if you are not redirected.