MoreRSS

site iconJohn D. CookModify

I have decades of consulting experience helping companies solve complex problems involving applied math, statistics, and data privacy.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of John D. Cook

Root prime gap

2026-04-09 08:18:57

I recently found out about Andrica’s conjecture: the square roots of consecutive primes are less than 1 apart.

In symbols, Andrica’s conjecture says that if pn and pn+1 are consecutive prime numbers, then

pn+1 − √pn < 1.

This has been empirically verified for primes up to 2 × 1019.

If the conjecture is true, it puts an upper bound on how long you’d have to search to find the next prime:

pn+1 < 1 + 2√pn  + pn,

which would be an improvement on the Bertrand-Chebyshev theorem that says

pn+1 < 2pn.

 

The post Root prime gap first appeared on John D. Cook.

A Three- and a Four- Body Problem

2026-04-09 07:30:15

Last week I wrote about the orbit of Artemis II. The orbit of Artemis I was much more interesting.

Because Artemis I was unmanned, it could spend a lot more time in orbit. The Artemis I mission took 25 days while Artemis II will take 10 days. Artemis I took an unusual path, orbiting the moon the opposite direction of the moon’s orbit around earth. This video by Primal Space demonstrates the orbit both from the perspective of earth and from the perspective of the Moon.

Another video from Primal Space describes the orbit of the third stage of Apollo 12. This stage was supposed to orbit around the sun in 1971, but an error sent it on a complicated unstable orbit of the earth, moon, and sun. It returned briefly to earth in 2002 and expected to return sometime in the 2040s.

Related posts

The post A Three- and a Four- Body Problem first appeared on John D. Cook.

Toffoli gates are all you need

2026-04-07 08:33:23

Landauer’s principle gives a lower bound on the amount of energy it takes to erase one bit of information:

E ≥ log(2) kBT

where kB is the Boltzmann constant and T is the ambient temperature in Kelvin. The lower bound applies no matter how the bit is physically stored. There is no theoretical lower limit on the energy required to carry out a reversible calculation.

In practice the energy required to erase a bit is around a billion times greater than Landauer’s lower bound. You might reasonably conclude that reversible computing isn’t practical since we’re nowhere near the Landauer limit. And yet in practice reversible circuits have been demonstrated to use less energy than conventional circuits. We’re far from the ultimate physical limit, but reversibility still provides practical efficiency gains today.

A Toffoli gate is a building block of reversible circuits. A Toffoli gate takes three bits as input and returns three bits as output:

T(abc) = (abc XOR (a AND b)).

In words, a Toffoli gate flips its third bit if and only if the first two bits are ones.

A Toffoli gate is its own inverse, and so it is reversible. This is easy to prove. If ab = 1, then the third bit is flipped. Apply the Toffoli gate again flips the bit back to what it was. If ab = 0, i.e. at least one of the first two bits is zero, then the Toffoli gate doesn’t change anything.

There is a theorem that any Boolean function can be computed by a circuit made of only NAND gates. We’ll show that you can construct a NAND gate out of Toffoli gates, which shows any Boolean function can be computed by a circuit made of Toffoli gates, which shows any Boolean function can be computed reversibly.

To compute NAND, i.e. ¬ (ab), send (ab, 1) to the Toffoli gate. The third bit of the output will contain the NAND of a and b.

T(a, b, 1) = (ab, ¬ (ab))

A drawback of reversible computing is that you may have to send in more input than you’d like and get back more output than you’d like, as we can already see from the example above. NAND takes two input bits and returns one output bit. But the Toffoli gate simulating NAND takes three input bits and returns three output bits.

Related posts

The post Toffoli gates are all you need first appeared on John D. Cook.

HIPAA compliant AI

2026-04-06 07:04:46

The best way to run AI and remain HIPAA compliant is to run it locally on your own hardware, instead of transferring protected health information (PHI) to a remote server by using a cloud-hosted service like ChatGPT or Claude. [1].

There are HIPAA-compliant cloud options, but they’re both restrictive and expensive. Even enterprise options are not “HIPAA compliant” out of the box. Instead, they are “HIPAA eligible” or that they “support HIPAA compliance,” because you still need the right Business Associate Agreement (BAA), configuration, logging, access controls, and internal process around it, and the end product often ends up far less capable than a frontier model. The least expensive and therefore most accessible services do not even allow this as an option.

Specific examples:

  • Only sales-managed ChatGPT Enterprise or Edu customers are eligible for a BAA, and OpenAI explicitly says it does not offer a BAA for ChatGPT Business. The consumer ChatGPT Health product says HIPAA and BAAs do not apply. ChatGPT for Healthcare pricing is based on ChatGPT Enterprise, depends on organization size and deployment needs, and requires contacting sales. Even within Enterprise, OpenAI’s Regulated Workspace spec lists Codex and the multi-step Agent feature as “Non-Included Functionality,” i.e. off limits for PHI.
  • Google says Gemini can support HIPAA workloads in Workspace, but NotebookLM is not covered by Google’s BAA, and Gemini in Chrome is automatically blocked for BAA customers. If a work or school account does not have enterprise-grade data protections, chats in the Gemini app may be reviewed by humans and used to improve Google’s products.
  • GitHub Copilot, despite being a Microsoft product, is not under Microsoft’s BAA. Azure OpenAI Service is, but only for text endpoints. While Microsoft is working on their own models, it is unlikely that they will deviate significantly here.
  • Anthropic says its BAA covers only certain “HIPAA-ready” services, namely the first-party API and a HIPAA-ready Enterprise plan, and does not cover Claude Free, Pro, Max, Team, Workbench, Console, Cowork, or Claude for Office. The HIPAA-ready Enterprise offering is sales-assisted only. Bundled Claude Code seats are not covered. AWS Bedrock API calls can work, but this requires extensive configuration and carries its own complexities and restrictions.

Running AI locally is already practical as of early 2026. Open-weight models that approach the quality of commercial coding assistants run on consumer hardware. A single high-end GPU or a recent Mac with enough unified memory can run a 70B-parameter model at a reasonable token speed.

There’s an interesting interplay between economies of scale and diseconomies of scale. Cloud providers can run a data center at a lower cost per server than a small company can. That’s the economies of scale. But running HIPAA-compliant computing in the cloud, particularly with AI providers, incurs a large direct costs and indirect bureaucratic costs. That’s the diseconomies of scale. Smaller companies may benefit more from local AI than larger companies if they need to be HIPAA-compliant.

Related posts

[1] This post is not legal advice. My clients are often lawyers, but I’m not a lawyer.

The post HIPAA compliant AI first appeared on John D. Cook.

Kalman and Bayes average grades

2026-04-04 23:00:14

This post will look at the problem of updating an average grade as a very simple special case of Bayesian statistics and of Kalman filtering.

Suppose you’re keeping up with your average grade in a class, and you know your average after n tests, all weighted equally.

m = (x1 + x2 + x3 + … + xn) / n.

Then you get another test grade back and your new average is

m′ = (x1 + x2 + x3 + … + xn + xn+1) / (n + 1).

You don’t need the individual test grades once you’ve computed the average; you can instead remember the average m and the number of grades n [1]. Then you know the sum of the first n grades is nm and so

m′ = (nm + xn+1) / (n + 1).

You could split that into

m′ = w1m + w2xn+1

where w1 = n/(n + 1) and w2 = 1/(n + 1). In other words, the new mean is the weighted average of the previous mean and the new score.

A Bayesian perspective would say that your posterior expected grade m′ is a compromise between your prior expected grade m and the new data xn+1. [2]

You could also rewrite the equation above as

m′ = m + (xn+1m)/(n + 1) = m + KΔ

where K = 1/(n + 1) and Δ = xn+1m. In Kalman filter terms, K is the gain, the proportionality constant for how the change in your state is proportional to the difference between what you saw and what you expected.

Related posts

[1] In statistical terms, the mean is a sufficient statistic.

[2] You could flesh this out by using a normal likelihood and a flat improper prior.

The post Kalman and Bayes average grades first appeared on John D. Cook.

Roman moon, Greek moon

2026-04-04 00:31:54

I used the term perilune in yesterday’s post about the flight path of Artemis II. When Artemis is closest to the moon it will be furthest from earth because its closest approach to the moon, its perilune, is on the side of the moon opposite earth.

Perilune is sometimes called periselene. The two terms come from two goddesses associated with the moon, the Roman Luna and the Greek Selene. Since the peri- prefix is Greek, perhaps periselene would be preferable. But we’re far more familiar with words associated with the moon being based on Luna than Selene.

The neutral terms for closest and furthest points in an orbit are periapsis and apoapsis. but there are more colorful terms that are specific to orbiting particular celestial objects. The terms perigee and apogee for orbiting earth (from the Greek Gaia) are most familiar, and the terms perihelion and aphelion (not apohelion) for orbiting the sun (from the Greek Helios) are the next most familiar.

The terms perijove and apojove are unfamiliar, but you can imagine what they mean. Others like periareion and apoareion, especially the latter, are truly arcane.

The post Roman moon, Greek moon first appeared on John D. Cook.