MoreRSS

site iconJohn D. CookModify

I have decades of consulting experience helping companies solve complex problems involving applied math, statistics, and data privacy.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of John D. Cook

Race between primes of the forms 4k + 1 and 4k + 3

2026-02-16 03:19:08

The last few posts have looked at expressing an odd prime p as a sum of two squares. This is possible if and only if p is of the form 4k + 1. I illustrated an algorithm for finding the squares with p = 2255 − 19, a prime that is used in cryptography. It is being used in bringing this page to you if the TLS connection between my server and your browser is uses Curve25519 or Ed25519.

World records

I thought about illustrating the algorithm with a larger prime too, such as a world record. But then I realized all the latest record primes have been of the form 4k + 3 and so cannot be written as a sum of squares. Why is p mod 4 equal to 3 for all the records? Are more primes congruent to 3 than to 1 mod 4? The answer to that question is subtle; more on that shortly.

More record primes are congruent to 3 mod 4 because Mersenne primes are easier to find, and that’s because there’s an algorithm, the Lucas-Lehmer test, that can test whether a Mersenne number is prime more efficiently than testing general numbers. Lucas developed his test in 1878 and Lehmer refined it in 1930.

Since the time Lucas first developed his test, the largest known prime has always been a Mersenne prime, with exceptions in 1951 and in 1989.

Chebyshev bias

So, are more primes congruent to 3 mod 4 than are congruent to 1 mod 4?

Define the function f(n) to be the ratio of the number of primes in each residue class.

f(n) = (# primes p < n with p = 3 mod 4) / (# primes p < n with p = 1 mod 4)

As n goes to infinity, the function f(n) converges to 1. So in that sense the number of primes in each category are equal.

If we look at the difference rather than the ratio we get a more subtle story. Define the lead function to be how much the count of primes equal to 3 mod 4 leads the number of primes equal to 1 mod 4.

g(n) = (# primes p < n with p = 3 mod 4) − (# primes p < n with p = 1 mod 4)

For any nf(n) > 1 if and only if g(n) > 0. However, as n goes to infinity the function g(n) does not converge. It oscillates between positive and negative infinitely often. But g(n) is positive for long stretches. This phenomena is known as Chebyshev bias.

Visualizing the lead function

We can calculate the lead function at primes with the following code.

from numpy import zeros
from sympy import primepi, primerange

N = 1_000_000
leads = zeros(primepi(N) + 1)
for index, prime in enumerate(primerange(2, N), start=1):
    leads[index] = leads[index - 1] + prime % 4 - 2

Here is a list of the primes at which the lead function is zero, i.e. when it changes sign.

[   0,     1,     3,     7,    13,    89,  2943,  2945,  2947,
 2949,  2951,  2953, 50371, 50375, 50377, 50379, 50381, 50393,
50413, 50423, 50425, 50427, 50429, 50431, 50433, 50435, 50437,
50439, 50445, 50449, 50451, 50503, 50507, 50515, 50517, 50821,
50843, 50853, 50855, 50857, 50859, 50861, 50865, 50893, 50899,
50901, 50903, 50905, 50907, 50909, 50911, 50913, 50915, 50917,
50919, 50921, 50927, 50929, 51119, 51121, 51123, 51127, 51151,
51155, 51157, 51159, 51161, 51163, 51177, 51185, 51187, 51189,
51195, 51227, 51261, 51263, 51285, 51287, 51289, 51291, 51293,
51297, 51299, 51319, 51321, 51389, 51391, 51395, 51397, 51505,
51535, 51537, 51543, 51547, 51551, 51553, 51557, 51559, 51567,
51573, 51575, 51577, 51595, 51599, 51607, 51609, 51611, 51615,
51617, 51619, 51621, 51623, 51627]

This is OEIS sequence A038691.

Because the lead function changes more often in some regions than others, it’s best to plot the function over multiple ranges.

The lead function is more often positive than negative. And yet it is zero infinitely often. So while the count of primes with remainder 3 mod 4 is usually ahead, the counts equal out infinitely often.

The post Race between primes of the forms 4k + 1 and 4k + 3 first appeared on John D. Cook.

Wagon’s algorithm in Python

2026-02-15 07:06:41

The last three posts have been about Stan Wagon’s algorithm for finding x and y satisfying

x² + y² = p

where p is an odd prime.

The first post in the series gives Gauss’ formula for a solution, but shows why it is impractical for large p. The bottom of this post introduces Wagon’s algorithm and says that it requires two things: finding a quadratic non-residue mod p and a variation on the Euclidean algorithm.

The next post shows how to find a quadratic non-residue.

The reason Wagon requires a non-residue is because he needs to find a square root of −1 mod p. The previous post showed how that’s done.

In this post we will complete Wagon’s algorithm by writing the modified version of the euclidean algorithm.

Suppose p is an odd prime, and we’ve found x such that x² = −1 mod p as in the previous posts. The last step in Wagon’s algorithm is to apply the Euclidean algorithm to x and p and stop when the numbers are both less than √p.

When we’re working with large integers, how do we find square roots? Maybe p and even √p are too big to represent as a floating point number, so we can’t just apply the sqrt function. Maybe p is less than the largest floating point number (around 10308) but the sqrt function doesn’t have enough precision. Floats only have 53 bits of precision, so an integer larger than 253 cannot necessarily be represented entirely accurately.

The solution is to use the isqrt function, introduced in Python 3.8. It returns the largest integer less than the square root of its argument.

Now we have everything necessary to finish implementing Wagon’s algorithm.

from sympy import legendre_symbol, nextprime
from math import isqrt

def find_nonresidue(p):
    q = 2
    while legendre_symbol(q, p) == 1:
        q = nextprime(q)
    return q

def my_euclidean_algorithm(a, b, stop):
    while a > stop:
        a, b = b, a % b
    return (a, b)

def find_ab(p):
    assert(p % 4 == 1)
    k = p // 4
    c = find_nonresidue(p)
    x = pow(c, k, p)
    return my_euclidean_algorithm(p, x, isqrt(p))

Let’s use this to find a and b such that x² + y² = p where p = 2255 − 19.

>>> a, b = find_ab(p := 2**255 - 19)
>>> a
230614434303103947632580767254119327050
>>> b
68651491678749784955913861047835464643
>>> a**2 + b**2 - p
0

Finis.

The post Wagon’s algorithm in Python first appeared on John D. Cook.

Finding a square root of -1 mod p

2026-02-15 06:07:05

If p is an odd prime, there is a theorem that says

x² = −1 mod p

has a solution if and only if p = 1 mod 4. When a solution x exists, how do you find it?

The previous two posts have discussed Stan Wagon’s algorithm for expressing an odd prime p as a sum of two squares. This is possible if and only if p = 1 mod 4, the same condition on p for −1 to have a square root.

In the previous post we discussed how to find c such that c does not have a square root mod p. This is most of the work for finding a square root of −1. Once you have c, set

xck

where p = 4k + 1.

For example, let’s find a square root of −1 mod p where p = 2255 − 19. We found in the previous post that c = 2 is a non-residue for this value of p.

>>> p = 2**255 - 19
>>> k = p // 4
>>> x = pow(2, k, p)

Let’s view x and verify that it is a solution.

>>> x
19681161376707505956807079304988542015446066515923890162744021073123829784752
>>> (x**2 + 1) % p
0

Sometimes you’ll see a square root of −1 mod p written as i. This makes sense in context, but it’s a little jarring at first since here i is an integer, not a complex number.

The next post completes this series, giving a full implementation of Wagon’s algorithm.

The post Finding a square root of -1 mod p first appeared on John D. Cook.

Finding a non-square mod p

2026-02-15 05:35:54

The previous post briefly mentioned Stan Wagon’s algorithm for expressing an odd prime p as a sum of two squares when it is possible (i.e. when p = 1 mod 4). Wagon’s algorithm requires first finding a non-square mod p, i.e. a number c such that cd² mod p for any d in 1, 2, 3, … p − 1.

Wagon recommends searching for c by testing consecutive primes q as candidates. You can test whether a number q is a square mod p by computing the Legendre symbol, which is implemented in SymPy as legendre_symbol(q, p). This returns 1 if q is a quadratic residue mod p and −1 if it is not.

Here’s Python code for doing the search.

from sympy import *

def find_nonresidue(p):
    q = 2
    while legendre_symbol(q, p) == 1:
        q = nextprime(q)
    return q

Let’s start with p = 2255 − 19. This prime comes up often in cryptography, such as Curve25519. Now p = 1 mod 4, so Wagon’s algorithm applies.

The code above returns 2, i.e. the first thing we tried worked. That was kinda disappointingly easy.

Here’s another large prime used in cryptography, in the NIST curve P-384.

p = 2384 − 2128 − 296 + 232 − 1

For this value of p, find_nonresidue(p) returns 19.

Why test prime as candidates for non-residues? You could pick candidates at random, and there’s a probability 1/2 that any candidate will work, since half the integers less than p are residues and half are non-residues. It’s very likely that you’d find a solution quickly, but it’s not guaranteed. In principle, you could try a thousand candidates before one works, though that’s vanishingly unlikely. If you test consecutive primes, there is a theoretical limit on how many guesses you need to make, if the Extended Riemann Hypothesis is true.

The next post explains why we wanted to find a non-residue.

The post Finding a non-square mod p first appeared on John D. Cook.

Expressing a prime as the sum of two squares

2026-02-13 10:34:58

I saw where Elon Musk posted Grok’s answer to the prompt “What are the most beautiful theorems.” I looked at the list, and there were no surprises, as you’d expect from a program that works by predicting the most likely sequence of words based on analyzing web pages.

There’s only one theorem on the list that hasn’t appeared on this blog, as far as I can recall, and that’s Fermat’s theorem that an odd prime p can be written as the sum of two squares if and only if p = 1 mod 4. The “only if” direction is easy [1] but the “if” direction takes more effort to prove.

If p is a prime and p = 1 mod 4, Fermat’s theorem guarantees the existence of x and y such that

x^2 + y^2 = p

Gauss’ formula

Stan Wagon [2] gave an algorithm for finding a pair (xy) to satisfy the equation above [2]. He also presents “a beautiful formula due to Gauss” which “does not seem to be of any value for computation.” Gauss’ formula says that if p = 4k + 1, then a solution is

\begin{align*} x &= \frac{1}{2} \binom{2k}{k} \pmod p \\ y &= (2k)\!!\, x \pmod p \end{align*}

For x and y we choose the residues mod p with |x| and |y| less than p/2.

Why would Wagon say Gauss’ formula is computationally useless? The number of multiplications required is apparently on the order of p and the size of the numbers involved grows like p!.

You can get around the problem of intermediate numbers getting too large by carrying out all calculations mod p, but I don’t see a way of implementing Gauss’ formula with less than O(p) modular multiplications [3].

Wagon’s algorithm

If we want to express a large prime p as a sum of two squares, an algorithm requiring O(p) multiplications is impractical. Wagon’s algorithm is much more efficient.

You can find the details of Wagon’s algorithm in [3], but the two key components are finding a quadratic non-residue mod p (a number c such that cx² mod p for any x) and the Euclidean algorithm. Since half the numbers between 1 and p − 1 are quadratic non-residues, you’re very likely to find a non-residue after a few attempts.

 

[1] The square of an integer is either equal to 0 or 1 mod 4, so the sum of two squares cannot equal 3 mod 4.

[2] Stan Wagon. The Euclidean Algorithm Strikes Again. The American Mathematical Monthly, Vol. 97, No. 2 (Feb., 1990), pp. 125-129.

[3] Wilson’s theorem gives a fast way to compute (n − 1)! mod n. Maybe there’s some analogous identity that could speed up the calculation of the necessary factorials mod p, but I don’t know what it would be.

 

The post Expressing a prime as the sum of two squares first appeared on John D. Cook.

Aligning one matrix with another

2026-02-11 21:23:50

Suppose you have two n × n matrices, A and B, and you would like to find a rotation matrix Ω that lines up B with A. That is, you’d like to find Ω such that

A = ΩB.

This is asking too much, except in the trivial case of A and B being 1 × 1 matrices. You could view the matrix equation above as a set of n² equations in real numbers, but the space of orthogonal matrices only has n(n − 1) degrees of freedom [1].

When an equation doesn’t have an exact solution, the next best thing is to get as close as possible to a solution, typically in a least squares sense. The orthogonal Procrustes problem is to find an orthogonal matrix Ω minimizing the distance between A and ΩB That is, we want to minimize

|| A − ΩB ||

subject to the constraint that Ω is orthogonal. The matrix norm used in this problem is the Frobenius norm, the sum of the squares of the matrix entries. The Frobenius norm is the 2-norm if we straighten the matrices into vectors of dimension n².

Peter Schönemann found a solution to the orthogonal Procrustes problem in 1964. His solution involves singular value decomposition (SVD). This shouldn’t be surprising since SVD solves the problem of finding the closest thing to an inverse of an non-invertible matrix. (More on that here.)

Schönemann’s solution is to set MABT and find its singular value decomposition

M = UΣVT.

Then

Ω = UVT.

Python code

The following code illustrates solving the orthogonal Procrustes problem for random matrices.

import numpy as np

n = 3

# Generate random n x n matrices A and B
rng = np.random.default_rng(seed=20260211) 
A = rng.standard_normal((n, n))
B = rng.standard_normal((n, n))

# Compute M = A * B^T
M = A @ B.T

# SVD: M = U * Sigma * V^T
U, s, Vt = np.linalg.svd(M, full_matrices=False)

# R = U * V^T
R = U @ Vt

# Verify that R * R^T is very nearly the identity matrix
print("||R^T R - I||_F =", np.linalg.norm(R.T @ R - np.eye(n), ord="fro"))

In this example the Frobenius norm between RRT and I is 4 × 10−16, so essentially RRT = I to machine precision.

Related posts

[1] Every column of an orthogonal matrix Ω must have length 1, so that gives n constraints. Furthermore, each pair of columns must be orthogonal, which gives n choose 2 more constraints. We start with Ω having n² degrees of freedom, but then remove n and then n(n − 1)/2 degrees of freedom.

n² − nn(n − 1)/2 = n(n − 1)/2

The post Aligning one matrix with another first appeared on John D. Cook.