I got a lot of great feedback after posting about excellent numbers on Hacker News. An anonymous user, going by the nickname Someone, noticed a way to re-write the equation for excellent numbers.
Following this method to its conclusion yields explicit solutions for d-digit excellent numbers in terms of the factors of 10d - 1.
Thus, if you want to find all 42 digit excellent numbers, you need to first factorize 999,999,999,999,999,999,999,999,999,999,999,999,999,999. Then, you can go through those factors looking for excellent numbers instead of the 10d/2-digit numbers.
In this post, I am going to go over Somone’s insight.
Please consult my previous post for notation and definition.
A d-digit excellent number n ≡ a10(d/2) + b satisfies:
b2 - a2 = a10(d/2) + b
Set K ≡ 10(d/2) and rearrange:
b2 - b = a2 + aK
Someone’s insight is to re-write the left hand side as:
(b - 0.5)2 - 0.52
and the right hand side as:
(a + K/2)2 - (K/2)2,
and multiply both sides by 4 to get:
(2b - 1)2 - 1 = (2a + K)2 - K2.
Moving terms involving a and b to the left hand side, we get:
(2a + K)2 - (2b - 1)2 = K2 - 1.
Now, this is an equation of the form x2 - y2 = c. That means we can rewrite it as:
(2a + K - (2b - 1))(2a + K + (2b - 1)) = K2 - 1.
Rearranging variables one more time, we get:
(2(a - b) + K + 1)(2(a + b) + K - 1) = K2 - 1.
Note that K2 is just 10d. Consider any two integer factors of 10d-1, denoted p and q. Then, we have:
(2(a - b) + K + 1)(2(a + b) + K - 1) = pq.
This can hold in two ways. Suppose:
2(a - b) + K + 1 = p
2(a + b) + K - 1 = q
hold. Leaving the unknowns on the left hand side yields:
a - b = (p - (K + 1))/2
a + b = (q - (K - 1))/2.
We can solve this system using simple substitution:
a + a - (p - (K + 1))/2 = (q - (K - 1))/2
2a = (q - (K - 1))/2 + (p - (K + 1))/2
a = (p + q - 2K) / 4
Substituting back to get b, we have:
b = (p + q - 2K)/4 - (p - (K + 1))/2
b = (p + q - 2K - 2p + 2K + 2)/4
b = (q - p + 2)/4
Since both a and b must both be positive, we need to only consider factors (p, q) of 10d - 1 such that q > p - 2.
Also, both a and b must lie in the interval (K/10, K), i.e. (10(d/2)-1, 10d/2).
Those conditions impose additional conditions on which factors of 10d- 1 can give rise to excellent numbers: (p + q) > 0.6K, (p + q) < 3K, (q - p) > 0.4K - 2, and (q - p) < 4K - 2.
Looking at this linear constraint system, I feel like there is a neat algebraic structure behind the arithmetic, but it is not apparent to me right now. But then, integer programming is NP-hard, so it may not help much.
The important thing is that this solution transforms the search for excellent numbers into a constrained integer factorization problem.
A 38-digit number requires a little less than 127 bits. Previously, we were finding excellent numbers by searching among 19 digit numbers which meant we could store each part of the number in a 64-bit unsigned integer, and use a few 128 bit arithmetic operations.
In this case, because the integer factors of 1038 - 1 are involved, we’d need 128-bit unsigned integers to hold those if we needed to consider all of them, but the (p + q) < 3K constraint implied by the bounds on a means we don’t need that if we are careful about how we iterate over the prime factors.
Luckily, there is a project dedicated to factorizing numbers of the form (10d - 1)/9. Apparently, these numbers are called repunits. I guess we need factors of repnine numbers, i.e. a repunit multiplied by nine.
For example, 111,111 has the prime factors 3, 7, 11, 13, and 37. Therefore, 999,999 has the prime factors 33, 7, 11, 13, and 37. Of course, we need all integer factors of 999,999. That means going through all the different ways you can multiply these factors, but that is still a much smaller problem space than what one deals with when searching through 10d/2. So, provided you know the factors of 10d - 1, you can find all d-digit excellent numbers much more quickly.
It feels like cheating, because, let’s be honest, it is. Now, in a business context, where money is on the line, my first recommendation would be to go ahead and use the factors someone else found. In this case, it takes some of the fun out of the challenge.
I am hoping that the connection between d-digit excellent numbers and the factors on 10d - 1 will yield some more insight into the structure of excellent numbers. Otherwise, it is good to know that the problem is equivalent to integer factorization.
This was a useful exercise in that it really drove home to me the fact that one really may not need GMP for integers that can fit in 256 or fewer bits. With a bit of up-front analysis, one can save a lot of time, especially using gcc’s __int128
. I learned that Visual C++ has only clumsy support for 128-bit integers, and gcc seems to do a better job than clang at optimization.
Not all problems involving large numbers are this straightforward, so just these observations alone were worth it.
I really appreciate Someone’s insight. This shows the value of sharing your experience with a problem with others.
At one point, brian d foy did notice patterns involving the prime factors of the number of digits, but the explicit solution above did not occur to either of us.
While all that is involved here is high school math, manipulating the equation this way never occurred to me. I really thought I could gain something from trying to express the problem in terms of Vandermonde matrices … Goes to show the value of sticking longer with the simpler approaches.
It was a fun experience. As I mentioned before, I am not one to be normally interested in quests to find another prime, another digit of π, another factorization, but, for some reason, this one really was different.
It is proof that working on puzzles can be fun.
You can comment on this post on r/programming.
PS: Matthew Arcus takes this idea to the limit. It looks like he first mentioned this idea back in March 2015.