Conditional probabilities in a sudoku

Question

While making sudoku puzzles I came up with the following question:

Suppose there is a square $a$ in which there can be a $3$ or a $4$. Obviously, the probability of there being a $3$ is equal to that of there being a $4$.

But what happens when another square is involved? Suppose that if there was a $3$ in $a$ there can be a $5$, a $6$ or a $7$ in this new square $b$. But if there was a $4$ in $a$, there could only be a $5$ or a $6$ in $b$. We have no more information.

Here goes the question: is it true that it is more probable that there is a $3$ than a $4$ in $a$?

My intuition tells me that this is true, because this would lead to more solutions, but I'm not very convinced.

I don't know if this question even makes sense in probability or if it is too complex to answer without further information.

"Obviously..." Not at all obvious. First of all, Sudoku puzzles always have one solution, by definition. And there can be a lot of other indications if you are talking about random sudokus. But random sudoku puzzles are very hard to define. But the problem, for it to be governed by probability, needs to be made more explicit. — Thomas Andrews
– Thomas Andrews, Commented Sep 25 at 19:35
@ThomasAndrews I somewhat disagree with your response. I regard the posted question as a conditional probability question, based on very narrow information. I invite you to post comment(s) following my answer, where you criticize the analysis in my answer. — user2661923
– user2661923, Commented Sep 25 at 21:14
I think perhaps the only way such a question about the "probability" of the value of a specific sudoku cell could make sense is if we think of the joint distribution of all 81 values over the sample space of all sudoku solutions, under the assumption that they are uniformly distributed. Then an exact conditional probability would be computationally intractable, but I think in some cases, it should be reasonable to approximate it according to OP's suggestion. — heropup
– heropup, Commented Sep 26 at 0:49
To be clear, "...they are uniformly distributed" in in relation to the space of all sudoku solutions. — heropup
– heropup, Commented Sep 26 at 0:51
But, as Thomas Andrews already pointed out, the "conditional probability" over the "space of all sudoku solutions" trivializes as the given puzzle only has one solution. So the probability is 0 or 1 depending on if it agrees with the unique solution. — Emil Jeřábek
– Emil Jeřábek, Commented Sep 26 at 7:08

user2661923 · Accepted Answer · 2025-09-27 01:06:37Z

Addendum added to respond to the comment of Emil Jefabek, following the posted question.

Yes, your intuition is true.

First, I will avoid the issue of whether $~7~$ in square B is as equally likely as $~5~$ or $~6~$ in square B. Given this avoidance, it is still reasonable to assume, given the limited information at this point, that there exists some $~p~$ such that $~0 < p < 1,~$ and such that the probability of a $~7~$ in square $~B~$ equals $~p.~$

This implies, based on the limited information, that the probability of a $~3~$ in square A minus the probability of a $~4~$ in square A equals $~p.~$

Therefore, based on this narrow information, $~3~$ is more likely than $~4~$ in square A.

Note:
The preceding paragraphs make the critical assumption that if (for example) square B contains a $~5,~$ that the probability of a $~3~$ in square A equals the probability of a $~4~$ in square A.

That is, I am regarding each of the following (Square A, Square B) possibilities as equally likely:

$$(3,5), (3,6), (4,5), (4,6). \tag1 $$

I actually regard this assumption as unclear, though reasonable. It is conceivable that the presence of the possibility of a $~7~$ in square B, makes the assumption represented in (1) above false.

It comes down to the exact method for constructing sudoku puzzles. So, the comment of Thomas Andrews is reasonable.

However sudoku puzzles are created, I still have a hard time believing, based on the limited information available, that $~3~$ in square A is not more likely than $~4~$ in square A.

$\underline{\text{Addendum}}$

Responding to the comment of Emil Jefabek, following the originally posted question.

But, as Thomas Andrews already pointed out, the "conditional probability" over the "space of all sudoku solutions" trivializes as the given puzzle only has one solution. So the probability is 0 or 1 depending on if it agrees with the unique solution.

I disagree, because the solution to the actual sudoku problem involved is unknown. As I see it, this is akin to considering a coin that has already been flipped, where it is unknown whether the coin landed heads or tails.

If I understand the thinking correctly, the assertion would be that the probability that the coin landed heads is not 1/2 but rather 0 or 1, because the coin has already been flipped.

Focusing on the Sudoku problem:

For any set $~E~$ with a finite number of elements, let $~| ~E ~|~$ denote the number of elements in the set $~E.$

Let $~S~$ denote the set of all possible ordered two square pairings, where the two squares are selected (in order, sampling without replacement) from the 81 squares in a Sudoku problem. So, $~| ~S ~| = 81 \times 80.$

Let $~T~$ denote the set of all possible solvable Sudoku problems. For a problem to be solvable, one or more of the 81-squares must be initially known, one or more of the 81-squares must be initially unknown, and normal Sudoku analytical steps (whatever that means) must result in a unique solution.

For any $~s \in S,~$ let $~T_s~$ denote the subset of $~T~$ where an element $~t~$ is in $~T_s~$ if and only if all of the following is true:

The final solution has 3 or 4 in Square A.
The final solution has 5, 6, or 7 in Square B.
The starting point of $~t~$ represents a solvable sudoku puzzle, where Squares A and B are initially undetermined.
There exists at least one manual-step-by-step attack of problem $~t~$ where at some point during this attack, one can validly conclude each of the following:
- Square-A, which is currently unknown, must be 3 or 4.
- Square-B, which is currently unknown, must be 5, 6, or 7.
- If Square-B is 5 or 6, then Square-A must be 3 or 4.
- If Square-B is 7, then Square-A must be 3.

Let $~S'~$ denote the subset of $~S~$ where $~s \in S'~$ if and only if $~T_s \neq \emptyset.$

For each $~s \in S',~$ let $~T_{(s,3)}~$ represent the subset of $~T_s~$ where the final solution has a $~3~$ (rather than a $~4$) in Square-A.

Then, the overall probability of a $~3~$ in Square-A is

$$\left[ ~\frac{1}{| ~S' ~|} ~\right] \times \left[ ~\sum_{s \in S'} \frac{| ~T_{(s,3)}~|}{| ~T_s ~|} ~\right].$$

Note:

Actually, the above computation is arguable, because it assumes that each element in $~S'~$ is equally likely to represent the (Square-A,Square-B) pair of cells.

Suppose $~s_1, s_2 \in S',~$ where there are more analytical lines of attack that justify putting $~s_1~$ in $~S'~$ than there are analytical lines of attack that justify putting $~s_2~$ in $~S'.~$

Does this mean that $~s_1~$ and $~s_2~$ are not equally likely to occur?

As a related question, given any two valid lines of attack on a solvable Sudoku problem, is each line of attack equally likely to be the attack that was implemented?

Still another approach, that I can not criticize is to argue that if $~s_1, ~s_2 \in S',~$ then the relative probability of $~s_1~$ occurring, rather than $~s_2,~$ is $~\displaystyle \frac{| ~T_{s_1} ~|}{| ~T_{s_1} ~| + |~T_{s_2} ~|}.$

Stack Exchange Network

Conditional probabilities in a sudoku

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Conditional probabilities in a sudoku

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions