Cantor-Schröder-Bernstein Theorem
The Cantor-Schröder-Bernstein theorem (CSB) states that for any two sets A and B, if |A| ≤ |B| and |B| ≤ |A|, then |A| = |B|.
In terms of functions, the Cantor-Schröder-Bernstein theorem states that if A and B are sets and there are injective functions f : A → B and g : B → A, then there exists a bijective function h : A → B.
In terms of relation properties, the Cantor-Schröder-Bernstein theorem shows that the order relation on cardinalities of sets is antisymmetric.
CSB is a fundamental theorem of set theory. It is a convenient tool for comparing cardinalities of infinite sets.
Proof
There are many different proofs of this theorem. We present here a direct proof by using the definitions of injective and surjective function.
Let \(A, B\) be sets and let \(f: A \to B\) and \(g : B \to A\) be injective functions. We need to show that there is a bijective function \(h : A \to B.\)
We will denote the range of the function \(f\) by \(f\left( A \right)\) and the range of the function \(g\) by \(g\left( B \right).\) By condition, the functions \(f: A \to B\) and \(g : B \to A\) are injective, so they may not have inverse functions (unless they are also surjective). However, the function \(g : B \to g\left( B \right)\) between sets \(B\) and \(g\left( B \right)\) is bijective, and therefore it has an inverse \(g^{-1} : g\left( B \right) \to B.\)
Successively applying the injections \(g\) and \(f,\) we get the following fractal-like structures:
Let \({A_0} = A\backslash g\left( B \right)\) be the complement of \(g\left( B \right)\) after the function \(g\) is applied for the first time. When we circle back to \(A\) with the next iteration, we get \({A_1} = \left( {g \circ f} \right)\left( {{A_0}} \right).\) Each cycle creates a new nested region \(A_1, A_2, \ldots, A_n, A_{n+1},\ldots\) (shaded in blue) according to the recurrence relation
We will use the symbol \({A_\infty }\) to denote the union over all \(n,\) so the entire blue region of the set \(A\) is given by
The remaining (orange) region in \(A\) is expressed as \(A \backslash A_{\infty}.\)
Thus, we partitioned the original set \(A\) into two subsets that "behave" differently. The first subset \(A_{\infty}\) maps the blue areas in \(A\) to the blue areas in \(B\) under the function \(f.\) The second subset \(A \backslash A_{\infty}\) maps the orange areas in \(A\) to the orange areas in \(B\) using the function \(g^{-1}.\)
The resulting function \(h : A \to B\) is defined as follows:
To see that \(h\) is injective, we pick two elements \(x,y \in A\) and show that \(h\left( x \right) = h\left( y \right)\) implies \(x = y.\) Consider three cases:
- If \(x,y \in A_{\infty},\) then
\[h\left( x \right) = h\left( y \right),\;\; \Rightarrow f\left( x \right) = f\left( y \right),\;\; \Rightarrow x = y,\]because the function \(f\) is injective.
- If \(x,y \not\in A_{\infty},\) then given that \(g\) is injective, we get
\[h\left( x \right) = h\left( y \right),\;\; \Rightarrow {g^{ - 1}}\left( x \right) = {g^{ - 1}}\left( y \right),\;\; \Rightarrow \left( {g \circ {g^{ - 1}}} \right)\left( x \right) = \left( {g \circ {g^{ - 1}}} \right)\left( y \right),\;\; \Rightarrow x = y.\]
- If \(x \in A_{\infty}\) but \(y \not\in A_{\infty},\) then \(x \in A_n\) for some \(n.\) This yields:
\[h\left( x \right) = h\left( y \right),\;\; \Rightarrow f\left( x \right) = {g^{ - 1}}\left( y \right),\;\; \Rightarrow \left( {g \circ f} \right)\left( x \right) = \left( {g \circ {g^{ - 1}}} \right)\left( y \right),\;\; \Rightarrow y = \left( {g \circ f} \right)\left( x \right) \in \left( {g \circ f} \right)\left( {{A_n}} \right) = {A_{n + 1}}.\]This contradicts the assumption that \(y \not\in A_{\infty}.\) Hence, this case cannot happen.
To see that \(h\) is surjective, we take an arbitrary element \(b \in B\) and show that there is a preimage \(x \in A\) such that \(h\left( x \right) = b.\) There are two cases to consider:
- If \(x = g\left( b \right) \not\in A_{\infty},\) then
\[h\left( x \right) = {g^{ - 1}}\left( x \right) = {g^{ - 1}}\left( {g\left( b \right)} \right) = \left( {{g^{ - 1}} \circ g} \right)\left( b \right) = b.\]
- If \(x = g\left( b \right) \in A_{\infty},\) then \(x \in A_n\) for some \(n.\) Here \(n \gt 0\) because \(x \in g\left( B \right).\) Using the recurrence relation \({A_{n}} = \left( {g \circ f} \right)\left( {{A_{n-1}}} \right),\) we can write \(x \in \left( {g \circ f} \right)\left( {{A_{n-1}}} \right).\) If we take an element \(z \in A_{n-1},\) then we have \(x = \left( {g \circ f} \right)\left( z \right) = g\left( {f\left( z \right)} \right).\) The function \(g\) is injective. Therefore
\[\left\{ \begin{array}{l} x = g\left( b \right)\\ x = g\left( {f\left( z \right)} \right) \end{array} \right., \Rightarrow b = f\left( z \right).\]We can choose \(x = z\) to have\[h\left( x \right) = f\left( x \right) = f\left( z \right) = b.\]
Thus for any \(b \in B,\) there is an element \(x \in A\) for which \(h\left( x \right) = b.\) So, the function \(h\) is surjective. Since \(h\) is both injective and surjective, it is bijective.
Corollary 1. \(\left| {\left( {0,1} \right]} \right| = \left| {\left( {0,1} \right)} \right|\)
We have already found a bijective function between the sets \({\left( {0,1} \right]}\) and \({\left( {0,1} \right)}\) in Example \(3\) on the Cardinality of a Set page. Now we solve the problem by using the Cantor-Schröder-Bernstein theorem.
The function \(f\left( x \right) = \frac{x}{2} + \frac{1}{4}\) is an injection \(\left( {0,1} \right] \to \left( {0,1} \right).\) Also, the function \(g\left( x \right) = x\) is an injection \(\left( {0,1} \right) \to \left( {0,1} \right].\) It then follows from the CSB theorem that
Thus, in order to prove that two sets have the same cardinality, we can just look for injections both ways instead of looking for a bijection.
Corollary 2. Equal Cardinalities of Unit Square and Unit Interval
Consider the open unit square \(I \times I = \left( {0,1} \right) \times \left( {0,1} \right)\) and the open unit interval \(I = \left( {0,1} \right).\)
To build an injection from \(I \times I\) to \(I,\) we represent the coordinates \(\left( {x,y} \right)\) of an arbitrary point of the square by their decimal expansions. Let
We assume here that \(x\) and \(y\) do not have a repeating sequence of \(9'\text{s}\) at the end. If there exists such a number, it must be represented as a terminating decimal:
We define the mapping \(f:I \times I \to I\) as follows:
If \(\left( {{x},{y}} \right) \ne \left( {{x^\prime},{y^\prime}} \right),\) then the two numbers have a different coordinate, and hence, their decimal expansions are also different, that is, \({z_1} \ne {z_2}.\) This means that the function \(f\) is injective.
There is a natural injection from \(I\) to \(I \times I:\) \(g\left( z \right) = \left( {a,z} \right),\) where \(a\) is any number from the interval \(\left( {0,1} \right).\) For example, if we take \(a = \frac{1}{2},\) then \(g\left( z \right) = \left( {\frac{1}{2}, z} \right).\)
By the Cantor-Schröder-Bernstein theorem,
Thus, the open unit square and the open unit interval have the same cardinality, which is an amazing result!
Corollary 3. \(\left| \mathbb{R} \right| = \left| {\mathbb{R} \times \mathbb{R}} \right|\)
We can map \(I \to \mathbb{R}\) using the function
This mapping is bijective.
Similarly, the mapping \(I \times I \to \mathbb{R} \times \mathbb{R}\) is given by the function
that is also bijective.
Then we have
that is, the set of points of a plane and the set of points of a number line have the same cardinality.
Corollary 4. \(\left| \mathbb{R} \right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|\)
Notice that the cardinality of \(\mathbb{R}\) is the same as the cardinality of the open unit interval \(I = \left( {0,1} \right)\) because there exists a bijective function \(f: I \to \mathbb{R}\) between the sets:
Next, similarly to the Corollary \(1,\) we can easily prove that the open interval \({\left( {0,1} \right)}\) and the closed interval \({\left[ {0,1} \right]}\) have the same cardinality. Hence, we will just need to show that \(\left| {\left[ {0,1} \right]} \right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|.\) By the Cantor-Schröder-Bernstein theorem, it suffices to find injections \(\left[ {0,1} \right] \to \mathcal{P}\left( \mathbb{N} \right)\) and \(\mathcal{P}\left( \mathbb{N} \right) \to \left[ {0,1} \right].\)
To define \(\left[ {0,1} \right] \to \mathcal{P}\left( \mathbb{N} \right),\) we use the fact that any real can be represented as a binary decimal. Let
Then the mapping function is given by
For example,
The function \(f\) is injective.
To define \(\mathcal{P}\left( \mathbb{N} \right) \to \left[ {0,1} \right],\) we use the similar approach. Let \(S\) be a subset of \(\mathbb{N}.\) The function \(g\left( S \right)\) is given by
where \(b_i = 1\) if \(i \in S\) and \(b_i = 0\) if \(i \not\in S.\) For example,
We also set
It is clear that the function \(g\) is injective.
By the Cantor-Schröder-Bernstein theorem, \(\left| \left[ {0,1} \right]\right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|\) and hence
Note that if \(A\) is an arbitrary set, its power set \(\mathcal{P}\left( A \right)\) can be represented as
Indeed, with every subset \(B \subseteq A,\) we can associate the characteristic function \({\chi _B}:A \to \left\{ {0,1} \right\}\) defined by
where \(a \in A.\)
Clearly, the total number of subsets of \(A\) is \(2^{\left| A \right|}.\) For the set of natural numbers, we obtain
Then