Calculus

Set Theory

Set Theory Logo

Cantor-Schröder-Bernstein Theorem

The Cantor-Schröder-Bernstein theorem \(\left( {\text{CSB}} \right)\) states that for any two sets \(A\) and \(B,\) if \(\left| A \right| \le \left| B \right|\) and \(\left| B \right| \le \left| A \right|,\) then \(\left| A \right| = \left| B \right|.\)

In terms of functions, the Cantor-Schröder-Bernstein theorem states that if \(A\) and \(B\) are sets and there are injective functions \(f : A \to B\) and \(g : B \to A,\) then there exists a bijective function \(h : A \to B.\)

In terms of relation properties, the Cantor-Schröder-Bernstein theorem shows that the order relation on cardinalities of sets is antisymmetric.

\( {\text{CSB}} \) is a fundamental theorem of set theory. It is a convenient tool for comparing cardinalities of infinite sets.

Proof

There are many different proofs of this theorem. We present here a direct proof by using the definitions of injective and surjective function.

Let \(A, B\) be sets and let \(f: A \to B\) and \(g : B \to A\) be injective functions. We need to show that there is a bijective function \(h : A \to B.\)

We will denote the range of the function \(f\) by \(f\left( A \right)\) and the range of the function \(g\) by \(g\left( B \right).\) By condition, the functions \(f: A \to B\) and \(g : B \to A\) are injective, so they may not have inverse functions (unless they are also surjective). However, the function \(g : B \to g\left( B \right)\) between sets \(B\) and \(g\left( B \right)\) is bijective, and therefore it has an inverse \(g^{-1} : g\left( B \right) \to B.\)

Two injections in the Cantor-Schroeder-Bernstein theorem.
Figure 1.

Successively applying the injections \(g\) and \(f,\) we get the following fractal-like structures:

Bijective function in the Cantor-Schroeder-Bernstein theorem.
Figure 2.

Let \({A_0} = A\backslash g\left( B \right)\) be the complement of \(g\left( B \right)\) after the function \(g\) is applied for the first time. When we circle back to \(A\) with the next iteration, we get \({A_1} = \left( {g \circ f} \right)\left( {{A_0}} \right).\) Each cycle creates a new nested region \(A_1, A_2, \ldots, A_n, A_{n+1},\ldots\) (shaded in blue) according to the recurrence relation

\[{A_{n + 1}} = \left( {g \circ f} \right)\left( {{A_n}} \right).\]

We will use the symbol \({A_\infty }\) to denote the union over all \(n,\) so the entire blue region of the set \(A\) is given by

\[{A_\infty } = \bigcup\limits_{n = 0}^\infty {{A_n}} = \bigcup\limits_{n = 0}^\infty {{{\left( {g \circ f} \right)}^n}\left( {A\backslash g\left( B \right)} \right)} .\]

The remaining (orange) region in \(A\) is expressed as \(A \backslash A_{\infty}.\)

Thus, we partitioned the original set \(A\) into two subsets that "behave" differently. The first subset \(A_{\infty}\) maps the blue areas in \(A\) to the blue areas in \(B\) under the function \(f.\) The second subset \(A \backslash A_{\infty}\) maps the orange areas in \(A\) to the orange areas in \(B\) using the function \(g^{-1}.\)

The resulting function \(h : A \to B\) is defined as follows:

\[h\left( x \right) = \left\{ {\begin{array}{*{20}{l}} {f\left( x \right)} &{\text{if }x \in {A_\infty }}\\ {{g^{ - 1}}\left( x \right)} &{\text{otherwise}} \end{array}} \right..\]

To see that \(h\) is injective, we pick two elements \(x,y \in A\) and show that \(h\left( x \right) = h\left( y \right)\) implies \(x = y.\) Consider three cases:

  1. If \(x,y \in A_{\infty},\) then
    \[h\left( x \right) = h\left( y \right),\;\; \Rightarrow f\left( x \right) = f\left( y \right),\;\; \Rightarrow x = y,\]
    because the function \(f\) is injective.
  2. If \(x,y \not\in A_{\infty},\) then given that \(g\) is injective, we get
    \[h\left( x \right) = h\left( y \right),\;\; \Rightarrow {g^{ - 1}}\left( x \right) = {g^{ - 1}}\left( y \right),\;\; \Rightarrow \left( {g \circ {g^{ - 1}}} \right)\left( x \right) = \left( {g \circ {g^{ - 1}}} \right)\left( y \right),\;\; \Rightarrow x = y.\]
  3. If \(x \in A_{\infty}\) but \(y \not\in A_{\infty},\) then \(x \in A_n\) for some \(n.\) This yields:
    \[h\left( x \right) = h\left( y \right),\;\; \Rightarrow f\left( x \right) = {g^{ - 1}}\left( y \right),\;\; \Rightarrow \left( {g \circ f} \right)\left( x \right) = \left( {g \circ {g^{ - 1}}} \right)\left( y \right),\;\; \Rightarrow y = \left( {g \circ f} \right)\left( x \right) \in \left( {g \circ f} \right)\left( {{A_n}} \right) = {A_{n + 1}}.\]
    This contradicts the assumption that \(y \not\in A_{\infty}.\) Hence, this case cannot happen.

To see that \(h\) is surjective, we take an arbitrary element \(b \in B\) and show that there is a preimage \(x \in A\) such that \(h\left( x \right) = b.\) There are two cases to consider:

  1. If \(x = g\left( b \right) \not\in A_{\infty},\) then
    \[h\left( x \right) = {g^{ - 1}}\left( x \right) = {g^{ - 1}}\left( {g\left( b \right)} \right) = \left( {{g^{ - 1}} \circ g} \right)\left( b \right) = b.\]
  2. If \(x = g\left( b \right) \in A_{\infty},\) then \(x \in A_n\) for some \(n.\) Here \(n \gt 0\) because \(x \in g\left( B \right).\) Using the recurrence relation \({A_{n}} = \left( {g \circ f} \right)\left( {{A_{n-1}}} \right),\) we can write \(x \in \left( {g \circ f} \right)\left( {{A_{n-1}}} \right).\) If we take an element \(z \in A_{n-1},\) then we have \(x = \left( {g \circ f} \right)\left( z \right) = g\left( {f\left( z \right)} \right).\) The function \(g\) is injective. Therefore
    \[\left\{ \begin{array}{l} x = g\left( b \right)\\ x = g\left( {f\left( z \right)} \right) \end{array} \right., \Rightarrow b = f\left( z \right).\]
    We can choose \(x = z\) to have
    \[h\left( x \right) = f\left( x \right) = f\left( z \right) = b.\]

Thus for any \(b \in B,\) there is an element \(x \in A\) for which \(h\left( x \right) = b.\) So, the function \(h\) is surjective. Since \(h\) is both injective and surjective, it is bijective.

Corollary 1. \(\left| {\left( {0,1} \right]} \right| = \left| {\left( {0,1} \right)} \right|\)

We have already found a bijective function between the sets \({\left( {0,1} \right]}\) and \({\left( {0,1} \right)}\) in Example \(3\) on the Cardinality of a Set page. Now we solve the problem by using the Cantor-Schröder-Bernstein theorem.

The function \(f\left( x \right) = \frac{x}{2} + \frac{1}{4}\) is an injection \(\left( {0,1} \right] \to \left( {0,1} \right).\) Also, the function \(g\left( x \right) = x\) is an injection \(\left( {0,1} \right) \to \left( {0,1} \right].\) It then follows from the CSB theorem that

\[\left| {\left( {0,1} \right]} \right| = \left| {\left( {0,1} \right)} \right|.\]

Thus, in order to prove that two sets have the same cardinality, we can just look for injections both ways instead of looking for a bijection.

Corollary 2. Equal Cardinalities of Unit Square and Unit Interval

Consider the open unit square \(I \times I = \left( {0,1} \right) \times \left( {0,1} \right)\) and the open unit interval \(I = \left( {0,1} \right).\)

Unit square and unit interval have the same cardinality.
Figure 3.

To build an injection from \(I \times I\) to \(I,\) we represent the coordinates \(\left( {x,y} \right)\) of an arbitrary point of the square by their decimal expansions. Let

\[x = 0.{x_1}{x_2}{x_3} \ldots ,\;\;y = 0.{y_1}{y_2}{y_3} \ldots \]

We assume here that \(x\) and \(y\) do not have a repeating sequence of \(9'\text{s}\) at the end. If there exists such a number, it must be represented as a terminating decimal:

\[0.73699999 \ldots = 0.737\]

We define the mapping \(f:I \times I \to I\) as follows:

\[f\left( {\left( {x,y} \right)} \right) = z = 0.{x_1}{y_1}{x_2}{y_2}{x_3}{y_3} \ldots \]

If \(\left( {{x},{y}} \right) \ne \left( {{x^\prime},{y^\prime}} \right),\) then the two numbers have a different coordinate, and hence, their decimal expansions are also different, that is, \({z_1} \ne {z_2}.\) This means that the function \(f\) is injective.

There is a natural injection from \(I\) to \(I \times I:\) \(g\left( z \right) = \left( {a,z} \right),\) where \(a\) is any number from the interval \(\left( {0,1} \right).\) For example, if we take \(a = \frac{1}{2},\) then \(g\left( z \right) = \left( {\frac{1}{2}, z} \right).\)

By the Cantor-Schröder-Bernstein theorem,

\[\left| {\left( {0,1} \right) \times \left( {0,1} \right)} \right| = \left| {\left( {0,1} \right)} \right|.\]

Thus, the open unit square and the open unit interval have the same cardinality, which is an amazing result!

Corollary 3. \(\left| \mathbb{R} \right| = \left| {\mathbb{R} \times \mathbb{R}} \right|\)

We can map \(I \to \mathbb{R}\) using the function

\[f\left( x \right) = \tan \left( {\pi x - \frac{\pi }{2}} \right).\]

This mapping is bijective.

Similarly, the mapping \(I \times I \to \mathbb{R} \times \mathbb{R}\) is given by the function

\[g\left( {x,y} \right) = \left( {\tan \left( {\pi x - \frac{\pi }{2}} \right),\tan \left( {\pi y - \frac{\pi }{2}} \right)} \right)\]

that is also bijective.

Then we have

\[\left| {\mathbb{R} \times \mathbb{R}} \right| = \left| {\left( {0,1} \right) \times \left( {0,1} \right)} \right| = \left| {\left( {0,1} \right)} \right| = \left| \mathbb{R} \right|,\]

that is, the set of points of a plane and the set of points of a number line have the same cardinality.

Corollary 4. \(\left| \mathbb{R} \right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|\)

Notice that the cardinality of \(\mathbb{R}\) is the same as the cardinality of the open unit interval \(I = \left( {0,1} \right)\) because there exists a bijective function \(f: I \to \mathbb{R}\) between the sets:

\[f\left( x \right) = \tan \left( {\pi x - \frac{\pi }{2}} \right).\]

Next, similarly to the Corollary \(1,\) we can easily prove that the open interval \({\left( {0,1} \right)}\) and the closed interval \({\left[ {0,1} \right]}\) have the same cardinality. Hence, we will just need to show that \(\left| {\left[ {0,1} \right]} \right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|.\) By the Cantor-Schröder-Bernstein theorem, it suffices to find injections \(\left[ {0,1} \right] \to \mathcal{P}\left( \mathbb{N} \right)\) and \(\mathcal{P}\left( \mathbb{N} \right) \to \left[ {0,1} \right].\)

To define \(\left[ {0,1} \right] \to \mathcal{P}\left( \mathbb{N} \right),\) we use the fact that any real can be represented as a binary decimal. Let

\[x = 0.{b_1}{b_2}{b_3}{b_4} \ldots \in \left[ {0,1} \right], \;\text{where }\;{b_i} \in \left\{ {0,1} \right\}.\]

Then the mapping function is given by

\[f\left( x \right) = \left\{ {i \in \mathbb{N} \mid {b_i} = 1} \right\}.\]

For example,

\[f\left( {0.110001 \cdots } \right) = \left\{ {1,2,6, \ldots } \right\},\;\;f\left( {0.00111010 \cdots } \right) = \left\{ {3,4,5,7, \ldots } \right\}.\]

The function \(f\) is injective.

To define \(\mathcal{P}\left( \mathbb{N} \right) \to \left[ {0,1} \right],\) we use the similar approach. Let \(S\) be a subset of \(\mathbb{N}.\) The function \(g\left( S \right)\) is given by

\[g\left( S \right) = 0.{b_1}{b_2}{b_3}{b_4} \ldots ,\]

where \(b_i = 1\) if \(i \in S\) and \(b_i = 0\) if \(i \not\in S.\) For example,

\[g\left( {\left\{ {1,2,7,8, \ldots } \right\}} \right) = 0,11000011 \cdots,\;\;g\left( {\left\{ {4,5,6, \ldots } \right\}} \right) = 0,000111 \cdots .\]

We also set

\[g\left( \varnothing \right) = 0,\;\;g\left( N \right) = 0.11111 \cdots .\]

It is clear that the function \(g\) is injective.

By the Cantor-Schröder-Bernstein theorem, \(\left| \left[ {0,1} \right]\right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|\) and hence

\[\left| \mathbb{R} \right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right|.\]

Note that if \(A\) is an arbitrary set, its power set \(\mathcal{P}\left( A \right)\) can be represented as

\[\left| {P\left( A \right)} \right| = {2^{\left| A \right|}}.\]

Indeed, with every subset \(B \subseteq A,\) we can associate the characteristic function \({\chi _B}:A \to \left\{ {0,1} \right\}\) defined by

\[{\chi _B}\left( a \right) = \left\{ {\begin{array}{*{20}{l}} {1} &{\text{if}\;\;a \in B}\\ {0} &{\text{if}\;\;a \not\in B} \end{array}} \right.,\]

where \(a \in A.\)

Clearly, the total number of subsets of \(A\) is \(2^{\left| A \right|}.\) For the set of natural numbers, we obtain

\[\left| {\mathcal{P}\left( \mathbb{N} \right)} \right| = {2^{\left| \mathbb{N} \right|}}.\]

Then

\[\left| \mathbb{R} \right| = \left| {\mathcal{P}\left( \mathbb{N} \right)} \right| = {2^{\left| \mathbb{N} \right|}}.\]

See solved problems on Page 2.

Page 1 Page 2