15. Space Complexity

Time is not the only resource that is of interest in algorithms. Another important one is the amount of memory that algorithms require. The minimum amount of memory required to solve various computational problems can be studied through space complexity.

Space Complexity Classes

The space cost of the Turing machine $M$ on input $x$ is the total number of distinct tape cells that are visited by $M$’s tape head before $M$ halts. When $M$ is a multitape machine, we count the total number of visited cells in all the tapes to obtain the space cost.

Definition. The (worst-case) space cost of the Turing machine $M$ is the function $s : \mathbb{N} \to \mathbb{N}$ where $s(n)$ is the maximum space cost of $M$ on any input $x$ of length $|x| = n$.

For every function $s : \mathbb{N} \to \mathbb{N}$, $\mathbf{SPACE}(s)$ is the set of all languages that can be decided by a Turing machine with worst-case space cost bounded above by $O(s)$; and

We can also extend the notion of space cost and space complexity classes to verifiers and to probabilistic Turing machines, by again counting the number of distinct squares of the input tape visited during execution. Doing so leads to analogues of the time complexity classes we have seen so far.

Space Class Definition
$\mathbf{PSPACE}$ \(\bigcup_{k \ge 1} \mathbf{SPACE}(n^k)\)
$\mathbf{NPSPACE}$ All languages decidable by polynomial-space verifiers.
$\mathbf{BPPSPACE}$ All languages $\frac13$-decidable by polynomial-space probabilistic TMs.
$\mathbf{EXPSPACE}$ \(\bigcup_{k \ge 0} \mathbf{SPACE}(2^{n^k})\)

The class $\mathbf{PSPACE}$ contains $\mathbf{P}$ because a Turing machine that runs in polynomial time can only visit a polynomial number of distinct cells on the tape. It also contains some problems that are not obviously solvable in polynomial time.

Proposition. $\textsf{SAT} \in \mathbf{PSPACE}$.

Proof. Consider the simple exhaustive search algorithm for solving $\textsf{SAT}$. Given a formula $\phi$, we start by checking to see if the assignment $x = 0^n$ satisfies it. If it does, we accept. Otherwise, we erase our work on the tape (leaving the encoding of $\phi$ and the assignment to $x$ that we just checked), increment the assignment, and repeat. If we have checked all the possible assignments of $x$ and none have satisfied$\phi$, then we halt and reject.

This simple algorithm decides $\textsf{SAT}$ because it accepts whenever $\phi$ has a satisfying assignment and rejects otherwise. And because it reuses the same memory for each assignment test, the space cost of this algorithm is the space required to check a single assignment, which is polynomial in the length of the encoding of $\phi$.

It even contains some languages that may not be in the polynomial hierarchy $\mathbf{PH}$.

Theorem. The Totally Quantified Boolean Formula language

\[\textsf{TQBF} = \{ \left< \phi \right> : \exists x_1 \forall x_2 \cdots \exists/\forall x_n \mbox{ such that } \phi(x_1,x_2,\ldots,x_n) = 1\}\]

is in $\mathbf{PSPACE}$.

Proof sketch. The proof strategy is the same as the one for $\textsf{SAT}$: we iterate over all possible assignments of $x_1,\ldots,x_n$ to see if the condition on $\phi$ is satisfied.

Time and Space

We can establish formal connections between time and space complexity classes in the following way.

Theorem. (Time-Space Hierarchy) For every function $s : \mathbb{N} \to \mathbb{N}$,

\[\mathbf{NTIME}(s) \subseteq \mathbf{SPACE}(s) \subseteq \mathbf{TIME}(2^{O(s)}).\]

Proof. We prove both inclusions separately with extensions of the universal Turing machine.

$\mathbf{NTIME}(s) \subseteq \mathbf{SPACE}(s)$:

Fix any $L \in \mathbf{NTIME}(s)$. Let $V$ be a verifier that decides $L$ and has time cost $O(s)$. We can simulate all paths of length $O(s)$ with a deterministic Turing machine that iterates over all possible certificates $c$ to $V$ of length $s$. By erasing the contents of the simulation part of the tape between each simulation, we end up with overall space cost $O(s)$ since that is the space cost of $V$ and also an upper bound on the additional space we require for the simulation. And the resulting simulator accepts if and only if there is a certificate that causes $V$ to accept, so it decides $L$.

$\mathbf{SPACE}(s) \subseteq \mathbf{TIME}(2^{O(s)})$:

Fix any $L \in \mathbf{SPACE}(s)$. Let $M$ be any Turing machine with $m$ states and tape alphabet $\Gamma$ that decides $L$ and has space cost $O(s)$. On any input of length $n$, $M$ can have at most $m \cdot O(s) \cdot |\Gamma|^{O(s)}$ distinct configurations. Since $M$ always terminates, it can never be in the same configuration twice in a computational path. Thus, it must have time cost $t(n) = 2^{O(s(n))}$.

This theorem immediately lets us place $\mathbf{PSPACE}$ within the time complexity hierarchy.

Corollary. $\mathbf{P} \subseteq \mathbf{NP} \subseteq \mathbf{PSPACE} \subseteq \mathbf{EXP}$.

Since $\mathbf{P} \subsetneq \mathbf{EXP}$, at least one of the inclusions in the corollary is proper. It is believed that all of them are proper, but so far all of them correspond to open problems. The first one would of course resolve the famous $\mathbf{P}$ vs. $\mathbf{NP}$ problem, but establishing either of the other two would also represent a significant breakthrough in complexity theory.

Savitch’s Theorem

The space complexity analogue of the $\mathbf{P}$ vs. $\mathbf{NP}$ problem is the natural question: Does $\mathbf{PSPACE} = \mathbf{NPSPACE}$?

Remarkably, we do have an answer to this question. Every nondeterministic Turing machine can be simulated by a deterministic Turing machine with at most a quadratic increase in the amount of space required. This result is known as Savitch’s Theorem.

To prove Savitch’s Theorem, we first consider a seemingly unrelated problem on graphs. Define the $(s,t)$-connectivity (or reachability) language on directed graphs to be

\[\textsf{STCON} = \{ \left<G,s,t,k\right> : \mbox{ there is a path of length } \le k \mbox{ from } s \mbox{ to } t \mbox{ in } G\}\]

The exhaustive search algorithm for $\textsf{STCON}$ that iterates through all possible paths of length $k$ between $s$ and $t$ in $G$ and reuses the same space to check all of these paths uses $O(k \log(n))$ squares of the tape on top of the ones used to store the input. (That is the space required to store $k$ vertices in memory, so that we remember where we are in the exhaustive search.)

We can do much better using a divide-and-conquer algorithm.

Theorem. There is a Turing machine that decides $\textsf{STCON}$ and uses only $O(\log k \cdot \log n)$ squares of the input tape in addition to the ones where the input is written.

Proof. Here is a divide-and-conquer algorithm for solving $\textsf{STCON}$:

  • STCON $(G, s,t, k)$
    • if $k=1$,
      • if $s = t$ or $(s, t) \in E(G)$, Accept; otherwise Reject
    • else
      • for each $u \in V(G)$
        • if STCON$(G,s,u,\lceil \frac k2 \rceil)$ and STCON$(G,u,t,\lfloor \frac k2 \rfloor)$, Accept
      • Reject

The correctness of this algorithm follows from the observation that there is a path of length $k$ from $s$ to $t$ in $G$ if and only if there exists some vertex $u$ for which there are paths of length at most $k/2$ from $s$ to $u$ and from $u$ to $t$.

And the bound on the space cost comes from the fact that a Turing machine implementing the STCON function with the algorithm above only needs to store the values of $s$, $t$, and $k$ for each instance of the algorithm on the current stack: since $k$ gets divided by $2$ in each call, the stack has depth at most $\log_2 k$, so in total the Turing machine can be implemented to use at most $O(\log n \cdot \log k)$ squares outside of the original input.

The connection between the $\textsf{STCON}$ language and Savitch’s theorem comes from the configuration graph of verifiers. Let us consider a graph whose vertices represent all the possible configurations of the verifier $V$. There is an edge from configuration $c_1$ to $c_2$ in this graph if and only if there is a certificate which causes $V$ to go from configuration $c_1$ to $c_2$ in one computational step. We can use the STCON algorithm to deterministically decide if there is a certificate that causes $V$ to accept some input $x$ with little space overhead.

Theorem. (Savitch’s Theorem) For every $s : \mathbb{N} \to \mathbb{N}$,

\[\mathbf{NSPACE}\big(s(n)\big) \subseteq \mathbf{SPACE}\big(s(n)^2\big).\]

Proof. Fix any language $L \in \mathbf{NSPACE}(s)$. Let $V$ be a verifier that decides $L$ and has space cost $O(s(n))$.

A string $x$ is in $L$ if and only if there exists a sequence of consecutive configurations starting from the initial configuration $c_0$ of $V$ on input $x$ with some certificate $a$ and ending with an accepting configuration $c_{\bf A}$. Without loss of generality, we can also assume that $c_{\bf A}$ is unique (say, by having $M$ erase the content of the tape before accepting).

As we have seen in the proof of the Time-Space Hierarchy Theorem, a Turing machine with space cost $O(s)$ can be in at most $2^{O(s)}$ distinct configurations. So the configuration graph $G$ for $V$ on input $x$ has $N = 2^{O(s)}$ vertices. And to determine if $V$ accepts $x$, we want to determine if there is a path of length at most $k = N = 2^{O(s)}$ from $c_0$ to $c_{\bf A}$ in this graph.

Using the STCON algorithm, we can solve this problem with $O(\log k \log N) = O(s^2)$ additional squares of the tape if the configuration graph $G$ is provided on the tape as part of the input. Of course, we don’t want to write $G$ explicitly on the tape, as that would blow up the space cost of the algorithm. But we don’t need to: it suffices to examine the transition function of $V$ to determine if any two vertices $u$, $v$ in its configuration graph $G$ are connected by an edge in the graph. Doing so give us the desired algorithmm that decides $L$ with space cost $O(s(n)^2)$.

Savitch’s Theorem immediately provides a solution to the $\mathbf{PSPACE}$ vs. $\mathbf{NPSPACE}$ question.

Corollary. $\mathbf{PSPACE} = \mathbf{NPSPACE}.$

Savitch’s Theorem does leave one more fundamental open problem: is the quadratic blowup in space complexity required to go from nondeterministic to deterministic Turing machines?


Eric Blais ©2024 — Last edited on Mar. 10, 2024

results matching ""

    No results matching ""