Por favor explique esta definición formal de computación

7

Estoy tratando de atacar a TAOCP una vez más, dada la gran pesadez literal de los volúmenes que tengo problemas para comprometerme en serio. En TAOCP 1 Knuth escribe, página 8, conceptos básicos ::

Dejar $A$ ser un conjunto finito de letras. Dejar $A^*$ ser el conjunto de todas las cadenas en $A$ (el conjunto de todas las secuencias ordenadas $x_1$ $x_2$ ... $x_n$ dónde $n \ge 0$ y $x_j$ es en $A$ para $1 \le j \le n$ ) La idea es codificar los estados del cálculo para que estén representados por cadenas de $A^*$ . Ahora deja $N$ ser un número entero no negativo y Q (el estado) sea el conjunto de todos $(\sigma, j)$ , dónde $\sigma$ es en $A^*$ y j es un entero $0 \le j \le N$ ; dejar $I$ (la entrada) sea el subconjunto de Q con $j=0$ y deja $\Omega$ (la salida) sea el subconjunto con $j = N$ . Si $\theta$ y $\sigma$ son cadenas en $A^*$ , Nosotros decimos eso $\theta$ ocurre en $\sigma$ Si $\sigma$ tiene la forma $\alpha \theta \omega$ para cuerdas $\alpha$ y $\omega$ . Para completar nuestra definición, dejemos $f$ ser una función del siguiente tipo, definido por las cadenas $\theta_j$ , $\phi_j$ y los enteros $a_j$ , $b_j$ para $0 \le j \le N$ :

$f((\sigma, j)) = (\sigma, a_j)$ Si $\theta_j$ no ocurre en $\sigma$

$f((\sigma, j)) = (\alpha \psi_j \omega, b_j)$ Si $\alpha$ es la cadena más corta posible para la cual $\sigma = \alpha \theta_j \omega$

$f((\sigma,N)) = (\sigma, N)$

Al no ser un informático, tengo problemas para comprender todo el pasaje. Tengo la idea de que está detrás de un sistema de códigos de operación, pero no he progresado efectivamente en la comprensión. Creo que el problema principal es tat. No sé cómo leerlo de manera efectiva.

¿Sería posible explicar el pasaje anterior para que pueda entenderlo y darme una estrategia para entrar en la lógica al interpretar estas declaraciones?

formal-languages turing-machines computation-models Stefano Borini
fuente

Entonces no debe incluir su comentario en la supuesta cita, confundiendo a cualquiera que no tenga el libro a mano. -.- Espero que mi respuesta ayude ...

Raphael

@Raphael: la cita es literal del libro. Acabo de agregar una explicación entre paréntesis de los símbolos para I y omega

Stefano Borini

@SteanoBorini: Pero no es una "explicación", está mal. Veo cómo puedes leer el texto original para llegar a la misma conclusión que lo hiciste, pero todavía no es útil. Si dice que cita algo y agrega comentarios, márquelo como tal para que las personas puedan tomarlo con un grano de sal.

Raphael

Aquí falta un contexto: ¿qué cálculo y qué estados?

reinierpost

8

Nos falta algo de contexto, así que no tengo idea de qué punto está tratando de hacer Knuth, pero aquí está cómo interpretar una máquina Turing de esta manera. Quizás te ayudará a entender lo que está sucediendo. En general, una buena forma de controlar un concepto es jugar con él. En el caso de paradigmas de programación, eso significa escribir un programa. En este caso, mostraré cómo escribir cualquier programa.

Supongamos que la cinta de la máquina Turing tiene símbolos $\{0,1,\epsilon\}$ (dónde $\epsilon$ significa "vacío") y agregue un símbolo más que represente la ubicación de la cabeza $H$ . Tus estados van a ser pares de la forma $(q,\alpha)$ , dónde $q$ es un estado de la máquina de Turing, y $\alpha \in \{0,\ldots,14\}$ . También identificamos $(F,0)$ con $N$ para cualquier estado final.

Entrada activada (no vacía) $x$ , tu punto de partida será $(Hx,(s,0))$ , dónde $s$ is the starting state. The difficult part is to encode states. Suppose that at state $q$ , upon reading input $x$ , you replace it with $a(q,x)$ , move in direction $D(q,x) \in \{L,R\}$ , and switch to state $\sigma(q,x)$ . For the $\theta$ s, we have

\begin{aligned} θ_{q, 0} & = 0 H 0, & θ_{q, 1} & = 0 H 1, & θ_{q, 2} & = 0 H ϵ, \\ θ_{q, 3} & = 1 H 0, & θ_{q, 4} & = 1 H 1, & θ_{q, 5} & = 1 H ϵ, \\ θ_{q, 6} & = ϵ H 0 & θ_{q, 7} & = ϵ H 1, & θ_{q, 8} & = ϵ H ϵ, \\ θ_{q, 9} & = H 0, & θ_{q, 10} & = H 1, & θ_{q, 11} & = H ϵ, \\ θ_{q, 12} & = 0 H, & θ_{q, 13} & = 1 H, & θ_{q, 14} & = ϵ H . \end{aligned}

$\begin{align*} \theta_{q,0} &= 0H0, & \theta_{q,1} &= 0H1, & \theta_{q,2} &= 0H\epsilon, \\ \theta_{q,3} &= 1H0, & \theta_{q,4} &= 1H1, & \theta_{q,5} &= 1H\epsilon, \\ \theta_{q,6} &= \epsilon H0 & \theta_{q,7} &= \epsilon H1, & \theta_{q,8} &= \epsilon H\epsilon, \\ \theta_{q,9} &= H0, & \theta_{q,10} &= H1, & \theta_{q,11} &= H\epsilon, \\ \theta_{q,12} &= 0H, & \theta_{q,13} &= 1H, & \theta_{q,14} &= \epsilon H. \end{align*}$ For the

a

$a$ s, we have

a_{q, i} = (q, i + 1)

$a_{q,i} = (q,i+1)$ for

i < 14

$i < 14$ , and

a_{q, 14} = (q, 14)

$a_{q,14} = (q,14)$ , though we should never really get that far. For the

b

$b$ s, we have

\begin{aligned} b_{q, 0} = b_{q, 3} = b_{q, 6} = b_{q, 9} = (σ (q, 0), 0), \\ b_{q, 1} = b_{q, 4} = b_{q, 7} = b_{q, 10} = (σ (q, 1), 0), \\ b_{q, 2} = b_{q, 5} = b_{q, 8} = b_{q, 11} = b_{q, 12} = b_{q, 13} = b_{q, 14} = (σ (q, ϵ), 0) . \end{aligned}

$\begin{align*} &b_{q,0} = b_{q,3} = b_{q,6} = b_{q,9} = (\sigma(q,0),0), \\ &b_{q,1} = b_{q,4} = b_{q,7} = b_{q,10} = (\sigma(q,1),0), \\ &b_{q,2} = b_{q,5} = b_{q,8} = b_{q,11} = b_{q,12} = b_{q,13} = b_{q,14} = (\sigma(q,\epsilon),0). \end{align*}$ Now it remains to determine the

ψ

$\psi$ s. Let

a_{0} = a (q, 0)

$a_0 = a(q,0)$ . If

D (q, 0) = L

$D(q,0) = L$ then

\begin{aligned} ψ_{q, 0} & = H 0 a_{0}, & ψ_{q, 3} & = H 1 a_{0}, & ψ_{q, 6} & = ψ_{q, 9} = H ϵ a_{0} . \end{aligned}

$\begin{align*} \psi_{q,0} &= H0a_0, & \psi_{q,3} &= H1a_0, & \psi_{q,6} &= \psi_{q,9} = H\epsilon a_0. \end{align*}$ If

D (q, 0) = R

$D(q,0) = R$ then

\begin{aligned} ψ_{q, 0} & = 0 a_{0} H, & ψ_{q, 3} & = 1 a_{0} H, & ψ_{q, 6} & = ϵ a_{0} H, & ψ_{q, 9} & = a_{0} H ϵ . \end{aligned}

$\begin{align*} \psi_{q,0} &= 0a_0H, & \psi_{q,3} &= 1a_0H, & \psi_{q,6} &= \epsilon a_0 H, & \psi_{q,9} &= a_0H\epsilon. \end{align*}$ Next, let

a_{1} = a (q, 1)

$a_1 = a(q,1)$ . If

D (q, 1) = L

$D(q,1) = L$ then

\begin{aligned} ψ_{q, 1} & = H 0 a_{1}, & ψ_{q, 4} & = H 1 a_{1}, & ψ_{q, 7} & = ψ_{q, 10} = H ϵ a_{1} . \end{aligned}

$\begin{align*} \psi_{q,1} &= H0a_1, & \psi_{q,4} &= H1a_1, & \psi_{q,7} &= \psi_{q,10} = H\epsilon a_1. \end{align*}$ If

D (q, 1) = R

$D(q,1) = R$ then

\begin{aligned} ψ_{q, 1} & = 0 a_{1} H, & ψ_{q, 4} & = 1 a_{1} H, & ψ_{q, 7} & = ϵ a_{1} H, & ψ_{q, 10} & = a_{1} H ϵ . \end{aligned}

$\begin{align*} \psi_{q,1} &= 0a_1H, & \psi_{q,4} &= 1a_1H, & \psi_{q,7} &= \epsilon a_1 H, & \psi_{q,10} &= a_1 H\epsilon. \end{align*}$ Finally, let

a_{ϵ} = a (q, ϵ)

$a_\epsilon = a(q,\epsilon)$ . If

D (q, ϵ) = L

$D(q,\epsilon) = L$ then

\begin{aligned} ψ_{q, 2} & = H 0 a_{ϵ}, & ψ_{q, 5} & = H 1 a_{ϵ}, & ψ_{q, 8} & = ψ_{q, 11} = H ϵ a_{ϵ}, \\ ψ_{q, 12} & = H 0 a_{ϵ}, & ψ_{q, 13} & = H 1 a_{ϵ}, & ψ_{q, 14} & = H ϵ a_{ϵ} . \end{aligned}

$\begin{align*} \psi_{q,2} &= H0a_\epsilon, & \psi_{q,5} &= H1a_\epsilon, & \psi_{q,8} &= \psi_{q,11} = H\epsilon a_\epsilon, \\ \psi_{q,12} &= H0a_\epsilon, & \psi_{q,13} &= H1a_\epsilon, &\psi_{q,14} &= H\epsilon a_\epsilon. \end{align*}$ If

D (q, ϵ) = R

$D(q,\epsilon) = R$ then

\begin{aligned} ψ_{q, 2} & = 0 a_{ϵ} H, & ψ_{q, 5} & = 1 a_{ϵ} H, & ψ_{q, 8} & = ϵ a_{ϵ} H, & ψ_{q, 11} & = a_{ϵ} H ϵ, \\ ψ_{q, 12} & = 0 a_{ϵ} H, & ψ_{q, 13} & = 1 a_{ϵ} H, & ψ_{q, 14} & = ϵ a_{ϵ} H . \end{aligned}

$\begin{align*} \psi_{q,2} &= 0a_\epsilon H, & \psi_{q,5} &= 1a_\epsilon H, & \psi_{q,8} &= \epsilon a_\epsilon H, & \psi_{q,11} &= a_\epsilon H\epsilon, \\ \psi_{q,12} &= 0a_\epsilon H, & \psi_{q,13} &= 1a_\epsilon H, & \psi_{q,14} &= \epsilon a_\epsilon H. \end{align*}$

Now apply $f$ repeatedly until you get stuck. If you follow the construction, you will see that we have simulated the running of the Turing machine.

Yuval Filmus
fuente

understood: nothing. Not your fault. Thank you anyway :(

3

"Nos falta algo de contexto". Es: deberíamos tener una descripción precisa de lo que queremos decir con un "método de cálculo"; aquí hay uno dado por AA Markov; Hay otros equivalentes, como las máquinas de Turing.

rgrig

6

Vamos a desglosarlo poco a poco. En primer lugar, recuerde lo que Knuth escribió en la página 7:

Definamos formalmente un método computacional para que sea cuádruple $(Q,I,\Omega,f)$ , in which $Q$ is a set containing subsets $I$ and $\Omega$ , and $f$ is a function from $Q$ into itself. [...] The four quantities $Q$ , $I$ , $\Omega$ , $f$ are intended to represent repectively the state of the computation, the input, the output, and the computational rule.

This is the outline. You have to read "represent" as "contain"; $Q$ is going to contain states (some of which are in $I$ , some are in $\Omega$ ) and $f$ is going to be a transition function between states; think of it as a program.

Let $A$ be a finite set of letters. Let $A^*$ be the set of all strings in $A$ (the set of all ordered sequences $x_1$ $x_2$ ... $x_n$ where $n \ge 0$ and $x_j$ is in $A$ for $1 \le j \le n$ ).

This is just a reiteration of what $A^*$ is. See also here.

The idea is to encode the states of the computation so that they are represented by strings of $A^*$ .

This is probably the key sentence. We are talking about computations, that is execution sequences of some (programming language) statements which manipulate some state, which can be thought of as values in memory cells, or valuations of variables. Knuth says here that he wants to encode these states in an abstract way, namely as word over some alphabet.

Example: Consider a program that uses (at most) $k$ variables, each of which stores an integer. That is, a state is given by the tuple of values $(x_1, \dots, x_k)$ where $x_k$ is the (current) value of the $k$ -th variable. In order to encode states of this form in a formal language, we can choose $A = \{0,1,\#\}$ with $\#$ a separator. Now model such a state by $\#\overline{x_1}\#\cdots\#\overline{x_k}\#$ where $\overline{x_i}$ is the binary encoding of $x_i$ .

Specifically, $(3,5,0)$ would be $\#11\#101\#0\#$ .

Now let $N$ be a non-negative integer and Q be the set of all $(\sigma, j)$ , where $\sigma$ is in $A^*$ and j is an integer $0 \le j \le N$ ; let $I$ be the subset of Q with $j=0$ and let $\Omega$ be the subset with $j = N$ .

You misquoted there (bad Stefano!); the parentheses are not in the original text, and they were misleading (see above). Knuth defines $Q$ here as the set of all possible states ( $\sigma \in A^*$ ) at all possible places in the computation ( $j$ can be understood as program counter). Therefore, $Q$ contains all statement-indexed states any computation of the algorithm given by $f$ can assume. By definition, we start with program counter $0$ and end in $N$ , thus states indexed $0$ are input states and those indexed $N$ are output states.

If $\theta$ and $\sigma$ are strings in $A^*$ , we say that $\theta$ occurs in $\sigma$ if $\sigma$ has the form $\alpha \theta \omega$ for strings $\alpha$ and $\omega$ .

I hope that this is clear; it is just a (re)definition of substrings.

To complete our definition, let $f$ be a function of the following type, defined by the strings $\theta_j$ , $\phi_j$ and the integers $a_j$ , $b_j$ for $0 \le j \le N$ :

$f((\sigma, j)) = (\sigma, a_j)$ if $\theta_j$ does not occur in $\sigma$

$f((\sigma, j)) = (\alpha \psi_j \omega, b_j)$ if $\alpha$ is the shortest possible string for which $\sigma = \alpha \theta_j \omega$

$f((\sigma,N)) = (\sigma, N)$

This is a a small programming language; if you fix $\theta_j, \psi_j, a_j, b_j$ , you have a program. On program counter $j$ , $f$ replaces the left-most occurrence $\theta_j$ in the state with $\psi_j$ and goes to statement $b_j$ . If there is no $\theta_j$ in the current state, it goes to statement $a_j$ . The program loops if statement $N$ is reached, modelling termination.

On the upper half of page 8, there is a more concrete example of a "program" $f$ . Keep in mind that Knuth is going to use assembly language later on; this informs how he looks at programs (atomic statements connected by jumps).

Raphael
fuente

1

Now I got a bit better understanding of what is going on. However, two things are still not clear and I would really appreciate if you could expand your answer. First, θj,ψj,aj,bj - what are these strings and numbers? What do they represent? If I understand correctly, aj and bj represent the step number or command counter for state j+1. But I am not sure what θj,ψj strings mean. Can you explain what do you mean by " if you fix θj,ψj,aj,bj, you have a program"? Or rather, how would I fix it for some example?

Georgy Bolyuba

@GeorgyBolyuba: You are right about

a_{j}

$a_j$ and

b_{j}

$b_j$ . The program's state is a string

σ

$\sigma$ and a "program counter"

j

$j$ .

θ_{j}

$\theta_j$ and

ψ_{j}

$\psi_j$ are used to modify that state (see second case of

f

$f$ ). They can have all kinds of shapes; it really depends on how you encode state as a string. See the book for an example.

Raphael

5

That text describes the following (Python) pseudocode:

subs = a list of string pairs  
As = a list of integers  
Bs = a list of integers

def f(state, pc):  
  if pc == N: return (state, pc)  
  if state.find(subs[pc][0]) != -1:  
    return (state.replace(subs[pc][0],subs[pc][1],1), Bs[pc])  
  else:  
    return (state,As[pc])

The function f is presumably going to be applied repeatedly.

The last three bullet points is all you really need once you understand the notations. All that comes before is a bit analogous to explaining how Python works before giving the Python code.

rgrig
fuente

Ah ok, it's a Turing machine.

Stefano Borini

1

Rather, it is a different model of computation with the same power as a Turing machine.

Yuval Filmus

Well, three lines below your quote Knuth says that this is equivalent to Turing machines, so presumably you already knew this when you asked. I thought you were asking for help with the notation. Now I have no idea what is it that you wanted to ask.

rgrig

Por favor explique esta definición formal de computación

Respuestas: