Logic MT24, First-order logic
Flashcards
First-order languages
@Define a (countable) first-order language $\mathcal L$.
    A countable first-order language consists of two parts:
- A countable set of non-logical symbols, each of which categorised as one of the following kinds:
    - Constant symbols: $c _ 0, c _ 1, \ldots$
- $k$-ary predicate symbols for some $k \ge 1$: $P _ 0(\cdots), P _ 1(\cdots), \ldots$
- $k$-ary function symbols for some $k \ge 1$: $f _ 0(\cdots), f _ 1(\cdots), \ldots$
 
- The following (disjoint) set of logical symbols:
    - Connectives: $\to, \lnot$ (other connectives can be expressed by logically equivalent formulas)
- Quantifier: $\forall$ (other quantifiers can be expressed by logically equivalent formulas)
- Variables: $\lbrace x _ i\rbrace _ {i \in \mathbb N}$
- Punctuation: “$,$”, “$($” and “$)$”.
- Equality: $\doteq$
 
Given a countable first-order language $\mathcal L$, @define an $\mathcal L$-term.
    Recursively, a string in $\mathcal L$ is an $\mathcal L$-term if it has one of the following forms:
- A variable $x _ i$
- A constant symbol
- $f(t _ 1, \ldots, t _ k)$ where $f$ is a $k$-ary function symbol $\mathcal L$ and each $t _ i$ is a term
What’s missing from the following definition?
  Recursively, a string in $\mathcal L$ is an $\mathcal L$-term if it has one of the following forms:
  
    - A constant symbol
- $f(t  _  1, \ldots, t  _  k)$ where $f$ is a $k$-ary function symbol $\mathcal L$ and each $t  _  i$ is a term
    Recursively, a string in $\mathcal L$ is an $\mathcal L$-term if it has one of the following forms:
- A constant symbol
- $f(t _ 1, \ldots, t _ k)$ where $f$ is a $k$-ary function symbol $\mathcal L$ and each $t _ i$ is a term
Every variable is also an $\mathcal L$-term.
Given a countable first-order language $\mathcal L$ and its corresponding $\mathcal L$-terms, @define an atomic $\mathcal L$-formula.
    Any string of the form
\[P(t _ 1, \ldots, t _ k)\]or
\[t _ 1 \doteq t _ 2\]where each $t _ i$ is an $\mathcal L$-term.
Given a countable first-order language $\mathcal L$ and its corresponding $\mathcal L$-terms and atomic $\mathcal L$-formulas, @define an $\mathcal L$-formula.
    A string is an $\mathcal L$-formula if it has one of the following forms:
- An atomic $\mathcal L$-formula (i.e. $P(t _ 1, \ldots, t _ k)$ or $t _ 1 \doteq t _ 2$ for $\mathcal L$-terms $t _ 1, \ldots, t _ k$)
- $\lnot \phi$ or $(\phi \to \psi)$ where $\phi, \psi$ are $\mathcal L$-formulas
- $\forall x _ i \phi$ where $\phi$ is an $\mathcal L$-formula and $i \in \mathbb N$.
What’s missing from the following definition?
  A string is an $\mathcal L$-formula if it has one of the following forms:
  
    - An atomic $\mathcal L$-formula (i.e. $P(t  _  1, \ldots, t  _  k)$ for $\mathcal L$-terms $t  _  1, \ldots, t  _  k$)
- $\lnot \phi$ or $(\phi \to \psi)$ where $\phi, \psi$ are $\mathcal L$-formulas
- $\forall x  _  i \phi$ where $\phi$ is an $\mathcal L$-formula and $i \in \mathbb N$.
    A string is an $\mathcal L$-formula if it has one of the following forms:
- An atomic $\mathcal L$-formula (i.e. $P(t _ 1, \ldots, t _ k)$ for $\mathcal L$-terms $t _ 1, \ldots, t _ k$)
- $\lnot \phi$ or $(\phi \to \psi)$ where $\phi, \psi$ are $\mathcal L$-formulas
- $\forall x _ i \phi$ where $\phi$ is an $\mathcal L$-formula and $i \in \mathbb N$.
The definition of an atomic $\mathcal L$-formula also includes $t _ 1 \doteq t _ 2$.
Which is it?
  - If $\phi$ is an $\mathcal L$-formula where $x  _  i$ free in $\phi$, the $\forall x  _  i \phi$ is an $\mathcal L$-formula
- If $\phi$ is an $\mathcal L$-formula, then $\forall x  _  i \phi$ is an $\mathcal L$-formula
    2, If $\phi$ is an $\mathcal L$-formula, then $\forall x _ i \phi$ is an $\mathcal L$-formula.
Structures
Suppose $\mathcal L$ is a language. @Define an $\mathcal L$-structure $\mathcal M$.
    - A non-empty set $M$, called the domain of $\mathcal M$
- For each $k$-ary function symbol $f \in \mathcal L$, a $k$-ary function $f^\mathcal M : M^k \to M$
- For each $k$-ary predicate symbol $P \in \mathcal L$, a subset $P^\mathcal M \subseteq M^k$
- For each constant symbol $c \in \mathcal L$, an element $c^\mathcal M \in M$
This can be seen as a 4-tuple
\[\mathcal M = \langle A, (f^\mathcal A) _ {f \in \text{Func}(\mathcal L)}, (P^\mathcal A) _ {P \in \text{Pred}(\mathcal L)}, (c^\mathcal A) _ {c \in \text{Const}(\mathcal L)} \rangle\]What is meant by an interpretation of a language $\mathcal L$?
    A choice of $\mathcal L$-structure.
@Define the notation
\[\mathcal M = \langle M; f^\mathcal M, P^\mathcal M, c^\mathcal M \rangle\]
    A structure in the language $\lbrace f, P, c\rbrace$.
Assignments and valuations
Suppose $\mathcal M = \langle M; f^\mathcal M, f^\mathcal M, P^\mathcal M, c^\mathcal M \rangle$ is an $\mathcal L$-structure. @Define what is meant by an assignment in $\mathcal M$ and what it means for $\mathcal M \models  _  a \phi$.
    An assignment in $\mathcal M$ is a function
\[a : \lbrace x _ 0, x _ 1, \ldots\rbrace \to M\](i.e. it assigns values to each of the variables in $\mathcal L$).
It recursively determines a function
\[\tilde a : \text{Terms}(\mathcal L) \to M\]by:
- $\tilde a(x _ i) := a(x _ i)$ for $i = 0, 1, \ldots$
- $\tilde a(c) := c^\mathcal M$ where $c \in \mathcal L$ is a constant symbol
- $\tilde a(f(t _ 1, \ldots, t _ k)) := f^\mathcal M (\tilde a(t _ 1), \ldots, \tilde a(t _ k))$ where $f \in \mathcal L$ is a $k$-ary function symbol and $t _ i \in \text{Term}(\mathcal L)$.
Then whether
\[\mathcal M \models _ a \phi\]is also determined recursively:
- Atomic:
    - $\mathcal M \models _ a P(t _ 1, \ldots, t _ k)$ iff $(\tilde a(t _ 1), \ldots, \tilde a(t _ k)) \in P^\mathcal M$
- $\mathcal M \models _ a t _ 1 \doteq t _ 2$ iff $\tilde a(t _ 1) = \tilde a(t _ 2)$ (remember equality!)
 
- Built from atomic:
    - $\mathcal M \models _ a \lnot \psi$ iff $\mathcal M \not\models _ a \psi$
- $\mathcal M \models _ a (\psi \to \chi)$ iff $\mathcal M \not\models _ a \psi$ or $\mathcal M \models _ a \chi$
- $\mathcal M \models _ a \forall x _ i \psi$ iff for all assignments $a^\star$ such that $a^\star(x _ j) = a(x _ j)$ for all $j \ne i$, $\mathcal M \models _ {a^\ast} \psi$ (i.e. under all possible assignments of $x _ i$ while keeping every other assignment the same, the formula holds)
 
Suppose $a$ and $a^\star$ are assignments in a structure $\mathcal M$. @Define the notation
\[a^\star \sim  _  i a\]
    for all $j \ne i$.
Suppose:
  - $m$ is a constant symbol in a structure $\mathcal M$
- $a$ is an assignment
@Define the notation
\[a[m / x  _  i]\]
    The unique assignment such that $a [ m/x _ i ] \sim _ i a$ and $a [ m/x _ i ] (x _ i) = m$, i.e.
\[a [ m/x _ i ] (x _ j) = \begin{cases} a(x _ j) &\text{if } j \ne i \\\\ m &\text{if } j = i \end{cases}\]Entailment
@Define what it means for a first-order $\mathcal L$-formula $\phi$ to be logically valid, written $\models \phi$.
    for all $\mathcal L$-structures $\mathcal M$ and for all assignments $a$ in $\mathcal M$.
@Define what it means for a first-order $\mathcal L$-formula $\phi$ to be satisfiable.
    where $a$ is some assignment in $\mathcal M$ .
Suppose that $\Gamma \subseteq \text{Form}(\mathcal L)$, i.e. a subset of first-order formulas over a language $\mathcal L$. @Define what it means to write
\[\mathcal M \models  _  a \Gamma\]
    For all $\phi \in \Gamma$,
\[\mathcal M \models _ a \phi\]Suppose that $\Gamma \subseteq \text{Form}(\mathcal L)$, i.e. a subset of first-order formulas over a language $\mathcal L$ and that $\phi$ is a formula. @Define what it means for $\phi$ to be a logical consequence of $\Gamma$, written $\Gamma \models \phi$.
    For all $\mathcal L$-structures $\mathcal M$ and for all assignments $a$ in $\mathcal M$, then
\[\mathcal M \models _ a \Gamma \text{ implies } \mathcal M \models _ a \phi\]Suppose $\mathcal L$ is a first-order language and that $\phi, \psi \in \text{Form}(\mathcal L)$. @Define what it means for $\phi$ and $\psi$ be logically equivalent, written $\phi \equiv \psi$.
    Suppose:
  - $\mathcal L$ is a first-order language
- $\mathcal M$ is an $\mathcal L$-structure
- $\phi$ is a sentence of $\mathcal L$
Why is it unambiguous to write
\[\mathcal M \models \phi\]
without specifying the choice of assignment $a$?
    Since $\phi$ is a sentence it contains no free-variables. Hence its truth-value does not depend on the choice of assignment.
Suppose:
  - $\mathcal L$ is a first-order language
- $\mathcal M$ is an $\mathcal L$-structure
- $\sigma$ is a sentence of $\mathcal L$
What does it mean for $\mathcal M$ to be a “model” of $\sigma$?
    Tautologies
@Define what it means for a first-order formula $\phi \in \text{Form}(\mathcal L)$ to be a (first-order) tautology?
    $\phi$ comes from putting first-order formulas into a logically valid propositional logic formula, i.e. there exists some logically valid propositional formula $\alpha \in \text{Form}(\mathcal L _ 0)$ containing propositional variables $p _ 1, \ldots, p _ n$ and some $\psi _ 0, \ldots, \psi _ n \in \text{Form}(\mathcal L)$ such that $\phi$ is obtained by substituting each occurrence of $p _ i$ by $\psi _ i$.
@Prove that if a first-order formula $\phi \in \text{Form}(\mathcal L)$ is a tautology, then it is logically valid.
    By the definition of $\phi$ being a tautology, there exists some logically valid propositional formula $\alpha \in \text{Form}(\mathcal L _ 0)$ containing propositional variables $p _ 1, \ldots, p _ n$ and some $\psi _ 0, \ldots, \psi _ n \in \text{Form}(\mathcal L)$ such that $\phi$ is obtained by substituting each occurrence of $p _ i$ by $\psi _ i$.
Take any structure $\mathcal M$ and a corresponding assignment $a$. Consider the propositional valuation
\[v(p _ i) = \begin{cases} \mathbf{true} & \text{if } \mathcal M \models _ a \psi _ i \\\\ \mathbf{false} & \text{otherwise} \end{cases}\]Then it follows that
\[\tilde v(\alpha) = \mathbf{true}\]by the validity of $\alpha$. By the definition of $\models$, it follows that
\[\mathcal M \models _ a \phi\]Since $a$ and $\mathcal M$ were arbitrary, $\phi$ is logically valid.
Give an @example of first-order logical validity that is not a tautology.
    The only propositional formula that would work would $p _ 0$, which is not a valid propositional formula.
Free and bound variables
Suppose:
  - $\mathcal L$ is a first-order language
- $\phi$ is an $\mathcal L$-formula
- $x \in \lbrace x  _  0, x  _  1, \ldots\rbrace$ is a variable
@Define (inductively) what it means for an occurrence of $x$ in $\phi$ to be free.
    Either:
- $\phi$ is atomic
- $\phi = \lnot \psi$ and $x$ is free in $\psi$
- $\phi = (\psi \to \chi)$ and $x$ is free in $\psi$ or in $\chi$
- $\phi = \forall x _ i \psi$ and $x$ is free in $\psi$, and $x \ne x _ i$.
Suppose:
  - $\mathcal L$ is a first-order language
- $\phi$ is an $\mathcal L$-formula
- $\mathcal M$ is an $\mathcal L$-structure
- $a  _  1$, $a  _  2$ are assignments in $\mathcal M$
@Prove that if $a  _  1$ and $a  _  2$ agree on the free variables of $\phi$ (i.e. $a  _  1(x  _  i) = a  _  2(x  _  i)$ where $x  _  i$ free), then
\[\mathcal M \models  _  {a   _   1} \phi \text{ iff } \mathcal M \models   _   {a   _   2} \phi\]
    If $\phi$ is atomic, then all variables in $\phi$ must be free and so the result follows immediately since the assignments must actually be equal.
Induct on the length of $\phi$.
Case $\phi = \lnot \psi$: By the IH, it holds for $\psi$. Since whether $\mathcal M\models\lnot\psi$ is defined in terms of whether $\mathcal M \models \psi$, the result is immediate.
Case $\phi = (\psi \to \rho)$: By the IH, it holds for $\psi$ and $\rho$. Likewise, the result is immediate in this case.
Case $\phi = \forall x _ i \psi$: By the IH, it holds for $\psi$.
Suppose that $\mathcal M \models _ {a _ 1} \forall x _ i \psi$, we aim to show that $\mathcal M \models _ {a _ 2} \forall x _ i \psi$. By the definition of $\forall$, we need to show that for any assignment $a _ 2^\star$ where $a _ 2^\star \sim _ i a _ 2$, $\mathcal M \models _ {a _ 2^\star} \psi$.
Define
\[a _ 1^\star := a _ 1[a _ 2^\star(x _ i)/x _ i]\]Then $\mathcal M \models _ {a _ 1^\star} \psi$ since $a _ 1^\star \sim _ i a _ 1$ (all variables are the same apart from $x _ i$). We now aim to show that $a _ 1^\star$ and $a _ 2^\star$ agree on the free variables of $\psi$, so that we can conclude by the inductive hypothesis that $\mathcal M \models _ {a _ 2^\star} \psi$. This amounts to showing that if $x _ j$ occurs free in $\psi$, then $a _ 2^\star(x _ j) = a _ 1^\star (x _ j)$.
If $j = i$, this follows directly from the definition of $a _ 1^\star$. If $j \ne i$, then $x _ j$ occurs free in $\phi$ and so
\[a _ 2^\star (x _ j) = a _ 2(x _ j) = a _ 1(x _ j) = a _ 1^\star(x _ j)\]Hence $\mathcal M \models _ {a _ 2^\star} \psi$. Since $a _ 2^\star$ was arbitrary with $a _ 2^\star \sim _ i a _ 2$, it follows $\mathcal M \models _ {a _ 2} \forall x _ i \psi$.
Suppose:
  - $\mathcal L$ is a first-order language
- $\alpha, \beta \in \text{Form}(\mathcal L)$
@Prove that if $x  _  i$ has no free occurrence in $\alpha$, then:
\[\models (\forall x  _  i (\alpha \to \beta) \to (\alpha \to \forall x  _  i \beta))\]
(note this is the “entailment version” of the axiom A5, this result is used is used in the proof of the soundness theorem).
    Pick an arbitrary $\mathcal L$-structure $\mathcal M$ and an assignment $a$ such that
\[\mathcal M \models _ a \forall x _ i (\alpha \to \beta)\]We aim to show that then:
\[\mathcal M \models _ a (\alpha \to \forall x _ i \beta)\]To do this, suppose $\mathcal M \models _ a \alpha$. By the definition of $\forall$, we want to show that for all $a^\star \sim _ i a$, $\mathcal M \models _ {a^\star} \beta$. Since $a$ and $a^\star$ agree on all variables apart from $x _ i$ (which is not free in $\alpha$), by the result that says:
\[\mathcal M \models _ {a _ 1} \phi \text{ iff } \mathcal M \models _ {a _ 2} \phi\]Suppose:
- $\mathcal L$ is a first-order language
- $\phi$ is an $\mathcal L$-formula
- $\mathcal M$ is an $\mathcal L$-structure
- $a _ 1$, $a _ 2$ are assignments in $\mathcal M$
- $a _ 1$ and $a _ 2$ agree on the free variables of $\phi$ (i.e. $a _ 1(x _ i) = a _ 2(x _ i)$ where $x _ i$ free) Then:
It follows that
\[\mathcal M \models _ {a^\star} \alpha\]This is useful, because with the assumption that $\mathcal M \models _ a \forall x _ i (\alpha \to \beta)$ along with the definition of $\forall$ (using the reverse direction), it follows that
\[\mathcal M \models _ {a^\star} \beta\]But unwinding everything we have shown so far, this means that
\[\mathcal M \models _ a (\alpha \to \forall x _ i \beta)\]as required.
@Define what it means for a first-order formula $\sigma$ to be a sentence.
    It contains no free variables.
@Define what it means for a formula to be prenex normal form (PNF).
    It is of the form
\[Q _ 1 x _ {i _ 1} \cdots Q _ k x _ {i _ k} \psi\]for quantifiers $Q _ 1, \ldots, Q _ k$, variables $x _ {i _ 1}, \ldots, x _ {i _ k}$ and a quantifier-free formula $\psi$.
@Define what it means for a formula to be in rectified prenex form.
    It is of the form
\[Q _ 1 x _ {i _ 1} \cdots Q _ k x _ {i _ k} \psi\]for quantifiers $Q _ 1, \ldots, Q _ k$, variables $x _ {i _ 1}, \ldots, x _ {i _ k}$ and a quantifier-free formula $\psi$ such that $i _ j \ne i _ k$ for any $j, k$.
Substitutions
Suppose:
  - $\mathcal L$ is a first-order language
- $\phi \in \text{Form}(\mathcal L)$
- $x  _  i$ is a variable
- $t$ is a term $\mathcal L$
@Define what it means to substitute $t$ for $x  _  i$ in $\phi$, written
\[(\phi)[t/x  _  i]\]
.
    Replace each free occurrence of $x _ i$ in $\phi$ with the string $t$, as long as this does not introduce new bound occurrences of variables.
More precisely:
- If $\phi$ is atomic, $(\phi)[t/x _ i]$ is the result of replacing each instance of $x _ i$ in $\phi$ with $t$.
- $(\lnot \psi)[t/x _ i] := \lnot (\psi) [t/x _ i]$ (which is undefined if $(\psi)[t/x _ i]$ is)
- $((\psi \to \chi))[t/x _ i] := ((\psi)[t/x _ i] \to (\chi)[t/x _ i])$ (which is undefined if $(\psi)[t/x _ i]$ or $(\chi)[t/x _ i]$ is)
- $(\forall x _ i \psi)[t/x _ i] := \forall x _ i \psi$
- If $j \ne i$, then $(\forall x _ j \psi)[t/x _ i] := \forall x _ j (\psi) [t/x _ i]$, unless $x _ j$ occurs in $t$ and $x _ i$ occurs free in $\psi$ (in which case it is undefined).
Suppose:
  - $\mathcal L$ is a first-order language
- $t \in \text{Term}(\mathcal L)$
- $a$ is an assignment in an $\mathcal L$-structure
@Define the notation
\[a[t/x  _  i]\]
    @State the substitution lemma for first-order logic.
    Suppose:
- $\mathcal L$ is a first-order language
- $\mathcal M$ is an $\mathcal L$-structure
- $\phi \in \text{Form}(\mathcal L)$
- $t \in \text{Term}(\mathcal L)$
- $\phi[t/x _ i]$ is defined
Then:
\[\mathcal M \models _ a \phi[t/x _ i] \text{ iff } \mathcal M \models _ {a[t/x _ i]} \phi\]@Prove the substitution lemma, i.e. that if
  - $\mathcal L$ is a first-order language
- $\mathcal M$ is an $\mathcal L$-structure
- $\phi \in \text{Form}(\mathcal L)$
- $t \in \text{Term}(\mathcal L)$
- $\phi[t/x  _  i]$ is defined
then:
\[\mathcal M \models  _  a \phi[t/x  _  i] \text{ iff } \mathcal M \models  _  {a[t/x  _  i]} \phi\]
    We proceed by induction on the length of $\phi$.
Base case: $\phi$ is atomic, say $\phi = P(t _ 1, \ldots, t _ k)$. For each $u \in \text{Term}(\mathcal L)$ we define $u[t/x _ i]$ by replacing each occurrence of $x _ i$ in $u$ by $t$.
Note that $\widetilde{a[t/x _ i]}(u) = \tilde a(u[t/x _ i])$ (this can be proved inductively, but is intuitive; replacing $x _ i$ by $t$ in $u$ when you “bottom out” in the recursion on the left hand side is the same as replacing all occurrences of $x _ i$ by $t$ at the start on the right hand side).
Then note that:
\[\begin{aligned} &\mathcal M \models _ {a[t/x _ i]} \phi \\\\ \text{iff }& (\widetilde{a[t/x _ i]}(t _ 1), \ldots, \widetilde{a[t/x _ i]}(t _ k)) \in P^M \\\\ \text{iff }& (\tilde a(t _ 1[t/x _ i]), \ldots,\tilde a(t _ k[t/x _ i])) \in P^M \\\\ \text{iff }& \mathcal M \models _ a P(t _ 1[t/x _ i], \ldots, t _ k[t/x _ i]) \\\\ \text{iff } & \mathcal M \models _ a \phi[t/x _ i] \end{aligned}\]as required.
Likewise, if $t _ 1 \doteq t _ 2$, the same argument applies. So the lemma holds for atomic formulas.
Inductive hypothesis: The lemma holds for shorter formulas.
Case $\phi = \lnot \psi$: Follows from inductive hypothesis.
Case $\phi = (\psi \to \chi)$: Follows from inductive hypothesis.
Case $\phi = \forall x _ i \psi$ (note $x _ i$ is the same as the $x _ i$ in the statement): Then $\phi [t/x _ i] = \phi$, $x _ i \notin \text{Free}(\phi)$. Then $a$ and $a[t/x _ i]$ agree on all $x \in \text{Free}(\phi)$, so by the lemma that says:
\[\mathcal M \models _ {a _ 1} \phi \text{ iff } \mathcal M \models _ {a _ 2} \phi\]Suppose:
- $\mathcal L$ is a first-order language
- $\phi$ is an $\mathcal L$-formula
- $\mathcal M$ is an $\mathcal L$-structure
- $a _ 1$, $a _ 2$ are assignments in $\mathcal M$
- $a _ 1$ and $a _ 2$ agree on the free variables of $\phi$ (i.e. $a _ 1(x _ i) = a _ 2(x _ i)$ where $x _ i$ free) Then:
it follows that
\[\begin{aligned} &\mathcal M \models _ {a[t/x _ i]} \phi \\\\ \text{iff }& \mathcal M \models _ a \phi \\\\ \text{iff }& \mathcal M \models _ a \phi[t/x _ i] \end{aligned}\]as required.
Case $\phi = \forall x _ j \psi$, $j \ne i$: Expanding the definitions, note that $\phi [t/x _ i] = \forall x _ j (\psi)[t/x _ i]$.
If $x _ i$ is not free in $\psi$, then $\phi[t/x _ i] = \phi$ and we can argue identically to the previous case.
Now if $x _ i$ is free in $\psi$, then since $\phi[t/x _ i]$ is defined (by assumption), it follows that $x _ j$ does not occur in $t$.
We would like to argue like so:
\[\begin{aligned} &\mathcal M \models _ a \phi[t/x _ i] \\\\ \text{iff }& \mathcal M \models _ a \forall x _ j (\psi)[t/x _ i] && \text{by expanding out} \\\\ \text{iff }& \mathcal M \models _ {a^\star} \psi[t/x _ i] \text{ for all }a^\star \sim _ j a&& \text{by def. of }\forall \\\\ \text{iff }&\mathcal M \models _ {a^\star[t/x _ i]} \psi \text{ for all } a^\star \sim _ j a && \text{by inductive hypothesis} \\\\ \text{iff }&\mathcal M \models _ {a'} \psi \text{ for all } a' \sim _ j a[t/x _ i] &&\text{by }(\star) \\\\ \text{iff }&\mathcal M \models _ {a[t/x _ i]} \forall x _ j \psi &&\text{by def. of } \forall \\\\ \text{iff }& \mathcal M \models _ {a[t/x _ i]} \phi \end{aligned}\]Where $(\star)$ is justified by the following equality:
\[\lbrace a^\star [t/x _ i] \mid a^\star \sim _ j a\rbrace = \lbrace a' \mid a' \sim _ j a[t/x _ i]\rbrace\]To see $\subseteq$: Suppose that $a^\star \sim _ j a$. Then $\widetilde{a^\star}(t) = \tilde a(t)$, so $a^\star[t/x _ i] \sim _ j a[t/x _ i]$, since $a^\star[t/x _ i] = a^\star[\widetilde{a^\star}(t) / x _ i]$ and $a[t/x _ i] = a[\tilde a(t)/x _ i]$, and by assumption $a$ and $a^\star$ agree on all other variables.
To see $\supseteq$: Suppose that $a’ \sim _ j a[t/x _ i]$. Then $a’ = a^\star[t/x _ i]$ for some $a^\star \sim _ j a$, namely $a^\star := a[a’(x _ j)/x _ j]$, so we have the other conclusion.
This concludes the final case.
The substitution lemma states that if
  - $\mathcal L$ is a first-order language
- $\mathcal M$ is an $\mathcal L$-structure
- $\phi \in \text{Form}(\mathcal L)$
- $t \in \text{Term}(\mathcal L)$
- $\phi[t/x  _  i]$ is defined
then:
\[\mathcal M \models  _  a \phi[t/x  _  i] \text{ iff } \mathcal M \models  _  {a[t/x  _  i]} \phi\]
The typical proof proceeds by induction on the size of $\phi$. The crux is showing the following:
\[\mathcal M \models _ a \forall x _ j (\psi)[t/x _ i] \text{ iff } \mathcal M \models _ {a[t/x _ i]} \forall x _ j \psi\]
Where $x _ j \ne x _ i$ (if they are equal, the proof is much simpler). @Justify just this step.
    If $x _ i$ is not free in $\psi$, then $\phi[t/x _ i] = \phi$ and we can argue identically to the previous case.
Now if $x _ i$ is free in $\psi$, then since $\phi[t/x _ i]$ is defined (by assumption), it follows that $x _ j$ does not occur in $t$.
We would like to argue like so:
\[\begin{aligned} &\mathcal M \models _ a \phi[t/x _ i] \\\\ \text{iff }& \mathcal M \models _ a \forall x _ j (\psi)[t/x _ i] && \text{by expanding out} \\\\ \text{iff }& \mathcal M \models _ {a^\star} \psi[t/x _ i] \text{ for all }a^\star \sim _ j a&& \text{by def. of }\forall \\\\ \text{iff }&\mathcal M \models _ {a^\star[t/x _ i]} \psi \text{ for all } a^\star \sim _ j a && \text{by inductive hypothesis} \\\\ \text{iff }&\mathcal M \models _ {a'} \psi \text{ for all } a' \sim _ j a[t/x _ i] &&\text{by }(\star) \\\\ \text{iff }&\mathcal M \models _ {a[t/x _ i]} \forall x _ j \psi &&\text{by def. of } \forall \\\\ \text{iff }& \mathcal M \models _ {a[t/x _ i]} \phi \end{aligned}\]Where $(\star)$ is justified by the following equality:
\[\lbrace a^\star [t/x _ i] \mid a^\star \sim _ j a\rbrace = \lbrace a' \mid a' \sim _ j a[t/x _ i]\rbrace\]To see $\subseteq$: Suppose that $a^\star \sim _ j a$. Then $\widetilde{a^\star}(t) = \tilde a(t)$, so $a^\star[t/x _ i] \sim _ j a[t/x _ i]$, since $a^\star[t/x _ i] = a^\star[\widetilde{a^\star}(t) / x _ i]$ and $a[t/x _ i] = a[\tilde a(t)/x _ i]$, and by assumption $a$ and $a^\star$ agree on all other variables.
To see $\supseteq$: Suppose that $a’ \sim _ j a[t/x _ i]$. Then $a’ = a^\star[t/x _ i]$ for some $a^\star \sim _ j a$, namely $a^\star := a[a’(x _ j)/x _ j]$, so we have the other conclusion.
The substitution lemma states that if:
  - $\mathcal L$ is a first-order language
- $\mathcal M$ is an $\mathcal L$-structure
- $\phi \in \text{Form}(\mathcal L)$
- $t \in \text{Term}(\mathcal L)$
- $\phi[t/x  _  i]$ is defined
then:
\[\mathcal M \models  _  a \phi[t/x  _  i] \text{ iff } \mathcal M \models  _  {a[t/x  _  i]} \phi\]
@Justify, as a corollary, that for any formula $\phi$ and term $t$ where $\phi[t/x  _  i]$ is defined,
\[\models (\forall x  _  i \phi \to \phi[t/x  _  i])\]
(note this is the “entailment version” of the axiom A4, this result is used is used in the proof of the soundness theorem).
    Fix an assignment $a$ in an $\mathcal L$-structure $\mathcal M$.
Suppose that $\mathcal M \models _ a \forall x _ i \phi$. Then $\mathcal M \models _ {a[t/x _ i]} \phi$ since $a[t/x _ i] \sim _ i a$. Therefore $\mathcal M \models _ a \phi[t/x _ i]$ by the substitution lemma.
@Prove that if:
  - $\mathcal L$ is a first-order language
- $\alpha$ is an atomic $\mathcal L$-formula
then:
\[\models (x _ i \doteq x _ j \to (\alpha \to \alpha[x _ j/x _ i]))\]
(note this is the “entailment version” of the axiom A8, this result is used is used in the proof of the soundness theorem).
    Let $\mathcal M$ be an $\mathcal L$-structure and $a$ an assignment $\mathcal M$ such that $\mathcal M \models _ a x _ i \doteq x _ j$ and $\mathcal M \models _ a \alpha$. We aim to show that $\mathcal M \models _ a \alpha[x _ j / x _ i]$.
By the fact $\mathcal M$ entails $x _ i \doteq x _ j$, we have $a(x _ i) = a(x _ j)$ and thus
\[\tilde a(t[x _ j / x _ i]) = \tilde a(t)\]for any term $t$, by induction on the length of $t$.
There are two cases to consider, either $\alpha = P(t _ 1, \ldots, t _ k)$, or $\alpha = t _ 1 \doteq t _ 2$.
Suppose $\alpha = P(t _ 1, \ldots, t _ k)$. Then
\[\begin{aligned} M \models _ a \alpha &\implies (\tilde a(t _ 1), \ldots, \tilde a(t _ k)) \in P^\mathcal M \\\\ &\implies (\tilde a(t _ 1[x _ j / x _ i]), \ldots, \tilde a(t _ k[x _ j/x _ i])) \in P^\mathcal M \\\\ &\implies \mathcal M \models _ a P(t _ 1[x _ j/x _ i], \ldots, t _ k[x _ j / x _ i]) \\\\ &\implies \mathcal M \models _ a \alpha[x _ j / x _ i] \end{aligned}\]as required, and a similar argument can be made for $\alpha$ of the form $t _ 1 \doteq t _ 2$.
As $\mathcal M$ and $a$ was arbitrary, we have
\[\models (x _ i \doteq x _ j \to (\alpha \to \alpha[x _ j/x _ i]))\]as required.