Backpropagation


CSE 891: Deep Learning

Vishnu Boddeti

Wednesday September 16, 2020

Univariate Backpropagation

  • Example: univariate least squares regression
  • Forward Pass:
$$ \begin{eqnarray} z &=& wx+b \nonumber \\ y &=& \sigma(z) \nonumber \\ \mathcal{L} &=& \frac{1}{2}(y-t)^2 \nonumber \\ \mathcal{R} &=& \frac{1}{2}w^2 \nonumber \\ \mathcal{L}_{reg} &=& \mathcal{L} + \lambda\mathcal{R} \nonumber \end{eqnarray} $$

Multivariate Backpropagation

  • Example: Multilayer Perception (multiple outputs)
  • Forward Pass:
$$ \begin{eqnarray} z_i &=& \sum_{j}w^{(1)}_{ij} + b^{(1)}_i \nonumber \\ h_i &=& \sigma(z_i) \nonumber \\ y_k &=& \sum_{i}w^{(2)}_{ki}h_i+b^{(2)}_k \nonumber \\ \mathcal{L} &=& \frac{1}{2}\sum_k(y_k-t_k)^2 \nonumber \end{eqnarray} $$

Vector Form

  • Consider this computation graph:
  • Backprop rules:
  • $$ \begin{equation} z_{j}' = \sum_{k}y_k'\frac{\partial y_k}{\partial z_j} \mbox{ or } \mathbf{z}' = \frac{\partial \mathbf{y}^T}{\partial \mathbf{z}}\mathbf{y}' \end{equation} $$
  • where $\frac{\partial \mathbf{y}}{\partial \mathbf{z}}$ is the Jacobian matrix: $\mathbf{J} = \frac{\partial \mathbf{y}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial y_1}{\partial x_1} & \dots & \frac{\partial y_1}{\partial x_n} \\\ \vdots & \ddots & \vdots \\\ \frac{\partial y_n}{\partial x_1} & \dots & \frac{\partial y_n}{\partial x_n} \end{bmatrix}$

MLP Backpropagation

  • Example: Multilayer Perception (vector form)
  • Forward Pass:
$$ \begin{eqnarray} \mathbf{z} &=& \mathbf{W}^{(1)}\mathbf{x}+\mathbf{b}^{(1)} \nonumber \\ \mathbf{h} &=& \sigma(\mathbf{z}) \nonumber \\ \mathbf{z} &=& \mathbf{W}^{(2)}\mathbf{h}+\mathbf{b}^{(2)} \nonumber \\ \mathcal{L} &=& \frac{1}{2}\||\mathbf{t}-\mathbf{y}\|| \nonumber \end{eqnarray} $$