Supervised Learning Applications - I


CSE 891: Deep Learning

Vishnu Boddeti

Monday November 02, 2020

Deep Learning Applications


Image Classification


Video Recognition
Object Detection

Instance Segmentation
Semantic Segmentation

Medical Image Classification

Classification: Object Recognition

Object Detection

Object Detection

Object Detection

  • Key ideas today:
    • train longer
    • multi-scale backbone: feature pyramid networks
    • bigger backbones: e.g., ResNeXt
    • very big models work better
    • big ensembles, more data, etc
    • test-time augmentations

Face Matching

Feature Extraction

Face Recognition Models

  • DeepFace
  • FaceNet

Regression: Semantic Segmentation

Regression: Semantic Segmentation

Regression: Super-Resolution

Bayesian Deep Learning

  • Model Output Uncertainty: $p(\mathbf{y}^{*}|\mathbf{x}^{*}, \mathbf{X},\mathbf{Y})$
  • Model Parameter Uncertainty: $p(\mathbf{w}^{*}|\mathbf{x}^{*},\mathbf{X},\mathbf{Y})$
\begin{eqnarray} p(\mathbf{y}^{*}|\mathbf{x}^{*}, \mathbf{X},\mathbf{Y}) = \int p(\mathbf{y}^{*}|\mathbf{w}^{*})p(\mathbf{w}^{*}|\mathbf{x}^{*},\mathbf{X},\mathbf{Y})d\mathbf{w}^{*} \nonumber \end{eqnarray}

Attention for Memory: Neural Turing Machines

NTM: Memory Read and Write

  • (Blurry) Read: Read everywhere with weights
  • \[r_t = \sum_{i} w_t(i)\mathbf{M}_t(i)\]
  • (Blurry) Write: Erase and add everywhere with weights
  • \[\mathbf{M}_t(i) \leftarrow w_t(i)\mathbf{a}_t + M_{t-1}(i)(1-w_t(i)\mathbf{e}_t)\]

NTM: Memory Addressing

NTM: Memory Addressing

  • Content Addressing:
  • \[w^c_t(i) = \frac{\exp(\beta_tK[\mathbf{k}_t,\mathbf{M})t(i)])}{\sum_j \exp(\beta_tK[\mathbf{k}_t,\mathbf{M})t(j)])} \]
  • Interpolation:
  • \[\mathbf{w}_t^g = g_t\mathbf{w}^c_t+(1-g_t)\mathbf{w}_{t-1}\]
  • Convolutional Shifting:
  • \[\tilde{w}_t(i) \leftarrow \sum_{j=0}^{N-1}w^g_t(j)s_t(i-j)\]
  • Sharpening:
  • \[w_t(i) \leftarrow \frac{\tilde{w}_t(i)^{\gamma_t}}{\sum_j \tilde{w}_t(j)^{\gamma_t}}\]

NTM: Copy Performance

NTM: Copy Comparison

  • LSTM:
  • NTM:

NTM: Read Write

Regression: Object Alignment

Regression: Object Alignment

Structured Output Prediction

  • Traditional Learning: Mapping $f : \mathcal{X} \rightarrow \mathbb{R}$
  • Structured Output Learning: Mapping $f : \mathcal{X} \rightarrow \mathcal{Y}$
Semantic Segmentaion