Let's Talk ChatGPT

CSE 440: Introduction to Artificial Intelligence

Vishnu Boddeti

AI Progress

What is ChatGPT?

ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with the chatbot. The language model can answer questions and assist you with tasks, such as composing emails, essays, and code.

What can we learn from reconstructing the input?

Michigan State University is located in _____, Michigan.

Answer: trivia

What can we learn from reconstructing the input?

I put _____ fork down on the table.

Answer: syntax

What can we learn from reconstructing the input?

The woman walked across the street, checking for traffic over ___ shoulder.

Answer: coreference

What can we learn from reconstructing the input?

I went to the ocean to see the fish, turtles, seals, and _____.

Answer: lexical semantics/topic

What can we learn from reconstructing the input?

Overall, the value I got from the two hours watching it was the sum total of the popcorn and the drink. The movie was ___.

Answer: sentiment

What can we learn from reconstructing the input?

Iroh went into the kitchen to make some tea. Standing next to Iroh, Zuko pondered his destiny. Zuko left the ______.

Answer: some reasoning - this is harder

What can we learn from reconstructing the input?

I was thinking about the sequence that goes 1, 1, 2, 3, 5, 8, 13, 21, ____

Answer: some basic arithmetic; they don't learn the Fibonnaci sequence

The Shannon Game

Language Modeling As a Probability Model

Sequential Probabilistic Model:
Is a Markov model good enough?

Need very long-range context.
How much context is sufficient?
Current day models: 2 million tokens

Language Modeling: Unigram

Assumption: each word is generated independently

$p(x_1, x_2, \dots, x_n) = \prod_{i=1}^n q(x_i)$

Language Modeling: Bigram

Assumption: each word is generated independently

$p(x_1, x_2, \dots, x_n) = \prod_{i=1}^n q(x_i|x_{i-1})$

Language Modeling: N-gram

Assumption: each word is generated independently

$p(x_1, x_2, \dots, x_n) = \prod_{i=1}^n q(x_i|x_{i-1},x_{i-2},x_{i-(k-1)})$

ChatGPT Overview

Initial Language Model: GPT Overview

Initial Language Model: GPT Training Data

Initial Language Model: GPT Training Process

Initial Language Model: GPT Language Generation

Training ChatGPT with Human Feedback

Training ChatGPT Reward Model

ChatGPT: Reinforcement Learning with Human Feedback

How much data are LLMs trained with?

How much compute are LLMs trained with?

How useful is current ChatGPT?

Ghost of the Internet

If you ask the internet for the answer to a question on the best forum available and get an answer, it might be in the ballpark of as useful and as correct as that which GPT4 provides notably, in seconds.

There are pluses and minuses:

Super human for tasks like text summarization and style.
Solid performance for commonly discussed content.
Sketchy for less commonly discussed content.

Probably should not trust its math, reasoning and logic.
Driving a car is out of the question, since it cannot be described.

What about the future of AI?

Behavior in LLMs is a reflection of our collective behavior.
Will AI kill everyone?

Short Term: definitely not in the next few years.
Medium Term: possible to have very lethal drone warfare.
Long Term: development of intelligence is inevitable, at least for competetive or capitalistic reasons.

Will we have superintelligent beings?

Don't think this will happen in my lifetime.
Even if it does, it may be possible to have some control over them (e.g., through cryptography).

Real Dangers of AI

The problem:

Job creation typically follows mechanization induced job losses.
Loss of many jobs in the next couple of decades is very plausible.
Most humans may not be able to skill up.
Our society may not be equipped for such a future.
The jobs that become easy with AI are the ones to disappear soonest.

The solution:

Kaizen: continuous improvement, continuous learning, continuous adaptation.
Try to be as high on the skill (cognitive or physical) ladder as possible, such job are harder to automate.

The dark art of "prompt engineering"

How to access Chatbots?

Q & A