🧠 Adding Chat Memory

Right now, your chat has a problem. Every time you ask a question, the AI feels like it’s meeting you for the first time. You can’t say “tell me more about that” or “what was the first thing you mentioned?” because the AI has no memory of your previous messages.

This makes conversations feel disconnected and frustrating. Real conversations build on what was said before.

🔄 The Memory Problem

What’s happening now:

You: "What's the capital of France?"
AI: "Paris is the capital of France."

You: "Tell me more about that city"
AI: "I'd be happy to help! What city are you asking about?"

The AI doesn’t remember you were just talking about Paris. Each message is completely isolated.

What you want:

You: "What's the capital of France?"
AI: "Paris is the capital of France."

You: "Tell me more about that city"
AI: "Paris is a beautiful city with over 2 million people. It's famous for the Eiffel Tower, the Louvre Museum, and its café culture..."

The AI remembers the context and continues the conversation naturally.

🧠 Key Parts You Need to Understand

1. Conversation History

Instead of sending just the current message to OpenAI, you need to send the entire conversation history.

Without Memory (what you have now):

// Only sending current message
const response = await openai.responses.create({
  model: "gpt-4.1",
  input: currentMessage
})

With Memory (what you’re building):

// Sending entire conversation
const response = await openai.responses.create({
  model: "gpt-4.1",
  input: [
    { role: "user", content: "What's the capital of France?" },
    { role: "assistant", content: "Paris is the capital of France." },
    { role: "user", content: "Tell me more about that city" }
  ]
})

2. Message Roles

OpenAI expects messages in a specific format with roles. Understanding these roles is crucial for building proper conversation memory.

const conversationHistory = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Hello!" },
  { role: "assistant", content: "Hi! How can I help you today?" },
  { role: "user", content: "What's the weather like?" }
]

🧠 Understanding Roles in the Response API

When sending messages to the OpenAI Response API, each message must include a role. This helps the model understand who is speaking and how to behave in context.

Unlike the Chat Completion API, which uses system, user, and assistant, the Response API introduces a new role called developer.

🧩 When to Use Each Role

🧑‍💻 `developer` – Define app behavior and context

Use this at the beginning of a conversation to set expectations for the assistant.

{
  role: "developer",
  content: "You are integrated into a customer support app. Always be professional and helpful. Ask for a ticket number when there's a complaint. If a user is angry or frustrated, offer to escalate to a human agent."
}

✅ This replaces system from Chat Completions — but with better alignment to your app’s purpose.

🧍 `user` – Every human message

Used for the actual input from the end user (your app’s customer or user):

{
  role: "user",
  content: "I'm having trouble with my order."
}

🤖 `assistant` – Every AI response

Used when you want to include or simulate previous AI responses in the thread:

{
  role: "assistant",
  content: "I'm sorry to hear that. Could you provide your order number so I can help?"
}

Practical Example: Building Context

Here’s how a real conversation with memory looks:

const buildConversationHistory = (messages) => {
  return [
    {
      role: "developer",
      content: "You are a programming tutor. Break down complex concepts into simple steps. Use code examples when helpful."
    },
    ...messages.map(msg => ({
      role: msg.isUser ? "user" : "assistant",
      content: msg.text
    }))
  ]
}

// Usage
const messages = [
  { text: "Explain JavaScript closures", isUser: true },
  { text: "A closure is a function that has access to variables...", isUser: false },
  { text: "Can you show me an example?", isUser: true }
]

const conversationHistory = buildConversationHistory(messages)

Role Best Practices

For developer role:

Set clear behavior expectations
Define the AI’s purpose in your app
Include any special instructions or constraints
Keep it concise but comprehensive

For conversation flow:

Always alternate between user and assistant
Never have two consecutive messages with the same role
Include all previous messages for full context

Example of proper role sequencing:

// ✅ Good - alternating roles
[
  { role: "developer", content: "You are a helpful assistant." },
  { role: "user", content: "Hello" },
  { role: "assistant", content: "Hi there!" },
  { role: "user", content: "How are you?" },
  { role: "assistant", content: "I'm doing well, thanks!" }
]

// ❌ Bad - consecutive user messages
[
  { role: "user", content: "Hello" },
  { role: "user", content: "How are you?" }, // This breaks the pattern
  { role: "assistant", content: "Hi! I'm well!" }
]

3. Context Window Limits

AI models have limits on how much text they can process at once (called the “context window”). For GPT-4:

GPT-4o: ~128,000 tokens (about 96,000 words)
GPT-4.1: ~200,000 tokens (about 150,000 words)

What this means: Very long conversations might hit this limit, so you need strategies to manage it.

4. Token Costs Add Up

Every message in your conversation history costs tokens. A 10-message conversation might cost 5x more than a single message because you’re sending the history each time.

Cost example:

Message 1: 100 tokens
Message 2: 200 tokens (100 new + 100 history)
Message 3: 300 tokens (100 new + 200 history)
Message 4: 400 tokens (100 new + 300 history)

Costs grow quickly in long conversations.

5. Storage Considerations

You need to decide where to store conversation history:

Frontend only: Simple but lost when page refreshes
Backend storage: Database or session storage
User accounts: Persistent across devices and sessions

🎯 What You’ll Build

In the next sections, you’ll add memory to your chat by:

Modifying the backend to accept and manage conversation history
Updating the frontend to send full conversation context
Adding memory management to handle long conversations
Implementing conversation storage for persistence

By the end, your AI will remember everything from the conversation and respond naturally to follow-up questions.

🔍 Memory Strategies Preview

You’ll explore different approaches to manage memory:

Simple Memory

Store everything in the frontend - perfect for short sessions.

Sliding Window Memory

Keep only the last N messages to control costs and context length.

Summary Memory

Summarize old parts of the conversation to maintain context while reducing tokens.

Persistent Memory

Save conversations to a database for long-term storage.

✅ What You’ll Gain

After adding memory, your chat will:

✅ Remember previous messages in the conversation
✅ Handle follow-up questions naturally
✅ Maintain context throughout the session
✅ Feel like talking to a real person
✅ Support complex, multi-turn conversations

Ready to make your AI actually remember things? Let’s start with the backend! 🚀