🧠 Adding Chat Memory
Right now, your chat has a problem. Every time you ask a question, the AI feels like it’s meeting you for the first time. You can’t say “tell me more about that” or “what was the first thing you mentioned?” because the AI has no memory of your previous messages.
This makes conversations feel disconnected and frustrating. Real conversations build on what was said before.
🔄 The Memory Problem
Section titled “🔄 The Memory Problem”What’s happening now:
You: "What's the capital of France?"AI: "Paris is the capital of France."
You: "Tell me more about that city"AI: "I'd be happy to help! What city are you asking about?"
The AI doesn’t remember you were just talking about Paris. Each message is completely isolated.
What you want:
You: "What's the capital of France?"AI: "Paris is the capital of France."
You: "Tell me more about that city"AI: "Paris is a beautiful city with over 2 million people. It's famous for the Eiffel Tower, the Louvre Museum, and its café culture..."
The AI remembers the context and continues the conversation naturally.
🧠 Key Parts You Need to Understand
Section titled “🧠 Key Parts You Need to Understand”1. Conversation History
Section titled “1. Conversation History”Instead of sending just the current message to OpenAI, you need to send the entire conversation history.
Without Memory (what you have now):
// Only sending current messageconst response = await openai.responses.create({ model: "gpt-4.1", input: currentMessage})
With Memory (what you’re building):
// Sending entire conversationconst response = await openai.responses.create({ model: "gpt-4.1", input: [ { role: "user", content: "What's the capital of France?" }, { role: "assistant", content: "Paris is the capital of France." }, { role: "user", content: "Tell me more about that city" } ]})
2. Message Roles
Section titled “2. Message Roles”OpenAI expects messages in a specific format with roles. Understanding these roles is crucial for building proper conversation memory.
const conversationHistory = [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Hello!" }, { role: "assistant", content: "Hi! How can I help you today?" }, { role: "user", content: "What's the weather like?" }]
🧠 Understanding Roles in the Response API
Section titled “🧠 Understanding Roles in the Response API”When sending messages to the OpenAI Response API, each message must include a role
. This helps the model understand who is speaking and how to behave in context.
Unlike the Chat Completion API, which uses system
, user
, and assistant
, the Response API introduces a new role called developer
.
🧩 When to Use Each Role
Section titled “🧩 When to Use Each Role”🧑💻 developer
– Define app behavior and context
Section titled “🧑💻 developer – Define app behavior and context”Use this at the beginning of a conversation to set expectations for the assistant.
{ role: "developer", content: "You are integrated into a customer support app. Always be professional and helpful. Ask for a ticket number when there's a complaint. If a user is angry or frustrated, offer to escalate to a human agent."}
✅ This replaces system from Chat Completions — but with better alignment to your app’s purpose.
🧍 user
– Every human message
Section titled “🧍 user – Every human message”Used for the actual input from the end user (your app’s customer or user):
{ role: "user", content: "I'm having trouble with my order."}
🤖 assistant
– Every AI response
Section titled “🤖 assistant – Every AI response”Used when you want to include or simulate previous AI responses in the thread:
{ role: "assistant", content: "I'm sorry to hear that. Could you provide your order number so I can help?"}
Practical Example: Building Context
Section titled “Practical Example: Building Context”Here’s how a real conversation with memory looks:
const buildConversationHistory = (messages) => { return [ { role: "developer", content: "You are a programming tutor. Break down complex concepts into simple steps. Use code examples when helpful." }, ...messages.map(msg => ({ role: msg.isUser ? "user" : "assistant", content: msg.text })) ]}
// Usageconst messages = [ { text: "Explain JavaScript closures", isUser: true }, { text: "A closure is a function that has access to variables...", isUser: false }, { text: "Can you show me an example?", isUser: true }]
const conversationHistory = buildConversationHistory(messages)
Role Best Practices
Section titled “Role Best Practices”For developer
role:
- Set clear behavior expectations
- Define the AI’s purpose in your app
- Include any special instructions or constraints
- Keep it concise but comprehensive
For conversation flow:
- Always alternate between
user
andassistant
- Never have two consecutive messages with the same role
- Include all previous messages for full context
Example of proper role sequencing:
// ✅ Good - alternating roles[ { role: "developer", content: "You are a helpful assistant." }, { role: "user", content: "Hello" }, { role: "assistant", content: "Hi there!" }, { role: "user", content: "How are you?" }, { role: "assistant", content: "I'm doing well, thanks!" }]
// ❌ Bad - consecutive user messages[ { role: "user", content: "Hello" }, { role: "user", content: "How are you?" }, // This breaks the pattern { role: "assistant", content: "Hi! I'm well!" }]
3. Context Window Limits
Section titled “3. Context Window Limits”AI models have limits on how much text they can process at once (called the “context window”). For GPT-4:
- GPT-4o: ~128,000 tokens (about 96,000 words)
- GPT-4.1: ~200,000 tokens (about 150,000 words)
What this means: Very long conversations might hit this limit, so you need strategies to manage it.
4. Token Costs Add Up
Section titled “4. Token Costs Add Up”Every message in your conversation history costs tokens. A 10-message conversation might cost 5x more than a single message because you’re sending the history each time.
Cost example:
- Message 1: 100 tokens
- Message 2: 200 tokens (100 new + 100 history)
- Message 3: 300 tokens (100 new + 200 history)
- Message 4: 400 tokens (100 new + 300 history)
Costs grow quickly in long conversations.
5. Storage Considerations
Section titled “5. Storage Considerations”You need to decide where to store conversation history:
- Frontend only: Simple but lost when page refreshes
- Backend storage: Database or session storage
- User accounts: Persistent across devices and sessions
🎯 What You’ll Build
Section titled “🎯 What You’ll Build”In the next sections, you’ll add memory to your chat by:
- Modifying the backend to accept and manage conversation history
- Updating the frontend to send full conversation context
- Adding memory management to handle long conversations
- Implementing conversation storage for persistence
By the end, your AI will remember everything from the conversation and respond naturally to follow-up questions.
🔍 Memory Strategies Preview
Section titled “🔍 Memory Strategies Preview”You’ll explore different approaches to manage memory:
Simple Memory
Section titled “Simple Memory”Store everything in the frontend - perfect for short sessions.
Sliding Window Memory
Section titled “Sliding Window Memory”Keep only the last N messages to control costs and context length.
Summary Memory
Section titled “Summary Memory”Summarize old parts of the conversation to maintain context while reducing tokens.
Persistent Memory
Section titled “Persistent Memory”Save conversations to a database for long-term storage.
✅ What You’ll Gain
Section titled “✅ What You’ll Gain”After adding memory, your chat will:
- ✅ Remember previous messages in the conversation
- ✅ Handle follow-up questions naturally
- ✅ Maintain context throughout the session
- ✅ Feel like talking to a real person
- ✅ Support complex, multi-turn conversations
Ready to make your AI actually remember things? Let’s start with the backend! 🚀