Skip to content

Simple Memory Implementation

Let’s add memory to your streaming chat. We’ll store the entire conversation history in the frontend and send it with each request. This approach is perfect for short to medium-length conversations.

Note: You can apply this exact same process to your normal chat page - the only difference is the endpoint you call and how you handle the response.


Before Memory: Each message is isolated - the AI has no context

You: "My name is Sarah"
AI: "Nice to meet you! How can I help?"
You: "What's my name?"
AI: "I don't have access to that information."

After Memory: The AI remembers the entire conversation

You: "My name is Sarah"
AI: "Nice to meet you, Sarah! How can I help?"
You: "What's my name?"
AI: "Your name is Sarah, as you mentioned earlier."

Currently, your streaming chat sends only the current message:

// Current - no memory
body: JSON.stringify({ message: currentInput })

We’ll change it to send the full conversation:

// With memory
body: JSON.stringify({
message: currentInput,
conversationHistory: buildConversationHistory(messages)
})

We need to modify your existing streaming endpoint to accept and process conversation history.

Your current backend only receives the current message. We’ll enhance it to:

  1. Accept conversation history from the frontend
  2. Build context by combining history with the current message
  3. Send enhanced context to the AI for better responses

Replace your /api/chat/stream route with this enhanced version:

// Updated streaming endpoint with memory
app.post("/api/chat/stream", async (req, res) => {
try {
// 🆕 MEMORY ADDITION: Accept conversationHistory from frontend
const { message, conversationHistory = [] } = req.body;
if (!message) {
return res.status(400).json({ error: "Message is required" });
}
// Set headers for streaming (unchanged)
res.writeHead(200, {
'Content-Type': 'text/plain',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
});
// 🆕 MEMORY ADDITION: Build context-aware message for the AI
let contextualMessage = message;
// If we have conversation history, include it as context
if (conversationHistory.length > 0) {
const context = conversationHistory
.map(msg => `${msg.role === 'user' ? 'User' : 'Assistant'}: ${msg.content}`)
.join('\n');
contextualMessage = `Previous conversation:\n${context}\n\nCurrent question: ${message}`;
}
// Create streaming response using Response API (modified to use contextualMessage)
const stream = await openai.responses.create({
model: "gpt-4o-mini",
input: contextualMessage, // 🔄 CHANGED: was just 'message', now includes context
stream: true,
});
// Stream each chunk to the frontend - Handle Response API events (unchanged)
for await (const event of stream) {
switch (event.type) {
case "response.output_text.delta":
if (event.delta) {
let textChunk = typeof event.delta === "string"
? event.delta
: event.delta.text || "";
if (textChunk) {
res.write(textChunk);
res.flush?.();
}
}
break;
case "text_delta":
if (event.text) {
res.write(event.text);
res.flush?.();
}
break;
case "response.created":
case "response.completed":
case "response.output_item.added":
case "response.content_part.added":
case "response.content_part.done":
case "response.output_item.done":
case "response.output_text.done":
// Keep connection alive, no content to write
break;
case "error":
console.error("Stream error:", event.error);
res.write("\n[Error during generation]");
break;
}
}
// Close the stream (unchanged)
res.end();
} catch (error) {
console.error("OpenAI Streaming Error:", error);
// Handle error properly for streaming (unchanged)
if (res.headersSent) {
res.write("\n[Error occurred]");
res.end();
} else {
res.status(500).json({
error: "Failed to stream AI response",
success: false,
});
}
}
});

🆕 Line 4: const { message, conversationHistory = [] } = req.body;

  • What it does: Accepts conversation history from the frontend
  • Why: We need the previous messages to give the AI context

🆕 Lines 15-25: Context building logic

  • What it does: Converts conversation history into readable format
  • Why: The AI needs to understand what was said before

🔄 Line 31: input: contextualMessage

  • What changed: Was input: message, now includes full context
  • Why: Sends the enhanced message with history to get better responses

🔄 Step 2: Update Your Streaming Frontend

Section titled “🔄 Step 2: Update Your Streaming Frontend”

Now we’ll enhance your React component to build and send conversation history.

Your current frontend only sends the current message. We’ll enhance it to:

  1. Build conversation history from your existing messages
  2. Send history with each request to provide context
  3. Keep everything else the same - your UI and streaming logic don’t change

Add this new function to your StreamingChat component, right after your state declarations:

// 🆕 MEMORY ADDITION: Function to build conversation history
const buildConversationHistory = (messages) => {
return messages
.filter(msg => !msg.isStreaming) // Only include completed messages
.map(msg => ({
role: msg.isUser ? "user" : "assistant",
content: msg.text
}));
};

What this function does:

  • Filters: Only includes completed messages (not currently streaming ones)
  • Maps: Converts your message format to the format the backend expects
  • Returns: Array of {role, content} objects for the backend

Step 2b: Updated Complete Component with Highlighted Changes

Section titled “Step 2b: Updated Complete Component with Highlighted Changes”

Here’s your complete component with all memory additions highlighted:

import { useState, useRef } from 'react'
import { Send, Bot, User } from 'lucide-react'
function StreamingChat() {
const [messages, setMessages] = useState([])
const [input, setInput] = useState('')
const [isStreaming, setIsStreaming] = useState(false)
const abortControllerRef = useRef(null)
// 🆕 MEMORY ADDITION: Function to build conversation history
const buildConversationHistory = (messages) => {
return messages
.filter(msg => !msg.isStreaming) // Only include completed messages
.map(msg => ({
role: msg.isUser ? "user" : "assistant",
content: msg.text
}));
};
const sendMessage = async () => {
if (!input.trim() || isStreaming) return
const userMessage = { text: input, isUser: true, id: Date.now() }
setMessages(prev => [...prev, userMessage])
const currentInput = input
setInput('')
setIsStreaming(true)
// Create AI message placeholder
const aiMessageId = Date.now() + 1
const aiMessage = { text: '', isUser: false, id: aiMessageId, isStreaming: true }
setMessages(prev => [...prev, aiMessage])
try {
// 🆕 MEMORY ADDITION: Build conversation history from current messages
const conversationHistory = buildConversationHistory(messages)
// Create abort controller for canceling requests
abortControllerRef.current = new AbortController()
const response = await fetch('http://localhost:8000/api/chat/stream', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
message: currentInput,
conversationHistory: conversationHistory // 🆕 MEMORY ADDITION: Include history
}),
signal: abortControllerRef.current.signal,
})
if (!response.ok) {
throw new Error('Failed to get response')
}
// Read the stream (unchanged)
const reader = response.body.getReader()
const decoder = new TextDecoder()
while (true) {
const { done, value } = await reader.read()
if (done) break
const chunk = decoder.decode(value, { stream: true })
// Update the AI message with new content
setMessages(prev =>
prev.map(msg =>
msg.id === aiMessageId
? { ...msg, text: msg.text + chunk }
: msg
)
)
}
// Mark streaming as complete (unchanged)
setMessages(prev =>
prev.map(msg =>
msg.id === aiMessageId
? { ...msg, isStreaming: false }
: msg
)
)
} catch (error) {
if (error.name === 'AbortError') {
console.log('Request was cancelled')
} else {
console.error('Streaming error:', error)
// Update AI message with error
setMessages(prev =>
prev.map(msg =>
msg.id === aiMessageId
? { ...msg, text: 'Sorry, something went wrong.', isStreaming: false }
: msg
)
)
}
} finally {
setIsStreaming(false)
abortControllerRef.current = null
}
}
const handleKeyPress = (e) => {
if (e.key === 'Enter' && !e.shiftKey && !isStreaming) {
e.preventDefault()
sendMessage()
}
}
const stopStreaming = () => {
if (abortControllerRef.current) {
abortControllerRef.current.abort()
}
}
return (
<div className="min-h-screen bg-gray-100 flex items-center justify-center p-4">
<div className="bg-white rounded-lg shadow-lg w-full max-w-2xl h-[600px] flex flex-col">
{/* Header */}
<div className="bg-blue-500 text-white p-4 rounded-t-lg">
<h1 className="text-xl font-bold">Streaming AI Chat with Memory</h1>
<p className="text-blue-100">Real-time responses with conversation context!</p>
</div>
{/* Messages */}
<div className="flex-1 overflow-y-auto p-4 space-y-4">
{messages.length === 0 && (
<div className="text-center text-gray-500 mt-20">
<Bot className="w-12 h-12 mx-auto mb-4 text-gray-400" />
<p>Send a message to see streaming and memory in action!</p>
</div>
)}
{messages.map((message) => (
<div
key={message.id}
className={`flex items-start space-x-3 ${
message.isUser ? 'justify-end' : 'justify-start'
}`}
>
{!message.isUser && (
<div className="bg-blue-500 p-2 rounded-full">
<Bot className="w-4 h-4 text-white" />
</div>
)}
<div
className={`max-w-xs lg:max-w-md px-4 py-2 rounded-lg ${
message.isUser
? 'bg-blue-500 text-white'
: 'bg-gray-200 text-gray-800'
}`}
>
{message.text}
{message.isStreaming && (
<span className="inline-block w-2 h-4 bg-blue-500 ml-1 animate-pulse" />
)}
</div>
{message.isUser && (
<div className="bg-gray-500 p-2 rounded-full">
<User className="w-4 h-4 text-white" />
</div>
)}
</div>
))}
</div>
{/* Input */}
<div className="border-t p-4">
<div className="flex space-x-2">
<input
type="text"
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyPress={handleKeyPress}
placeholder="Type your message..."
className="flex-1 border border-gray-300 rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
disabled={isStreaming}
/>
{isStreaming ? (
<button
onClick={stopStreaming}
className="bg-red-500 hover:bg-red-600 text-white px-4 py-2 rounded-lg transition-colors"
>
Stop
</button>
) : (
<button
onClick={sendMessage}
disabled={!input.trim()}
className="bg-blue-500 hover:bg-blue-600 disabled:bg-gray-300 text-white p-2 rounded-lg transition-colors"
>
<Send className="w-5 h-5" />
</button>
)}
</div>
</div>
</div>
</div>
)
}
export default StreamingChat

🆕 Lines 11-18: buildConversationHistory function

  • What it does: Converts your messages to backend format
  • When it runs: Before each API request

🆕 Line 35: const conversationHistory = buildConversationHistory(messages)

  • What it does: Builds history from current messages
  • Why: We need this to send to the backend

🆕 Line 46: conversationHistory: conversationHistory

  • What it does: Includes conversation history in the request
  • Why: Backend needs this to provide context to the AI

🔄 Lines 111-112: Updated header text

  • What changed: Shows “with Memory” to indicate the new feature
  • Why: Visual confirmation that memory is enabled

  1. Start both servers (backend and frontend)
  2. Open your streaming chat
  3. Test the memory with this conversation:
You: "My name is Sarah and I'm 25 years old"
AI: "Nice to meet you, Sarah! It's great to know you're 25. How can I help you today?"
You: "What's my name and age?"
AI: "Your name is Sarah and you're 25 years old, as you mentioned earlier."
You: "Can you remember what I told you?"
AI: "Yes! You told me your name is Sarah and that you're 25 years old."
You: "Tell me a joke about my age"
AI: "Here's a joke for a 25-year-old: Why don't 25-year-olds ever feel old? Because they're still in their 'twenty-fun' years!"

The AI should remember everything you’ve told it and reference earlier parts of the conversation!


Frontend State Backend Processing AI Input
───────────── ──────────────────── ─────────
messages = [ conversationHistory = [ contextualMessage =
{ { "Previous conversation:
text: "I'm Sarah", role: "user", User: I'm Sarah
isUser: true content: "I'm Sarah" Assistant: Nice to meet you!
}, },
{ { Current question:
text: "Nice to role: "assistant", What's my name?"
meet you!", content: "Nice to
isUser: false meet you!"
} }
] ]
// If you have this conversation history:
conversationHistory = [
{ role: "user", content: "My name is Sarah and I'm 25" },
{ role: "assistant", content: "Nice to meet you, Sarah!" },
{ role: "user", content: "I love programming" },
{ role: "assistant", content: "That's awesome! What languages?" }
]
// And current message: "What do you know about me?"
// The backend creates this contextual message:
contextualMessage = `
Previous conversation:
User: My name is Sarah and I'm 25
Assistant: Nice to meet you, Sarah!
User: I love programming
Assistant: That's awesome! What languages?
Current question: What do you know about me?
`
// Your messages array format (unchanged):
messages = [
{ text: "My name is Sarah", isUser: true, id: 1, isStreaming: false },
{ text: "Nice to meet you!", isUser: false, id: 2, isStreaming: false },
{ text: "What's my name?", isUser: true, id: 3, isStreaming: false }
]
// buildConversationHistory converts to backend format:
conversationHistory = [
{ role: "user", content: "My name is Sarah" },
{ role: "assistant", content: "Nice to meet you!" }
]
// Note: Current question "What's my name?" is not included in history
// It's sent separately as the 'message' parameter

If you want to add the same memory to your normal chat page, follow these steps:

// Same changes but for /api/chat endpoint
app.post("/api/chat", async (req, res) => {
try {
// 🆕 MEMORY ADDITION: Accept conversationHistory
const { message, conversationHistory = [] } = req.body;
// 🆕 MEMORY ADDITION: Build context-aware message
let contextualMessage = message;
if (conversationHistory.length > 0) {
const context = conversationHistory
.map(msg => `${msg.role === 'user' ? 'User' : 'Assistant'}: ${msg.content}`)
.join('\n');
contextualMessage = `Previous conversation:\n${context}\n\nCurrent question: ${message}`;
}
const response = await openai.responses.create({
model: "gpt-4o-mini",
input: contextualMessage, // 🔄 CHANGED: was 'message', now includes context
});
res.json({
response: response.output_text,
success: true,
});
} catch (error) {
console.error("OpenAI API Error:", error);
res.status(500).json({
error: "Failed to get AI response",
success: false,
});
}
});
// 🆕 MEMORY ADDITION: Add the same buildConversationHistory function
const buildConversationHistory = (messages) => {
return messages
.filter(msg => !msg.isStreaming)
.map(msg => ({
role: msg.isUser ? "user" : "assistant",
content: msg.text
}));
};
// In your sendMessage function:
const sendMessage = async () => {
// ... your existing code ...
// 🆕 MEMORY ADDITION: Build conversation history
const conversationHistory = buildConversationHistory(messages)
const response = await fetch('http://localhost:8000/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
message: input,
conversationHistory: conversationHistory // 🆕 MEMORY ADDITION
}),
})
// ... rest of your existing code ...
}

  • Session-based: Memory only lasts during the current session
  • Refresh resets: Page refresh clears all conversation history
  • No persistence: Data isn’t saved to database or local storage
  • Token usage: Each message now sends full conversation history
  • Growing costs: Longer conversations = more tokens = higher costs
  • API limits: Very long conversations might hit OpenAI’s token limits
  • Short conversations (under 50 messages)
  • Demo applications and prototypes
  • Single-session chats where context matters
  • Interactive tutorials or guided conversations
  • Long conversations (50+ messages)
  • Multi-session apps requiring persistent memory
  • Cost-sensitive applications with high volume
  • Production apps needing conversation history

Your enhanced chat now provides:

  • Full conversation memory - AI remembers everything in the current session
  • Context-aware responses - References and builds on previous messages
  • Natural conversations - Can ask follow-up questions and get relevant answers
  • Seamless streaming - Memory doesn’t affect the real-time streaming experience
  • Minimal changes - Only 3 small additions to your existing code
  • Backward compatible - Works with your existing streaming implementation
  • Error handling - Gracefully handles missing or malformed conversation history
  • Performance optimized - Only sends completed messages, filters out streaming content
  • Immediate feedback - Users see their messages appear instantly as before
  • Smart responses - AI provides relevant, contextual answers
  • Natural flow - Conversations feel more human and connected
  • Visual indicators - Updated header shows memory is enabled

Next Steps: Ready to explore persistent memory that survives page refreshes? Let’s build browser storage and database solutions! 🚀