DEV Community

Cover image for AI Agent Memory: From Manual Implementation to Mem0 to AWS AgentCORE
Sudarshan Gouda
Sudarshan Gouda

Posted on • Edited on

AI Agent Memory: From Manual Implementation to Mem0 to AWS AgentCORE

Introduction

AI agents need memory to remember past conversations, user preferences, and learned information. Just like humans have different types of memory (short-term, long-term, episodic), AI agents use different memory systems to function effectively.

This guide explains memory in simple terms and shows you how to implement it both without external tools (using pure Python) and with external tools (using specialized services). We'll end with a complete end-to-end solution using Mem0 that combines all memory types.


Understanding Memory Types (Simple Explanation)

Think of AI agent memory like human memory:

Memory Type What It Does Simple Example
Short-term Memory Remembers current conversation "What did the user just say?"
Long-term Memory Remembers across sessions "User prefers dark mode" (even after days)
Episodic Memory Remembers specific past events "Last week, user asked about Python"
Semantic Memory Remembers facts and knowledge "User is a software developer"

Part 1: Memory Without External Tools

When you don't want to use external databases or services, you can implement memory using pure Python. This is great for:

  • Learning and prototyping
  • Small applications
  • Full control over your data

1.1 Simple Short-Term Memory (Current Conversation)

What it does: Keeps track of the current conversation.

class SimpleShortTermMemory: """Remembers the current conversation""" def __init__(self, max_messages=10): self.messages = [] self.max_messages = max_messages def add_message(self, role, content): """Add a message (user or assistant)""" self.messages.append({"role": role, "content": content}) # Keep only recent messages  if len(self.messages) > self.max_messages: self.messages.pop(0) # Remove oldest  def get_conversation(self): """Get all messages for the LLM""" return self.messages # Usage memory = SimpleShortTermMemory(max_messages=5) memory.add_message("user", "Hi, I'm Alice") memory.add_message("assistant", "Hello Alice! How can I help?") memory.add_message("user", "What's my name?") # Get conversation context context = memory.get_conversation() # LLM can now see: user said "Hi, I'm Alice" and assistant responded 
Enter fullscreen mode Exit fullscreen mode

1.2 Simple Long-Term Memory (User Preferences)

What it does: Remembers user preferences across sessions.

import json import os class SimpleLongTermMemory: """Remembers user preferences and facts""" def __init__(self, storage_file="memory.json"): self.storage_file = storage_file self.data = self._load() def _load(self): """Load from file""" if os.path.exists(self.storage_file): with open(self.storage_file, 'r') as f: return json.load(f) return {"preferences": {}, "facts": []} def _save(self): """Save to file""" with open(self.storage_file, 'w') as f: json.dump(self.data, f, indent=2) def remember_preference(self, user_id, key, value): """Remember a user preference""" if user_id not in self.data["preferences"]: self.data["preferences"][user_id] = {} self.data["preferences"][user_id][key] = value self._save() def get_preference(self, user_id, key): """Get a user preference""" return self.data["preferences"].get(user_id, {}).get(key) def remember_fact(self, user_id, fact): """Remember a fact about the user""" if user_id not in self.data["facts"]: self.data["facts"][user_id] = [] self.data["facts"][user_id].append(fact) self._save() def get_facts(self, user_id): """Get all facts about a user""" return self.data["facts"].get(user_id, []) # Usage ltm = SimpleLongTermMemory() # Remember preferences ltm.remember_preference("alice_123", "theme", "dark") ltm.remember_preference("alice_123", "language", "Python") # Remember facts ltm.remember_fact("alice_123", "User is a software developer") ltm.remember_fact("alice_123", "User works at TechCorp") # Later, retrieve memories theme = ltm.get_preference("alice_123", "theme") # Returns "dark" facts = ltm.get_facts("alice_123") # Returns list of facts 
Enter fullscreen mode Exit fullscreen mode

1.3 Simple Episodic Memory (Past Interactions)

What it does: Remembers specific past conversations to learn from them.

class SimpleEpisodicMemory: """Remembers past interactions""" def __init__(self, max_episodes=100): self.episodes = [] self.max_episodes = max_episodes def add_episode(self, user_query, assistant_response, outcome="success"): """Store a past interaction""" episode = { "query": user_query, "response": assistant_response, "outcome": outcome } self.episodes.append(episode) # Keep only recent episodes  if len(self.episodes) > self.max_episodes: self.episodes.pop(0) def find_similar(self, query, top_k=3): """Find similar past interactions""" # Simple keyword matching  query_words = set(query.lower().split()) scored = [] for episode in self.episodes: episode_words = set(episode["query"].lower().split()) # Count matching words  matches = len(query_words.intersection(episode_words)) if matches > 0: scored.append((matches, episode)) # Sort by matches and return top_k  scored.sort(reverse=True) return [ep for _, ep in scored[:top_k]] # Usage episodic = SimpleEpisodicMemory() # Store past successful interactions episodic.add_episode( "How do I create a Python virtual environment?", "Use: python -m venv myenv, then activate with: source myenv/bin/activate", outcome="success" ) episodic.add_episode( "What's the best way to handle Python dependencies?", "Use requirements.txt or pyproject.toml with pip or poetry", outcome="success" ) # Find similar past interactions similar = episodic.find_similar("How do I set up a Python project?") # Returns similar past episodes that can be used as examples 
Enter fullscreen mode Exit fullscreen mode

1.4 Simple Semantic Memory (Knowledge Base)

What it does: Stores facts and knowledge that can be searched.

class SimpleSemanticMemory: """Stores and searches knowledge""" def __init__(self): self.knowledge = [] def add_knowledge(self, content, category="general"): """Add a piece of knowledge""" self.knowledge.append({ "content": content, "category": category }) def search(self, query, top_k=3): """Search for relevant knowledge""" query_words = set(query.lower().split()) scored = [] for item in self.knowledge: content_words = set(item["content"].lower().split()) matches = len(query_words.intersection(content_words)) if matches > 0: scored.append((matches, item)) scored.sort(reverse=True) return [item for _, item in scored[:top_k]] # Usage semantic = SimpleSemanticMemory() # Add knowledge semantic.add_knowledge("Alice is a data scientist at TechCorp", "user_profile") semantic.add_knowledge("Alice prefers detailed technical explanations", "preferences") semantic.add_knowledge("Alice uses Python and scikit-learn", "tools") # Search for relevant knowledge results = semantic.search("What tools does Alice use?") # Returns: ["Alice uses Python and scikit-learn"] 
Enter fullscreen mode Exit fullscreen mode

1.5 Complete Example: All Memory Types Together

class SimpleMemoryAgent: """Agent with all memory types (no external tools)""" def __init__(self): self.short_term = SimpleShortTermMemory(max_messages=10) self.long_term = SimpleLongTermMemory() self.episodic = SimpleEpisodicMemory() self.semantic = SimpleSemanticMemory() def process_query(self, user_id, user_query): """Process a user query using all memory types""" # 1. Get long-term memories (preferences, facts)  preferences = self.long_term.data.get("preferences", {}).get(user_id, {}) facts = self.long_term.get_facts(user_id) # 2. Get similar past episodes (few-shot examples)  similar_episodes = self.episodic.find_similar(user_query, top_k=2) # 3. Get relevant knowledge  relevant_knowledge = self.semantic.search(user_query, top_k=2) # 4. Build context for LLM  context = f"""User Preferences: {preferences} Known Facts: {facts} Similar Past Interactions: {chr(10).join([f"Q: {e['query']}\nA: {e['response']}" for e in similar_episodes])} Relevant Knowledge: {chr(10).join([k['content'] for k in relevant_knowledge])} Current Conversation: {self.short_term.get_conversation()} """ # 5. Add to short-term memory  self.short_term.add_message("user", user_query) # 6. Generate response (pseudo-code - replace with actual LLM call)  response = f"Response to: {user_query}" # 7. Store in episodic memory  self.episodic.add_episode(user_query, response, outcome="success") # 8. Add response to short-term memory  self.short_term.add_message("assistant", response) return response # Usage agent = SimpleMemoryAgent() # Set up some memories agent.long_term.remember_preference("alice_123", "theme", "dark") agent.semantic.add_knowledge("Alice is a Python developer", "profile") # Process queries response1 = agent.process_query("alice_123", "Hi, I'm Alice") response2 = agent.process_query("alice_123", "What's my favorite theme?") # Agent remembers from long-term memory: "dark" 
Enter fullscreen mode Exit fullscreen mode

Part 2: Memory With External Tools

External tools provide better scalability, persistence, and advanced features like semantic search. This is better for:

  • Production applications
  • Large-scale systems
  • Multiple users
  • Advanced search capabilities

2.1 LangGraph Checkpointer (Short-Term + Persistence)

What it does: Manages conversation state with automatic persistence.

from langgraph.graph import StateGraph, START, END from langgraph.checkpoint.memory import InMemorySaver from langchain_core.messages import HumanMessage, AIMessage from langchain_openai import ChatOpenAI from typing import TypedDict, Annotated from langgraph.graph.message import add_messages # Define state class ConversationState(TypedDict): messages: Annotated[list, add_messages] # Initialize LLM llm = ChatOpenAI(model="gpt-4") # Create graph def chat_node(state: ConversationState): response = llm.invoke(state["messages"]) return {"messages": [response]} workflow = StateGraph(ConversationState) workflow.add_node("chat", chat_node) workflow.add_edge(START, "chat") workflow.add_edge("chat", END) # Add checkpointer for persistence checkpointer = InMemorySaver() graph = workflow.compile(checkpointer=checkpointer) # Usage - each thread_id maintains separate conversation config = {"configurable": {"thread_id": "user_alice"}} # First message result = graph.invoke( {"messages": [HumanMessage(content="Hi, I'm Alice!")]}, config ) # Second message - remembers previous conversation result = graph.invoke( {"messages": [HumanMessage(content="What's my name?")]}, config ) # LLM remembers: "Alice" 
Enter fullscreen mode Exit fullscreen mode

2.2 ChromaDB (Semantic Memory)

What it does: Vector database for semantic search over knowledge.

import chromadb from chromadb.utils import embedding_functions class ChromaSemanticMemory: """Semantic memory using ChromaDB""" def __init__(self, collection_name="knowledge"): self.client = chromadb.PersistentClient(path="./chroma_db") # Use OpenAI embeddings  self.embedding_fn = embedding_functions.OpenAIEmbeddingFunction( model_name="text-embedding-3-small" ) self.collection = self.client.get_or_create_collection( name=collection_name, embedding_function=self.embedding_fn ) def add_knowledge(self, content, metadata=None): """Add knowledge to the database""" self.collection.add( documents=[content], metadatas=[metadata or {}] ) def search(self, query, n_results=3): """Search for semantically similar knowledge""" results = self.collection.query( query_texts=[query], n_results=n_results ) return results['documents'][0] # Returns list of relevant content  # Usage memory = ChromaSemanticMemory() # Add knowledge memory.add_knowledge("Alice prefers Python over JavaScript") memory.add_knowledge("Alice is building a recommendation system") # Search semantically results = memory.search("What programming language does Alice like?") # Returns: ["Alice prefers Python over JavaScript"] # Even though query doesn't match exactly, semantic search finds it 
Enter fullscreen mode Exit fullscreen mode

2.3 Pinecone (Episodic Memory at Scale)

What it does: Cloud vector database for storing millions of past interactions.

from pinecone import Pinecone, ServerlessSpec from openai import OpenAI class PineconeEpisodicMemory: """Episodic memory using Pinecone""" def __init__(self, index_name="episodes"): self.pc = Pinecone() self.openai = OpenAI() # Create index if needed  if index_name not in [idx.name for idx in self.pc.list_indexes()]: self.pc.create_index( name=index_name, dimension=1536, # OpenAI embedding dimension  metric="cosine", spec=ServerlessSpec(cloud="aws", region="us-east-1") ) self.index = self.pc.Index(index_name) def store_episode(self, episode_id, query, response, user_id): """Store a past interaction""" # Create embedding  text = f"Query: {query}\nResponse: {response}" embedding = self.openai.embeddings.create( model="text-embedding-3-small", input=text ).data[0].embedding # Store in Pinecone  self.index.upsert(vectors=[{ "id": episode_id, "values": embedding, "metadata": { "query": query, "response": response, "user_id": user_id } }]) def find_similar(self, query, user_id=None, top_k=3): """Find similar past interactions""" # Create query embedding  embedding = self.openai.embeddings.create( model="text-embedding-3-small", input=query ).data[0].embedding # Search  results = self.index.query( vector=embedding, top_k=top_k, include_metadata=True, filter={"user_id": user_id} if user_id else None ) return [match.metadata for match in results.matches] # Usage episodic = PineconeEpisodicMemory() # Store episodes episodic.store_episode( "ep_001", "How do I optimize a database query?", "Add indexes, use EXPLAIN, and optimize WHERE clauses", user_id="alice_123" ) # Find similar similar = episodic.find_similar( "My database is slow, what should I do?", user_id="alice_123" ) # Returns similar past interactions 
Enter fullscreen mode Exit fullscreen mode

Part 3: End-to-End Solution with Mem0 (All Memory Types)

Mem0 is a specialized service that handles all memory types automatically. It extracts, stores, and retrieves memories intelligently.

Complete Mem0 Implementation

from mem0 import MemoryClient from langchain_openai import ChatOpenAI from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder from typing import List, Dict class Mem0MemoryAgent: """Complete agent using Mem0 for all memory types""" def __init__(self): # Initialize Mem0 (requires MEM0_API_KEY environment variable)  self.mem0 = MemoryClient() # Initialize LLM  self.llm = ChatOpenAI(model="gpt-4") # Create prompt template with memory context  self.prompt = ChatPromptTemplate.from_messages([ ("system", """You are a helpful personal assistant with memory. Use the provided memories to personalize your responses. Relevant Memories: {memories} Use these memories to provide personalized, context-aware responses."""), MessagesPlaceholder(variable_name="history"), ("user", "{input}") ]) def get_memories(self, query: str, user_id: str) -> str: """Retrieve relevant memories for the current query""" try: results = self.mem0.search(query, user_id=user_id) if results.get("results"): memories = [] for mem in results["results"]: memories.append(f"- {mem['memory']}") return "\n".join(memories) return "No relevant memories found." except Exception as e: print(f"Memory retrieval error: {e}") return "No relevant memories found." def save_interaction(self, user_id: str, user_input: str, assistant_response: str): """Save interaction to Mem0 - it automatically extracts memories""" try: self.mem0.add( messages=[ {"role": "user", "content": user_input}, {"role": "assistant", "content": assistant_response} ], user_id=user_id ) except Exception as e: print(f"Memory save error: {e}") def chat(self, user_input: str, user_id: str, history: List[Dict] = None) -> str: """Main chat function with full memory integration""" history = history or [] # 1. Retrieve relevant memories (Mem0 handles all memory types)  memories = self.get_memories(user_input, user_id) # 2. Generate response with memory context  chain = self.prompt | self.llm response = chain.invoke({ "memories": memories, "history": history, "input": user_input }) # 3. Save interaction (Mem0 automatically extracts and stores memories)  self.save_interaction(user_id, user_input, response.content) return response.content def get_all_memories(self, user_id: str) -> List[Dict]: """Get all memories for a user""" try: results = self.mem0.get_all(user_id=user_id) return results.get("results", []) except Exception as e: print(f"Error retrieving memories: {e}") return [] def delete_memory(self, memory_id: str): """Delete a specific memory""" try: self.mem0.delete(memory_id=memory_id) except Exception as e: print(f"Error deleting memory: {e}") # Complete Usage Example def main(): """End-to-end example using Mem0""" # Initialize agent  agent = Mem0MemoryAgent() user_id = "alice_123" conversation_history = [] print("=== Conversation 1 ===") # First interaction  user_input1 = "Hi! I'm Alice and I love hiking in the mountains. I'm a Python developer at TechCorp." response1 = agent.chat(user_input1, user_id, conversation_history) print(f"User: {user_input1}") print(f"Assistant: {response1}\n") # Update history  conversation_history.append({"role": "user", "content": user_input1}) conversation_history.append({"role": "assistant", "content": response1}) print("=== Conversation 2 (Same Session) ===") # Second interaction - Mem0 remembers from first conversation  user_input2 = "What outdoor activities would you recommend for this weekend?" response2 = agent.chat(user_input2, user_id, conversation_history) print(f"User: {user_input2}") print(f"Assistant: {response2}\n") # Mem0 recalls: Alice loves hiking → recommends hiking activities  print("=== Conversation 3 (New Session - Days Later) ===") # New session - Mem0 still remembers!  user_input3 = "What programming language should I use for my new project?" response3 = agent.chat(user_input3, user_id, []) # Empty history, but Mem0 remembers  print(f"User: {user_input3}") print(f"Assistant: {response3}\n") # Mem0 recalls: Alice is a Python developer → recommends Python  print("=== All Memories for User ===") # View all stored memories  all_memories = agent.get_all_memories(user_id) for i, mem in enumerate(all_memories, 1): print(f"{i}. {mem.get('memory', 'N/A')}") print("\n=== Memory Types Handled by Mem0 ===") print(""" Mem0 automatically handles: - Short-term Memory: Current conversation context - Long-term Memory: User preferences and facts (persisted) - Episodic Memory: Past interactions and experiences - Semantic Memory: Knowledge about the user and domain All extracted automatically from conversations! """) if __name__ == "__main__": main() 
Enter fullscreen mode Exit fullscreen mode

How Mem0 Handles All Memory Types

Mem0 automatically extracts and manages different memory types:

  1. Short-term Memory: Maintains conversation context during the session
  2. Long-term Memory: Extracts user preferences and facts, stores them persistently
  3. Episodic Memory: Remembers specific past interactions and their outcomes
  4. Semantic Memory: Builds a knowledge base about users and topics

Key Benefits of Mem0:

  • ✅ Automatic memory extraction (no manual coding)
  • ✅ Intelligent retrieval (finds relevant memories)
  • ✅ Handles all memory types automatically
  • ✅ Production-ready and scalable
  • ✅ Simple API

Part 4: AWS AgentCORE Memory (Alternative to Mem0)

AWS Bedrock AgentCORE Memory is a fully managed AWS service that provides similar capabilities to Mem0. It's designed for applications already using AWS services and offers enterprise-grade features.

Can AWS AgentCORE Memory be Used Like Mem0?

Yes! AWS AgentCORE Memory can be used similarly to Mem0. Both provide:

  • Short-term and long-term memory
  • Automatic memory extraction
  • Context-aware retrieval
  • Multi-session persistence

Key Differences

Feature Mem0 AWS AgentCORE Memory
Deployment Open-source + Managed Fully managed AWS service
Integration Works with any LLM Optimized for AWS Bedrock
Setup Simple API key AWS account + IAM setup
Cost Usage-based pricing AWS pricing model
Customization Open-source option available AWS-managed (less customization)
Best For Multi-cloud, flexibility AWS-native applications

AWS AgentCORE Memory Implementation

Note: This requires the bedrock-agentcore Python SDK. Install with:

pip install bedrock-agentcore 
Enter fullscreen mode Exit fullscreen mode
from bedrock_agentcore.memory import MemoryClient from bedrock_agentcore.memory.session import MemorySessionManager from bedrock_agentcore.memory.constants import ConversationalMessage, MessageRole from typing import List, Dict, Optional from datetime import datetime class AWSAgentCOREMemory: """Agent using AWS Bedrock AgentCORE Memory""" def __init__(self, region_name="us-east-1", memory_name="AgentMemory"): # Initialize Memory Client  self.memory_client = MemoryClient(region_name=region_name) # Create or get memory resource  self.memory = self._get_or_create_memory(memory_name) self.memory_id = self.memory['id'] # Initialize session manager  self.session_manager = MemorySessionManager( memory_id=self.memory_id, region_name=region_name ) def _get_or_create_memory(self, name: str) -> Dict: """Create or retrieve memory resource""" try: # Try to get existing memory  memories = self.memory_client.list_memories() for mem in memories.get('memories', []): if mem.get('name') == name: return mem # Create new memory if not found  memory = self.memory_client.create_memory( name=name, description="Memory store for AI agent", eventExpiryDuration=30, # Store events for 30 days  memoryStrategies=[ { "userPreferenceMemoryStrategy": { "name": "UserPreferences", "namespaces": ["agent/{actorId}/preferences"] } }, { "semanticMemoryStrategy": { "name": "SemanticKnowledge", "namespaces": ["agent/{actorId}/knowledge"] } } ] ) return memory except Exception as e: print(f"Error creating memory: {e}") raise def store_interaction(self, user_id: str, session_id: str, user_message: str, assistant_message: str): """Store interaction in short-term memory""" try: # Create or get session  session = self.session_manager.create_memory_session( actor_id=user_id, session_id=session_id ) # Add conversation turns  session.add_turns( messages=[ ConversationalMessage(user_message, MessageRole.USER), ConversationalMessage(assistant_message, MessageRole.ASSISTANT) ] ) except Exception as e: print(f"Error storing interaction: {e}") def get_recent_events(self, user_id: str, session_id: str, max_results: int = 10) -> List[Dict]: """Get recent events from short-term memory""" try: events = self.memory_client.list_events( memory_id=self.memory_id, actor_id=user_id, session_id=session_id, max_results=max_results ) return events.get('events', []) except Exception as e: print(f"Error retrieving events: {e}") return [] def retrieve_long_term_memories(self, user_id: str, query: str, top_k: int = 5) -> List[Dict]: """Retrieve long-term memories (preferences, facts)""" try: # Search in preferences namespace  preferences = self.memory_client.retrieve_memory_records( memory_id=self.memory_id, namespace=f"agent/{user_id}/preferences", searchCriteria={ "searchQuery": query, "topK": top_k } ) # Search in knowledge namespace  knowledge = self.memory_client.retrieve_memory_records( memory_id=self.memory_id, namespace=f"agent/{user_id}/knowledge", searchCriteria={ "searchQuery": query, "topK": top_k } ) # Combine results  all_memories = (preferences.get('memoryRecords', []) + knowledge.get('memoryRecords', [])) return all_memories[:top_k] except Exception as e: print(f"Error retrieving long-term memories: {e}") return [] def get_all_long_term_memories(self, user_id: str) -> List[Dict]: """Get all long-term memories for a user""" try: session = self.session_manager.create_memory_session( actor_id=user_id, session_id="retrieval_session" ) # List all memory records  memory_records = session.list_long_term_memory_records( namespace_prefix=f"agent/{user_id}/" ) return list(memory_records) except Exception as e: print(f"Error getting all memories: {e}") return [] def chat(self, user_input: str, user_id: str, session_id: str, llm_callback=None) -> str: """ Main chat function with AgentCORE Memory integration Args: user_input: User's message user_id: Unique user identifier session_id: Session identifier llm_callback: Function to call LLM (you provide this) Returns: Assistant response """ # 1. Get recent events (short-term memory)  recent_events = self.get_recent_events(user_id, session_id, max_results=5) # 2. Get long-term memories (preferences, facts)  long_term_memories = self.retrieve_long_term_memories( user_id, user_input, top_k=3 ) # 3. Build context from memories  context = self._build_context(recent_events, long_term_memories) # 4. Generate response using LLM (you provide this function)  if llm_callback: assistant_response = llm_callback(user_input, context) else: # Placeholder response  assistant_response = f"Response to: {user_input}" # 5. Store interaction in memory  self.store_interaction(user_id, session_id, user_input, assistant_response) return assistant_response def _build_context(self, recent_events: List[Dict], long_term_memories: List[Dict]) -> str: """Build context string from memories""" context_parts = [] # Add recent conversation context  if recent_events: context_parts.append("Recent Conversation:") for event in recent_events[-5:]: # Last 5 events  messages = event.get('messages', []) for msg in messages: role = msg.get('role', '') content = msg.get('content', '') context_parts.append(f"{role}: {content}") # Add long-term memories  if long_term_memories: context_parts.append("\nRelevant Memories:") for mem in long_term_memories: content = mem.get('content', {}).get('text', '') if content: context_parts.append(f"- {content}") return "\n".join(context_parts) # Usage Example def llm_generate(user_input: str, context: str) -> str: """ Example LLM callback function In production, replace with actual LLM call (Bedrock, OpenAI, etc.) """ # This is a placeholder - replace with your LLM  return f"Based on context: {context[:50]}... Response to: {user_input}" def main_aws(): """Example using AWS AgentCORE Memory""" # Initialize AgentCORE Memory  agent = AWSAgentCOREMemory(region_name="us-east-1", memory_name="MyAgentMemory") user_id = "alice_123" session_id = f"session_{datetime.now().timestamp()}" print("=== AWS AgentCORE Memory Example ===\n") # First interaction  print("--- Conversation 1 ---") user_input1 = "Hi! I'm Alice and I love hiking in the mountains. I'm a Python developer at TechCorp." response1 = agent.chat( user_input1, user_id, session_id, llm_callback=llm_generate ) print(f"User: {user_input1}") print(f"Assistant: {response1}\n") # AgentCORE stores this in short-term memory and extracts long-term memories  # Second interaction - AgentCORE remembers from short-term memory  print("--- Conversation 2 (Same Session) ---") user_input2 = "What outdoor activities would you recommend for this weekend?" response2 = agent.chat( user_input2, user_id, session_id, llm_callback=llm_generate ) print(f"User: {user_input2}") print(f"Assistant: {response2}\n") # AgentCORE recalls: Alice loves hiking (from short-term memory)  # New session - long-term memory persists  print("--- Conversation 3 (New Session - Days Later) ---") new_session_id = f"session_{datetime.now().timestamp()}" user_input3 = "What programming language should I use for my new project?" response3 = agent.chat( user_input3, user_id, new_session_id, llm_callback=llm_generate ) print(f"User: {user_input3}") print(f"Assistant: {response3}\n") # AgentCORE recalls from long-term memory: Alice is a Python developer  # View all stored memories  print("--- All Long-Term Memories for User ---") all_memories = agent.get_all_long_term_memories(user_id) for i, mem in enumerate(all_memories, 1): content = mem.get('content', {}).get('text', 'N/A') print(f"{i}. {content}") if __name__ == "__main__": # Note: Requires:  # 1. AWS credentials configured (aws configure)  # 2. Bedrock AgentCORE access enabled  # 3. Install: pip install bedrock-agentcore  #  # Uncomment to run:  # main_aws()  pass 
Enter fullscreen mode Exit fullscreen mode

How AWS AgentCORE Memory Works

AWS AgentCORE Memory provides:

  1. Short-Term Memory:

    • Stores raw interaction events using create_event() or add_turns()
    • Events organized by actor (user) and session
    • Maintains chronological order for conversation flow
    • Configurable retention (up to 365 days)
  2. Long-Term Memory:

    • Uses Memory Strategies to extract insights from events
    • Built-in strategies: userPreferenceMemoryStrategy, semanticMemoryStrategy
    • Stores extracted memories in hierarchical namespaces
    • Persists across sessions automatically
    • Retrieved using retrieve_memory_records() with search queries
  3. Memory Strategies:

    • User Preference Strategy: Extracts user preferences and settings
    • Semantic Strategy: Extracts facts and knowledge
    • Custom strategies can be defined for specific needs
    • Strategies process events and create long-term memory records
  4. Security:

    • Data encrypted at rest and in transit
    • AWS-managed or customer-managed KMS keys
    • Fine-grained access control via namespaces
    • IAM-based authentication
  5. Scalability:

    • Fully managed service - no infrastructure to manage
    • Handles large volumes efficiently
    • Low latency retrieval
    • Built for production workloads

When to Use AWS AgentCORE Memory vs Mem0

Choose AWS AgentCORE Memory if:

  • ✅ You're already using AWS services
  • ✅ You need enterprise-grade security and compliance
  • ✅ You want fully managed infrastructure
  • ✅ You're building AWS-native applications

Choose Mem0 if:

  • ✅ You want open-source flexibility
  • ✅ You're using multiple cloud providers
  • ✅ You need more customization
  • ✅ You want simpler setup (just API key)

Setup Requirements for AWS AgentCORE Memory

  1. AWS Account: Active AWS account with Bedrock AgentCORE access
  2. Install SDK: pip install bedrock-agentcore
  3. AWS Credentials: Configure using aws configure or IAM roles
  4. IAM Permissions: Required permissions for Bedrock AgentCORE Memory
  5. Region: Available in specific AWS regions (e.g., us-east-1, us-west-2)
# Prerequisites setup (one-time)  """ 1. Install the SDK: pip install bedrock-agentcore 2. Configure AWS Credentials: aws configure # Or use IAM roles if running on EC2/Lambda 3. Required IAM Permissions: - bedrock:CreateMemory - bedrock:GetMemory - bedrock:ListMemories - bedrock:UpdateMemory - bedrock:DeleteMemory - bedrock:CreateEvent - bedrock:ListEvents - bedrock:RetrieveMemoryRecords - bedrock:ListMemoryRecords 4. Enable Bedrock AgentCORE in AWS Console: - Go to AWS Bedrock Console - Request access to AgentCORE features - Wait for approval (if required) 5. Create Memory Resource: - The code automatically creates a memory resource - Or create manually via AWS Console/CLI """ 
Enter fullscreen mode Exit fullscreen mode

Comparison: Memory Solutions

Quick Comparison Table

Aspect Manual (No Tools) Mem0 AWS AgentCORE Memory
Setup Complexity Very Simple Simple (API key) Moderate (AWS setup)
Scalability Single machine High Enterprise-scale
Search Quality Keyword matching Semantic search Semantic search
Memory Extraction Manual coding Automatic Automatic
Persistence File-based Database-backed AWS-managed
Cost Free Usage-based AWS pricing
Best For Learning, prototyping Production (flexible) AWS-native apps
Open Source Yes Yes (option) No (AWS managed)
Multi-Cloud N/A Yes No (AWS only)

Detailed Comparison

Manual Memory (No External Tools)

  • Pros: Free, full control, simple setup, no dependencies
  • Cons: Limited scalability, manual extraction, basic search
  • Use When: Learning, prototyping, small applications

Mem0

  • Pros: Automatic extraction, simple API, open-source option, multi-cloud
  • Cons: Requires API key, usage costs for managed version
  • Use When: Production apps needing flexibility, multi-cloud deployments

AWS AgentCORE Memory

  • Pros: Enterprise-grade, AWS integration, fully managed, high security
  • Cons: AWS-only, more complex setup, AWS account required
  • Use When: AWS-native applications, enterprise requirements, need AWS integration

Best Practices

1. Choose the Right Approach

  • Start Simple: Use manual memory for learning and prototyping
  • Scale Up: Move to external tools when you need production features
  • Consider Mem0: If you want automatic memory management with flexibility
  • Consider AWS AgentCORE Memory: If you're building AWS-native applications and need enterprise features

2. Memory Hygiene

  • Regular Cleanup: Remove old or irrelevant memories
  • Deduplication: Avoid storing duplicate information
  • Validation: Check memory quality before storing

3. Privacy and Security

  • Encrypt Sensitive Data: Protect user information
  • User Consent: Get permission before storing memories
  • User Control: Let users view and delete their memories

4. Performance

  • Batch Operations: Store multiple memories at once when possible
  • Caching: Cache frequently accessed memories
  • Indexing: Use proper indexes for fast retrieval

5. Memory Selection

  • Relevance First: Prioritize memories relevant to current context
  • Recency Matters: Give more weight to recent memories
  • Success Filtering: Prefer memories from successful interactions

6. Testing

  • Test Memory Retrieval: Ensure relevant memories are found
  • Test Memory Persistence: Verify memories survive restarts
  • Test Memory Extraction: Confirm automatic extraction works correctly

Conclusion

Memory is essential for building intelligent AI agents. Whether you start with simple Python implementations or use advanced tools like Mem0 or AWS AgentCORE Memory, the key is understanding what each memory type does and when to use it.

Quick Decision Guide:

  • Learning/Prototyping: Use manual memory (Part 1)
  • Production App (Flexible): Use Mem0 (Part 3) - works with any cloud
  • Production App (AWS): Use AWS AgentCORE Memory (Part 4) - AWS-native
  • Custom Needs: Use individual tools like ChromaDB, Pinecone (Part 2)

Start simple, understand the concepts, then scale up as needed. The examples in this guide provide working code you can adapt to your needs.

Key Takeaway: Both Mem0 and AWS AgentCORE Memory can be used similarly - they both provide automatic memory extraction and management. Choose based on your infrastructure preferences (multi-cloud vs AWS-only).


Resources


Top comments (0)