Back to Case Studies
Agentic AIEnterprise

Zep Memory Assistant - AI Agent with Human-Like Memory

An enterprise-ready AI agent platform with persistent memory that enables intelligent, personalized, and context-aware conversations across sessions using Zep Cloud and Microsoft AutoGen.

Dec 2025
12 min read
Live Demo
Zep Memory Assistant - AI Agent with Human-Like Memory

Project Overview

Traditional AI chatbots forget everything between sessions, leading to repetitive conversations and poor user experience. We built an autonomous memory-powered agent system where AI agents maintain long-term context using Zep Cloud's vector memory store, integrated with Microsoft AutoGen for sophisticated multi-agent orchestration. The platform also includes enterprise features: JWT authentication, multi-tenant organizations with RBAC, PayPal payments, and SendGrid email integration.

95%
Context Retention
<500ms
Response Latency
99%
Memory Accuracy
Session Persistence

System Architecture

The system uses a Hub-and-Spoke architecture with FastAPI as the central backend orchestrator. The React/Vite frontend communicates with the API, which manages multiple subsystems: Zep Cloud for vector-based long-term memory, AutoGen for agent orchestration, PostgreSQL for persistent data, and integrations with PayPal, SendGrid, and OpenRouter LLM providers.

System Architecture
Figure 1: System Architecture Diagram

ZepConversableAgent

Custom AutoGen agent with Zep memory hooks for automatic message persistence

Zep Cloud Memory

Vector store for semantic fact retrieval with configurable minimum rating thresholds

FastAPI Backend

RESTful API with async support, JWT auth, and comprehensive OpenAPI documentation

Multi-Tenant Organizations

RBAC-enabled organization management with Owner/Admin/Member roles

React + Vite Frontend

TypeScript-based modern SPA with responsive design

Implementation Details

Code Example

python
class ZepConversableAgent(ConversableAgent):
    """Custom AutoGen agent with Zep memory integration."""

    def __init__(self, name: str, system_message: str, llm_config: dict,
                 zep_session_id: str, zep_client: Zep, min_fact_rating: float):
        super().__init__(name=name, system_message=system_message, llm_config=llm_config)
        self.zep_session_id = zep_session_id
        self.zep_client = zep_client
        self.original_system_message = system_message
        self.register_hook('process_message_before_send', self._zep_persist_assistant_messages)

    def _zep_fetch_and_update_system_message(self):
        """Fetch facts and inject into system prompt."""
        memory = self.zep_client.memory.get(self.zep_session_id, min_rating=self.min_fact_rating)
        context = memory.context or 'No specific facts recalled.'
        self.update_system_message(
            self.original_system_message + f'\n\nRelevant facts:\n{context}'
        )

Agent Memory

Using min_fact_rating=0.7 filters out low-confidence memories, ensuring only high-quality facts are injected into the agent's context. This prevents hallucination from uncertain memories while maintaining conversational continuity.

Workflow

1

Authentication: User logs in via JWT auth endpoint.

2

Session Creation: A new Zep session is created linking user to memory context.

3

Message Ingestion: User message is persisted to Zep and sent to AutoGen agent.

4

Memory Retrieval: Agent fetches relevant facts from Zep's vector store.

5

Response Generation: OpenRouter LLM generates context-aware response.

6

Memory Update: Response is stored in Zep for future context retrieval.

Workflow Diagram
Figure 2: Workflow Diagram

Results & Impact

"The Zep Memory Assistant transformed our customer support—agents now remember past interactions, reducing resolution time by 60% and dramatically improving customer satisfaction."

Context Retention

Eliminated 'Who are you again?' moments with persistent memory

Developer Experience

Full OpenAPI docs, TypeScript frontend, and modular architecture

Enterprise Ready

Multi-tenancy, payments, and email built-in from day one

Agentic AIMemoryAutoGenFastAPIZep CloudMulti-TenancyVector DBReactPostgreSQLJWTRBACPayPalSendGrid

About the Author

Rahul Patil, AI Context Engineer

Rahul Patil

AI Context Engineer

20+
Projects Delivered
1.5+
Industry Experience

Rahul Patil

AI Context Engineer

Apex Neural

Rahul is an AI Context Engineer experienced in architecting agentic AI systems, scalable backend services, and full-stack SaaS platforms. His work includes LLM integrations, automation systems, OCR and document processing, web scraping, and fine-tuned AI models. He focuses on delivering production-ready AI solutions that solve real business problems.

Contributors

Ready to Build Your AI Solution?

Get a free consultation and see how we can help transform your business.