Configuring LLM backends

Connecting MemGPT to various LLM backends

You can use MemGPT with various LLM backends, including the OpenAI API, Azure OpenAI, and various local (or self-hosted) LLM backends.

OpenAI

To use MemGPT with an OpenAI API key, simply set the OPENAI_API_KEY variable:

export OPENAI_API_KEY=YOUR_API_KEY # on Linux/Mac
set OPENAI_API_KEY=YOUR_API_KEY # on Windows
$Env:OPENAI_API_KEY = "YOUR_API_KEY" # on Windows (PowerShell)

When you run memgpt configure, make sure to select openai for both the LLM inference provider and embedding provider, for example:

$ memgpt configure
? Select LLM inference provider: openai
? Override default endpoint: https://api.openai.com/v1
? Select default model (recommended: gpt-4): gpt-4
? Select embedding provider: openai
? Select default preset: memgpt_chat
? Select default persona: sam_pov
? Select default human: cs_phd
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite

OpenAI Proxies

To use custom OpenAI endpoints, specify a proxy URL when running memgpt configure to set the custom endpoint as the default endpoint.

Azure OpenAI

To use MemGPT with Azure, expore the following variables and then re-run memgpt configure:

# see https://github.com/openai/openai-python#microsoft-azure-endpoints
export AZURE_OPENAI_KEY=...
export AZURE_OPENAI_ENDPOINT=...
export AZURE_OPENAI_VERSION=...

# set the below if you are using deployment ids
export AZURE_OPENAI_DEPLOYMENT=...
export AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT=...

For example, if your endpoint is customproject.openai.azure.com (for both your GPT model and your embeddings model), you would set the following:

# change AZURE_OPENAI_VERSION to the latest version
export AZURE_OPENAI_KEY="YOUR_AZURE_KEY"
export AZURE_OPENAI_VERSION="2023-08-01-preview"
export AZURE_OPENAI_ENDPOINT="https://customproject.openai.azure.com"
export AZURE_OPENAI_EMBEDDING_ENDPOINT="https://customproject.openai.azure.com"

If you named your deployments names other than their defaults, you would also set the following:

# assume you called the gpt-4 (1106-Preview) deployment "personal-gpt-4-turbo"
export AZURE_OPENAI_DEPLOYMENT="personal-gpt-4-turbo"

# assume you called the text-embedding-ada-002 deployment "personal-embeddings"
export AZURE_OPENAI_EMBEDDING_DEPLOYMENT="personal-embeddings"

Replace export with set or $Env: if you are on Windows (see the OpenAI example).

When you run memgpt configure, make sure to select azure for both the LLM inference provider and embedding provider, for example:

$ memgpt configure
? Select LLM inference provider: azure
? Select default model (recommended: gpt-4): gpt-4-1106-preview
? Select embedding provider: azure
? Select default preset: memgpt_chat
? Select default persona: sam_pov
? Select default human: cs_phd
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite

Note: your Azure endpoint must support functions or you will get an error. See this GitHub issue for more information.

Google AI (Gemini) API

To run MemGPT with Gemini, simply select the appropriate settings during memgpt configure:

$ memgpt configure
? Select LLM inference provider: google_ai
? Enter your Google AI (Gemini) API key (see https://aistudio.google.com/app/apikey): *********
? Enter your Google AI (Gemini) service endpoint (see https://ai.google.dev/api/rest): generativelanguage
? Select default model: gemini-pro
Got context window 30720 for model gemini-pro (from Google API)
? Select your model's context window (see https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versioning#gemini-model-versions): 30720
? Select embedding provider: openai
? Select default preset: memgpt_chat
? Select default persona: sam_pov
? Select default human: basic
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite

Anthropic (Claude) API

To run MemGPT with Claude, simply select the appropriate settings during memgpt configure:

$ memgpt configure              
? Select LLM inference provider: anthropic
? Enter your Anthropic API key (starts with 'sk-', see https://console.anthropic.com/settings/keys): *********
? Override default endpoint: https://api.anthropic.com/v1
? Select default model: claude-3-opus-20240229
Got context window 200000 for model claude-3-opus-20240229
? Select your model's context window (see https://docs.anthropic.com/claude/docs/models-overview): 200000
? Select embedding provider: openai
? Select default preset: memgpt_chat
? Select default persona: sam_pov
? Select default human: basic
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite

Cohere API

To run MemGPT with the Cohere API, simply select the appropriate settings during memgpt configure:

$ memgpt configure
? Select LLM inference provider: cohere
? Enter your Cohere API key (see https://dashboard.cohere.com/api-keys): ****************************************
? Override default endpoint: https://api.cohere.ai/v1
? Select default model: command-r-plus
Got context window 116000 for model command-r-plus
? Select your model's context window (see https://docs.cohere.com/docs/command-r): 116000
? Select embedding provider: openai
? Select default preset: memgpt_chat
? Select default persona: sam_pov
? Select default human: basic
? Select storage backend for archival data: chroma
? Select chroma backend: persistent
? Select storage backend for recall data: sqlite

Local Models & Custom Endpoints

MemGPT supports running open source models, both being run locally or as a hosted service. Setting up MemGPT to run with open models requires a bit more setup, follow the instructions here.