Caching

The SDK includes a thread-safe, in-memory TTL cache that reduces API calls by storing prompt versions locally. Caching is enabled by default with a 300-second (5-minute) TTL.

Default Behavior

When caching is enabled, get_prompt_version stores the result keyed by prompt name and label/version. Subsequent calls with the same arguments return the cached value until the TTL expires.

Python
Ruby

# First call hits the API
version = client.get_prompt_version("greeting")

# Second call returns cached result (no API call)
version = client.get_prompt_version("greeting")

# First call hits the API
version = client.get_prompt_version("greeting")

# Second call returns cached result (no API call)
version = client.get_prompt_version("greeting")

Setting a Custom TTL

Change the default TTL when creating the client:

Python
Ruby

client = ScopeClient(
    credentials=credentials,
    cache_ttl=600,  # 10 minutes
)

client = ScopeClient::Client.new(
  credentials: credentials,
  cache_ttl: 600,  # 10 minutes
)

Per-Request Cache Control

Skip the cache for a single request

Python
Ruby

# Bypass cache and fetch fresh data
version = client.get_prompt_version("greeting", cache=False)

# Bypass cache and fetch fresh data
version = client.get_prompt_version("greeting", cache: false)

Override TTL for a single request

Python
Ruby

# Cache this result for 60 seconds instead of the default
version = client.get_prompt_version("greeting", cache_ttl=60)

# Cache this result for 60 seconds instead of the default
version = client.get_prompt_version("greeting", cache_ttl: 60)

Clearing the Cache

Remove all cached entries programmatically:

Python
Ruby

client.clear_cache()

client.clear_cache

Disabling Caching Globally

Turn off caching entirely for a client:

Python
Ruby

client = ScopeClient(
    credentials=credentials,
    cache_enabled=False,
)

client = ScopeClient::Client.new(
  credentials: credentials,
  cache_enabled: false,
)

How Caching Works

Cache keys are derived from the prompt name plus the label or version ID (e.g., prompt:greeting:production)
Entries expire lazily — they are evicted when accessed after TTL expiration
The cache is thread-safe (uses locks/monitors internally)
Each client instance has its own independent cache

Was this page helpful?

Default Behavior​

Setting a Custom TTL​

Per-Request Cache Control​

Skip the cache for a single request​

Override TTL for a single request​

Clearing the Cache​

Disabling Caching Globally​

How Caching Works​