Skip to main content

Caching

The SDK includes a thread-safe, in-memory TTL cache that reduces API calls by storing prompt versions locally. Caching is enabled by default with a 300-second (5-minute) TTL.

Default Behavior​

When caching is enabled, get_prompt_version stores the result keyed by prompt name and label/version. Subsequent calls with the same arguments return the cached value until the TTL expires.

# First call hits the API
version = client.get_prompt_version("greeting")

# Second call returns cached result (no API call)
version = client.get_prompt_version("greeting")

Setting a Custom TTL​

Change the default TTL when creating the client:

client = ScopeClient(
credentials=credentials,
cache_ttl=600, # 10 minutes
)

Per-Request Cache Control​

Skip the cache for a single request​

# Bypass cache and fetch fresh data
version = client.get_prompt_version("greeting", cache=False)

Override TTL for a single request​

# Cache this result for 60 seconds instead of the default
version = client.get_prompt_version("greeting", cache_ttl=60)

Clearing the Cache​

Remove all cached entries programmatically:

client.clear_cache()

Disabling Caching Globally​

Turn off caching entirely for a client:

client = ScopeClient(
credentials=credentials,
cache_enabled=False,
)

How Caching Works​

  • Cache keys are derived from the prompt name plus the label or version ID (e.g., prompt:greeting:production)
  • Entries expire lazily — they are evicted when accessed after TTL expiration
  • The cache is thread-safe (uses locks/monitors internally)
  • Each client instance has its own independent cache
Was this page helpful?