Caching
The SDK includes a thread-safe, in-memory TTL cache that reduces API calls by storing prompt versions locally. Caching is enabled by default with a 300-second (5-minute) TTL.
Default Behavior​
When caching is enabled, get_prompt_version stores the result
keyed by prompt name and label/version. Subsequent calls with the
same arguments return the cached value until the TTL expires.
- Python
- Ruby
# First call hits the API
version = client.get_prompt_version("greeting")
# Second call returns cached result (no API call)
version = client.get_prompt_version("greeting")
# First call hits the API
version = client.get_prompt_version("greeting")
# Second call returns cached result (no API call)
version = client.get_prompt_version("greeting")
Setting a Custom TTL​
Change the default TTL when creating the client:
- Python
- Ruby
client = ScopeClient(
credentials=credentials,
cache_ttl=600, # 10 minutes
)
client = ScopeClient::Client.new(
credentials: credentials,
cache_ttl: 600, # 10 minutes
)
Per-Request Cache Control​
Skip the cache for a single request​
- Python
- Ruby
# Bypass cache and fetch fresh data
version = client.get_prompt_version("greeting", cache=False)
# Bypass cache and fetch fresh data
version = client.get_prompt_version("greeting", cache: false)
Override TTL for a single request​
- Python
- Ruby
# Cache this result for 60 seconds instead of the default
version = client.get_prompt_version("greeting", cache_ttl=60)
# Cache this result for 60 seconds instead of the default
version = client.get_prompt_version("greeting", cache_ttl: 60)
Clearing the Cache​
Remove all cached entries programmatically:
- Python
- Ruby
client.clear_cache()
client.clear_cache
Disabling Caching Globally​
Turn off caching entirely for a client:
- Python
- Ruby
client = ScopeClient(
credentials=credentials,
cache_enabled=False,
)
client = ScopeClient::Client.new(
credentials: credentials,
cache_enabled: false,
)
How Caching Works​
- Cache keys are derived from the prompt name plus the label
or version ID (e.g.,
prompt:greeting:production) - Entries expire lazily — they are evicted when accessed after TTL expiration
- The cache is thread-safe (uses locks/monitors internally)
- Each client instance has its own independent cache
Was this page helpful?