Skip to main content

4 posts tagged with "kubernetes"

View All Tags

The Multi-Cloud Design: Engineering your code for Portability

· 6 min read
Irfan Shah
Founder at base14

In our previous post on Cloud-Native foundations, we explored why running on one cloud isn't lock-in—but designing for one cloud is. Now let's look at how to implement that portability.

Portability is not defined by the ability to run everywhere simultaneously, as that is often a path toward over-engineering. It is, more accurately, a function of reversibility. It provides the technical confidence that if a migration becomes necessary, the system can support it. This quality is not derived from a specific cloud provider, but rather from the deliberate layering of code and environment. While many teams focus on the destination of their deployment, true portability is found in the methodology of the build.

Live Metric Registry: find and understand observability metrics across your stack

· 9 min read
Ranjan Sakalley
Founder at base14

Introducing Metric Registry: a live, searchable catalog of 3,700+ observability (and rapidly growing) metrics extracted directly from source repositories across the OpenTelemetry, Prometheus, and Kubernetes ecosystems, including cloud provider metrics. Metric Registry is open source and built to stay current automatically as projects evolve.

What you can do today with Metric Registry​

Search across your entire observability stack. Find metrics by name, description, or component, whether you're looking for HTTP-related histograms or database connection metrics.

Understand what metrics actually exist. The registry covers 15 sources including OpenTelemetry Collector receivers, Prometheus exporters (PostgreSQL, Redis, MySQL, MongoDB, Kafka), Kubernetes metrics (kube-state-metrics, cAdvisor), and LLM observability libraries.

See which metrics follow standards. Each metric shows whether it complies with OpenTelemetry Semantic Conventions, helping you understand what's standardized versus custom.

Trace back to the source. Every metric links to its origin: the repository, file path, and commit hash. When you need to understand a metric's exact definition, you can go straight to the source.

Trust the data. Metrics are extracted automatically from source code and official metadata files, and the registry refreshes nightly to stay current as projects evolve.

Can't find what you're looking for? Open an issue or better yet, submit a PR to add new sources or improve existing extractors.

Sources already indexed​

CategorySources
OpenTelemetryCollector Contrib, Semantic Conventions, Python, Java, JavaScript
Prometheusnode_exporter, postgres_exporter, redis_exporter, mysql_exporter, mongodb_exporter, kafka_exporter
Kuberneteskube-state-metrics, cAdvisor
LLM ObservabilityOpenLLMetry, OpenLIT
CloudWatchRDS, ALB, DynamoDB, Lambda, EC2, S3, SQS, API Gateway

The Cloud-Native Foundation Layer: A Portable, Vendor-Neutral Base for Modern Systems

· 4 min read
Irfan Shah
Founder at base14
Cloud-Native Foundation Layer

Cloud-native began with containers and Kubernetes. Since then, it has become a set of open standards and protocols that let systems run anywhere with minimal friction.

Today's engineering landscape spans public clouds, private clouds, on-prem clusters, and edge environments - far beyond the old single-cloud model. Teams work this way because it's the only practical response to cost, regulation, latency, hardware availability, and outages.

If you expect change, you need an architecture that can handle it.

Making Certificate Expiry Boring

· 21 min read
Ranjan Sakalley
Founder at base14
Certificate expiry issues are entirely preventable

On 18 November 2025, GitHub had an hour-long outage that affected the heart of their product: Git operations. The post-incident summary was brief and honest - the outage was triggered by an internal TLS certificate that had quietly expired, blocking service-to-service communication inside their platform. It's the kind of issue every engineering team knows can happen, yet it still slips through because certificates live in odd corners of a system, often far from where we normally look.

What struck me about this incident wasn't that GitHub "missed something." If anything, it reminded me how easy it is, even for well-run, highly mature engineering orgs, to overlook certificate expiry in their observability and alerting posture. We monitor CPU, memory, latency, error rates, queue depth, request volume - but a certificate that's about to expire rarely shows up as a first-class signal. It doesn't scream. It doesn't gradually degrade. It just keeps working… until it doesn't.

And that's why these failures feel unfair. They're fully preventable, but only if you treat certificates as operational assets, not just security artefacts. This article is about building that mindset: how to surface certificate expiry as a real reliability concern, how to detect issues early, and how to ensure a single date on a single file never brings down an entire system.