When your startup lands its first enterprise client, your traffic might 10x overnight. When your game goes viral, you need to handle millions of new players. When your trading platform expands to new markets, latency requirements get stricter.

The infrastructure patterns in this guide let you scale from dozens to millions of users without re-architecting. We’ll cover how SaaS companies, fintech platforms, and gaming studios build systems that grow with their business.

Choosing Your Kubernetes Distribution

Managed Kubernetes Services

For most cloud deployments, managed Kubernetes is the right choice:

  • AWS EKS: Deeply integrated with AWS services
  • Google GKE: Most mature, best default configurations
  • Azure AKS: Good for Microsoft-centric organizations

Managed services handle:

  • Control plane availability
  • Kubernetes version upgrades
  • Security patches
  • etcd backups

Application Deployment Best Practices

Resource Requests and Limits

Always specify resource constraints:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-server
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api-server
  template:
    metadata:
      labels:
        app: api-server
    spec:
      containers:
      - name: api
        image: myapp/api:v1.2.3
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        ports:
        - containerPort: 8080
  • Requests: Guaranteed resources; used for scheduling
  • Limits: Maximum resources; prevents runaway containers

Health Checks

Configure probes for reliability:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
spec:
  containers:
  - name: api
    livenessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 30
      periodSeconds: 10
      failureThreshold: 3
    readinessProbe:
      httpGet:
        path: /ready
        port: 8080
      initialDelaySeconds: 5
      periodSeconds: 5
      failureThreshold: 3
    startupProbe:
      httpGet:
        path: /health
        port: 8080
      failureThreshold: 30
      periodSeconds: 10
  • Liveness: Restart container if failing
  • Readiness: Remove from service if not ready
  • Startup: Allow slow-starting containers

Pod Disruption Budgets

Ensure availability during updates:

1
2
3
4
5
6
7
8
9
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-server-pdb
spec:
  minAvailable: 2  # Or use maxUnavailable: 1
  selector:
    matchLabels:
      app: api-server

Anti-Affinity for High Availability

Spread pods across nodes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchLabels:
              app: api-server
          topologyKey: kubernetes.io/hostname

Secrets Management

External Secrets Operator

Sync secrets from cloud providers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: database-credentials
  data:
  - secretKey: username
    remoteRef:
      key: production/database
      property: username
  - secretKey: password
    remoteRef:
      key: production/database
      property: password

Sealed Secrets for GitOps

Encrypt secrets for version control:

1
2
3
4
5
6
7
8
# Install kubeseal
kubeseal --fetch-cert > pub-cert.pem

# Create sealed secret
kubectl create secret generic my-secret \
  --from-literal=password=supersecret \
  --dry-run=client -o yaml | \
  kubeseal --cert pub-cert.pem -o yaml > sealed-secret.yaml

Autoscaling

Horizontal Pod Autoscaler

Scale based on metrics:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-server-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60

Cluster Autoscaler

Scale nodes based on pending pods:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# AWS EKS example
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: cluster-autoscaler
        image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.28.0
        command:
        - ./cluster-autoscaler
        - --cloud-provider=aws
        - --nodes=2:20:my-node-group
        - --scale-down-unneeded-time=10m

Network Policies

Restrict pod-to-pod traffic:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: api-server-network-policy
spec:
  podSelector:
    matchLabels:
      app: api-server
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend
    ports:
    - protocol: TCP
      port: 8080
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: database
    ports:
    - protocol: TCP
      port: 5432
  - to:  # Allow DNS
    - namespaceSelector: {}
      podSelector:
        matchLabels:
          k8s-app: kube-dns
    ports:
    - protocol: UDP
      port: 53

Observability

Prometheus + Grafana Stack

Deploy with Helm:

1
2
3
4
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace

Structured Logging

Configure applications for JSON logging:

1
2
3
4
5
6
spec:
  containers:
  - name: api
    env:
    - name: LOG_FORMAT
      value: "json"

Collect with Fluent Bit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
data:
  fluent-bit.conf: |
    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        Parser            docker
        Tag               kube.*
    
    [OUTPUT]
        Name              es
        Match             *
        Host              elasticsearch
        Port              9200

GitOps with ArgoCD

Manage deployments declaratively:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: api-server
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/k8s-manifests
    targetRevision: main
    path: apps/api-server
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true

Conclusion

Production Kubernetes requires:

  • Proper resource management and health checks
  • Security through RBAC, network policies, and secrets management
  • Scalability with HPA and cluster autoscaler
  • Observability with metrics, logs, and traces
  • GitOps for reliable, auditable deployments

At Sajima Solutions, we help organizations deploy and operate Kubernetes at scale. Contact us to discuss your cloud-native journey.