When your startup lands its first enterprise client, your traffic might 10x overnight. When your game goes viral, you need to handle millions of new players. When your trading platform expands to new markets, latency requirements get stricter.
The infrastructure patterns in this guide let you scale from dozens to millions of users without re-architecting. We’ll cover how SaaS companies, fintech platforms, and gaming studios build systems that grow with their business.
Choosing Your Kubernetes Distribution
Managed Kubernetes Services
For most cloud deployments, managed Kubernetes is the right choice:
- AWS EKS: Deeply integrated with AWS services
- Google GKE: Most mature, best default configurations
- Azure AKS: Good for Microsoft-centric organizations
Managed services handle:
- Control plane availability
- Kubernetes version upgrades
- Security patches
- etcd backups
Application Deployment Best Practices
Resource Requests and Limits
Always specify resource constraints:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api
image: myapp/api:v1.2.3
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
ports:
- containerPort: 8080
|
- Requests: Guaranteed resources; used for scheduling
- Limits: Maximum resources; prevents runaway containers
Health Checks
Configure probes for reliability:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| spec:
containers:
- name: api
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
startupProbe:
httpGet:
path: /health
port: 8080
failureThreshold: 30
periodSeconds: 10
|
- Liveness: Restart container if failing
- Readiness: Remove from service if not ready
- Startup: Allow slow-starting containers
Pod Disruption Budgets
Ensure availability during updates:
1
2
3
4
5
6
7
8
9
| apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: api-server-pdb
spec:
minAvailable: 2 # Or use maxUnavailable: 1
selector:
matchLabels:
app: api-server
|
Anti-Affinity for High Availability
Spread pods across nodes:
1
2
3
4
5
6
7
8
9
10
| spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: api-server
topologyKey: kubernetes.io/hostname
|
Secrets Management
External Secrets Operator
Sync secrets from cloud providers:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: database-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: database-credentials
data:
- secretKey: username
remoteRef:
key: production/database
property: username
- secretKey: password
remoteRef:
key: production/database
property: password
|
Sealed Secrets for GitOps
Encrypt secrets for version control:
1
2
3
4
5
6
7
8
| # Install kubeseal
kubeseal --fetch-cert > pub-cert.pem
# Create sealed secret
kubectl create secret generic my-secret \
--from-literal=password=supersecret \
--dry-run=client -o yaml | \
kubeseal --cert pub-cert.pem -o yaml > sealed-secret.yaml
|
Autoscaling
Horizontal Pod Autoscaler
Scale based on metrics:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-server-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60
|
Cluster Autoscaler
Scale nodes based on pending pods:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| # AWS EKS example
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
spec:
template:
spec:
containers:
- name: cluster-autoscaler
image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.28.0
command:
- ./cluster-autoscaler
- --cloud-provider=aws
- --nodes=2:20:my-node-group
- --scale-down-unneeded-time=10m
|
Network Policies
Restrict pod-to-pod traffic:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
| apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-server-network-policy
spec:
podSelector:
matchLabels:
app: api-server
policyTypes:
- Ingress
- Egress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
egress:
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
- to: # Allow DNS
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
|
Observability
Prometheus + Grafana Stack
Deploy with Helm:
1
2
3
4
| helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install prometheus prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace
|
Structured Logging
Configure applications for JSON logging:
1
2
3
4
5
6
| spec:
containers:
- name: api
env:
- name: LOG_FORMAT
value: "json"
|
Collect with Fluent Bit:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
data:
fluent-bit.conf: |
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser docker
Tag kube.*
[OUTPUT]
Name es
Match *
Host elasticsearch
Port 9200
|
GitOps with ArgoCD
Manage deployments declaratively:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: api-server
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/myorg/k8s-manifests
targetRevision: main
path: apps/api-server
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
|
Conclusion
Production Kubernetes requires:
- Proper resource management and health checks
- Security through RBAC, network policies, and secrets management
- Scalability with HPA and cluster autoscaler
- Observability with metrics, logs, and traces
- GitOps for reliable, auditable deployments
At Sajima Solutions, we help organizations deploy and operate Kubernetes at scale. Contact us to discuss your cloud-native journey.