GCP

GKE workload identity

misankim 2023. 7. 1. 23:06

# Grant gcp permissions to pods using workload identity

-> Rather than granting a gcp service account to individual nodes or the entire cluster, a k8s service account is created, and the k8s service account is set to assume the gcp service account role, thereby granting gcp specific service permissions only to the pod.

 

https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity?hl=ko#authenticating_to

 

 

# Create k8s service account

 

kubectl create serviceaccount gcs-ksa \
    --namespace flask-gcs

 

 

# Create gcp service account

 

gcloud iam service-accounts create my-test-gcs-gsa \
    --project=my-test-project

 

Assign roles when necessary

 

-> Add “Storage Object Admin” role in the gcp iam console.

-> reference https://cloud.google.com/iam/docs/understanding-roles?hl=ko#cloud-storage-roles

 

gcloud projects add-iam-policy-binding my-test-project \
    --member "serviceAccount:my-test-gcs-gsa@my-test-project.iam.gserviceaccount.com" \
    --role "roles/storage.objectAdmin"

 

Policy binding

 

-> Attach a policy to allow the cluster's workload identity to assume the role of the gcs service account.

-> When deleting a workload identity policy binding, change only “add-iam-policy-binding” to “remove-iam-policy-binding” and then execute the command.

 

gcloud iam service-accounts add-iam-policy-binding my-test-gcs-gsa@my-test-project.iam.gserviceaccount.com \
    --project=my-test-project \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:my-test-project.svc.id.goog[flask-gcs/gcs-ksa]"

 

check

 

gcloud iam service-accounts get-iam-policy my-test-gcs-gsa@my-test-project.iam.gserviceaccount.com --project=my-test-project

 

 

# Add annotation to k8s service account

apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    iam.gke.io/gcp-service-account: my-test-gcs-gsa@my-test-project.iam.gserviceaccount.com
  name: gcs-ksa
  namespace: flask-gcs

 

 

# Add k8s service account to pod

 

apiVersion: apps/v1
kind: Deployment
metadata:
  namespace: flask-gcs
  name: deploy-my-flask-app
spec:
  selector:
    matchLabels:
      app: my-flask-app
  replicas: 1
  template:
    metadata:
      labels:
        app: my-flask-app
    spec:
      serviceAccount: gcs-ksa
      containers:
      - image: asia.gcr.io/my-test-project/flaks-gcs:1.0
        imagePullPolicy: Always
        name: my-flask-app
        ports:
        - containerPort: 5000

 

 

# Check with test pod

 

vim test.yaml

apiVersion: v1
kind: Pod
metadata:
  name: workload-identity-test
  namespace: flask-gcs
spec:
  containers:
  - image: google/cloud-sdk:slim
    name: workload-identity-test
    command: ["sleep","infinity"]
  serviceAccountName: gcs-ksa
  nodeSelector:
    iam.gke.io/gke-metadata-server-enabled: "true"

 

 

kubectl apply -f test.yaml

kubectl exec -it workload-identity-test \
  --namespace flask-gcs \
  -- /bin/bash

curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/

 

Output

root@workload-identity-test:/# curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/service-accounts/

default/
my-test-gcs-gsa@my-test-project.iam.gserviceaccount.com/

 

 

(Note) When an authentication error occurs when newly deploying a pod with a workload identity set.

-> A newly created pod receives an authentication error and is authenticated normally when the pod is restarted.

-> Need to wait a few seconds after pod creation, so add initcontainer

 

https://cloud.google.com/kubernetes-engine/docs/troubleshooting/troubleshooting-security#troubleshoot-timeout

 

  initContainers:
  - image:  gcr.io/google.com/cloudsdktool/cloud-sdk:alpine
    name: workload-identity-initcontainer
    command:
    - '/bin/bash'
    - '-c'
    - |
      curl -sS -H 'Metadata-Flavor: Google' 'http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token' --retry 30 --retry-connrefused --retry-max-time 60 --connect-timeout 3 --fail --retry-all-errors > /dev/null && exit 0 || echo 'Retry limit exceeded. Failed to wait for metadata server to be available. Check if the gke-metadata-server Pod in the kube-system namespace is healthy.' >&2; exit 1
  containers:
  - image: gcr.io/your-project/your-image
    name: your-main-application-container