Building an AI Security Code Reviewer on AWS EKS

Apr 28, 2026 · 12 min read

The Problem

Security code reviews are a bottleneck. Most teams either skip them or rely on static analysis tools that generate walls of false positives. I wanted to build something that actually thinks about code. It understands context, identifies real vulnerabilities, and explains what to fix.

CodeGuardian AI is a production-grade security code reviewer: developers submit code, Claude Sonnet analyzes it, and returns structured findings with severity ratings, CWE/OWASP mappings, and actionable fix suggestions.

Architecture

The system runs on AWS EKS with a FastAPI backend, React frontend, and AWS Bedrock for Claude Sonnet inference. Everything is deployed via Terraform and managed through ArgoCD GitOps.

Frontend: React SPA served via nginx, handles code submission and results display
Backend: FastAPI (Python). Validates input, manages prompt engineering, calls Bedrock, parses structured output
AI Engine: AWS Bedrock Claude Sonnet. Receives code + security analysis prompt, returns JSON findings
Infrastructure: EKS cluster, ALB ingress, RDS for persistence, ElastiCache for rate limiting
GitOps: ArgoCD watches the Helm chart repo, auto-syncs on merge to main

Key Design Decisions

Why EKS over Lambda?

Lambda would have been simpler for a single-endpoint API, but I chose EKS because:

Cold starts on Lambda + Bedrock calls would push response times past 30s
I needed persistent connections for streaming responses back to the frontend
EKS gives me a real platform story: HPA for scaling, service mesh for observability, pod security policies
At enterprise scale (~200 AWS accounts), teams already run EKS. This fits into existing patterns

Why ArgoCD over Flux?

Both are solid. I went with ArgoCD because the UI makes it easy to demo rollback scenarios and sync status. For a portfolio project that I'll be presenting in interviews, visual feedback matters. ArgoCD's app-of-apps pattern also scales well. I can add new microservices without touching the root config.

VPC Endpoint Cost Optimization

AWS charges ~$7/month per VPC endpoint. When you need endpoints for ECR (dkr + api), S3 gateway, STS, Bedrock, CloudWatch Logs, and SSM, that adds up fast. I consolidated where possible and used S3 gateway endpoints (free) instead of interface endpoints where the service supported it.

6-Layer Security Model

Building a security tool that isn't itself secure would be ironic. Here's the defense-in-depth approach:

Layer	Implementation
Network	Private subnets, security groups with least-privilege, VPC endpoints
Cluster	EKS managed node groups, IRSA for pod-level IAM, no SSH access
Application	Input validation, rate limiting, structured output parsing (no raw LLM output to users)
Data	TLS in transit, KMS encryption at rest, no code persistence after analysis
Identity	IRSA (IAM Roles for Service Accounts). Each pod gets exactly the permissions it needs
Observability	Prometheus metrics, Grafana dashboards, CloudWatch alarms on anomalous patterns

Observability

Prometheus + Grafana stack deployed alongside the application. Key metrics I track:

Bedrock latency: p50, p95, p99 response times per model
Token usage: Input/output tokens per request (cost tracking)
Error rates: Bedrock throttling, timeout rates, malformed response rates
Business metrics: Submissions per hour, findings by severity distribution

What I'd Do Differently

Streaming responses: Currently waits for the full Bedrock response before returning. Should stream findings back as they're generated
Multi-model support: Add GPT-4o and Gemini as alternative analyzers, let users compare findings across models
GitHub integration: Auto-trigger reviews on PR creation, post findings as review comments
Fine-tuning: Collect labeled data from user feedback on findings to improve prompt engineering or fine-tune a smaller model

Results

The system processes code submissions in under 15 seconds, returns structured JSON findings with severity levels and remediation steps, and costs less than $50/month for the entire infrastructure stack at moderate usage.

More importantly, it demonstrates a real platform engineering skillset: Terraform IaC, Kubernetes operations, GitOps delivery, AI/ML integration, security best practices, and full-stack observability. This is the kind of system I build at work, just distilled into a single, cohesive project.