The Problem
Security code reviews are a bottleneck. Most teams either skip them or rely on static analysis tools that generate walls of false positives. I wanted to build something that actually thinks about code. It understands context, identifies real vulnerabilities, and explains what to fix.
CodeGuardian AI is a production-grade security code reviewer: developers submit code, Claude Sonnet analyzes it, and returns structured findings with severity ratings, CWE/OWASP mappings, and actionable fix suggestions.
Architecture
The system runs on AWS EKS with a FastAPI backend, React frontend, and AWS Bedrock for Claude Sonnet inference. Everything is deployed via Terraform and managed through ArgoCD GitOps.
- Frontend: React SPA served via nginx, handles code submission and results display
- Backend: FastAPI (Python). Validates input, manages prompt engineering, calls Bedrock, parses structured output
- AI Engine: AWS Bedrock Claude Sonnet. Receives code + security analysis prompt, returns JSON findings
- Infrastructure: EKS cluster, ALB ingress, RDS for persistence, ElastiCache for rate limiting
- GitOps: ArgoCD watches the Helm chart repo, auto-syncs on merge to main
Key Design Decisions
Why EKS over Lambda?
Lambda would have been simpler for a single-endpoint API, but I chose EKS because:
- Cold starts on Lambda + Bedrock calls would push response times past 30s
- I needed persistent connections for streaming responses back to the frontend
- EKS gives me a real platform story: HPA for scaling, service mesh for observability, pod security policies
- At enterprise scale (~200 AWS accounts), teams already run EKS. This fits into existing patterns
Why ArgoCD over Flux?
Both are solid. I went with ArgoCD because the UI makes it easy to demo rollback scenarios and sync status. For a portfolio project that I'll be presenting in interviews, visual feedback matters. ArgoCD's app-of-apps pattern also scales well. I can add new microservices without touching the root config.
VPC Endpoint Cost Optimization
AWS charges ~$7/month per VPC endpoint. When you need endpoints for ECR (dkr + api), S3 gateway, STS, Bedrock, CloudWatch Logs, and SSM, that adds up fast. I consolidated where possible and used S3 gateway endpoints (free) instead of interface endpoints where the service supported it.
6-Layer Security Model
Building a security tool that isn't itself secure would be ironic. Here's the defense-in-depth approach:
| Layer | Implementation |
|---|---|
| Network | Private subnets, security groups with least-privilege, VPC endpoints |
| Cluster | EKS managed node groups, IRSA for pod-level IAM, no SSH access |
| Application | Input validation, rate limiting, structured output parsing (no raw LLM output to users) |
| Data | TLS in transit, KMS encryption at rest, no code persistence after analysis |
| Identity | IRSA (IAM Roles for Service Accounts). Each pod gets exactly the permissions it needs |
| Observability | Prometheus metrics, Grafana dashboards, CloudWatch alarms on anomalous patterns |
Observability
Prometheus + Grafana stack deployed alongside the application. Key metrics I track:
- Bedrock latency: p50, p95, p99 response times per model
- Token usage: Input/output tokens per request (cost tracking)
- Error rates: Bedrock throttling, timeout rates, malformed response rates
- Business metrics: Submissions per hour, findings by severity distribution
What I'd Do Differently
- Streaming responses: Currently waits for the full Bedrock response before returning. Should stream findings back as they're generated
- Multi-model support: Add GPT-4o and Gemini as alternative analyzers, let users compare findings across models
- GitHub integration: Auto-trigger reviews on PR creation, post findings as review comments
- Fine-tuning: Collect labeled data from user feedback on findings to improve prompt engineering or fine-tune a smaller model
Results
The system processes code submissions in under 15 seconds, returns structured JSON findings with severity levels and remediation steps, and costs less than $50/month for the entire infrastructure stack at moderate usage.
More importantly, it demonstrates a real platform engineering skillset: Terraform IaC, Kubernetes operations, GitOps delivery, AI/ML integration, security best practices, and full-stack observability. This is the kind of system I build at work, just distilled into a single, cohesive project.