ArgoCD taught me that GitOps is about audit trails, not about git
After running ArgoCD in homelab and production for over a year, the real value is not deployment automation. It is the complete, immutable audit trail.
Most GitOps content focuses on the deployment mechanism. You push a manifest to a git repository. ArgoCD detects the change. It reconciles the cluster state to match the repository. The cluster is always in sync with git. This is the pitch.
After running ArgoCD in both my homelab k3s cluster and FinanceOps production for over a year, I think the deployment mechanism is the least interesting part. The part that matters, especially in fintech, is the audit trail.
Leadership got more concrete for me once I realized release engineering and infrastructure are really trust systems. It also builds on what I learned earlier in “How I use Grafana dashboards to run engineering meetings instead of slide decks.” The infrastructure stack, ctrlpane, and even my dotfiles all orbit the same idea now: the best teams move fast because the defaults are stable, not because the heroics are impressive.
What the Audit Trail Actually Gives You
Every infrastructure change at FinanceOps is a git commit. Every commit has an author, a timestamp, a description, and a diff. Every deployment is traceable to a specific commit. Every rollback is a revert commit that itself is traceable.
This means that when a compliance auditor asks “who changed the database connection pool size on November 3rd, and why?” the answer is not “let me check with the team.” The answer is a git commit with a PR link, a review approval, and a description of the business reason for the change.
- Every configuration change has a human-readable description in the commit message
- Every change has a reviewer who approved it before it reached the cluster
- Every change has a timestamp that is immutable and tamper-evident
- Every change can be diffed against the previous state to show exactly what changed
- Every rollback is itself a change with the same audit properties
In fintech, this audit trail is worth more than the deployment automation. SOC 2 requires change management controls. PCI DSS requires audit trails for infrastructure changes. GitOps gives you both, not as a bolted-on compliance feature, but as a natural consequence of how the system works.
What ArgoCD Does That Manual GitOps Does Not
You can get some of these benefits by just committing your Kubernetes manifests to git without ArgoCD. But ArgoCD adds three things that matter:
- Drift detection. ArgoCD continuously compares the cluster state to the git state and alerts when they diverge. Without this, someone can kubectl apply a change directly, bypassing the audit trail. ArgoCD makes unauthorized changes visible immediately.
- Sync history. ArgoCD maintains a history of every sync operation, including success, failure, and partial sync states. This history is separate from git history and includes operational details that git commits do not capture.
- Automated reconciliation. When the cluster drifts from the desired state, ArgoCD can automatically reconcile. This means a manual kubectl change gets overwritten within minutes. The git repository is not just the record. It is the enforced source of truth.
Lessons from Running It in Both Environments
Running ArgoCD on my homelab first was the best decision I made. The homelab is where I learned the operational patterns, made configuration mistakes, and developed the runbooks that the FinanceOps production deployment relies on.
Specific lessons from running ArgoCD at both scales:
- Application-of-applications pattern is worth the setup cost. Managing 20 ArgoCD applications individually is painful. Managing one root application that references all others is clean.
- Sync waves matter for databases and migrations. Deploying a schema migration and the application that depends on it in the wrong order is a common failure mode. Sync waves enforce the sequence.
- Health checks need to be application-specific. The default Kubernetes readiness probe is not sufficient for ArgoCD health assessment. Custom health checks that verify business-level readiness prevent ArgoCD from marking a broken deployment as healthy.
- Notifications to Slack on sync failures are essential. A failed sync that nobody notices for hours defeats the purpose of continuous reconciliation.
The Real Argument for GitOps
By the time I wrote this, the lesson was bigger than the tool or incident. The job had become setting defaults a team could trust, then proving those defaults in systems like infrastructure and ctrlpane. That is leadership work, not just technical taste.
GitOps is not about using git to deploy. It is about using git as an immutable, auditable, reviewable record of every infrastructure decision. The deployment automation is a side effect. The audit trail is the product.
When I talk to engineering leaders about GitOps, I no longer lead with deployment speed or consistency. I lead with the audit question: can you tell me, right now, who changed what in your infrastructure last month, and why? If the answer involves digging through Slack threads, checking CloudTrail logs, and asking someone who might remember, you have an audit trail problem that GitOps solves.
The deployment automation is nice. The infrastructure-as-code discipline is valuable. But the immutable, reviewable, auditable record of every change is the thing that makes my compliance team happy, my on-call engineers effective, and my incident retrospectives productive. That is why I run ArgoCD. The deployment part is just how it gets there.
The audit trail benefit of GitOps is underappreciated until you need it. When a production incident occurs and the first question is what changed, having every deployment tracked as a git commit with a timestamp, author, and diff is invaluable. ArgoCD made our deployment history searchable and our rollbacks trivial. The operational overhead of maintaining ArgoCD on a small cluster was real, but the confidence it provided during incidents paid for itself within the first quarter. GitOps is not about automation convenience. It is about operational accountability.