Grafana Migration: From On-Prem to Cloud
How I migrated Grafana from on-prem to Grafana Cloud with staged cutover, alert parity checks, and RBAC mapping.
Grafana Migration: From On-Prem to Cloud
I recently worked through a Grafana migration from an on-prem setup to Grafana Cloud. The biggest goal was reducing operational overhead while keeping dashboards, alerts, and access controls stable for teams that rely on them daily.
Why We Moved
Our on-prem Grafana instance worked, but it came with recurring issues:
- Manual upgrades and plugin compatibility checks
- Backup and restore responsibility on our side
- Alerting reliability concerns during infrastructure maintenance
- Growing support load for authentication and access management
Moving to cloud let us offload platform management and focus on improving observability itself.
Migration Plan I Used
I broke the move into small, reversible stages:
- Inventory dashboards, folders, alerts, and datasources
- Classify dashboards by criticality (business-critical vs nice-to-have)
- Recreate datasources in Grafana Cloud with secure auth methods
- Migrate dashboards and validate query behavior
- Migrate alert rules and notification channels
- Run both environments in parallel before final cutover
This helped avoid a risky “all at once” migration.
What Needed Extra Attention
Datasource Authentication
Some on-prem datasources used patterns that were fine internally but not ideal for cloud migration. I updated credentials handling and moved to tighter scoped access where possible.
Alerting Parity
Dashboard migration is usually straightforward; alert migration is where surprises happen. I tested:
- Threshold behavior
- Evaluation intervals
- Notification routing
- Mute timings and schedules
I treated alert validation as a separate project, not an afterthought.
Folder Permissions and RBAC
We had to match existing team access exactly to prevent disruption. I mapped old folder permissions to cloud RBAC and validated with a few power users before rollout.
Cutover Strategy
For cutover, I used a short freeze window:
- Freeze dashboard edits on on-prem
- Final export/import sync
- Switch user traffic and update internal docs/bookmarks
- Keep on-prem read-only for a rollback window
Keeping rollback available lowered risk and made approvals easier.
Result
After migration:
- Platform maintenance overhead dropped significantly
- Teams got more consistent alert delivery
- Access management became cleaner and easier to audit
- New dashboard onboarding became faster
My Takeaways
- Spend more time on alerting validation than dashboard import
- Define ownership for folders and alerts before migration day
- Keep the old environment briefly as a safety net
- Communicate early with teams that depend on dashboards daily
If I do this again, I would automate more of the inventory and validation checks up front so I can reduce manual comparisons.