CC's - Notice history

All systems operational

Notice history

May 5

Incident with Actions
  • Resolved
    Resolved

    On May 20, 2026, between 16:00 UTC and 17:45 UTC, GitHub Actions customers experienced run start delays exceeding 5 minutes. Approximately 4.5% of all runs were delayed during the impact window, with scale set jobs disproportionately affected. 30% of scale set jobs were delayed and 4% failed to start entirely.

    The incident was caused by a misconfigured health check on an internal service that assigns jobs to runners. A brief latency spike in an upstream dependency triggered health check failures across several pods, removing them from service and concentrating load on the remaining capacity. The added load drove memory pressure that escalated into a cascading failure in one regional cluster, leaving it unable to self-recover.

    Responders mitigated the incident by scaling capacity in the healthy regional clusters and draining traffic away from the impaired one, after which run start latency recovered. To prevent recurrence, we are strengthening our health check configuration to avoid cascading failure scenarios and evaluating automated mitigations to rebalance traffic when a region is degraded.

  • Update
    Update

    Customer impact has fully subsided. We are maintaining yellow status while we deploy a permanent fix to prevent recurrence.

  • Update
    Update

    We've applied a mitigation to fix the issues with queuing and running Actions jobs. We are seeing improvements in telemetry and are monitoring for full recovery.

  • Monitoring
    Monitoring

    The degradation affecting Actions has been mitigated. We are monitoring to ensure stability.

  • Update
    Update

    A subset of runners are taking longer than expected to connect, which may delay some jobs from beginning execution. We are actively working to mitigate the issue.

  • Investigating
    Investigating

    We are investigating reports of degraded performance for Actions

Apr 4

Mar 3

Elevated delays in Actions workflow runs and Pull Request status updates
  • Resolved
    Resolved

    On March 30, 2026, between 10:11 UTC and 13:25 UTC, GitHub Actions experienced degraded performance. During this time, approximately 2.65% of workflow jobs triggered by pull request events experienced start delays exceeding 5 minutes. The issue was caused by replication lag on an internal database cluster used by Actions, which triggered write throttling in our database protection layer and slowed job queue processing.

    The replication lag originated from planned maintenance to scale the internal database. Newly added database hosts triggered guardrails in the throttling layer, restricting write throughput. The incident was mitigated by excluding the new hosts from replication delay calculations.

    To prevent recurrence, we have updated our maintenance procedures to ensure new hosts are excluded from throttling assessments during scaling operations. Additionally, we are investing in automation to streamline this type of maintenance activity.

  • Update
    Update

    The degradation has been mitigated. We are monitoring to ensure stability.

  • Monitoring
    Monitoring

    The degradation affecting Actions and Pull Requests has been mitigated. We are monitoring to ensure stability.

  • Investigating
    Investigating

    We are investigating reports of degraded performance for Actions and Pull Requests

Git operations for users in the west coast are experiencing an increase in latency
  • Resolved
    Resolved

    On March 19, 2026 between 16:10 UTC and 00:05 UTC (March 20), Git operations (clone, fetch, push) from the US west coast experienced elevated latency and degraded throughput. Users reported clone speeds dropping from typical speeds to under 1 MiB/s in extreme cases. The root cause was network transport link saturation at our Seattle edge site, where a fiber cut affecting our backbone transport resulted in saturation and packet loss. We had a planned scale-up in progress for the site that was accelerated to resolve the backbone capacity pressure. We also brought online additional edge capacity in a cloud region and redirected some users there. Current scale with the upgraded network capacity is sufficient to prevent reoccurrence, as we upgraded from 800Gbps to 3.2Tbps total capacity on this path. We will continue to monitor network health and respond to any further issues.

  • Update
    Update

    We have reached stability with git operations through our changes deployed today.

  • Update
    Update

    We are seeing early signs of improvement. We are working on one more small change to further improve traffic routing on the west coast.

  • Update
    Update

    We have completed the rollout of our new network path and are monitoring its impact.

  • Update
    Update

    We are beginning the rollout of our new network path. During this change, users will continue to see higher latency from the west coast. We will provide another update when the rollout is complete.

  • Update
    Update

    We are working to enable a new network path in the west coast to reduce load and will monitor the impact on latency for Git Operations

  • Update
    Update

    We are still seeing elevated latency for Git operations in the west coast and are continuing to investigate

  • Update
    Update

    We are redirecting traffic back to our Seattle region and customers should see a decrease in latency for Git operations

  • Investigating
    Investigating

    We are investigating reports of degraded performance for Git Operations

Elevated deployment and function invocations failures
  • Resolved
    Resolved

    This incident has been resolved.

  • Monitoring
    Monitoring

    We have rolled out a second mitigation for elevated Build errors and are seeing recovery. All builds are now excluding the Dubai region (dxb1) from their deployment targets as a temporary measure. We will provide additional updates as they become available.

  • Update
    Update

    We have rolled out a first mitigation for elevated Build errors. Builds that use Middleware are now excluding the Dubai region (dxb1) from their deployment targets as a temporary measure, and should complete successfully again. We are now working on a mitigation for Builds that are using Edge Functions.

  • Update
    Update

    We are currently deploying a mitigation for elevated Build errors. Builds that use Middleware or Edge Functions will exclude the Dubai region (dxb1) from their deployment targets as a temporary measure. We will provide additional updates as they become available.

  • Update
    Update

    We are still seeing elevated errors in Builds in all regions, because Middleware and Edge Functions may be deployed globally. Builds that don't use Middleware and Edge Functions are not impacted. We are continuing to work on a fix for this issue.

  • Update
    Update

    The dxb1 Edge traffic is currently being rerouted to the nearest Edge region (bom1) to mitigate the impact. We will provide additional updates as they become available.

  • Update
    Update

    We have rolled out mitigations and are seeing recovery. If you are still seeing build failures and are using Dubai (dxb1) as your primary Vercel Functions region, you can switch to another region as a workaround.

  • Identified
    Identified

    Starting from 5:00 am UTC, we have started seeing failures to deploy and invoke functions in Dubai region (dxb1). Deployments with Middleware Functions are also impacted in all regions, because Middleware Functions are deployed globally for production deployments. Our team is actively investigating the issue.

Mar 3 to May 5

Next