Relay 2 LEXI connection issue

Incident Report for Ai-Media

Postmortem

Post Incident Review LEXI/iCap Outage (29 January 2026) 

 

Overview:

Incident Title: Relay 2 LEXI Connection Issue – 29 January 2026 

Date & Time (UTC): Jan 29, 18:07 - Jan 30, 01:00 

Date & Time (AEDT): Jan 30, 05:07 - Jan 30, 12:00

Date & Time (EST): Jan 29, 13:07 - Jan 29, 20:00 

Affected Service: iCap, LEXI 

Impact Scope: Global

Severity: High 

Duration: 6 hours 53 minutes 

Summary of Incident: 

On January 29, 2026, a deployment to Relay 2 (13.52.203.48) completed at 18:07 UTC. Following this deployment, encoders configured to prefer relay_2 began experiencing connection failures, with the first customer report received at 20:06 UTC. 

As the issue progressed, the Ai-Media Support team began receiving additional cases, including reports that LEXI jobs were not running. This indicated that the impact extended beyond encoder connectivity and affected downstream captioning workflows as well. 

The root cause was identified as a deployment script that unintentionally blocked traffic from certain IPs to relay_2 while other relays remained unaffected.  The blocks were removed at 01:00 UTC (30 Jan), restoring service, and customers confirmed normal operation shortly thereafter 

 

Timeline of Events (UTC): 

 

Impact:

  •  Duration: 6 hours 53 minutes 
  • Services Affected: iCap/LEXI 

Clients Impacted:  

  • Encoder and LEXI connections became stuck in disconnected state. 
  • The issue impacted a small subset of customers—between 1% and 5%—and only those whose encoders were configured to Relay 2. 

Root Cause Analysis (RCA):

 A script used during deployment unintentionally applied blocking rules that affected a subset of customer IP ranges — but only on relay_2, not on the other relay servers. This caused encoder/LEXI connections set to prefer relay_2 to fail to connect. 

 

Resolution:

 At 01:00 UTC — 30 January 2026, the technical team removed the traffic blocks that had been unintentionally applied by the deployment script on relay_2, and initial testing immediately showed that the issue was resolved. 

 

Corrective Actions & Improvements: 

Immediate Fixes 

  • Removal of the traffic blocks from Relay 2. 
  • Validation of successful encoder reconnections. 

Short Term Improvements

  • Add deployment script auditing before execution. 
  • Implement diff-check or rule-compare for firewall changes per relay. 
  • Update internal runbooks to include:  

    • Early detection of relay specific behavior
    • Explicit post-deploy traffic validation 

Long Term Prevention 

  • Introduce automated configuration validation pipelines for relay nodes. 
  • Enforce “canary-style” deployment (1 relay first, monitor results). 
  • Improve monitoring to detect unusual connection failure patterns quickly. 
  • Build alerting on encoder disconnect spikes for any specific relay.  

Commitment to Clients:

 We recognize the impact this incident had on our clients’ operations and deeply regret the disruption. AI-Media is committed to: 

  • Maintaining transparent communication during and after incidents. 
  • Strengthening our technical safeguards to reduce the likelihood of similar outages. 
  • Continuously improving our disaster recovery capabilities to ensure service reliability.
Posted Feb 05, 2026 - 02:44 UTC

Resolved

The deployed fix has been successful and we are marking this as resolved. We will continue to monitor to ensure stability. Please contact support if any further assistance is required. Thank you for your patience as this was resolved.
Posted Jan 30, 2026 - 01:43 UTC

Monitoring

A fix is deployed and we are seeing relay 2 service return to normal. We are continuing monitor.

Please contact support if you are encountering any issues. Thank you.
Posted Jan 30, 2026 - 01:13 UTC

Identified

A cause has been identified and a fix is being tested to resolve any ongoing issues. Thanks for your patience. Please contact support if you are still impacted by this. Thank you.
Posted Jan 30, 2026 - 01:00 UTC

Investigating

We are currently experiencing an issue with LEXI connecting to encoders set with a preference to use the relay 2 server. Effected encoders will likely not be able see LEXI captions during this time. Our team is working to identify the cause and will share updates shortly.

If your encoder is impacted, please remove the relay preference in iCap Admin (https://accounts.eegicap.com/) under the encoder user account and recreate the LEXI instance. Please reach out to support for further assistance as needed.
Posted Jan 29, 2026 - 22:59 UTC
This incident affected: iCap and Lexi.