There are no Open Network Issues Currently
Charlotte, NC Service Impacting Maintenance: 8/16 @ 8:45pm (Resolved) Critical

Affecting Other - Charlotte, NC Datacenter

  • 08/16/2024 20:45 - 08/17/2024 08:14
  • Last Updated 08/16/2024 23:56

Dear Customer

Thank you for your continued business!  


This is a service impacting maintenance notification for our Charlotte, NC datacenter. If you do not have services at this datacenter, then you can safely ignore this notice.


The existing datacenter building on the Segra campus (currently 1612 Cross Beam Dr) is approaching planned obsolescence. As such, all services are being moved to a brand new facility on the same campus, which was just completed this year at 3101 International Airport Dr.   All of our networking has been upgraded and duplicated in the new location, it is "live", and similar test migrations have been completed successfully so we expect the process to go smoothly.  


We will be cleanly powering off servers and migrating one cabinet at a time, then powering them on at the new location. You can expect more redundancy, more on-net carriers, and a better, higher capacity network immediately and with more to come in the future (without the need for any similar migrations for decades to come)!


Dedicated and colo server customers: Please begin powering down your server at 8:45PM EST on Friday, August 16th otherwise we will power down all servers manually. 


Expected impact:

8:45PM EST until 11:30PM EST


UPDATE 
8/16/2024 @ 8:05pm EST: we are prepared and ready for the migration as planned.   Teams are on site and will begin powering down servers at 8:45pm if your server(s) are not already powered down.   

UPDATE 8/16/2024 @ 11:30pm EST: Maintenance has been successful!   Servers are mostly all powered back up.   You should see your servers up now or soon.     If you suspect a server should be up, but isn't, please open a support ticket.   

Emergency Network Maintenance - OREGON (Resolved) Critical

Affecting Other - Bend, OR

  • 07/22/2024 03:31 - 07/22/2024 06:26
  • Last Updated 07/22/2024 06:26

2024-07-21 12:31am PST - A fiber issue affecting both of our primary redundant providers (Fatbeam and Lumen) between Sandy, OR and Maupin, OR is currently being addressed with window on July 22, 2024 between midnight thru 6am PST. The Lumen NOC has advised a partially damaged cable has been identified. The damage was caused by a construction project in the area, and the cable needs to be fully cut and spliced to repair the damage. Services are expected to be impacted for 5 hours within the maintenance window.

UPDATE: 12:40am PST, we have failed over to a tertiary provider.   Connectivity services are restored but customers may notice increased latency or packetloss throughout this window.   If your services are still completely offline, please post an urgent support request.

UPDATE: 3:21AM PST, one of our primary providers is back up and traffic has been restored. 

 

 

Service Impacting Maintenance: Bend, OR Datacenter (Resolved) Medium

Affecting Other - Bend, OR datacenter

  • 07/13/2024 21:30 - 07/15/2024 11:56
  • Last Updated 07/13/2024 01:01

Dear valued clients,

Please be advised that we will be performing emergency network maintenance in our Bend, OR datacenter tomorrow, 7/13/2024.

----

Timeline (Please note only 10-30 minutes downtime is expected within this window):

Start time: 7/13/2024 9:30PM pacific time

End time: 7/13/2024 11:30PM pacific time

---

Description of issue:

This maintenance is necessary to resolve an issue with connectivity that is related to a Cisco bug that has no SMU (hotfix) available. As such, we will be performing a full software update on our core routers. Due to the way our routers are clustered, this will result in approximately 10-30 minutes of downtime.

---

 
If you have any questions, please do not hesitate to contact us.

Thank you for your continued business!

Node: OR-Poppy RAID Failure (Resolved) Critical

Affecting Other - Hypervisor: OR-Poppy

  • 06/20/2024 16:25 - 06/22/2024 21:58
  • Last Updated 06/22/2024 00:59

At approximately 4:25PM Pacific time our NOC began to notice alarms for the VPS Hypervisor "OR-Poppy"

We immediately dispatched a technician to physically investigate the server, and it was discovered that two drives in the RAID array on the same span had failed. We attempted multiple ways to get the RAID array back online with no success. We are now preparing a replacement server which will then have backups restored on it. No ETA is currently available.

UPDATE: 7:25PM Pacific: The replacement hypervisor physical hardware is ready, and the OS is installed. We are preparing the virtualization software currently.

UPDATE: 7:41PM Pacific: The replacement hypervisor is now ready. We are beginning to restore backups, this will take many hours. VMs will come online one-by-one as the restoration for them is finished. No ETA for your specific VPS is available.

UPDATE: 10:53AM Pacific: We have restored approximately 40% of VMs, and they are continuing to restore. An archived backup from December 25th is being used.

UPDATE: 9:58PM Pacific: All VMs have been restored. Thank you for your patience with this process, if you have any additional questions please do not hesitate to open a support ticket.

Dallas, TX - Facility Outage (Resolved) Critical

Affecting Other - Dallas, TX - datacenter-wide outage

  • 05/26/2024 06:17 - 06/15/2024 16:54
  • Last Updated 05/30/2024 01:44

Dear Customer,At approximately 6:17am CST our NOC began to see alarms in our Dallas, TX POP.  As we are still investigating the issue, we do not have a ETA for resolution yet, but hopefully it will be very soon.   Meanwhile, your patience is appreciated.   We will post updates shortly.


UPDATE: 6:35am CST: Upon further investigation, we suspect the entire facility at Carrier-1/Prime Datacenters in Dallas has gone dark.   It is likely power related.   We are awaiting updates from onsite.   

UPDATE: 7:25AM CST: We have received confirmation from the facility that this is a power outage. They have confirmed that there is no physical damage to the facility from the storm in the area. The generators are running, however the UPS are not receiving any power from the generators. Facilities and generator contractors are enroute with an ETA of 30-45 minutes.


UPDATE: 9:02AM CST:  Power has been restored to our pods and office in Dallas.   Networking is back up and most servers are back online.  If you still have servers or services down, please post a support ticket and we investigate with urgency.   


UPDATE: 11AM CST: We are still working on scattered issues related to the earlier outage.   Servers in cabinets Bi04 and DK12 are up but without proper networking.  We are working urgently on this as well as remaining hosts that have been reported to us.    

UPDATE: 1:24PM CST: We have resolved the issues in cabinets BI04 and DK12. We are now working our way through any remaining individual servers.

A full post mortem will be posted here and as an announcement once we have gathered all of the facts.   The underlying issue was that despite redundant power systems that were online and functional, the facility may have had some type of malfunction with the automated transfer system (ATS).   Powerful storms and tornados in Texas overnight caused deaths and destruction in the area but the facility should have remained unaffected even with extended loss of utility power, as it has before.   These systems are indeed routinely tested.   Please have no doubt that our team will ensure that all necessary investigations, and ultimately fixes, will be employed to avoid even the slightest chance of a repeat in the future.   We appreciate your patience and understanding.  

Preliminary RFO:

Full Outage Summary

Utility is lost and both Gen A and Gen B start running. Gen A then tripped off immediately after load transfer for an undetermined reason. Gen A continued to try to restart, completely draining and killing the battery even though it recently passed annual PM. UPS 1A and 2A fully discharged their batteries. Once that happened STS 1A and 2A changed sources to secondary source, which was the Catcher UPS which was being fed from Gen B. This overloaded the Catcher UPS and forced it to bypass without fully discharging its batteries. This loading on Gen B caused significant voltage fluctuations and caused UPS 1B and 2B to declare the source unavailable and continued to stay on battery until fully discharged then the units went offline. Load on block 1A and 2A remained on powered through the Catcher UPS in bypass fed from Gen B. When the utility returned the open transition between Gen B and Utility caused the remaining online equipment to trip offline resulting in all the PDU main input breakers to have tripped. This is why the customer load was not immediately restored with the utility.

 

Timeline

5:57 AM – Utility power to the facility was lost.

•            Gen A starts but eventually fails and drains battery trying to restart.

•            Gen B starts and continues to operate until utility returns with significant voltage and frequency instability. (Due to overloading)

5:58 AM – Overall System Status

•            UPS 1A / 2A discharging due to no Generator power available.

•            UPS 1B / 2B discharging due to ATS transition.

•            Mechanical fed from Gen B back online (Half capacity)

5:58 AM – ATS-1B and ATS-2B Load transferred to Generator.

6:01 AM – UPS-1A Offline, STS-1A transfers to Source 2 (Catcher UPS)

6:03 AM – UPS-2A Offline, STS-2A transfers to Source 2 (Catcher UPS)

6:04 AM – Load on Catcher exceeds catcher capacity forcing Catcher UPS to bypass fed from Gen B

6:05 AM – Gen B starts to experience significant voltage and frequency fluctuations due to overloading.

6:06 AM – UPS-1B and UPS-2B due to poor power quality from the generator declare input power to be substandard, resumes discharging from batteries.

6:15 AM – UPS-1B with input power considered bad, fully discharges batteries and downstream load is lost. Downstream STS-1B unable to switch to source 2 (catcher) due to power quality and shuts down. Downstream PDU’s open main input breakers.

6:21 AM – UPS-2B with input power considered bad, fully discharges batteries and downstream load is lost. Downstream STS-2B unable to switch to source 2 (catcher) due to power quality and shuts down. Downstream PDU’s open main input breakers.

6:28 AM – Utility Comes back, all CRACs come back. When ATS-B performs the open transition from Generator to Utility, the remaining PDU’s operating on Gen B lose power due to having no remaining UPSs with battery capacity. Opening the remaining PDU main input breakers.

7:15 AM – Technician identifies UPS 1A, 1B and 2A are all offline. UPS 2B is online in bypass with no load.

8:20 AM – All UPS’s reset and brought back online in normal operation.

8:50 AM – All tripped PDU main input breakers reset, and customer load restored.

Ashburn, VA Emergency Network Maintenance (Resolved) High

Affecting Other - Ashburn, VA POP

  • 05/16/2024 17:30 - 05/16/2024 19:26
  • Last Updated 05/16/2024 18:41

UPDATE: 5/16/2024 5:11PM eastern time: The end time for this maintenance is being extended by one hour to 7:30PM eastern time.

 

 

Dear valued clients,

Please be advised that we will be performing emergency network maintenance in our Ashburn, VA datacenter today, 5/16/2024.

----

Timeline:

Start time: 5/16/2024 5:30PM eastern time

End time: 5/16/2024 6:30PM eastern time

---

Description of issue:

This maintenance is necessary to resolve an issue with IPv6 connectivity that is related to a Cisco bug that has no SMU (hotfix) available. As such, we will be performing a full software update on our core routers. Due to the way our routers are clustered, this will result in approximately 10-30 minutes of downtime.

---

 

If you have any questions, please do not hesitate to contact us.

Thank you for your continued business!

Service Impacting Maintenance Notification: Bend, OR 12/11 & 12/12 (Resolved) Medium

Affecting Other - Bend, OR Datacenter

  • 12/11/2023 22:00 - 12/13/2023 08:38
  • Last Updated 12/01/2023 17:50

Dear Valued Clients,

This is a service impacting maintenance notification for our Bend, OR location.

It has been identified that necessary power maintenance needs to be performed, which will require the shutoff of the main power distribution units one at a time. First the A side PDU will be serviced on the evening of 12/11/2023, and then the B side PDU will be serviced on the evening of 12/12/2023. To reduce impact to services, we will be "swinging" rack-level PDUs temporarily during the maintenance, and then "swinging" them back to their original diverse feed after the maintenance is completed.

We will also be performing switch and router software upgrades at this time. You may experience one to three network outages of approximately 15 minutes each during these periods.

Maintenance Window One (A side):
December 11th from 11PM Pacific Time, until December 12th 7AM Pacific Time

Maintenance Window Two (B side):
December 12th from 11PM Pacific Time, until December 13th 7AM Pacific Time

Dedicated server customers: Most of our dedicated servers are dual power supply, meaning that the only impact to your service will be during the network maintenances. If you have a single power supply server, then we will cleanly shutdown your server, swing the power supply to the other feed, and power it back up - expected downtime of approximately 10-15 minutes. We will then repeat the reboot process at the end of the power maintenance, moving the power supply back to its original source. If you have an operating system that will not accept an ACPI shutdown command cleanly (ESXi, Xencenter, etc) - please ensure that we have an up-to-date root/administrator password on file otherwise we will be forced to hard power cycle the server.

VPS server customers: The only impact to your service will be during the network maintenances.

Shared hosting customers: The only impact to your service will be during the network maintenances.

Colocation customers: If you have purchased A+B power, there will be no power disruption to your services, the only impact to your service will be during the network maintenances. If you have not purchased A+B power. then we will cleanly shutdown your server, swing the power supply to the other feed, and power it back up - expected downtime of approximately 10-15 minutes. We will then repeat the reboot process at the end of the power maintenance, moving the power supply back to its original source. If you have an operating system that will not accept an ACPI shutdown command cleanly (ESXi, Xencenter, etc) - please ensure that we have an up-to-date root/administrator password on file otherwise we will be forced to hard power cycle the server.

If you have any questions regarding this maintenance event, please do not hesitate to contact us. This maintenance will allow us to continue to expand in our Bend, OR location and increase our redundancy, security, and capacity. We appreciate your understanding and thank you for being a valued client!

Fiber Cut - Bend, OR (Resolved) Critical

Affecting Other - Bend, OR Datacenter

  • 10/19/2023 16:40 - 10/19/2023 23:16
  • Last Updated 10/20/2023 02:19

10/19/2023 4:40PM PST: Our NOC was alerted to a network issue in our Bend, OR location and are investigating currently.

UPDATE 5:06PM PST: We have confirmed that there is a fiber cut which is in the same convergence point as the network outage as the cut on 09/19/2023 - unfortunately the diverse route we have ordered is not yet available. We are engaging a tertiary provider in the meantime.

UPDATE 5:13PM PST: We have engaged a tertiary provider and subnets are beginning to propagate.

UPDATE 7:49PM PST: We have received an update from our providers, they have identified a fiber cut located in Sandy Oregon. Their splicers have just arrived on-site to start repairs. Updates are to follow as more information becomes available.

UPDATE 8:56PM PST: The splicers have prepared the sheath and are commencing fusion splicing. They are giving a maximum ETTR of 01:00AM PST (four hours from now). The next update will be provided at that time or as additional information is obtained.

UPDATE 11:16PM PST: One of our transit providers has regained connectivity. Network access is now restored, please note that we do not yet have full redundancy. This incident is now closed.

Bend, OR Network Event (Resolved) Critical

Affecting Other - Bend, OR Network

  • 09/19/2023 00:31 - 09/19/2023 02:34
  • Last Updated 09/19/2023 12:44

At approximately 12:10 AM PST 9/19/2023 our NOC detected a network outage in our Bend, OR datacenter. We are currently investigating this outage.

Update 12:33AM PST: We have determined that all of our redundant network providers are currently hard down. While one was scheduled for maintenance tonight, no others were. We are currently in contact with both providers, working to establish an ETTR

Update 1:02AM PST: We have confirmed with both upstream providers that unfortunately there was a network topology change by one provider, which resulted in a convergence point we were unaware of resulting in a loss of service due to both providers being in the same sheath which is currently undergoing maintenance. We are engaging a tertiary upstream provider in order to restore services as quickly as possible. Please note the maximum ETTR for the originally announced maintenance windows is 6:00AM PST

Update 2:01AM PST: We have several prefixes beginning to propagate via a tertiary provider, and are working on bringing up the remaining prefixes.

Update 2:34AM PST: One of our primary providers has come back up, and services are now fully restored.

We will be conducting a thorough RCA of this incident, including investigating when this convergence point was changed, why we were not notified, and will be changing topology to have full path redundancy again.

Network Issue - Dallas, TX (Resolved) Critical

Affecting Other - Dallas, TX datacenter

  • 09/12/2023 10:39 - 09/12/2023 11:04
  • Last Updated 09/12/2023 12:06

At approximately 10:38AM CST our NOC became aware of a network issue in our Dallas, TX POP. We are actively investigating and will provide updates once available.

Update 10:51AM CST: We have identified the root cause of the issue, and are working to implement a fix. ETTR of within ten minutes.

Update: 11:04AM CST: The issue has been resolved. An RCA will be created and available upon request in 1-2 weeks.

URGENT Network Issue - Dallas, TX (Resolved) High

Affecting Other - Dallas, TX datacenter

  • 08/14/2023 14:33 - 08/14/2023 16:33
  • Last Updated 08/14/2023 16:34

We are investigating a routing issue involving certain subnets in Dallas, TX

The following ranges are affected:

  172.107.31.xxx
  172.107.80.xxx
  104.217.227.xxx
  45.35.213.xxx
  172.107.176.xxx
  108.181.226.xxx
  108.181.227.xxx
  108.181.150.xxx
  108.181.98.xxx
  108.181.240.xxx

We are working with upstream providers to resolve the issue ASAP.  Full connectivity will be restored shortly.

Update: 4:33PM EST, this issue has been resolved.

6/1/2023 Staten Island, NY Network Maintenance (Resolved) Medium

Affecting Other - Staten Island, NY Network Maintenance

  • 06/01/2023 21:00 - 06/10/2023 09:59
  • Last Updated 05/30/2023 15:45

Dear valued clients,

Thank you for your continued business!

Please note we will be performing service impacting network maintenance in our Staten Island, NY location in order to complete a necessary expansion and capacity upgrade.  Please note expected downtime is only 30 minutes, however over one hour is being allocated in case of unforeseen issues. During this time you may see 2-3 brief network outages.

Start time: 6/1/2023 9:00PM EDT
End time: 6/1/2023 11:30PM EDT

If you have any questions, please do not hesitate to get in contact with us.

North Carolina - Power Issues (Resolved) Critical

Affecting Other - CLT4 North Carolina - ENTIRE DATACENTER

  • 03/26/2023 06:38 - 03/26/2023 23:10
  • Last Updated 03/26/2023 19:10

On 3/26/2023 at approximately 3:36AM EST our NOC became aware of a major electrical issue in the North Carolina datacenter affecting all services.   We are handling this with top priority.   Updates will be posted as information is gathered.  There is no ETA but services will be restored ASAP

UPDATE 5:52AM EST: We are presently experiencing an issue at our CLT4 data center which is causing temperature loads to increase to an unsafe equipment level. We currently have 2 maintenance vendors onsite to diagnose and resolve the issue however we have no ETA at this time. Out of an abundance of caution we must advise at this time to avoid any unnecessary damage to our equipment we may power down some equipment.

UPDATE 1:47PM EST: HVAC issues have been temporarily resolved and we are working to begin powering equipment back up as temperature thresholds allow.

UPDATE 2:06PM EST: We continue to power equipment back on gradually. We still have more then five racks of equipment offline and will continue to power it back on as temperature thresholds allow

UPDATE 3:50PM EST: HVAC and electrical contractors have isolated the issue at the CLT4 data center which was causing the increased temperature loads, and ambient temperatures are beginning to drop. At this time it is now safe to power on equipment, and we continue to actively do so. We currently are still awaiting resolution from the root cause, current Estimated Time to Repair from Duke Energy is 8:00 PM Eastern. The facility will remain on generator power until Duke has completed their work.

UPDATE 6:42PM EST: We have worked through all remaining service alerts. If you currently have a service impacting issue, please open a high priority ticket from your client portal.

Routing issues - New York (NY2) (Resolved) Critical

Affecting Other - NY2 - Datacenter

  • 01/02/2023 23:30 - 01/05/2023 13:57
  • Last Updated 01/02/2023 23:03

As of 11:30pm EST, we are currently experiencing network downtime in our Staten Island location.   We are working on re-establishing BGP sessions with upstream providers.   There is currently no ETA, but we are hoping for a resolution at any moment.

As of 1:50am EST, it seems the issue has re-occured.   We are investigating this and looking for a permanent fix.

The issues appear to be surrounding the fact that our BGP sessions keep flapping between us and our upstream providers.   Their network engineers will be onsite ASAP but meanwhile, our only option is to have onsite techs reboot our networking equipment.   We are currently awaiting confirmation that devices have been cycled once more.    This process may repeat unfortunately until a final permanent solution is obtained.  

North Carolina - partial outage Nov 9th, 2022 (Resolved) Critical

Affecting Other - North Carolina (NC)

  • 11/09/2022 05:35 - 11/09/2022 11:12
  • Last Updated 11/09/2022 06:53

As of Nov 9th, 2022 at 8:30am EST, we are investigating a partial outage in North Carolina (Segra).   Staff is onsite and are investigating the issue.   There is no ETA known at this time but we are optimistic that it is not a major issue.  

 

UPDATE Nov 9th at 9:45am EST:

On site staff have determined an issue with the cross connects from 2 of our cabinets to our main networking cabinet.   There is no power related issue.  We are diagnosing and attempting to repair the issue without any further disruption.   There is still no ETA but we are hoping to have a resolution very soon.

Dallas, TX Distribution Switch Maintenance (Resolved) Medium

Affecting Other - Dallas, TX

  • 10/27/2022 22:00 - 10/28/2022 20:31
  • Last Updated 10/27/2022 17:02

Dear valued clients,Please note we will be performing a service impacting maintenance on Friday 10/28/2022 at approximately 10PM central time. The expected downtime is 10 minutes, however 30 minutes is being allocated.This is due to a critical security update to a distribution switch that requires a reboot to apply.Thank you for your continued business!

UPS Maintenance - 7/11 8AM to 12PM - 1612 Cross Beam Drive, Charlotte NC (Resolved) Medium

Affecting Other - CLT4 UPS B

  • 07/11/2022 08:00 - 07/12/2022 10:24
  • Last Updated 07/10/2022 15:45

CLT4 UPS B Maintenance - 7/11 8AM to 12PM - 1612 Cross Beam Drive, Charlotte NC 


Dear Valued Customer,

Please be advised we will be performing non-customer impacting maintenance windows Monday July 11th on our UPS systems in the Charlotte data center.  We will be replacing physical batteries in the UPS systems during its maintenance window.  We do not anticipate any disruption to services during this time however we wanted to advise you of this upcoming maintenance window.  During this maintenance window, the UPS systems will be online but will not have battery backup power until the windows have been completed.  Should you have any questions, please contact support and we will be happy to address them with you.
 
Thank you,
 

Cooling - Dallas, TX (Resolved) High

Affecting Other - DC2 - entire building

  • 06/11/2022 18:35 - 06/12/2022 00:31
  • Last Updated 06/11/2022 17:08

We observed above normal temperatures in building 2 at Carrier 1 in Dallas, TX.  Facilities team is onsite ensuring that HVAC is operating normally.   

UPDATE: as of 7pm CST, we are hearing that temperatures are moderating and the situation is under control.   We will mark this incident as resolved but will provide further updates here once we receive details.

Charlotte NC Power Outage (Resolved) Critical

Affecting Other - Segra, Charlotte NC Datacenter

  • 05/04/2022 05:59 - 05/04/2022 09:24
  • Last Updated 05/04/2022 09:25

We were informed of Duke Energy performing emergency maintenance on circuits affecting building CTL4 in North Carolina today.   This building is where most of our network and hardware is housed.No impact was expected but we experienced an issue with one of the UPS that caused a downstream impact within the data center.  Two PDU's (A1/A4) had the main breakers trip during the cutover (and then the failback).Any colo client with single-sourced to one of those PDU's, and our cabinets unlucky enough to have A side from one and B side from the other, were impacted.We are back on utility power now and we’ve put eyes on reported customer’s cages/racks where everything is up.  Any remaining servers down should be reported to us by opening a new URGENT TICKET so that we can filter out the already previously resolved issues.We sincerely apologize for the impact but we were at the mercy of the building power systems and the quick failover was not seamless as expected so we are investigating this and will run tests and resolve those issues accordingly.

At approximately 8:59AM EST 5/4/2022 our NOC became aware of multiple service issues in our Charlotte, NC datacenter. Initial investigations showed loss of link to multiple cabinets from our network distribution.

At approximately 9:05AM EST 5/4/2022 our NOC contacted datacenter site operations, who confirmed that there is a known datacenter-wide power issue, affecting multiple distribution PDUs

At approximately 9:14AM EST 5/4/2022 power was restored.

At approximately 11:13AM EST 5/4/2022 power was lost again, and restored at approximately 11:15AM EST.

We are currently working through all services and ensuring all servers boot successfully. If your services are currently down, please open a trouble ticket and we will ensure they come back up as soon as possible.

Dallas, TX Routing Issues (Resolved) High

Affecting Other - Networking / Psychz Routes

  • 03/02/2022 16:47 - 03/02/2022 17:54
  • Last Updated 03/02/2022 16:50

Dallas, TX - Update 6:49PM CST

We are experiencing latency and intermittent connections in isolated parts of the world coming via Psychz routes only.   A few small subnets are affected.   We are working with the upstream provider to route around the issue.  There is no specific ETA but it should be resolved any minute. 

Staten Island (NY2) Migration (Resolved) Medium

Affecting Other - Staten Island Datacenter

  • 02/02/2022 23:30 - 02/07/2022 05:29
  • Last Updated 02/03/2022 01:23

-- UPDATE: 4:15am EST Feb 3rd, 2022 --

We continue to work on remaining servers.   There were a few issues with stuck rails and other complications but most of our infrastructure is back online.    We will continue to work until all servers are up and responding.    Your continued patience is much appreciated.

-- UPDATE: 3:00am EST Feb 3rd, 2022 --

The migration continues at this time.   We've successfully moved routing equipment and servers will begin powering back up shortly.    Please allow until at least 4am EST for all servers to be re-cabled and connectivity restored.

-- UPDATE: 12:00am (midnight) EST Feb 3rd, 2022 --

We recieved word that the team is a bit behind schedule so the maintenance will start a bit later than expected and perhaps extend a bit later as well.   We are updating the estimated window to 12:30am EST thru 3:00am EST based on current pace and progress.  


--URGENT UPDATE--

We continue to plan maintenance the evening of Feb 2nd at 11:30pm EST.   Please note that the datacenter building has issued a maintenance window that extends from the evening of February 1st, 2022 thru the morning of February 4th, 2022.   While no other downtime is scheduled, you should keep in mind that there is a large physical migration at the building and other unexpected (but hopefully very brief) interruptions are not completely out of the question.    We thank you for your patience and understanding.

Dear valued customer,

Thank you for your continued business!

Please note we will be conducting scheduled maintenance in our Staten Island (NY2) datacenter.

During this time, full cabinets will be migrated to a new suite in the building to facilitate future growth.  We will be shutting down servers gracefully, moving them, and then starting them up in the new suite.

Start date/time: Feb 2nd, 2022 11:30PM EST
End date/time: Feb 3rd, 2022 1:30AM EST

You can either shutdown your server shortly before 11:30PM EST, or we will shut it down gracefully via ctrl+alt+del as possible.

Your patience is appreciated and we very much value your business!  Thank you for your understanding.

NY2 Outage (Resolved) Critical

Affecting Other - Staten Island Datacenter

  • 01/31/2022 13:06 - 01/31/2022 15:03
  • Last Updated 01/31/2022 13:08

At approximately 12:49PM Pacific time our monitors alerted us to a network issue regarding our Staten Island (NY2) datacenter.

We are currently investigating the issue, and will post any updates as available. We appreciate your patience.

Scheduled Maintenance - Ashburn, VA (Resolved) Critical

Affecting Other - Ashburn, VA Network - Colo, Dedicated, VPS

  • 01/22/2022 12:30 - 01/24/2022 06:23
  • Last Updated 01/23/2022 08:44

UPDATE (1/23/2022 11:45AM EST):    We have moved all but 2 cabinets successfully and efficiently.   The final cabinets will be moved today, Sunday, Jan 23rd starting at 9PM EST.  

Dear Valued Customer,

 

As you may know, Tier.Net has experienced rapid growth in Ashburn, VA during 2020 and 2021.   We have been able to scale on demand up until now but it is necessary to consolidate our space into Tier.Net's new area (our cage).   We will be relocating cabinets one by one into our cage space starting Saturday, January 22nd and working thru at least Monday, January 24th.  We are planning on a 1 hour maintenance window for each cabinet, but have calculated and allocated time for up to 4 hours for each in case of unexpected complications.   We will try to make the moves during off-peak hours as much as is humanly possible but some of the work will be done during the daytime hours to accomidate the schedules of electricians or network specialists as necessary.   

 

IMPACT:  < 1 hour of downtime for VPS, dedicated, and colo clients currently residing in our original space in Ashburn, VA. In addition, a ~15 minute outage, though 30 minutes is being allocated for all clients as we move our networking cabinet which will occur at approximately 9PM EST Sunday January 23rd 2022

 

RESULTS: A congruent and consolidated datacenter space that will result in enhanced scalability, reliability, and ease of access for our onsite staff and clients.  

 

Please note:  If your server or colo space was provisioned within the last 60 days, it is likely you are already in our "new space" and will not be affected by this maintenance, instead you will only have a limited ~15 minute outage as we move our networking cabinet.

 

We will post further announcements to our server status page here as possible:

 

https://billing.tier.net/serverstatus.php

Connectivity issues - 216.173.11x.xxx (Resolved) Critical

Affecting Other - IP subnet

  • 01/02/2022 12:00 - 01/02/2022 12:41
  • Last Updated 01/02/2022 12:19

Hello,

We are aware of an ROA (Route Origin Authorization) issue with the range 216.173.112.0/21.   This subnet's netblock owner made improper changes and we are in the process of reversing them.   We estimate resolution to this issue within 1 hour.   

Charlotte, NC Packetloss (Resolved) Critical

Affecting Other - Network

  • 12/02/2021 07:55 - 12/03/2021 08:33
  • Last Updated 12/02/2021 07:57

We are currently investigating a DDOS and possible complication with our anti-DDOS gear.   ETA < 30 min.   

North Carolina Network Issues (Resolved) Critical

Affecting Other - Network

  • 10/28/2021 00:00 - 10/27/2021 22:28
  • Last Updated 10/27/2021 22:14

We are currently investigating what appears to be a large (spoofed) DDOS attack in North Carolina.  It is currently causing packetloss to several segments of our network in North Carolina.   The issue is actively being worked on and should be resolved momentarily.

Network issue with 198.37.xxx range (Resolved) Critical

Affecting Other - 198.37.xxx range

  • 10/15/2021 06:15 - 10/15/2021 04:25
  • Last Updated 10/15/2021 03:44

Dear Customer,

As of 6:15AM EST, we started getting monitor alerts for IPs in the 198.37.xxx range.   We recently created ROA (route objects) for these ranges after a netblock ownership change and it seems to have gone awry.    We are working on correcting this and all services should be restored once the fixes propagate.  

ETA 30min - 1hour

Bend Oregon - Lumen Fiber Cut (Resolved) Critical

Affecting Other - Bend, Oregon Datacenter

  • 09/17/2021 12:27 - 09/17/2021 14:39
  • Last Updated 09/17/2021 14:38

At 12:27PM PST the NOC observed anomalies in our Bend, Oregon datacenter. Upon investigation it appears that one of our primary transit links to Lumen/CenturyLink/Level3 is flapping, as such we are engaging the Lumen NOC team.

Update: At 12:55PM PST the Lumen NOC team confirmed a partial fiber cut in the area affected circuits, causing the bounces. They have advised that additional bounces and/or full outages are possible. As such we have removed Lumen from our network mix temporarily until the issue is resolved. Please note that you may see packet loss during this time. Once Lumen has confirmed that the fiber cut is fully repaired we will update this network advisory.

Update: At 2:39PM PST the Lumen NOC team confirmed that the partial fiber cut has been repaired. As such, we are closing this network event.

North Carolina: Network Issue (Resolved) Critical

Affecting Other - North Carolina

  • 08/30/2021 10:30
  • Last Updated 08/30/2021 10:34

North Carolina: We are currently investigating an internal network anomaly that is causing packetloss.  There is no ETA at this time but we have all hands on deck investigating and working towards a resolution, hopefully any minute.

 

As of 1:05pm EST, we resolved what turned out to be a spoofed outbound attack originating from inside our own network.   The magnitude of the bandwidth should not have affected our network but the packet rate and type of packets sent (payload) managed to overload some of our devices and cause packetloss and misbehavior.    We've identified the source of the attack vector and we are working with Cisco and our device manufacturers to harden our settings to permanently protect against this sort of abuse.    We sincerely apologize for the problems this caused.

Scheduled Maintenance - Charlotte, NC (Resolved) Medium

Affecting Other - Distribution Switching

  • 05/14/2021 19:00 - 05/14/2021 16:28
  • Last Updated 05/13/2021 12:58

Dear Valued Client,

 

This notice is to inform you of upcoming scheduled maintenance in our Charlotte, NC facility on Friday, May 14th at 7pm EST.   We intend to fail over from temporary backup distribution switches to our primary (redundant) cluster.   The goal is to restore full redundancy to our switching infrastcuture.    

 

Background:  On Saturday, May 8th, we experienced cascading PSU failures in our primary distribution switch cluster.   It caused a complete outage and we ultimately failed over to backup/standby switches.    These switches have been in use since but are not suitable for continued production utilization due to lack of redundancy.   

 

Scope of work:   We will be replacing 2 failed PSUs (power supplies) with 4 new, tested, and fully redundant PSUs.   Our distribution switch cluster will effectively have twice the power redundancy as it had when it experienced multiple failures.   

 

Timing / window:  Starting at 7pm EST, we are scheduling a 30 minute window for the work but actual downtime is not expected to exceed 5-7 min as cabling is switched from the failover switches to the primary cluster.   Both switches will be operating normally before we proceed so there should be little chance of unexpected issues or extended downtime.   

 

As always, your patience and understanding is appreciated.   We look forward to serving you for many more years with incredible uptime in Charlotte, NC.   

North Carolina Network Issues (Resolved) Critical

Affecting Other - Networking / distribution swiches

  • 05/08/2021 06:20
  • Last Updated 05/08/2021 09:44

Update 12:40PM EST:   Confirmed all network connectivity is restored.

Update 12:21PM EST:   Connectivity is mostly restored.   The remaining disconnected cabinets will be back online momentarily.

Update 12:00PM EST: We are currently transitioning all cabling to our backup distribution switches. Our initial ETA holds true and we should have all services restored by 2PM EST.

Update:  9:10am EST.   We are preparing to replace the distribution switches.   There are a few last ditch efforts being worked to determine if the existing cluster will come back to life.    The emergency maintenance window will extend through 2pm EST though we hope to have services restored before then.  

Update:  7:45am EST.   The distribution switch cluster appears to show multiple failed PSUs with red lights and otherwise totally dark.   We are working on a decision to either replace the entire redundant cluster with a backup distribution switch or if we can fix it in place.   ETA is currently unknown.

Update:  7:09am EST.   Confirmed issues with distribution switches.   We are working on a resolution.   Your continued patience is appreciated.


As of 6:20am EST on Saturady, May 8th, we are currently experiencing a datacenter-wide outage in North Carolina.   We suspect an issue with our distribution switches and are currently investigating but have no ETA.   Hopefully services will be restored shortly.   We appreciate your patience.

Dallas routing issues (Resolved) Critical

Affecting Other - Networking

  • 05/02/2021 11:35 - 05/02/2021 12:00
  • Last Updated 05/02/2021 09:11

Dear Customer
We are very sorry for the interruption today.   An issue with our upstream transit (Psychz) caused us to attempt failover but there was apparently multiple issues happening simultaneously.   Our routing failed back over and the issues seemed to be resolved on our upstream's end at that time but we are still waiting to hear what caused the problem in the first place.


Tuesday May 5th, Tier.Net will be switching to an upgraded router cluster (as we notified clients about last week).   We also have plans to completely eliminate Psychz DDOS protection and bandwidth since it has caused each and every network blip and issue we've experienced.   We will be switching over to a new premium bandwidth mix and DDOS protection next week as well.    Based on the changes we are making, we will enjoy not only rock solid stability, but also the ability increase capacity seamlessly as we continue to grow.   No expense has been spared to completely eliminate this type of issue once and for all.  We sincerely apologize and value you business and we understand downtime is unacceptable.   Our ongoing efforts to eliminate the use of the upstream providers that have failed us is in its last stage.    Thank you for bearing with us!  

Dallas TX Network Maintenance (Resolved) Medium

Affecting Other - Dallas, TX Datacenter

  • 05/05/2021 08:30 - 05/08/2021 08:41
  • Last Updated 04/28/2021 17:59

Dear Valued Clients,

Thank you for your continues business! Please note we will be performing service impacting maintenance in the Dallas, TX datacenter.

Services affected: Dallas, TX Datacenter
Date: Wednesday, May 5th, 2021
Start time: 8:30AM CDT (GMT -6)
End time: 9AM CDT (GMT-6)
Outage Duration: Approximately 5 minutes, though 30 minutes is being allocated
Scope of work: We will be performing the final cutover to our new core router cluster. This new router cluster will significantly bolster our network performance, capacity and stability in our Dallas, TX POP. This is the second to last step of our complete network overhaul in order to deliver a more stable product. The new router cluster will be cable of 32Tbps capacity, allowing seamless upgrades as needed in the future.

If you have any questions, please do not hesitate to reach out to us simply by opening a support ticket in our portal at billing.tier.net

Dallas outage (Resolved) Critical

Affecting Other - Dallas

  • 03/10/2021 10:33
  • Last Updated 03/10/2021 10:52

Dear Customer,

We were given reports of CPU usage spikes on upstream routing devices.   We manually failed over to a redundant backup transit provider.   Some small subnets are still affected.   Most users should be up and running as normal.   If your services are still affected, they should be coming back online shortly as well.

Our sincere apologies for the problems this caused.

Dallas TX Power Issues (Resolved) Critical

Affecting Other - Carrier-1, Dallas TX

  • 01/29/2021 13:22 - 01/30/2021 09:14
  • Last Updated 01/29/2021 16:08

Our NOC became aware of a network issue in our Dallas, TX datacenter at 1:14PM PST. We are investigating the issue and all updates will be posted here.

UPDATE: 1:49PM PST. At this time we are beginning to see services restored. Initial reports are showing a potential power issue. We are awaiting an update from DC Ops for the RCA.

UPDATE: 2:03PM PST. This is a confirmed power outage affecting both A and B power legs in the DK row at Carrier-1 which is where our networking cabinet is located. Most services have been restored and we are working through the additional issues remaining in the DK row.

UPDATE: 3:49PM PST. Power has been restored to all cabinets. If you still have a server down, please open a support ticket so we can investigate.

DDOS Attack - Carrier 1/Psychz Dallas, TX (Resolved) Critical

Affecting Other - Network

  • 12/25/2020 03:15 - 12/25/2020 05:16
  • Last Updated 12/25/2020 03:19

At approximately 3:15AM CST, we experienced a particularly vicious DDOS attack.   This never before seen attack was targeting random IPs on random subnets and obviously designed to simply disrupt our entire network.   It was over 150Gbps and seemingly completely random, which increased the difficulty in isolating the patterns used to overwhelm our devices.    Network engineers immediately began work to mitigate the attack and we have succeeded as of approximately 5:15am CST.   We apologize for the problems that this disruption caused and we are ensuring that our mitigation methods are updated to fully protect our network against future similar attacks.   

 

Happy Holidays and Merry Xmas
-Tier.Net

Dallas TX Network Issues (Resolved) Critical

Affecting Other - Carrier-1, Dallas TX

  • 12/16/2020 09:09 - 12/16/2020 10:34
  • Last Updated 12/16/2020 10:34

At approximately 9:09AM PST our NOC began receiving monitor alerts for our Dallas, TX location. We immediately began investigating. It began apparent that our primary transit was down, which initially appears to be because of a very large and sophisticated DDOS. Unfortunately, the BGP session with them stayed up, meaning that our routers did not automatically failover to our backup transit providers.

 

At approximately 9:24AM PST after determining the issue to be a transit provider issue, we manually failed out our primary transit provider and services began returning to normal. There are still a few small ranges which are experiencing packetloss, or complete route failure.

 

At approximately 10:33AM PST, our primary transit provider gave the "all clear" and services were fully restored.

 

If you are still experiencing any service impacting issues, please contact us via ticket.

Network Issue (Resolved) Critical

Affecting Other - CenturyLink

  • 08/30/2020 05:09 - 08/30/2020 09:58
  • Last Updated 08/30/2020 05:10

At approximately 6:10AM EST, we started getting reports of inaccessible servers from clients in some specific regions around the world.   After a thorough investigation, we discovered that the issue is not related to our own networking.   Rather, it seems CenturyLink, a major North American ISP and telco provider has some sort of major issue.   (Link: https://downdetector.com/status/centurylink/) This is causing sporadic inaccessibility issues, packetloss, and latency for many routes to many locations in the USA.     As a direct transit provider, we have contacted them but they are seemingly overwhelmed by the widespread issues and have not been able to promptly reply with specific information.   Through our own connections, we have heard that they are expecting to have all issues sorted by 11:30am EST.    However, we have done our best to route around their networks as much as possible.   Since they are a major ISP in the USA, this proves impossible from some locations.  If your route to us is through CenturyLink, you will likely experience intermittent connectivity until this is fully resolved on their end.   Meanwhile, our network and all systems are 100% operational at this time.    We will continue to monitor this situation and hope that CenturyLink resolves their issues ASAP.    Please do not hesitate to contact us with any questions.

 

Bend, OR (Cascade Divide) Network Maintenance (Resolved) Medium

Affecting Other - Cascade Divide Datacenter

  • 08/12/2020 20:00 - 08/13/2020 11:09
  • Last Updated 08/10/2020 15:41

Beginning 8/12/2020 at 8PM PST we will be upgrading our core network infrastructure to better provide redundancy, speed, and to allow for larger future growth.

Services affected: Cascade Divide, Bend OR datacenter
Date: August 12th, 2020
Start time: 8PM PDT (GMT -8)
End time: 9:30PM PDT (GMT-8)
Outage Duration: Approximately 15 minutes, though 30 minutes is being allocated

Dallas Network Maintenance 2/28/2020 10:00 CST (Resolved) Medium

Affecting System - Dallas, TX Datacenter

  • 02/28/2020 10:00 - 04/28/2020 11:55
  • Last Updated 02/14/2020 13:53

Location: Carrier-1/Psychz, Dallas TX

Time: Feb 28 10AM CST - 2PM CST

Details: Tier.Net will be performing the final stage of our multi-piece network upgrade in our Dallas, TX datacenter. During this time, customers will see one brief (1-5 minute) outage per IP subnet, as subnets are migrated from the old router to our new HA router cluster. In-house Tier.net Network Engineers will be on-site to ensure a smooth transition.

This maintenance marks the final step in our successful transition to a 100% Tier.Net owned layer 3 network and will soon feature Tier.Net's own BGP bandwidth blend.   We have previously added additional redundancies, completed a successful migration to our wholly owned distribution switches, and much more.   The result of this hard work has been a substantial increase in redundancy and a wonderful uptime history in 2019 and 2020.  Rest assured that our own in-house staff will be extremely careful to minimalize any impact to your service.   The time slot was chosen based on the availability of engineers at the facility and upstream transit providers.     

Charlotte, NC Network Issues (Resolved) Critical
  • 01/03/2020 22:26 - 01/04/2020 00:08
  • Last Updated 01/04/2020 00:08

Dear Customer,

We are currently seeing issues with some IP ranges in our Charlotte, NC location.   Network engineers are working on this issue right now.  There is no ETA but we are hoping the issue will be resolved any minute.    This issue appears to affect only certain IPs/ranges. 

UPDATE (1:37am):   Gateways are pinging and we are expecting routing issues to be solved any minute.

UPDATE (2:28am):  Partial service restoration - we expect full resolution shortly.

UPDATE (2:49am):   Most services restored.   We are learning that the cause of this was poorly informed network maintenance upstream.   We are working with the provider to resolve this issue.  

 

UPDATE (2:50am):   Incident fully resolved.   The issue was indeed a result of PLANNED network maintenance by our provider, who did not properly communicate ahead of time.   We are taking action to prevent this from occurring again in the future.   Our apologies for the problems this caused.   We will hold our upstream provider accountable.      Please contact us for any more details or for SLA requests.

 

 

 

Bend, OR packetloss (Resolved) Critical

Affecting Other - Cascade Datacenter

  • 12/13/2019 12:00 - 12/13/2019 12:27
  • Last Updated 12/13/2019 12:32

A DDOS attack has been mitigated in Bend, OR.   The attack resulted in a 15-25min (depending on your network segment) period of network packetloss.    The incident is fully resolved and measure are in place to prevent it from reoccurring.

Router Upgrades + Addition of Dallas, TX Dist (Resolved) Medium

Affecting Other - Dallas, TX - Psychz

  • 09/19/2019 11:00 - 09/20/2019 09:48
  • Last Updated 09/18/2019 09:07

Dear Valued Client,
 
We are pleased to receive feedback that most of you have noticed improvements in network stability within the Dallas, TX (Carrier 1 / Psychz) facility during the past few months.   In our ongoing quest for maximum redundancy and reliability, we are also happy to share our next set of plans and goals.  

On Wed, September 18th at 5am CST (Dallas local time), Psychz has scheduled a software upgrade of its routers.    This should cause a brief interruption as reboots are performed.   According to Psychz, this should rid issues that affect performance, stability, and to enable features that will expand overall throughput.    The maintenance window is 1 hour but connectivity loss should be brief.    Meanwhile, we have our own plans to eliminate Psychz routing in the future entirely (other than for redundancy purposes).   Please read below.

On Thurs, September 19th, our experts will be bringing online additional networking equipment to include a Cisco Nexus 7000 distribution switch.   The scheduled maintenance window is from 10:30am CST thru 11pm CST but there should be MINIMAL service impact (under 30 seconds of packetloss) at the end of this window.  Our own onsite, in-house staff will be performing this maintenance, in coordination with Carrier-1 and Psychz.  Devices will be fully tested before going into production.
 
This new redundant Cisco device will provide several improvements:
 
1) Networking between Tier.Net cabinets will now only pass through Tier.Net's wholly owned networking gear.   We will be eliminating switches and network segments that we are not in direct control of.
 
2) We will be able to identify broadcast storms and other outbound forms of abuse more quickly.   Some of the previous incidents were caused by bad actors attempting to sabotage the Carrier 1 facilty's network from the inside.   While this won't make us immune to this type of abuse, we will have greater control over it.   We have put other safeguards in place as well to eliminate this threat.
 
3) It will pave the path for our final step in Dallas Carrier-1, which will be to switch all layer 3 networking over to 100% Tier.Net-owned devices on Tier.Net's own ASN (autonomous network).  The existing mix of bandwidth will remain as a BGP failover/redundancy option, and Tier.Net will be providing its own primary BGP mix.  On-premise DDOS protection will not be affected in any way, nor will there be any other negative functional impacts whatsoever.
 
Finally, you should expect exciting news in the coming weeks regarding Tier.Net's 2nd point of presence in Dallas featuring QuadraNet DDOS protection and also featuring Tier.Net's own routing and bandwidth mix.  We will be sending updates as soon as possible.   Thank you for your continued business!

Emergency Maintenance 5PM PDT Bend, OR Datace (Resolved) Medium

Affecting Other - Cascade Divide Datacenter

  • 09/11/2019 17:00 - 09/11/2019 18:12
  • Last Updated 09/11/2019 12:33

Dear valued clients,

Thank you for your continued business! This is an emergency service impacting maintenance notification:

Services affected: Cascade Divide, Bend OR datacenter
Date: September 11th, 2019
Start time: 5PM PDT (GMT -8)
Duration: Approximately 15 minutes, though 30 minutes is being allocated
Scope: Replacing a faulty line card that appears to have unexpectedly rebooted this morning, out of an abundance of caution

We appreciate your cooperation and continued business.

Dallas, TX Network Maintenance 7/17/2019 (Resolved) Critical

Affecting Other - Carrier-1 / Psychz Dallas

  • 07/17/2019 04:00 - 07/17/2019 08:27
  • Last Updated 07/17/2019 08:27

Update (9:45AM CST) - The network maintenance has been completed but many clients are still seeing spotty connectivity due to what appears to be broadcast storms.    Engineers are working to eliminate this.  There is no solid ETA but all hands are on the task and it will be resolved as soon as humanly possible.

Update (10:25AM CST) - All broadcast storms have been mopped up, and all service has returned to normal. If you are still having any issues, please open a ticket.

Dear valued clients,

Thank you for your continued business! This is a notification of an upcoming network maintenance that has potential to cause a very brief network outage. Our network engineers will be upgrading multiple aggregation points, reworking some of the network topology and upgrading network software for a more robust and stable overall network.

Details -

Date: Wednesday, July 17, 2019

Start Time: 4:00AM CST

Work Window: 4AM CST - 9:00AM CST

Outage Estimated Duration: 5-10 Minutes

Effect: Brief Outage

Facilities: Dallas

If you have any questions about this maintenance or want more information on how it will relate to your specific services with us, please feel free to open a ticket at billing.tier.net

June 25th/26th 2019 Segra Datacenters Mainten (Resolved) Medium

Affecting Other - Segra Datacenters (DC74, Charlotte NC)

  • 06/26/2019 00:01 - 07/16/2019 10:39
  • Last Updated 06/18/2019 17:06

Dear Valued Customer,

The SEGRA Data Center network team has been planning network upgrades at our Charlotte, NC data centers -  CLT1, CLT2 and CLT4 for the past few months.  During each night of the maintenance window, we will be performing router reboots to complete these upgrades.  Each reboot is only expected to last 5-10 minutes. 

These reboots will allow us to provide additional services at the data center and we do not anticipate any extended period of outages beyond the standard reboot time of each router themselves. 

If you have any questions or concerns, please contact H4Y Customer Care by ticket at billing.tier.net and we will be happy to assist you further.

Dallas, TX issues - Psychz/Carrier 1 (Resolved) Critical

Affecting Other - AS40676 Routing

  • 04/28/2019 02:54 - 04/28/2019 03:45
  • Last Updated 04/28/2019 01:54

At approximately 2:54am CST, we received alerts that the network in Dallas, TX was offline.    A Psychz core router had gone offline in the facility.   Network engineers immediately began investigating and were able to bring the router back online by 3:45am CST.    We continue to await the full details on the cause and ultimate resolution.   

Rest assured that Tier.Net is sparing no expense to establish failover outside of Psychz's routing equipment in Dallas.   We understand the negative impacts that these repeated network events can have on your business.   Our efforts underway include establishing routing via our own ASN, an additional non-Psychz POP in Dallas, and migration away from LACNIC ranges that previously caused issues.   These efforts remain underway but we are making steady progress and hope to provide more updates very soon.   Your patience and understanding are much appreciated!   

Dallas, TX issues - Psychz/Carrier-1 3/18/201 (Resolved) Critical

Affecting Other - Dallas, TX

  • 03/18/2019 07:30 - 03/18/2019 08:45
  • Last Updated 03/18/2019 13:18

Dear Valued Customer,

At approximately  3/18/2019 7:30am CST, our connectivity in Dallas, TX was disrupted suddenly.   Initial assessment by the datacenter was that a widespread power outage/battery backup failure had crashed both core routers.   OOB (out of band) failures complicated the issue.  The issues were resolved after software fixes were implemented at approximately 8:45am CST.  The datacenter is working on a full report and detailed plan to eliminate the possibility of this failure ever repeating.    However, at Tier.Net, we understand the incredibly negative impact this recent network instability has had on our clients.   We share your frustration and sincerely apologize for the inconveniences you have experienced.   Although we have been assured that the Psychz/Carrier-1 facility issues have been resolved, we understand the many well justified concerns.   We are working on solutions to entirely bypass the causes behind each of these issues.   The sole common denominator is the routing equipment at the facility itself.   Not only Tier.Net, but many other providers at this same facility have been impacted.   As this is upstream from our equipment, we have to rely on a 3rd party, and in no uncertain terms, they have been failing us (and our clients) recently.   At Tier.Net, we understand the importance of 100% SLA-backed uptime.   We are and have been working on every possible solution to this recent instability.

Actions we HAVE taken:

1) Earlier this year, we implemented LACP routing redundancy direct to each cabinet in Dallas.  This means that two Juniper routers are directly utilized at all times.   It takes a failure of BOTH routers to cause loss of connectivity.   Unfortunately, though it caused a long period of stability, the recent issues prove that it was not enough to provide adequate protection against these types of failures. 

2) Prior to #1, the datacenter also replaced core routers and all line cards entirely.   Unfortunately, this does not provide 100% protection against software or power failures, despite each of these systems being redundant.

3) Due to new LACNIC policies, LACNIC IPs have been moved to our own routing.   Unfortunately, at this moment we still rely on the core routers at the facility.   This has been the source of each recent issue.

Actions we ARE taking:

1) Tier.Net has its own ASN and is moving towards completely replacing all local routing that is upstream of our equipment currently.   0% of the problems we have had lately have been a result of any direct failure by Tier.Net or any of Tier.Net's wholly owned equipment.   Tier.Net wholly owns all rack level networking equipment and hardware.   All failures have occurred upstream of this.  It is clear that we need to take control from endpoint to endpoint.    

2) Tier.Net is establishing a new presence at a nearby Dallas facility and details will be provided shortly.   We will utilize this facility to add a POP at the Carrier 1 facility as well as give clients an alternative physical location in Dallas, with similar DDOS protection, for those who have lost confidence in the current facility.   We will provide free migration assistance for anyone who chooses to migrate.  We will also be able to utilize this facility as an additional POP to Carrier 1/Psychz once our ASN is established and routing is moved to our equipment.  To be clear, we will NOT be forcing any migration, but not only will provide the option, but also utilize it to add additional resilience to the existing Carrier 1 location.

3) Those with LACNIC IPs are also encouraged to switch to new ARIN IPs we have announced.   We will gladly assign a 2nd set of IPs for you to use as long as you need to migrate from the LACNIC ranges at no additional charge.   The LACNIC ranges start with 181.xxx or 191.xxx.   Please contact us for more details on this.  

Tier.Net is immediately working on the above items at the highest priority as well as participating in ongoing communications with the existing facility to ensure that this does not happen ever again.     We sincerely appreciate your understanding of the situation and promise to be in contact with you with updates again soon.

Sincerely,
Your Tier.Net Staff

Dallas Network Outage (8/10/2018) (Resolved) High

Affecting System - Carrier-1

  • 08/10/2018 14:50 - 08/10/2018 17:37
  • Last Updated 12/18/2018 16:38

At approximately 4:50pm CST, we experienced a catastrophic failure of our primary Juniper router at the Carrier-1 facility in Dallas, TX. It was immediately obvious that routing did not failover properly as designed so engineers power cycled the primary router and began identifying and correcting what seemed to be a buffer overflow that caused it to freeze. Software changes were made that should permanently correct the issue and also a line card was replaced just to be sure. Engineers are still testing and verifying at this time, but we do not expect additional downtime though there is a small risk for latency or packetloss in the coming hours. Juniper's emergency response team was also contacted within minutes to assist. The facility has the highest level contract with Juniper JTAC for support and hardware replacement.

By approximately 5:37pm CST, the primary router was back online and all identified issues with redundancy were also resolved. However, we are now planning additional VRRP (Virtual Router Redundancy Protocol) conversions within our portion of the network at Carrier 1 in Dallas. This will be occurring in the upcoming week or two and again no additional downtime is expected. With the addition of VRRP in each of our cabinets, this type of issue cannot happen no matter what type of failure the primary router experiences (software, hardware, or human error). We will spare no expense and leave no stone unturned to ensure this incident and nothing like it can happen again. We greatly value your business and downtime is unacceptable to everyone. Rest assured that our efforts to PERMANENTLY resolve this will be successful. Please contact us if you have any further questions or concerns.

Dallas, TX (US Central) Network Maintenance (Resolved) Medium

Affecting System - Dallas TX datacenter

  • 10/20/2018 01:00 - 10/24/2018 10:03
  • Last Updated 10/19/2018 16:54

Hello Psychz Networks - Dallas, TX (US Central) Customers,

Planned Time 1AM PDT  - 6AM PDT October 20, 2018 Backup Planned Time 1AM PDT  - 6AM PDT October 21, 2018 Work Window - 5 Hours Outage Estimated Duration: 5-20 minutes

Best practices will be used to avoid any downtime with proper reversion techniques.

Reason - Maintenance on a core router ( MX480 ) as well as placing core services back to their full redundant mode.

If you have any questions about this maintenance or want more information on how it will relate to your specific services with us, please feel free to contact us.

Emergency Maintenance: PDU Replacement (Resolved) Critical

Affecting System - North Carolina Rack 105 PDU 2

  • 01/19/2018 18:00 - 01/19/2018 16:16
  • Last Updated 01/19/2018 13:15

Dear Valued Clients,

During a routine walkthrough we noticed that PDU 2 of Rack 105 in our Lumos Datacenters (DC74), Charlotte NC facility appeared to be malfunctioning. We have scheduled an emergency maintenance beginning at 1/19/2017 at 6PM EST. At this time we will shut down servers (gracefully) that are sourced from a single power supply on PDU 2 beginning at 6PM EST. Dual power supply servers and servers on PDU 1 will not be effected. We will attempt to have all servers up as soon as possible, we estimate the downtime to be approximately 30 minutes or less.

We sincerely apologize for the inconvenience, if you have any questions please do not hesitate to contact us.

Service Impacting Emergency Maintenance: 6PM (Resolved) Medium

Affecting System - Dallas Rack BI04

  • 12/07/2017 18:00 - 01/19/2018 13:15
  • Last Updated 12/06/2017 15:48

Thank you for your continued business. This is a service impacting emergency maintenance notification for December 7th 2017 beginning at 6PM EST. Cisco has identified a bug in the switching platform we use that requires an immediate upgrade. The upgrade window is scheduled for 1.5 hours, however downtime should be far less then that. Rest assured that every precaution has been taken, onsite technicians will be standing by and spare switches are immediately available.

Once again, thank you for your continued business, feel free to contact us if you have any questions.

Service Affecting Maintenance Sat, November 5 (Resolved) Medium

Affecting System - Cascade Divide

  • 11/05/2016 18:00
  • Last Updated 11/06/2016 08:19

UPDATE 7:34AM PST - All servers up.   If you are still experiencing any issues.   Please file a ticket.   Welcome to Bend, OR!

UPDATE 4:32AM PST - 50% of servers are networked, cabled, and powered up.   We continue to work on the remainder.   All networking and routing has been tested and given the OK.  

UPDATE 1:33AM PST - All equipment has arrived at the Bend facility.   Staff is unpacking and racking the equipment into our cabinets as planned.  Switches will be powered up first followed by servers. 

UPDATE 10:03PM PST - IP routing has been tested and confirmed except for a few stragglers.   Equipment is still enroute and more updates will follow shortly.

UPDATE 7:38PM PST - All servers have been cleanly shut down.   We are now safely transporting them to the Bend facility.   So far so good!   Please check back for updates.


Dear Valued Clients,

This notice is to inform you that we will be relocating equipment at our Cascade Divide Roseburg, OR location to the Cascade Divide Bend, OR facility overnight Saturday, November 5th at 6pm PST. This relocation is necessary due to our requirements for a larger and more redundant facility. The new facility features more transit providers, additional redundancy, more space, and larger capacity in general. This relocation will indeed cause a service interruption on the night of November 5th, but it should be as minimal as humanly possible. There will be NO IP space changes, rack-level networking changes, and the new facility is in the same geographic region and state. We chose the most off-peak time while allowing for unforeseen conditions to complete the maintenance with plenty of time to spare before the following morning. Your data, IPs, and configurations are NOT at risk. There will be no functional difference to your service, though you can look forward to enhanced redundancy and reliability, plus pricing benefits for bandwidth and much more in the future. Updates as we go will be posted at https://billing.tier.net/serverstatus.php 

Details:

If you have a dedicated or colo server, we suggest shutting it down (halting) it before November 5th at 6PM PST. If that is not possible, our staff will also begin powering down equipment at that time. We will attempt to shut down all servers cleanly in all cases (using CTRL+ALT+DEL where possible, or by logging in and halting). We will also cleanly shut down all VPS accounts and shared servers. Servers will all be booted up and checked once they arrive at the Bend, OR facility.

The physical relocation will begin by 7pm PST. We have at least one staff member assisting PER CABINET so the equipment will be loaded and relocated as quickly as possible. We expect that equipment will be at the Bend facility and powering up within 4 hours. The maintenance period will extend to 3AM PST on November 6th to account for any unforeseen issues. Please check our network status page for updates and be aware that we will be busy with phonecalls and support requests the entire night. If possible, refrain from contacting us for status updates until we announce that servers are racked/cabled and should be powered back up.

Your business is appreciated! Please contact us if you have any questions, concerns, or special instructions for us during this relocation. Our goal is to ensure clients waking up Sunday morning will simply return with ease to business as usual. We will post status updates as they become available on the night of November 5th. We will also post the Bend, OR datasheet and info to our site within the coming days. Thank you for choosing us!

**EMERGENCY** Scheduled Network Maintenance (Resolved) Medium

Affecting System - Peer1 LA

  • 07/22/2016 00:00 - 08/09/2016 11:12
  • Last Updated 07/21/2016 18:56

Dear Customer,

Please be advised that we will be performing scheduled network maintenance in our Los Angeles (West 7th Street) facility during the following date and time:

From: July 22, 2016 - 00:00 PDT (July 22, 07:00 UTC)

To:   July 22, 2016 - 02:00 PDT (July 22, 09:00 UTC)

The window will occur on Friday July 22nd from 00:00 - 02:00. During this timeframe, Network Engineers will reboot a virtual switch chassis, due to the nature of the maintenance downtime of around 10 minutes is expected during this window and services will be affected.

This work will be SERVICE IMPACTING. The appropriate staff will be present for the entire duration of this maintenance window.

Cascade Divide Datacenter Outage 7/18 (Resolved) Critical

Affecting Other - Cascade Divide Datacenter

  • 07/18/2016 15:01 - 07/18/2016 20:38
  • Last Updated 07/18/2016 15:52

07/18/2016 15:01 PST: We are currently investigating an outage in our Cascade Divide, Roseburg OR datacenter. Datacenter staff is aware of the outage. Updates to follow.

7/18/2016 15:49 PST: We have been advised by the datacenter that their primary router has failed, and failover to the backup router failed. A manual failover is currently taking place.

Cascade Divide Datacenter Outage (Resolved) Critical

Affecting System - Cascade Divide Datacenter

  • 04/13/2016 10:57 - 04/18/2016 10:16
  • Last Updated 04/13/2016 17:49

At 10:57 AM PST (GMT-8) our NOC team noticed an outage at our Cascade Divide, Roseburg OR datacenter. The NOC made contact with the datacenter and they are aware of the issue, and investigating.

UPDATE: 11:12 AM PST -- There is a confirmed fiber cut which is affecting all circuits (including redundant and protected circuits) in and out of the datacenter.

UPDATE: 11:23 AM PST -- Response teams are en route from all major fiber providers to the facility (LSN, HE, Level3, etc). We do not yet have any ETA.

UPDATE: 12:04 PM PST -- Teams are on site from all carriers and repairs are underway. Substantial damage has been done to the multiple fiber paths and utility poles in the area, including full fiber sheath cuts. Crews are giving a rough estimate of 4-5 hours.

UPDATE: 1:55PM PST -- Crews have updated their original estimate. The new expected resolution time is 7:00PM PST. Our on-scene technicians have shared pictures, available at: http://imgur.com/a/023V2

UPDATE: 4:06PM PST -- Aerial lines have been pulled across roadways, and work is underway on terminating and splicing the fiber bundles. The ETA remains the same.

UPDATE: 5:47PM PST -- Service has been restored to the first fiber bundle. At this time if you have any remaining issues please open a ticket so we can address them.

Cascade Divide Network Issues (Resolved) Critical

Affecting Other - Cascade Divide - All services in Roseburg, OR

  • 05/31/2015 19:39 - 05/01/0015 00:00
  • Last Updated 05/31/2015 20:21

5/31/15 @ 7:30pm PST - there is a large DDOS in progress affecting upstream providers at our Oregon location.   We are working on this and should have it resolved momentarily.   This affects beta.tier.net shared srevers, VPS, and dedicated/colo in Roseburg, OR / Cascade Divide.

DC74 Datacenter Maintenance April 11th (Resolved) Medium

Affecting Other - DC74 Datacenter

  • 04/11/2015 23:00 - 07/21/2016 18:55
  • Last Updated 04/09/2015 18:29

Thank you for your continued business!

Please be advised of the following service impacting maintenance notification.

Reason for Notification: Router Maintenance
Location: DC74 Data Centers – CLT4 facility, 1612 Cross Beam Drive, Charlotte NC
Start Date & Time: Saturday, April 11th, 2015 starting at 2300 EST


Expected End Date & Time: Saturday, April 11th, 2015 ending at 2330 EST

PLEASE NOTE: This affects DC74 datacenter ONLY.

Description: We will be performing router maintenance potentially requiring a reboot of the routers. This will impact BGP routing and could potentially cause up to a 15-30 minute loss of network connectivity. The date and time chosen is the lowest traffic routing period for the entirety of the datacenter. Please be aware of this maintenance window and potential service disruption.

Please let me us know if you have any questions.

12/3/14 Cascade Divide Final Network Upgrade (Resolved) Medium

Affecting Other - Cascade Divide datacenter

  • 12/03/2014 23:30 - 07/21/2016 18:55
  • Last Updated 11/30/2014 20:37

Please note that we will be performing the final upgrade to our networking infrastructure at our Cascade Divide, Roseburg OR location starting December 3rd, 2014 at 11:30PM PST (GMT-8) and lasting until 12:00AM.

You may experience a service disruption of approximately five minutes or less during this window.

Thank you for your continued business.

[SERVICE IMPACTING MAINTENANCE] Sat. 11/29/20 (Resolved) Medium

Affecting System - DC74 Datacenter

  • 11/29/2014 22:00 - 11/30/2014 20:33
  • Last Updated 11/24/2014 15:54

Thank you for your continued business! This is a service impacting maintenance notification scheduled for Saturday, 11/29/2014 beginning at 10PM EST (GMT-5) for our DC74, Charlotte NC facility.

At this time we will be upgrading our core switching infrastructure. Clients will be going down one by one as you are moved to the new switch stack. We expect the maintenance window to be complete by 11PM EST (GMT-5). There is an expected 5-10 minutes of downtime per machine.

If you have any questions, please do not hesitate to contact us.

SERVICE IMPACTING Maintenance notification fo (Resolved) Medium

Affecting System - Cascade Divide, Roseburg OR

  • 09/06/2014 23:45 - 11/24/2014 15:09
  • Last Updated 08/22/2014 17:50

Thank you for your continued business! This is a SERVICE IMPACTING maintenance notification. This notification is for our Cascade Divide, Roseburg OR location.

We will be performing switch upgrades in order to provide a more robust and redundant switching infrastructure.

This upgrade is schedule to take place on: 9/6/2014 from 23:45 (11:45 PM) until 9/7/2014 00:15 (12:15 AM) GMT-8 (PST)

Should you have any questions please do not hesitate to contact us regarding this issue.