OVH server facility incident

The fire in the OVH server room in Strasbourg has affected many companies around the world, including ours, and through us, you and your customers.

In order to better understand the event and its impact on our system, we have decided to gather below the information we currently have. We describe the course of the event, the crisis scenarios we prepared in advance, the course of work during the disaster reduction, and what is most important – the security plan for the future.

We will update this post as our work progresses. We hope that this will make the whole situation clearer and our actions fully understandable.

 

10.03.2021 incident

A very serious fire occurred in the infrastructure of the OVH server room in Strasbourg on March 10 after midnight. As it is the largest data centre of its kind in Europe, the fire caused the shutdown of more than 3.6 million websites worldwide. Many shops, news sites, banks or government sites were affected – in France, the UK and Poland. Unfortunately, BaseLinker is also among those affected.

 

OVH is a well-known and reputable server provider in Europe. We have used the services of this hosting provider for many years and have placed great trust in the company. Our data was stored on some of the most expensive machines available (from the latest hardware lines). Although the fire did not affect our servers, due to the entire area being shut down, access to them was temporarily restricted. To our knowledge, such a prolonged failure of a single server room has never happened before in history.

 

As soon as we found out about the fire – we started internal data recovery procedures and immediately – without waiting for information from OVH – started setting up a new server infrastructure.

 

For the vast majority of users we managed to get key parts of the system up and running in less than 2 days – the failure occurred on Wednesday 10. 03 at 4am, the system was restored on Thursday 11.03 at 14:30 for Panel-B, and at 23:30 for Panel-A. Affiliation to a given panel is based solely on the order in which accounts are registered.

 

According to our statistics, on Friday 12.03, most users were already packing packages in the standard quantity, orders were being taken on the fly. Some minor system functions were restored only gradually in the following days. However, we wanted the main function of the system – the Order Manager – to be available as soon as possible.

 

Unfortunately, for some users (Panel-C) access to accounts has been temporarily limited – actions on the part of the OVH server room are necessary. According to the declarations of the server room, the data are secured, but due to the work carried out in the area of failure, access to machines will be possible at a later date (the server room has declared a deadline of 23.03. Other blocks are launched, but there is a risk that the date given by OVH may move). We are taking all possible measures to make the final date as close as possible to the previous declarations.

 

In order for Panel-C users to handle current sales, we have prepared the option to set up temporary accounts on Panel-D. Setting up a new panel normally takes weeks of preparation – in this exceptional situation we did everything to make it ready the very next day after the failure.

 

Security procedures – data backup

From the moment the first information about the failure appeared, we launched security procedures to start the process of restoring the system to full functioning.

 

Data security is a priority for us, so we built the solution around a back-up system. Our servers in Strasbourg for all panels had so-called mirrors, i.e. copies that replace themselves automatically and on the fly in case of failure.

 

The events of March 10 forced us to implement the most difficult scenario possible – to restart the system from a backup. Although our main server and mirror server were not damaged in the fire, they were shut down by OVH as a failure related area. Due to the temporary lack of access to data in Strasbourg, we referred to the copy available in Warsaw. The process of its implementation could not be immediate due to the amount of data to be copied.

 

The team stood up to the challenge. We did our best, with the help of our own and external specialists, to set up an alternative system infrastructure on the basis of the existing backup in a dozen or so hours.

 

The incident’s impact on our business

We are aware that the OVH server room fire has affected your work. We too have suffered serious consequences, both image-wise and financial.

 

We put up the new infrastructure as quickly as possible. No matter the costs. We fought for every hour to restore the system and make it available to our users as soon as possible. Because of how urgently we needed new, powerful infrastructure, the cost of obtaining it was very high. We used AWS (Amazon Web Services), which also has a reputation as a reputable provider and is able to deliver servers in minutes. We immediately bet on the most powerful machines available on the market, so that we would have a reserve of power when the system was not yet optimised.

 

The first few days of the new infrastructure generated costs similar to those previously generated over an entire month. To date, each day of operation of the new infrastructure is several times more expensive than before. Server costs will return to normal, but only over the next few weeks, once the new environment is properly optimised.

 

During this period, the entire team has been working very intensively and with great dedication, even around the clock, including weekends, dealing exclusively with the failure issue. This is our internal effort, completely natural in this situation.

 

The actual expenses incurred so far by our company are already hundreds of thousands of zloty.

 

The failure has affected everyone, including BaseLinker as a company, not just as a system. In normal times we would have been focused on development and new features of the system, but because of the failure all our resources were directed towards stabilising the situation. In an extremely short period of time, our specialists put in place from scratch an infrastructure that had been built up over the years.

 

Future system security

We encountered an extremely rare situation, which in this day and age should no longer happen – part of the server room of OVH, Europe’s leading hosting provider, burned down.

 

Although the incident was random, we decided to prepare an even stronger security net. The new scenario will consider even the least likely situations. We will take steps that go far beyond the standard for this type of protection. No one can prepare as well as someone who has had the unpleasantness of experiencing such an event. 

 

>>> Click here, to see in new window <<<

 

Furthermore, we want every BaseLinker User – if one wishes – to be able to easily create own backup copies.

 

We’ve designed a new feature – Google Drive Backup. This is an integration with Google Drive that will allow you to send information about orders, receipts, invoices, products in stock, etc. to your own drive every night.

 

If this is still not enough for some users, one can go a step further and set Google Drive to sync with your computer or NAS drive. This will allow making daily copies of the data even locally: on own drive, on office computer. 

Please take a look at the infographic to learn more about the security strategy:

 

Thank you for your support

Dear Users. We hope that the above information has given you a broader perspective on this complicated situation.

 

We are very sorry for any inconveniences and disruptions you have experienced due to the system unavailability.

 

Our goal is to provide the highest quality of service, which is why we make investments in tools, security and our own development in order to provide you with the best possible product and support for your ecommerce service.

 

As the entire BaseLinker team we would like to thank you for your understanding, patience, kind words and all the expressions of support we have received from you since the very beginning of the incident.

Was this helpful?