Can you share advice for leaders looking to shift to a multi-cloud strategy for better resilience in the event of major disruption (such as the recent CrowdStrike outage)? What best practices or common pitfalls are most important to consider?

674 viewscircle icon3 Comments
Sort by:
CISO in Energy and Utilities10 months ago

Regarding a multi-cloud strategy, I've always advocated for a 51/49 split. For example, we might allocate 51% to Microsoft Azure and 49% to Amazon Web Services (AWS). This approach is supported by robust security tools, enabling us to quickly detect and address any misconfigurations. Our cloud infrastructure constantly evolves, and we're continually building and maturing our cloud stack to stay ahead.

We apply the same philosophy to our internet providers, ensuring we never go all-in with a single vendor. This strategy enhances our resilience and strengthens our negotiating position during contract renewals. Vendors understand that they're just a few points away from being replaced, which keeps them competitive and responsive.

Preparing your organization for potential disruptions is always challenging. What if Microsoft Office were to fail? Many companies are heavily reliant on Microsoft products, making a robust Business Continuity Management (BCM), Business Continuity Planning (BCP), and Disaster Recovery (DR) strategy essential. It's about more than just having a plan on paper—it's about ensuring that your organization can operate smoothly, even in unexpected outages. Being proactive in this area safeguards your operations and strengthens overall organizational resilience.

My advice for enhancing resilience during significant disruptions is simple: don't put all your eggs in one basket, and ensure you have robust monitoring in place. Start with the basics, evolve, and continually build on your cloud infrastructure to achieve a mature, secure environment. It might not sound groundbreaking, but it's a practical and reliable approach to ensure your organization is prepared for any major disruptions.

Lightbulb on1
Group Director of Information Security in Bankinga year ago

Hi
A multi-cloud strategy for better resilience is only part of the story in the overall scheme of digital business resilience. Its NOT as simple as it sounds. The other often overlooked parts are:

1. Costs management and intricacies required for maintaining multi-cloud licences.
2. Defining and execution of deciding and shifting workloads on multicloud. Assets monitoring  on each cloud thereafter.
3. Internal skillsets and overall governance for maintaining workloads on multi-clouds.

Yes! it does give you a bit of vendor negotiation leverage but understanding in-depth licensing models of one CSP may give you same advantages. Overall, multi-cloud strategy is a nightmare in the making, from information security perspective.

Now, crowdstrike outage had little to do with Azure or AWS as CSPs but more to do with Windows as OS. MultiCloud strategy doesn't play much role in it, as you can always use non-Windows OS on Azure too.

Best pratices to consider are:
a. Host your most critical workloads on Linux/Unix OS stack even if it be on Azure or AWS. They have withstood the test of time.
b. Harden your POS systems (only the customer facing terminals in case of Banks, Airline operators, OT systems etc.) and do not bring them under active AV update regime rather push it in batches once a week. Crowdstrike also recommends that now.
c. Use Regions and High Availability Zones on single CSP instead of multi-CSP, as part of your application architecture design. Ensure non-synchronous updates across 'Regions' to safeguard against Ransomware infection across cloud storages and VM's. Get business sign-off on RPO/RTO.
d. Take good care of network segmentation strategy on cloud for guarding lateral malicious traffic including container segmentation. (Refer OWASP Top 10 K8S risks) 

https://owasp.org/www-project-kubernetes-top-ten/2022/en/src/K07-network-segmentation.

Hope this helps.

CISO/CPO & Adjunct Law Professor in Finance (non-banking)a year ago

1.       Ensure the “different” providers are not front ends for the same foundational provider.

2.       Review the track record of each provider on the whole as well as their record with organizations of your size, and industry if possible. Some organizations may choose a provider based largely upon cost if their uptime requirements aren’t as stringent. For example, the different needs of a hospital vs a management consulting firm.

3.       Ensure the provider can meet your geographic restrictions if necessary. Vendors for the government may need to keep data and data access within the US.

4.       Determine how well the providers bolt up to your existing products/services. If a provider requires you to rip and replace a functional process, then the new provider may not really be a bargain.

5.       Evaluate the disaster recovery aspect of the complete configuration, taking diverse routing and actual redundancy into account. Lighting up dark fiber in the same sheath isn’t a defense to an errant backhoe.

6.       Outside of the big three (G,M,A) check the financials of the provider carefully. Consolidation or other financial events could change the provider’s ability to provide resilience.

7.       Review and project costs. The risk of downtime should be compared to the constant costs of the multi-cloud solution. A person will never be stranded if they tow a support trailer containing new tires and a pit crew, but that is over engineering. 

Lightbulb on1

Content you might like

Yes65%

No35%

Not concerned at all8%

Slightly concerned49%

Moderately concerned26%

Significantly concerned14%

It’s our top priority1%

View Results