Secure Resiliency in the Cloud

By Andrew Black

Andy leads the Emerging Technology portfolio for AWS National Security, which includes AI/ML and Quantum. Prior to AWS, Andy led Gartner's business with the Army, State Department, USAID, and DARPA. Prior to that, he was the founder and CEO of data and analytics firms supporting U.S. national security and multinational customers. Andy also serves as an Adjunct Professor at the Georgetown University School of Foreign Service.

SPONSORED CONTENTThis is a time of historic challenges for the CIA and the entire intelligence profession, with geopolitical and technological shifts posing as big a test as we’ve ever faced. Success will depend on blending traditional human intelligence with emerging technologies in creative ways. It will require, in other words, adapting to a world where the only safe prediction about change is that it will accelerate.” – CIA Director William J. Burns, Foreign Affairs

Director Burns’s prescient early 2024 article in Foreign Affairs set the tone for national security and defense professionals across the community: We are living at the collision of unprecedented geopolitical challenges and technological advancements. From AWS’s vantage point, our government customers and partners confront the challenge of “adapting to a world where the only safe prediction about change is that it will accelerate” every day. Our national security and defense customers also understand that, 1) as new technologies emerge, adversaries are working to enhance and exploit them in order to gain advantage over the U.S.; and, 2) in order to win a competition staged on an increasingly changing landscape, our government customers must shore up their own technological assets, most significantly, data—to ensure that they alone are able to access it and wield it in support of U.S. interests.

Current geopolitical events are unfortunately providing several proving grounds for modern technology in international crises. These are the worst-case scenarios for which governments and other critical infrastructure providers plan and practice. As a long-standing technology provider to the U.S Government and critical industries, AWS is continually learning how emerging technologies, along with cloud computing best practices can deliver security and resiliency for critical workloads.

Framing Resiliency and Security
Resiliency is the ability of a workload to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions, such as misconfigurations or transient network issues. For regulated industries and governments especially, system end users must have secure, reliable access to their data and applications at all times. Whether the end user is a physician delivering care through telemedicine, a consular officer managing visa services, or a warfighter in need of battlefield intelligence, resiliency is table stakes for any IT system. When one node goes down due to overload, attack, or any other event, the next node must come online providing the end user’s expected service without disruption. A resilient architecture improves the options and methods for securely distributing information at the edge while also providing cost and performance enhancements for back-hauling data to the enterprise. 

Resiliency, along with cybersecurity practices like Zero Trust, should be incorporated as a core tenet into the planning for each application, workload, and network. In the U.S., the Cybersecurity & Infrastructure Security Agency (CISA) provides helpful guidance and resources for governments and industries to develop resilient digital operations. 


This is sponsored content.  Consider publishing your national security-related, thought leadership content in The Cipher Brief, with a monthly audience reach of more than 500K national security influencers from the public and private sectors.  Drop us a note at [email protected].


Operationalizing Secure Resiliency
Thankfully, operations planners have an ever-growing list of capabilities at their disposal and far too many to enumerate here, so we’ll focus on key cloud capabilities. 

Data centers are a core building block of the cloud, and the best strategies for resiliency account for both the physical construction and the distribution of workloads across data centers. The world’s critical industries run on AWS, so we’ve instilled resilience into our infrastructure, service design and deployment, operational model, and processes from day one.so we built resiliently from the ground up. One of the unique ways we build resiliency into our infrastructure is with the design of our Regions. Our infrastructure Regions typically have three or more Availability Zones. An Availability Zone (AZ) is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZs give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center.

This distinct Region design helps ensure applications are protected against disruptions, like human mistakes, unexpected traffic spikes, floods, utility failures, earthquakes, or even a global pandemic. Critical systems can be run out of several regions across the globe, achieving resiliency with even the most extreme-scale workloads. In addition, AWS control planes and the AWS management console are distributed across regions, and include regional API endpoints, which are designed to operate securely for at least 24 hours if isolated from the global control plane functions without requiring customers to access the region or its API endpoints via external networks during any isolation. AWS also proactively prepares for potential environmental threats, like natural disasters and fire, at each of our sites. And of course, we develop and test our Business Continuity Plan to ensure the security and safety of our sites and regularly run drills that simulate different scenarios.

Architecting for Resiliency and Security

All resilient capabilities rely on resilient architectures that include not just secure infrastructure, but services for automation, self-healing, and disaster recovery. Resilient architectures automate seamless failover as well as the building of software and production environments—this automation is key to quickly restoring operations in case of a disruption. 

AWS is architected to be the most secure and resilient global cloud infrastructure and continually innovates on behalf of our customers. We have learned the following lessons about developing and sustaining secure resilient architectures over the past decade of supporting national security, defense, and other critical workloads:

  1. Create a culture of resiliency and security. In addition to the layers of security, infrastructure, and services contributing to any resilient architecture, is the consistent awareness if not active participation of humans across the organization. By creating a culture of resiliency and security—that is, a culture in which we optimistically think big, but are fully prepared for complete failure at any moment—AWS is able to maintain the highest standards of security and resilience. Furthermore, our cultural commitment to resiliency and security enables us to safely experiment, and fail, so that we remain digitally agile as technological changes accelerate.
  2. Design with resiliency in mind. Resiliency is a proactive, continuous process that should begin by defining metrics for how available and resilient an application or system needs to be. Implement test-driven development and continuously measure performance against these metrics for the duration of the system’s lifecycle. The AWS Resilience Analysis Framework is a great starting point for analyzing failure modes and how they could impact workloads.
  3. Regularly test architecture resiliency. Testing should include a range of scenarios from massive failures where every system component is down, to something more limited that only impacts one to several components. What happens when just one component is unavailable? Does the system failover, or does it keep running in the hopes that that one component comes back online? Does the system even realize there is a problem? Does it recover gracefully once connectivity is restored?
  4. Storage replication. Because end users require secure, reliable access to their data and applications at all times, decide which data to push to the right places so that data are available when and where end users need it. AI and machine learning can use your metadata to assist with storage replication as well as security.
  5. Database replication. Copy data from a primary database to one or more replica databases to improve data accessibility and system fault-tolerance and reliability. AWS offers services to securely replicate databases, and as always, customers choose which data go where.
  6. Simplify infrastructure management with containers. AWS offers a fully managed container orchestration service to help customers efficiently deploy, manage, and scale containerized applications to easily and securely achieve high availability and resilience. 

When Director Burns says the CIA is “transforming our approach to emerging technology,” his agency is not alone. AWS is fortunate to work with transformational organizations across government, and those customers are routinely raising the bar on their expectations for the cloud. These customers are taking the proactive approach of focusing on building resilient systems from the start.  They are learning the lessons from world events  in real-time and are asking AWS to anticipate disruptions and ensure their operations remain secure and resilient. Backed by the security, power, agility, scalability, and cost efficiency of the cloud, AWS is primed and able to continue supporting the immediate and long-term resiliency needs of our national security and defense customers.

This is sponsored content.  Consider publishing your national security-related, sponsored content in The Cipher Brief, with a monthly audience reach of more than 500K national security influencers from the public and private sectors.  Drop us a note at [email protected].

Categorized as:Tech/CyberTagged with:

Related Articles

Search

Close