Ahead in the Cloud: Best Practices for Navigating the Future of Enterprise IT

by Stephen Orban, Andy Jassy, Adrian Cockcroft, and Mark Schwartz

Finished Reading on Saturday Janurary 19, 2019 267 Highlights

Some of the most compelling lessons I’ve learned in my time managing AWS are shared in Stephen’s book. In my opinion, the single biggest differentiator between those who talk a lot about the cloud and those who have actual success is the senior leadership team’s conviction that they want to take the organization to the cloud. It’s not enough to think it’s a good idea. It’s not enough to talk about it. It’s not enough to get a few of your leaders to agree with you. Inertia is a very powerful blocker in large organizations. Senior leaders need to present the vision of why they want to move to the cloud, get their entire leadership team aligned and committed to making the move, set an aggressive top-down goal to push the organization to move faster than they would otherwise, and have mechanisms to see the actual progress and avoid the pocket vetoing that sometimes transpires in large organizations.

144

As we work on technology migration with enterprises, it’s usually people and processes that are the blockers, not technology problems. Getting to cloud effectively means you need to figure out DevOps, but DevOps is often a re-org, not adding or re-naming a team. The whole organization needs to be recruited and compensated with an aligned strategy and rewards to get the culture right. In other words, you won’t get long-term strategic results like Amazon or Netflix without a culture that is supported by a long-term, focused compensation structure.

198

The right culture also unlocks internal talent, because you don’t add innovation to a company—you get out of its way. An executive once told me “We can’t copy Netflix because we don’t have the people.” My response was “Where do you think they used to work? We hired them from you and got out of their way…”

203

“digital transformation.” Startups have no need to transform; they are what they are. Enterprises, however, have to break free from what they have been.

216

That’s organizational culture: the norms that arise to reinforce behavior that has led to success.

224

Similarly, companies create rules—bureaucracy, essentially—based on activities that have worked for them in the past. Standard operating procedures, for example.

225

Most enterprises have not optimized for agility. If anything, they have optimized for efficiency – for doing what they do at the lowest cost. This is the dilemma. An enterprise must transform by changing its culture, changing its bureaucracy, changing its organization, changing its technical architecture—and making them agile. There is one indispensable tool available to help them do so. That is the cloud.

244

In the cloud, you spin up the infrastructure right when you need it, you dispose of it when you no longer need it, and you pay only for the time you used it. That is agility.

253

continuous transformation.

261

One of my favorite quotes from Jeff Bezos, Amazon’s founder and CEO, is that “we are willing to be misunderstood for long periods of time.”

280

At AWS, we define cloud computing as the on-demand delivery of information technology (IT) resources via the internet with pay-as-you-go pricing.

333

The process prong focused on giving each line of business more freedom to experiment and to react more quickly to shifting customer demands. To do this, we adopted continuous delivery practices and streamlined our project approval process.

520

I also came to believe that the cloud paradigm combines several technology disciplines that were traditionally considered distinct. Building and managing applications that scale up and down automatically through code requires skills that cross the boundaries of software development, system administration, database administration, and network engineering (and probably others).

535

“The idea of dev and ops working together is really a throwback to when we weren’t all so specialized and everyone just did what needed to get done.”

548

The primary role of the DevOps team would be to enable all teams to DevOps themselves and to provide them the tools and capabilities to make that easier.

551

We then came up with and communicated three tenets for the team, all of which were inspired by the DevOps movement.

553

The first tenet was that DevOps_had to treat the application teams as paying customers.

554

The second tenet was that DevOps had to automate everything. Our view at the time was that if you were going to deploy anything in the cloud, you should do it “right” by leveraging a cloud-native architecture.

560

The third tenet was that DevOps was not going to be responsible for the ongoing operations of the applications that the lines of business deployed to the cloud.

565

We were going to become a “run what you build” culture, and each line of business would leverage the best practices, reference architectures, and non-negotiables from the DevOps team, but otherwise be accountable for the ongoing operations and change management of their applications.

566

Once a line of business deployed something, it was on them to own and maintain it from then on.

568

We always tried to create the right balance of non-negotiables that every line of business had to implement for their projects while maintaining their ability to innovate using new tools, services, and open-source technologies.

571

More often than not, we intentionally didn’t backfill their old roles. So, if a system administrator or network engineer supporting Dow Jones Newswires, for example, moved from their traditional role into the DevOps team, our DevOps capacity increased at the same rate as our capacity to operate in the status quo decreased.

577

The DevOps team was responsible for developing the best practices and building the capabilities that made this hybrid architecture work well, and these capabilities became more sophisticated over time as our needs matured.

585

One of the most impactful things Milin Patel6, who led the DevOps team, did was build a curriculum he called “DevOps Days.” This two-day curriculum started with a half day of AWS basics. It continued with another day and a half about how all the reference architectures, best practices, and governance that the DevOps team had built should be used. In addition to being a great way to educate our teams, it was a great way to get feedback from those that were already using those practices.

589

in this case, we didn’t fundamentally change the architecture, and we ended up saving 30% by primarily lifting-and-shifting!

625

The Dow Jones Technology department became a driving force in the business, and it was recognized as something that could make a very positive difference in the products we delivered to our customers. During the course of our transition, we went from roughly 400 employees and 1,100 contractors to about 450 employees and 300 contractors. This was a considerable decrease, but a much better mix of motivated people who owned their product areas, and genuinely wanted to move quickly to benefit their customers and the business.

644

It’s about becoming an organization that is capable of quickly deploying technology to meet business needs, regardless of where the technology comes from.

682

I’ve come up with an imperfect pattern that I call the Stages of Adoption. Each of these stages (stage one: project, stage two: foundation, stage three: migration, stage four: reinvention) represents the kind of thing that happens in a large organization during the course of its never-ending journey to becoming a digital company.

687

STAGE 1: PROJECT

690

I always suggest they pick a project that is important enough that people will care, but also not something where there is no appetite for learning (in other words, don’t pick something that will get you fired). Once they get a feel for the cloud, they tend to want to do more.

693

STAGE 2: FOUNDATION

696

I need to make a couple of foundational investments so I can scale these new capabilities throughout my organization.” This typically includes the creation of a cross-functional team dedicated to their transformation efforts (which we call a Cloud Center of Excellence, or CCoE, Chapter 24-31) and the deployment of an “AWS landing zone” so that they have the right governance and operating model for leveraging the cloud at scale.

698

STAGE 3: MIGRATION

702

Here they develop a business case to quantify the benefits they can achieve by migrating their legacy systems to the cloud.

705

STAGE 4: REINVENTION

706

Many enterprises, including GE Oil & Gas,7 find that it’s easier to optimize their applications after they’ve migrated them to the cloud, because of the expertise they gained along the way. Many of these organizations begin to feel as though they’ve reinvented themselves, and are applying their newfound capabilities across their entire business.

709

Different enterprises arrive at a cloud-first policy at different points in their journey. Some CIOs with confident instincts will declare cloud-first early in their journey;

717

Our newfound ability to deliver technology to the business quickly became our “hero” project, which helped encourage my team (many of whom became anxious until we trained them) and executive stakeholders to come on the journey with

759

start with a project that they can get results from in a few weeks.

763

What I’ve found most important is that organizations pick something that will deliver value to the business, but something that isn’t so important that there’s no appetite for learning. Avoid analysis paralysis, and use your early cloud projects as a way to start experimenting.

773

WHO SHOULD WORK ON YOUR EARLY CLOUD PROJECTS? (HINT: ATTITUDE MATTERS) Regardless of where in your organization you get started, you should expect that your early cloud projects will make some people excited and some people uncomfortable. Start to nurture the people who get excited, and consider how you can turn them into your own cloud champions/evangelists.

781

1. CREATE A CLOUD CENTER OF EXCELLENCE TEAM

820

I believe the creation of a CCoE is one of the most critical foundational investment an organization can make, particularly if you’re looking to evolve your culture.

823

BUILD REFERENCE ARCHITECTURES TO REUSE ACROSS YOUR BUSINESS

837

Encourage your teams to look for common patterns in the applications they own. If you find a reference architecture that meets the needs of several applications, create scripts that automate the construction of that reference architecture while baking in your security and operational controls.

839

3. CREATE A CULTURE OF EXPERIMENTATION AND EVOLVE YOUR OPERATING MODEL

852

4. EDUCATE YOUR STAFF AND GIVE YOUR TEAM A CHANCE TO LEARN

869

Education is the most effective mechanism I’ve found to get your team to come along with you.

872

THE MIGRATION PROCESS

932

consists of five phases: Opportunity Evaluation, Portfolio Discovery and Planning, Application Design, Migration & Validation, and Operate.

934

APPLICATION MIGRATION STRATEGIES: “THE 6 R’S”

940

Rehosting (otherwise known as “lift-and-shift”) Replatforming (I sometimes call this “lift-tinker-and-shift”) Repurchasing (migrate to a different product/license, often SaaS) Refactoring (re-architect or re-imagine leveraging cloud-native capabilities) Retire (get rid of) Retain (do nothing, usually “revisit later”).

944

PHASE 1: OPPORTUNITY EVALUATION

981

PHASE 2: PORTFOLIO DISCOVERY AND PLANNING

1001

The complexity of migrating existing applications varies, depending on the architecture and existing licensing arrangements.

1008

mainframe at the high-complexity end of the spectrum. I suggest starting with something on the low-complexity end of the spectrum for the obvious reason that it will be easier to complete—which will give you some immediate positive reinforcement or “quick wins” as you learn.

1010

PHASES 3 AND 4: DESIGNING, MIGRATING, AND VALIDATING APPLICATIONS

1018

“migration factory,”

1021

Finally, make sure you have a strategy for testing and decommissioning the old systems.

1030

you may have to run parallel environments for a period of time while you migrate traffic, users, or content.

1032

PHASE 5: MODERN OPERATING MODEL

1034

Rehosting—Otherwise known as “lift-and-shift.”

1071

GE Oil & Gas,24 for instance, found that, even without implementing any cloud optimizations, it could save roughly 30 percent of its costs simply by rehosting.

1074

We’ve also found that applications are easier to optimize/rearchitect once they’re already running in the cloud.

1078

Replatforming—I sometimes call this “lift-tinker-and-shift.”

1081

Here you might make a few cloud (or other) optimizations in order to achieve some tangible benefit, but you aren’t otherwise changing the core architecture of the application.

1082

Repurchasing—Moving to a different product. I most commonly see repurchasing as a move to a SaaS platform.

1090

Refactoring/Re-architecting—Re-imagining how the application is architected and developed, typically using cloud-native features. This

1092

Retire—Get rid of. Once you’ve discovered everything in your environment, you might ask each functional area who owns each application. We’ve found that as much as 10 percent (I’ve seen 20 percent) of an enterprise IT portfolio is no longer useful, and can simply be turned off.

1098

Retain—Usually this means “revisit” or do nothing (for now). Maybe you’re still riding out some depreciation, aren’t ready to prioritize an application that was recently upgraded, or are otherwise not inclined to migrate some applications.

1101

Somewhere in the middle of rehosting and re-architecting is what we call re-platforming, where you’re not spending the time on a complete re-architecture, but, rather, making some adjustments to take advantage of cloud-native features or otherwise optimize the application.

1131

GE Oil & Gas31 rehosted hundreds of applications to the cloud as part of a major digital overhaul.32 In the process, they reduced their TCO by 52 percent.

1163

First, rehosting takes a lot less time, particularly when automated, and typically yields a TCO savings in the neighborhood of 30 percent.

1171

Second, it becomes easier to re-architect and constantly reinvent your applications once they’re running in the cloud. This is partly because of the obvious toolchain integration, and partly because your people will learn an awful lot about what cloud-native architectures should look like through re-hosting.

1174

look to rehost or re-platform the steady-state applications that you aren’t otherwise going to repurchase, retire,

1183

REASON 1—SSDS RULE

1213

one benefit of this variety is the boost in performance that occurs when solid-state drives (SSDs) are leveraged, particularly for storage I/O-intensive workloads like databases. The price of all storage types continues to go down, but

1216

REASON 2—COVERING APPLICATION SINS Most people think about the cloud’s elasticity from a scale-out perspective, but scale-up is just as valid. This can be done on-premises, but AWS provides an upper threshold that’s greater than most environments. For example, one of AWS’ largest instances in memory and CPU comes from our X1 family of virtual servers.33 The x1e.32xlarge34 has 128 vCPUs, 4TB of memory, and it’s backed by SSD storage with dedicated bandwidth to EBS (14,000 Mbps). This is an instance that is typically used for workloads like SAP HANA.35 One customer I know had an application that was in a critical period and realized there were some bad queries causing performance bottlenecks. Changing the code was too risky, so the database server was upped to an X1 instance and then ramped back down to a more reasonable instance size once the critical period was over. Being on the application development side of the IT house, I always appreciated when infrastructure had the ability to cover application sins. I’d rather catch application issues earlier in the development cycle, but it’s sure nice to know that AWS can provide help when you’re in a bind. REASON 3—HORSES FOR COURSES The relational database (RDMS) has been the de facto back-end for applications over the past 40 years. While the RDMS is great at many kinds of queries there are some workloads that the RDMS is simply not well-suited for. Full-text search is a good example, which explains why Lucene-based technologies such as Apache Solr36 and ElasticSearch37 are so popular and much better suited for this use case.

1222

REASON 4—EVOLVING MONOLITHS

1248

My belief is that, in order to be independently deployable and scalable, the microservice should be isolated from the code repository, presentation layer, business logic, down through the persistence store.

1257

Yes, You Can Migrate Your Mainframe to the Cloud

1286

RE-HOSTING A re-hosting solution runs existing mainframe applications on an x86–64 based Amazon EC2 instance using a mainframe emulator (i.e. Micro Focus Enterprise Server, TMaxSoft OpenFrame, Oracle Tuxedo ART). This migration is seamless from an end-user perspective and it does not require changes to standard mainframe technologies like 3270 Screens Web Cobol, JCL, and DB2.

1311

BATCH JOB MIGRATION Batch jobs often form a large portion of the mainframe application portfolio, and while some are business-critical, usually a significant number of these jobs are of low business value and consume a large amount of MIPS.

1317

Whether they are file based or near-real-time processes, offloading the heavy lifting to the AWS Cloud will enable customers to gain additional insights to their data and reduce MIPS consumption on their existing mainframe. RE-ENGINEERING The re-engineering approach is recommended when the existing mainframe application is no longer able to meet future-state business requirements or an Agile target architecture.

1319

MOST EXECUTIVES I TALK WITH tell me that their journey to the cloud is more about business and cultural transformation than it is about technology adoption.

1338

(Since its inception in 1955, for example, the Fortune 500 has seen between 20 and 50 companies fall off the list each year. Advances in technology are largely behind this steady rate of turnover, with the cloud being the most recent cause of large-scale disruption.)

1353

Provide executive support – projects are much more likely to succeed when the boss supports them, and large-scale change management typically comes from the top.

1369

Educate staff – people can sometimes be afraid of what they don’t know.

1372

Create a culture of experimentation – having access to a seemingly infinite amount of on-demand IT resources can change the game for any organization,

1375

Failure is a lot less expensive when you can just spin down what didn’t work.

1376

Pick the right partners – the ecosystem of system integrators, digital consultancies, managed service providers, and tools around cloud has grown substantially over the years.

1377

Create a Cloud Center of Excellence – most organizations that are doing anything meaningful with the cloud have a team of people dedicated to deeply understanding how cloud technology can be used across a distributed organization,

1380

Implement a hybrid architecture – the cloud is not an all-or-nothing value proposition.

1383

Implement a cloud-first strategy – once your organization has some experience with implementing cloud at scale, it may be time to make it a strategic imperative to accelerate the results across your organization.

1385

It’s not just about changing your organization’s technology—it’s about changing the way your IT department delivers technology and adds business value.

1406

Merging business and technology. Cloud adoption offers more than technology shift. It also offers a new way to do business.

1416

Providing clarity of purpose. Just as it’s important to tie technology to business results together for your executive stakeholders, tying your team’s roles back to the business benefit will help them understand how they fit in—especially when it involves changes to their roles.

1421

Breaking (Making) new rules. Most traditional IT operating models won’t allow you to take full advantage of what the cloud has to offer.

1427

LEADERS LEAD IN MANY DIFFERENT ways. Some lead by fear, some by example, some with charisma, and some lead through others. And while every leader’s style is slightly different, experience has taught me one thing that stays the same: people are most likely to follow those they understand.

1503

Whatever your short- or long-term motivations may be, I’d encourage you to make them known and make them measurable. Clearly articulate your motivations and goals to your team and your stakeholders, and hold everyone accountable for moving the needle in the right direction.

1516

It wasn’t until I started to clearly articulate what was important about our strategy that the behavior of my team started to change.

1520

Making it clear what everyone’s options are in light of the changing direction will give them a clear path to understanding how they can participate, and likely some peace of mind.

1528

we made automation a hard requirement for everything we did at the beginning of our journey.

1537

I’ve found that it’s best to treat the bumps you hit as learning opportunities, not to chastise your team for making mistakes (although you shouldn’t accept the same mistake twice), and swiftly address skepticism that goes against your purpose. Don’t let those who would be more comfortable reverting to the status quo influence the potential of your vision. This isn’t always easy, but your patience and perseverance will pay off. Remember, practice makes permanent.

1544

GOOD LEADERS ENFORCE THE RULES. Great leaders know when the old rules no longer apply and that it’s time to make new ones.

1553

Leading an organization on its journey to the cloud is one of the best opportunities technology leaders will have to make new rules.

1556

Once we illustrated that our controls were greatly improved because of the new rules we were employing around automation, our auditors became more comfortable with our future direction.

1592

These same tactics apply when dealing with your security and legal teams. Involve them early and often, and partner with them to ensure that everyone’s needs are met.

1595

We eventually institutionalized training of our own. Our DevOps team started to host “DevOps Days” where others in the organization learned about the best practices, frameworks, and governance models we developed on top of the cloud.

1641

Give your teams a hands-on, time-constrained opportunity to do something meaningful to your business with the cloud, and see what happens.

1671

Every organization’s journey will be unique, but there are some commonalities that I’ve seen in organizations that do this well. Here are 11 considerations that speak to these commonalities:

1684

Start with something meaningful, but basic. Your teams will quickly see the practical benefits of cloud technologies when they accomplish something important to the business.

1686

Leverage AWS training. AWS offers a wide variety of great training programs.

1692

Give your teams time to experiment. Creating a culture of experimentation is the next best practice on the journey, and it is particularly relevant when motivating your staff to learn.

1697

Set goals that encourage learning and experimentation. Most companies set goals and/or KPIs for their staff and tie these goals to performance. Using these existing mechanisms is a great way to reinforce your strategy and produce the behavior you’re after.

1700

Set time constraints, and pace yourselves. This is especially important as you move toward a culture of experimentation.

1705

Spot and deal with change resistance. All of these considerations are aimed at curbing your staff’s resistance to change by giving people the tools they need to succeed.

1709

swiftly deal with unnecessary friction.

1712

Don’t be afraid to give people new roles. Moving to the cloud in a meaningful way is as much a cultural shift as it is a technology shift.

1713

Show your staff how they fit into the bigger picture. It’s much easier to get excited about your job when you know how it fits into the organization’s big picture.

1719

Go to industry events and see what others are doing. Most people learn a lot from the successes

1722

Learn from your partners. There are tens of thousands of organizations in the AWS Partner Network.

1726

Institutionalize your own flavor of training in your organization. As you progress on your journey, you will hopefully find that a few teams or individuals in your organization will want to share what they learn with others.

1731

the best way to get the most out of your cloud investment is to invest in training to help build cloud skills within your organization.

1751

YOU ALREADY HAVE THE PEOPLE YOU NEED

1755

VALIDATE KNOWLEDGE WITH CERTIFICATIONS Encourage staff to get certified so everyone feels confident in the team’s skills.

1772

Phase 1:Cloud awareness and essentials training for a very broad range of employees Phase 2:Role-based foundational training for technical employees and key lines of business Phase 3:Associate certification for select technical staff with relevant experience Phase 4:Advanced and specialty training for select technical staff as required Phase 5:Professional certification for select technical staff with relevant experience

1791

The highly skilled, proactive, and dedicated team I had was the team I needed. The team members just needed a path, an incentive, and someone with empathy to listen and address their totally human fears of the technology unknown.

1835

Your engineers must accept the fact that they have the ability to learn AWS cloud skills and become experts.

1844

start with the AWS Technical Essentials course.

1849

There is no compression algorithm for experience. So, hands-on time is now required. Even if it’s a little clunky, engineers need to play away and configure stuff in a safe space.

1853

CREATE YOUR TWO-PIZZA TEAM53 The first engineering team you put together should consist of a thorough mix of core skills—Network, Database, Linux Server, Application, Automation, Storage, and Security.

1858

BRING IN SOME EXPERTS There is no compression algorithm for experience (continued). So, now you should bring in some real experts. Indeed, adding some expert-level engineers who have the right attitude when it comes to sharing their learnings and best practices is essential at this point.

1863

MAKE IT REAL At this juncture, the goal of the Agile two-pizza team should be to build something real and in production.

1873

SCALE THE LEARNING WITH CELLULAR MITOSIS

1881

And you need to gently but consciously split this first team who have gained the experience and best practice, into two new four-person teams, and then introduce four more engineers into each team. This will be difficult and should be handled with care.

1883

pass the certification in their own time and at their own pace.

1892

I advocate starting with the Associate-level certification63 and building up to the Professional-level certification.

1892

RECOGNIZE AND REWARD EXPERTISE (IN A VERY LOUD AND PROUD WAY!) Your goal as an IT executive is to shout from the rooftops the names of every engineer that passes every certification exam. Reward

1907

and recognize technical progress in any way that you can.

1909

TAKE THE CHALLENGE YOURSELF

1913

CREATE A UNIFYING JOB FAMILY PORTFOLIO

1919

Technical Program Manager (TPM)—Typically responsible for the Agile execution, release train congruence, and team interdependencies. AWS Infrastructure Engineer (IE)—Previously data center systems engineers, who were typically Linux/Wintel/Network, etc. Now creating CloudFormation code for different AWS building blocks as required for the product teams. AWS experts. Software Development Engineer (SDE)—Writing logic and working with data constructs in a variety of software languages.

1922

Software Quality Engineer (SQE)—Using test-driven design principles. Ensuring that testing is considered and executed throughout the lifecycle. Security Engineer—Ensuring that security is holistic. Engineering Manager—The manager responsible for both intent and supervising a group of engineers comprised from the above skill groups.

1926

Create a Culture of Experimentation Enabled by the Cloud

1971

Here are four things to consider and four things to avoid when building a culture of experimentation in your organization.

2024

DO manage expectations. Not every experiment will deliver the results you envision, but every experiment is an opportunity to learn and improve your operations. If your organization is not used to the concept of failing to learn, start small and make sure everyone knows which projects you consider experiments.

2025

DON’T start with a project where everyone is set on a specific outcome. If you’re acting as a change agent trying to create a culture of experimentation, don’t experiment too early in your journey with a project where your stakeholders demand a specific outcome.

2031

DO encourage your teams to propose experiments. Every organization has its own way of determining which projects get technology resources. Unfortunately, some organizations now treat the technology or IT department as a cost center and have pushed ideation too far away from those implementing it.

2035

DON’T pursue an experiment until you know how to measure it. You want to spend time on the right experiments and ensure the lessons learned from them will improve your operations and your products. Before you let your team to move forward with an experiment, you should agree on what they will measure during the experiment, and how.

2041

DO consider DevOps to institutionalize experimentation. A DevOps culture can be a powerful way to codify experimentation into your organization.

2046

DON’T doubt your team. Doubt is one of the most powerful ways to discourage your team and open the door for failure.

2050

DO encourage the whole organization to participate. As you start to deliver results faster through experimentation, other areas of the organization will become attracted to your methods. Involve those people.

2055

DON’T let experiments slow or halt delivery. Don’t let your teams off the hook to deliver just because something is an experiment. While it’s okay to fail and learn, it’s not okay to under deliver on the experiment.

2061

Your CCoE should be responsible for building the best practices, governance, and frameworks that the rest of the organization leverages when implementing systems on (or migrating systems to) the cloud.

2303

STARTING SMALL, ITERATING, AND GROWING based on what you learn is a recurring theme in any successful Enterprise Cloud journey.

2318

staffing their CCoE with one or two skeptics. If you have leaders in your organization who carry a lot of influence and are wary of your cloud direction, you might consider tying their success to how quickly you’re able to gain value from the cloud by putting them in or around your CCoE.

2343

Your CCoE should be high enough in your organization’s power structure that it can create impactful change.

2348

Your organization’s CCoE should start small and grow as it adds value to the business. Organizations that do this well set metrics or KPIs for the CCoE and measure progress against them. I’ve seen metrics range from IT resource utilization, to the number of releases each day/week/month as a sign of increasing agility, to the number of projects the CCoE is influencing.

2359

Identity management. How do you want to map roles and permissions in your cloud environment to the roles and responsibilities that you already have in your organization? What services and features are you comfortable leveraging in what environments? How do you want to integrate with your Active Directory and/or single-sign-on (SSO) platform?

2368

Account and cost management. Do you want to map accounts to business units and cost centers so you can logically separate your IT services and/or understand business-unit-specific costs?

2373

Asset management/tagging. What kind of information do you want to track for each of the resources that you provision? Some examples I’ve seen include budget code/cost center, business units, environments (e.g., test, staging, production), and owners.

2377

Reference architectures. How can you build security and governance into your environment from the very beginning, and rely on automation to keep it up to date? If you can find and define commonalities in the tools and approaches you use across your applications you can begin to automate the installation, patching, and governance of them.

2384

Alternatively, you might want multiple reference architectures for different classes or tiers of applications.

2387

defining an automation strategy, exploring a hybrid architecture, providing continuous delivery capabilities to enable business units to move more quickly and run-what-they-build, defining data governance practices, and implementing dashboards that give transparency to the metrics/KPIs that are important to your business.

2391

EXECUTIVE SUPPORT It’s hard for a CCoE to succeed if it’s not driven by strong leadership.

2406

EDUCATING YOUR STAFF The CCoE should be leading the charge to educate the rest of your organization on cloud, how your organization uses it, and evangelizing the best practices, governance, and frameworks that you use to support your business.

2413

EXPERIMENTATION The CCoE provides the guardrails that allow the rest of the organization to experiment quickly, while enhancing the organization’s security posture. By implementing reference architectures for common application patterns and developing one or more continuous integration platforms, the CCoE can enable dependent business units to experiment in a consistent and compliant way, allowing the organization to run-what-they-build, fail fast, learn, and deliver value to the business faster than before.

2421

PARTNERS Partners are, as I’ve mentioned, there to accelerate your cloud strategy, and your CCoE can help accelerate your partner strategy. You can use your CCoE to stay on top of the evolving partner ecosystem, evaluate new tools, and steward the best practices of how the new wave of cloud tools and consultants are integrated into the complex enterprise environment.

2425

HYBRID Cloud is not an all-or-nothing value proposition, and any enterprise that has been running IT for a significant period of time will run some form of a hybrid architecture. Your CCoE should be driving your hybrid strategy, and develop the standards and reference architectures for how your cloud and on-premises applications can call each other and migrate to the cloud over time.

2432

CLOUD-FIRST At some point, your CCoE will prove to some (and eventually all) of your business units that they’re better off asking themselves why they shouldn’t use cloud for their projects than asking why they should.

2440

Using automation and having reference architectures for your legacy and/or compliant applications will lead to faster time-to-marke,t and you should find that your business units want to work with your CCoE rather than have to be coerced

2442

I encourage companies wanting to shift to a DevOps culture to do so in a DevOps fashion—start with small projects, iterate, learn, and improve. I encourage them to consider implementing strategies that produce commonly accepted practices across the organization, and to begin embracing the idea that, when automated, ongoing operations can be decentralized and trusted in the hands of many teams that will run what they build.

2461

By implementing and inventing frameworks, best practices, and governance, and by automating everything, DevOps became one of our key levers for driving innovation and accelerating product development.

2466

three tenets to start:

2471

1.Be customer service-oriented throughout your organization. Businesses today should think of their internal stakeholders as customers.

2472

2.Automate everything. It’s widely understood that to get the most out of the cloud, you’ll need to be able to reliably reproduce your systems using code.

2477

3.Run what you build. This is often where I see traditional IT become anxious.

2481

Start small and be satisfied with incremental improvements. Cultural changes don’t happen overnight. Slowly begin applying these concepts across your portfolio, and both new and old ways of working can improve.

2490

1. CUSTOMER-SERVICE-CENTRICITY IMPROVES THE IT BRAND

2512

My experience has taught me that when someone finds an easier way to execute a task, they’ll likely take it. If they aren’t getting the service they need from IT, they’ll try to find that service elsewhere.

2519

2. CUSTOMER-SERVICE-CENTRICITY IS GOOD FOR YOUR CAREER In a DevOps model, where application teams run what they build, ownership and customer service go hand in hand.

2534

Ownership simply means that any individual responsible for a product or service should treat that product or service as his or her own business.

2537

Ownership creates better customer service because it places responsibility and reputation directly on the shoulders of the person overseeing the product.

2540

I try to encourage executives to make run what you build a crucial tenet in their DevOps-driven organizations. Here are just a few benefits and behaviors that I’ve seen organizations reap:

2578

Design for production. Run what you build forces development teams to think about how their software is going to run in production as they design it.

2580

Greater employee autonomy. The run-what-you-build mentality encourages ownership and accountability, which, in my experience, leads to more independent, responsible employees and even career growth in the organization.

2585

Greater transparency. No one wants their personal time interrupted.

2588

More automation. Developers hate repeating manual tasks, so if they find they have to do something over and over in production to address an issue, they’re more likely get to the root cause, and automate things along the way.

2592

Better operational quality. Things like transparency and automation will make your teams more efficient, and will continue to raise the bar for operational excellence.

2594

More-satisfied customers. Run what you build forces the entire IT team to understand more about the customer.

2596

EXPECT TO GROW YOUR DEVOPS PRACTICE SLOWLY Like most things worth doing, implementing a culture of DevOps takes time. I encourage anyone going down this path to start small and deliberately.

2608

benefits I’ve seen organizations celebrate as they mature a culture of DevOps:

2632

Constant releases. A DevOps culture should be conducive to smaller changes being made more frequently.

2633

Globally distributed apps. Watching an application scale up and down across different regions of the world as each time zone conducts its business is one of the most rewarding experiences I’ve had as an IT executive.

2639

The data center migration. Everything IT does should be driven by the business.

2644

three trends

2676

These three myths are:

2677

Myth One: Hybrid is a permanent destination. Permanent is too strong a word to describe this point of view.

2677

four factors that are working toward accelerating this transition:

2681

1.The economies of scale that cloud providers achieve are continuing to grow with adoption. These benefits, one way or another, will benefit cloud consumers. 2.The pace of innovation coming from cloud technologies is unprecedented. AWS released 516 new services and features in 2014, 722 in 2015, and 1,017 in 2016. 3.The technologies that companies depend on to run their business (e-mail, productivity, HR, CRM, etc.) are increasingly being built on the cloud. 4.The technologies and businesses that exist to help companies migrate to the cloud are growing rapidly in number. To get an idea, check out the AWS Marketplace and the AWS Partner Network.

2682

Myth Two: Hybrid allows you to seamlessly move applications between on-premises infrastructure and the cloud.

2689

Architecting your applications to seamlessly work across your data centers and the cloud will limit you to the functionality of the lowest common denominator.

2693

Myth Three: Hybrid allows you to seamlessly broker your applications across several cloud providers.

2694

Like Myth Two, this also limits the functionality to the lowest common denominator.

2703

“Surround yourself with the best people you can find, delegate authority, and don’t interfere as long as the policy you’ve decided upon is being carried out.” —RONALD REAGAN

2757

STAGE 4—OPTIMIZATION As our AWS footprint grows, we continually look for ways to optimize the cost and improve speed by automating reoccurring deployment activities. Some of the optimization efforts are “tuning” the infrastructure that’s allocated for each application.

2880

Other optimization efforts have been bolder, refactoring traditional platforms to use a serverless model. We currently have several key applications that are currently converting over to this pattern, and we have staffed an Agile team to enable our software engineers to use these new services similar to what was done initially with a CCoE.

2886

Learning: Invest early on the foundation, including: 1.Establish your landing zone 2.Train your team 3.Transform your security practices 4.Simplify your CI/CD practices and operational tooling 5.Establish an API approach including a micro-services POV, tooling, and reference architectures, especially if you have monoliths to decompose!

2995

COMMUNICATING EARLY AND OFTEN This is a complex process with many factors. Having buy in from all levels of the system is critical to the execution of this transformation.

3000

Learning: Keep your stakeholders in the loop by communicating early and often. Transitioning ownership and accountability from the core group developing the business case to those that will ultimately own the execution is critical. Include it in your plan—timing is everything!

3002

Cloud allows us to experiment in a way that is not easily possible on-premises.

3036

Do not think of the cloud as just another Engineering project. In order to get the benefits of the cloud, many of your processes and even organizational structure will need to change to achieve the full benefits of an adaptable infrastructure, experimentation & innovation, and DevOps.

3052

develop a Cloud Policy. In simple, easy-to-understand language, this document lays down the rules for operating in the cloud.

3058

The CIS AWS Foundations Benchmark provides many example policies that you can incorporate into your policy document, which include: All root accounts will protected with an MFA Internet gateways will not be used AWS IAM Password Policy will match Corporate Password Policy. Encryption at Rest and In Flight is required for all AWS services

3063

Once the document is finished the Cloud Services (CS) team can then proceed to develop the governance and operational guidelines and processes used to implement the policy.

3068

From a technology perspective, we viewed multi-cloud as a strategic hindrance. We felt that a multi-cloud approach would limit us to the lowest-common-denominator of features across clouds.

3080

we developed a multi-region strategy with a HOT/WARM failover scenario. AWS has made this process significantly easier with the release of both more North American regions and Direct Connect Gateway.

3085

Next, we abstracted away the operating system using containers (specifically Docker).

3088

Third, we have completely automated our cloud deployments. This has two benefits to mitigate single vendor risk:

3090

1.The entire environment is in code so we can easily audit what needs to be changed if we need to change cloud providers.

3091

2.With automated deployments, changing providers is just a matter of changing the automation.

3093

Additionally, we use simple low-cost abstraction layers where appropriate. For example, we have written a simple wrapper around Amazon Simple Queue Service (SQS) as it has a simple application interface that is common across many messaging platforms.

3095

Please note that we try to avoid writing abstraction layer as much as possible. We believe that adding these layers forces us to keep up with AWS’s pace of innovation in areas that aren’t core to our business, which can negatively impact our investments in software engineering.

3097

MIGRATING YOUR FIRST APPLICATION Choosing your first application to migrate is a key decision in the AWS adoption.

3127

our key takeaways for your cloud adoption and digital transformation: Your first application should be “Cloud Obvious” Try to ignore TCO as a primary cloud driver Build a policy document that describes your rules for operating on AWS Get support from senior leadership outside of IT on your cloud and digital transformation journey Develop a multi-cloud strategy which may not include multiple cloud providers Engage with a Partner to help build-out your initial Cloud platform and practices

3155

prime directive. It says, “Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.”

3318

“If you spend too much time thinking about a thing, you’ll never get it done.” —BRUCE LEE

3348

2.Use the two-week rule for refactoring decisions. We started our two-year migration plan with a flexible principle that allowed us to be opportunistic about refactoring; but nothing could delay the two-year target due to the data center lease expiration,

3758

the team developed a specific two-week rule of thumb that it still uses today. If we could refactor a sub-optimal component or service in our stack within two weeks, we would refactor versus lift-and-shift.

3761

In XOIT we have one golden rule: each role and each group that has accountability or is responsible for a domain must have full autonomy and authority to decide how it will reach its purpose (whether group or role).

3869

The infrastructure teams have started to take advantage of the automation that virtualization enables, but for the most part this capability has not been extended to the development teams. Self-service delivery models are still mostly manual, often requiring days or weeks of approvals using limited automation. Changes are hard since teams are still operating in silos, and orchestrating changes across these silos often comes with a lot of bureaucratic overhead that is rarely managed well. Ultimately, very little has been achieved in terms of builder velocity. Builders are still frustrated by limited automation, which impacts their productivity. It still takes too long to get things done for the business.

3929

Cloud Center of Excellence. The CCoE team develops and maintains the core infrastructure templates, designs reference architectures, educates the dev teams and helps them migrate applications to the cloud. The development teams leverage infrastructure as code pipelines and are starting to develop continuous integration pipelines for their applications.

3953

The Cloud COE also focuses more on developing reference architectures, governance, and compliance frameworks and allowing the development team more autonomy to deploy both infrastructure and applications through a unified CI/CD pipeline.

3984

“Tenet: a principle, belief, or doctrine generally held to be true; especially: one held in common by members of an organization, movement, or profession.” —MERRIAM-WEBSTER

4037

At Amazon, our Leadership Principles137 guide and shape our behavior and culture, and they’re at the heart of our ability to innovate at a rapid pace to serve our customers.

4047

My recommendation is to define a set of cloud tenets to help guide you to the decisions that make the most sense for your organization.

4050

The charter or mission states the what; the tenets state the how. Tenets are principles and core values that the program or team uses to fulfill the mission or charter.

4066

Be memorable. Being

4068

Each tenet has only one main idea.

4069

Be program specific. Good tenets will get people excited about the cloud program.

4070

cloud capabilities.” Counsel. Tenets help individuals make hard choices and trade-offs.

4073

Tenets keep you honest. It’s easy to get caught up in group-think or distracted by the nuances of a specific project and lose sight of the overall goals.

4076

Give app teams the control / ability to consume cloud services without artificial barriers—“If AWS deployed to the public, why can’t we use it?”

4081

Our mission statement was to figure out the right tooling and practices that would empower our development teams to deliver awesome digital experiences for our customers with agility and confidence.

4116

12 steps to get going when starting, that consistently deliver results:

4354

STEP 1—DON’T OVER-THINK IT; ASSIGN A DEVELOPER AND START!

4355

STEP 2—EMPOWER A SINGLE-THREADED LEADER

4362

STEP 3—CREATE YOUR 2-PIZZA CLOUD BUSINESS OFFICE

4372

Amazon’s 2-pizza team concept means a team of around 8–10 people. And, in this case, I’m referring to the virtual leadership team, which needs to provide strategic oversight and tactical air cover for engineers and developers as you move to the public cloud.

4374

STEP 4—ESTABLISH YOUR TENETS (AND BE PREPARED TO AMEND THEM AS YOU GO)

4390

1.Be Clear on your Business Goal—Are you reducing cost? Transforming to digital native? Reducing your app footprint? Or closing your data center?

4397

2.Choose a Predominant Public Cloud Partner—This provides focus for your organization to get to an expert level with a predominant platform, avoiding the distractions that come with too many platforms, across people, process, and technology paradigms.

4400

3.Agree on Your Security Objectives. I recommend reading these excellent white papers

4402

4.Remember That the Team You Have Is the Team You Need.

4405

5.You Build It, You Support It—small 2 pizza teams that own what they build

4407

6.Command and Control or Trust, But Verify approaches for your Engineers and Developers.

4409

STEP 5—CREATE YOUR QUESTIONS PARKING LOT The leadership team (everyone) will have lots of questions. Unfortunately, many hours will be wasted trying to answer them without the right folks in the room, and your progress could stall.

4410

Top Tip—The very best way to answer a lot of questions quickly is to arrange an executive briefing centre session with AWS.

4415

STEP 6—CREATE YOUR 2-PIZZA CLOUD ENGINEERING TEAM Creating your holistic cloud engineering team,

4418

STEP 7—BRING IN A PARTNER OR AWS PROSERVE

4433

STEP 8—WORK BACKWARDS FROM YOUR SECURITY, COMPLIANCE AND AVAILABILITY OBJECTIVES

4438

STEP 9—SHIP SOMETHING TO PRODUCTION THAT IS IMPORTANT, BUT NOT CRITICAL You need to get something that’s meaningful into production.

4449

STEP 10—TRAIN, GAIN EXPERIENCE, AND CERTIFY YOUR TEAMS The key role of the CCoE is to ensure that the people journey for everyone is managed positively and proactively.

4456

STEP 11—START MIGRATING—“PLANS ARE WORTHLESS, BUT PLANNING IS EVERYTHING”—DWIGHT D. EISENHOWER Once you have multiple teams you can really start thinking about migrating.

4461

STEP 12—TRUST, BUT VERIFY Finally, the question that many larger enterprises come back to time and time again is “How do I balance control (especially security) and innovation?”