Friday, June 26, 2009

Automated Infrastructure enables Agile Operations

"Agile" been applied to such unanticipated domains as enterprises, start ups, investing, etc. Agile encompasses several generic common sense principles (eg: simple design, sustainable pace, many incremental changes, action over bureaucracy, etc.) so the desire to bestow its virtues on all kinds of endeavors is understandable.

But why contemplate the idea of Agile Operations? Why would Agile Operations even make sense?

Let's start by playing devils advocate. Some of the Agile principles appear to contradict well established and accepted systems administration goals, namely stability and availability. Traditional culture in operations leans towards risk-aversion and stasis in an attempt to assure maximum service levels. Many operations groups play a centralized role serving multiple business lines and have evolved to follow a top-down directed, command and control style management structure that wants to limit access to change. From their point of view, change is the enemy of stability and availability. With stability and availability being the primary goals of operations, it's easy to see where the skepticism towards Agile Operations comes from.


The calls for Agile Operations has initially been driven by product development groups that employ Agile practices . These groups churn out frequent, small improvements to their software systems on a daily basis. The difference in change management philosophy has been the cause of a growing clash between development and operations. The clash intensifies when the business wants to drive these rapid product development iterations all the way through to production (even 10+ times a day).


So, if operations is to avoid being a bottleneck to this Agile empowered flow of product changes, how can they do it in a way that won't create unmanageable chaos?

To apply Agile to the world of operations, one must first see all infrastructure as programmable. Rather than see infrastructure as islands of equipment that were setup by reading a manual and typing commands into a terminal, one sees infrastructure as a set of components that are bootstrapped and maintained through programs. In other words, infrastructure is managed by executing code not by directly applying changes manually at the keyboard.


Replacing manual tasks with executable code is the crucial enabler to sharing a common set of change management principles between development and operations. This alignment is truly the key first step in allying development and operations to support the business' time to market needs. This shared change management model also facilitates a few additional beneficial practices.

  • Shared code bases: Store and control application and infrastructure code in the same place so both dev and ops staff have clear visibility into everything needed to create a running service.
  • Collaborative configuration management: Application and infrastructure configuration management code can be jointly developed early in the development cycle and tested in development integration environments. Code and configuration become the currency between dev and ops.
  • Skill transfer: App and ops engineers can transfer knowledge about the inner workings of the runtime application system and develop skills around tooling to maintain them.
  • Reproducibility: Reproducing a running application from source and a build specification is vital to managing a business at scale. (http://www.itpi.org/home/visibleops.php)
While some may argue that "Agile" in its entirety does not completely apply to the world of operations, an automated infrastructure based on principles like code sharing as a form of collaboration between dev and ops is a sound basis to enable business agility.

5 comments:

The IT Skeptic said...

OK I give up. How? Are you proposing a new generation of management tools?

Alex Honor said...

The short answer is yes but it does not necessarily mean the management tools are commercial or even third party. In my opinion, the essence of getting smoother collaboration between dev and ops is through code sharing. That code might plug into a formalized management framework but it might also just reflect established shared conventions.

Bill said...

e already share code with configuration Management through continuous build platform. It's not enough.
If you have dedicated platform, it may work, but in case of shared platform you're still blocked. It is also a philosophy change. In dev team agile is visible, you can feel and touch it. Not in production today.

I agree with you on the continuous deployment approach. But I think most of the job to be done is training OPS.

Alex Honor said...

Hi Bill,
Yes, I do believe the the cultural/philosophical shift is just as important. In a sense, the ops group (at least those that build/manage infrastructure) must change into something like a dev group.. only this dev group in ops drives infrastructure and app integration software projects.

Besides the need to better align with dev/product side of the org, ops needs to adopt a code-driven infrastructure development model due to the growing complexity and scale. Code sharing between dev and ops can also help address poor release and bugfix handoffs by testing the infrastructure and change process early in test environments.

My question is this: how else can ops guarantee consistency across operational environments without resorting to painstakingly slow eyeball verified manual procedures?
Thanks

Damon Edwards said...

@Bill Treating your infrastructure as code is a cultural shift as much as a tooling shift.

Old habits die hard. "Tweak the live system until we get it right" is still all too common of an operating paradigm.

@ITSkeptic, the tools may be new for some but examples have been around for a couple of years and are already in production use. Look at open source tools like Puppet, ControlTier, Chef, Cobbler... all specification driven tools. The specification is the "code".

The fully automated provisioning whitepaper we talked about in previous posts shows these tools in action:

http://blog.controltier.com/2009/04/new-whitepaper-achieving-fully.html