Wednesday, January 2, 2008

Where are the design patterns for software operations?

In the world of software development, application developers are accustomed to drawing from the wealth of design patterns that address common programming problems, codify best practices, and establish proven reusable solutions. There are several well known design pattern repositories that catalog solutions into various categories from fundamental ones described by the GangOfFour to architecture specific ones like J2EE Patterns, even ones for social organization. An Anti-pattern is a pattern that tells how to go from a problem to a bad solution. Design patterns help avoid re-inventing solutions and when combined together can form the basis of a problem solving "play book." When used effectively, design patterns become a common problem solving language and can lead to better written software.

But what happens after the code is written? For most organizations today, software operations - the acts of deploying, configuring, and operating software (and all of its related code and data artifacts) - is arguably as important as writing the software itself. If such an organization can't efficiently and reliably operate the software, the quality of the software will not matter. But if one looks for design patterns that codify best practices for automating software operations, nothing turns up. Where is the catalog of design patterns that address the problems encountered when managing environments of software deployments and the overall life cycle of the business service?

Anyone that has managed software operations for different organizations, will recognize the same kinds of problems and will often re-invent solutions that were successful in the past. Others that work closer to the bleeding edge will encounter problems that other groups will face later. If these problems could be discussed in terms of design patterns (or failures as anti-patterns), solutions and best practices for managing software operations would be more consistent across organizations.

Here are two specific problem areas that everyone can identify with:

  • Packages: Depending on the application and infrastructure, one will find multiple package formats in use. Operating systems use their own (eg, .rpm, .deb, .pkg, .msi, etc) and so do software runtime environments (eg, java, .net). Each format has its own way (to greater or lesser extents) of being created, extracted, and described (including dependencies). These differences lead to multiple package silos and administrative gray areas (cumbersome handoffs between dev and admin groups). It would be preferable to have a common repository that can host any kind of package type, and a homogeneous interface to controlling their life cycle (creation, installation and removal).

  • Services: At a certain level, one can view applications as a set of interacting long running processes. Again, depending on the application architecture, these processes might be standalone unix-style daemons, or windows services. Each service has its own way of being started or stopped, as well as a procedure for checking its current runtime state. Often times, shutting down a service is not a simple matter of just invoking a single command. Things go wrong at shutdown requiring other logic to figure out the next course of action. Besides coping with these differences, the deployment process is also difficult because change of runtime state and software package installation is intertwined. Software operations would benefit from a body of design patterns that described proven strategies to managing runtime state and a common model for describing these states.

Here is a sampling of general recurring problems in the world of software operations:
  • Complex application deployments: Applications are based on technologies from different vendors, are spread out over numerous machines in multiple environments, and use different architectures

  • Inconsistent management interfaces: Every application component and supporting piece of infrastrucure has a different way of being managed. This includes both how components are controlled and how they are configured.

  • Hard to scale administrative management: As the layers of software components increase, so does the difficulty to coordinate actions across them. This is especially difficult when the same application can be setup to run in a minimal footprint while another can be designed to support massive load and redundancy.

  • Incoherent life cycles: Applications are typically multi-tiered, where each tier may be on its own development track, uses its own release paradigm and requisite tools.

  • Generally, these problems are found in combination which means coping with them on the whole is a difficult challenge.

What's needed: Domain specific patterns for software operations

The body of existing design patterns can and should be used to analyze and solve some of the above problems. To make the design patterns more readily useful to software operations, we need a set of domain specific patterns. These patterns would be expressed in terms of concepts familiar to software operations groups (eg, package, service, process, node, etc) and would be geared to coping with typical problems they face (eg, various startup, shutdown strategies for services among many others). Ideally, these patterns can be composed into a system of patterns that help solve larger scale problems.

Developing patterns is a bit of an organic process but the most durable patterns are ones that have been proven over and over again in different contexts. The first step is to establish a repository to which various patterns can be contributed and a supporting forum where their merits can be discussed. Ultimately, the software operations community will find consensus about some of these patterns, thus establishing some common vocabulary and a basis for framework development.


External links:
PortlandPatternRepository
Hillside

5 comments:

Unknown said...

Michael Nygard's Release It! covers many of these topics, and well --- it even includes a small pattern language.

Alex Honor said...

I hadn't seen this book yet but it looks interesting. The table of contents suggests that it has many suggestions for how to design software systems so they are overall more stable and predictable (and as a result less problematic for ops teams). I am personally interested in a pattern language that allows software operations groups to implement management software and best practice. The book may cover this so I look forward to finding out what it has to say in that regard.

Anonymous said...

Disclaimer: Didn't read the book. I just browsed though the excerpt available on the publisher's site.

It seems that what I saw were more tips for administrators than design patterns in the classic sense. Pattern is a pretty broad term, but IMHO a pattern is really a direct pre-curser to executable code. What I think (and please correct me if I am wrong!) Alex is describing here is a need for software design patterns for people who are writing operations tools.

Alex Honor said...

One immediate interest of mine is to identify design patterns that are applicable to the software operations management world.
To make the business service reliable for the end user, there are two dimensions to consider:
1) the implementation of the service software.
2) the implementation of the service-management software

General software design best practices apply to both efforts. But the business logic of service-management is an area where I believe there is a lack of established and accepted design patterns.

Design patterns establish paradigms which eventually get captured into frameworks and common libraries. If I ask system engineers from different companies how they manage some operations activity for a business service, it seems everyone has reinvented the wheel. This "wheel reinventing" is an indication there are common solutions but there isn't a common conceptual language that would have made them look consistent at an implementation level. If there were popular frameworks and libraries based on accepted paradigms for service-management, implementations would also look more consistent.

JohnWinner said...

I'm looking for things like how to avoid your code being stolen, your software being cracked, how to package / obfuscate your libraries so people can't decompile them,
how to manage the software updates. I have a large "helper" library that I link to almost all my projects, is there a way to avoid the whole library to get distributed with the project but only the portion of code used?
I'm looking for answers/patterns about those questions.
Any help appreciated!