This is how we do DevOps

At ImpacTech we optimise for flow, avoid drift and break down silos by following these DevOps best practices

IT companies strive to get their product to market faster and offer customers value and quality service through continuous evolution. How well they do this is a reflects their DevOps.

The first objective is a function of the development team while the latter is the responsibility of operations. Doing both well is essential to success. And despite the shared responsibility of software deployment and application support, the two departments don’t often interact. In most cases, they operate in siloed functions that do not encourage teamwork.

As a result, these departments frequently find themselves at cross purposes, and failure to address this issue over time results in competitive disadvantage in the market.

SILOS RESULT IN DISCONNECTED TEAMS

Development teams seek to deploy the changes/features requested by the customers and product owners and address bugs reported asap. They work to deploy these changes without affecting the product roadmap, team velocity or the tasks they’ve committed to completing.

By contrast, the primary goal of an operations team is to provide a reliable, stable and secure environment for customers. Operations oppose changes to software that put their production services into yellow state or results in any service degradation. The mindset is to avoid an “outage” at all costs.When left unchecked the discord between these teams increases the time needed create new features and address bugs. It also reduces software quality and accelerates a downward spiral of deployments going wrong.

Over time the situation worsens. As management/product owners/customers push for more features and delivery of bugs the teams respond with easy fixes and semi-prepared solutions which increase the technical debt (see definition) of the IT company.

As it moves deeper down the spiral under the weight of the accumulated debt, development that once took hours ends up taking days and deployments that used to take days spreads out over weeks.

Technical Debt

Technical Debt describes the impact of decisions made that lead to problems that become more difficult/complex to solve and limit the options available in the future. Like financial debt, technical debt carries significant “interest” over time.

Case study: LINKEDIN

LinkedIn faced some major issues in 2011 following their successful IPO which was magnified by the huge growth of the organization which resulted in failing deployments, short outages and every deployment posing a risk to the reliability of the system.
The technical debt accumulated over the years had finally caught up with LinkedIn and the primary monolithic service could not keep up with the vision of even faster growth.
In response management took the brave decision to put all new development on pause for two months to re-architect and re-factor their code base.

(reference: https://www.linkedin.com/pulse/when-your-tech-debt-comes-due-kevin-scott)

THE DOWNWARD SPIRAL

The root of the problem is the desire of the Operations team to ensure applications/infrastructure run smoothly and products deliver value to the customers.

As technical debt accumulates, day-to-day operations become more complex and fragile. Workarounds are applied more frequently, and the Development team promises a holy fix will be applied on the next release – when more time is available.

But that time is never available.

Development complexity increases, the product’s fragility to changes increases no matter how many new hands are on deck.

Deadlines are missed and customers raise more tickets with Sales and Customer Support teams which increase the need to compensate for the last change made that broke a functionality or caused delay.

To placate the customers a huge new feature is promised.

The product owner pressurises the development team to deliver the new feature on time, fix the bugs and achieve the new project’s goals.

However, the technical challenges cannot be met without an easy fix. Another workaround is applied to Production with the inevitable consequences. Even greater technical debt.

Over time processes become more complex, team members are at full capacity, effective communication drops to the point that the company becomes so tightly coupled that even the smallest change can cause big failures.

Engineers avoid touching anything that it is still working, operations avoid deploying unless critical, and management brings in even more gatekeepers (overpopulating the approval chain).

With every cycle the IT company plunges down the spiral under the weight increasing technical debt.

The software delivery cycle becomes slower, customer feedback slows, and the company is unable respond to the demands of the market. Competitive advantage is lost, and the best engineers look for opportunities to get out as the most important customers cancel their contracts.

But it doesn’t have to be like this.

A BETTER WAY FOR DEVOPS TO CO-EXIST

For DevOps to work harmoniously and achieve the company’s common goals requires organizational excellence. The components of this model are:

Small development teams that work independently and validate their work in production-like environments. Their code is deployed quickly and safely.
Deployments for Operations team are easy and predictable and happen during normal business hours.
Feedback is provided for each step of the software lifecycle. Automated testing assures the Development team that the bugs are found as soon as possible and the Operations team that the technical debt is addressed as soon as it is discovered.
Environments exist that provide telemetry data (system metrics, log monitoring and tracing capabilities) to identify and fix any issue as soon as it is presented.
Software/infrastructure architecture that helps speed up the lifecycle and decouples each engineer’s work from the other. This make scaling possible and practical.
The lifecycle is fast enough that we value the fail fast approach. We can try to take risks, implement a feature that will be tested on some customers to identify any issues before offering as a general release.
Every engineer owns his or her work and seeks to achieve the highest quality by ensuring that every nut and bolt is tested automatically.
And when something goes wrong, we safely rollback, fix the issue and identify at Retrospective what actions need to be taken to ensure that this issue does not happen again.

In this organisational model quality is paramount and provides the space to continuously learn from any failures. Engineers can take personal responsibility, unencumbered by the goal to dominate the marketplace by providing higher value to our customers.

DOES THIS APPROACH PROVIDE VALUE FOR A BUSINESS?

According to the annual State of DevOps report issued by Puppet Labs, high performing organisations following these DevOps principles outperform their peers in the following areas:

Throughput metrics
Thirty times more frequent code deployments.
Two hundred times faster time from code commit to production.
Reliability metrics and production deployments with 60% higher success rate
168 times faster mean time to service restore during incidents
Twice as likely to exceed organization goals on productivity, market share and profitability
50% higher market capitalisation growth over three years

And as a bonus, engineers get to work with cool tools.

HOW TO ACHIEVE THIS DEVOPS APPROACH

These DevOps principles are a combination of proven best practises and years of learning in important movements in both IT and manufacturing.

The agile manifesto suggests following a development discipline of continuous building, testing and integration. But why not extend this to approach to add continuous delivery?

To achieve this an organization needs to undergo a transformation based on three key principles:

Principle of Flow: Accelerate the delivery of the work from business requirements to Development to Operations and finally to customers
Principles of Feedback: Continuous feedback throughout the lifecycle of the product to create safer, more reliable systems
Principles of Continual Learning: Daily work should involve lessons learned, outcomes that can be promoted to organizational practises that focus on continual improvement in conjunction with the above principles and result in gaining a competitive advantage in the market.

The result of this approach will break down any existing silos or prevent them from being created. You will more easily identify drift and be better equipped to resolve it. And with the resulting optimal flow you can unlock agility across departments, maintain product stability and ultimately ensure the quality your customers demand.