Visions and Principles

Open Table of contents

Vision Drafts
- Key Elements:
Principle Drafts

Vision Drafts

“Our DevOps systems empower teams to automatically release high-quality functionality built with secure, scalable, and reusable standards.”

“Consistent automated releases of secure, reliable, and high-quality functionality to customers.”

Key Elements:

Consistent: Implies scheduling, predictability, and frequent standard releases.
Automated: A non-negotiable requirement covering build, pipelines, deployment, testing, and monitoring.
Release: Encompasses more than just deployment and build; it extends to customer delivery.
Reliable: Must be measurable against a defined standard, ensuring observability.
High-quality: Requires well-defined testing and quality assurance.

“Empower our developers to focus directly on providing customer value without worrying about how to securely develop, build, test, and deploy. The CI/CD platform provides common interfaces that satisfy the needs of all stakeholders across supported languages and environments.”

Principle Drafts

1. Standardized

In Practice:

Maintain a finite selection of languages, runtimes, cloud services, security tools, and code templates.
Adapt to evolving SDLC needs with flexibility.
Implement critical guardrails to manage security and operational constraints with minimal overhead.
Use standards to enable releases at scale.
Provide freedom to build within secure constraints.
Enforce appropriate standards for workflows and tools.

2. Metrics-Driven

In Practice:

Continuously measure and improve SDLC performance over time.
Implement a scorecard for tracking progress.

3. Remote and Distributed

In Practice:

Enforce SDLC practices designed for distributed teams.
Architect systems that scale to geographically diverse users.
Promote remote-friendly development practices.
Maintain comprehensive documentation and runbooks.
Establish clear ownership of systems and processes.
Enhance local development environments using cloud resources.
Ensure equal access to networking and deployment capabilities for all developers.
Support async approvals and on-call best practices.

4. Reusable

In Practice:

Favor libraries over duplicated code.
Use shared network and storage patterns over custom one-off solutions.
Contribute to shared codebases when feasible.
Prefer primitives over frameworks (adapted from AWS).
Follow modular and composable design principles (inspired by HashiCorp and Unix philosophy).

5. Scalable

In Practice:

Reduce technical debt by enabling refactoring over full rewrites.
Build systems using small, independently iterated components.
Design microservices and nanoservices for easy component swaps.
Ensure monoliths can evolve efficiently through structured refactoring.
Architect for seamless growth and adaptability (adapted from AWS).

6. Automated

In Practice:

Code is deployed to customers without manual intervention.
Automate build, deployment, canary testing, and feature flag management.
Use automation to enforce constraints, standards, and best practices.
Eliminate human-induced risks.
Strive for zero operational toil (adapted from Google SRE and AWS).

7. Optimize for Easy Onboarding

In Practice:

Ensure new and existing team members can seamlessly work with systems, both locally and remotely.
Minimize effort required for non-engineers to engage in SDLC workflows.
Maintain high parity between local and remote development environments.
Enable designers, PMs, architects, QA engineers, and SREs to integrate into engineering workflows.
Provide a scorecard for assessing adherence to principles.

8. Design API-First

In Practice:

Prioritize system integration through APIs over direct code interactions.
Define a control plane API to facilitate automation.
Ensure all operational tasks can be executed via APIs.
Maintain modular system architecture with API-driven automation (adapted from AWS).

9. Expect Failures

In Practice:

Document failure states and recovery strategies.
Automate outage recovery processes.
Provide a control plane API for system operators.
Define:
- How to detect failure states.
- Differentiation between atomic and partial failures.
- Indicators of degraded performance.
- Steps for automated outage recovery (adapted from AWS).

10. Performance as a Competitive Advantage

In Practice:

Ensure continuous performance improvements over time.
Implement deep observability and testing strategies.
Define measurable performance metrics and SLAs.
Commit to iterative system improvements through refactoring.
Conduct performance testing at release and in production.
Utilize A/B testing to optimize performance.
Implement logging, metrics, and tracing across all system tiers.
Monitor resource budgets (cloud, licensing, etc.).
Track system changes and SDLC contributions.
Generate comprehensive test reports.

By following these principles, we ensure that our development and operational practices are scalable, reliable, and focused on delivering high-quality software efficiently.