diff --git a/expectations/roles-and-responsibilities.md b/expectations/roles-and-responsibilities.md index 5d79a2b..048e235 100644 --- a/expectations/roles-and-responsibilities.md +++ b/expectations/roles-and-responsibilities.md @@ -1,4 +1,4 @@ -# Roles and Responsibilities Framework: Product Owners, Engineers, and IT/Operations +# Roles and Responsibilities Framework: Product Owners, Engineers, and Platform Engineering ## Product Owners @@ -8,6 +8,7 @@ - UX Design & User Interactions: Define user experience flows, wireframes, and interaction patterns - Success Criteria Definition: Establish measurable KPIs, acceptance criteria, and business objectives - Product Documentation: Maintain product specifications, user stories, and feature documentation +- Product Functional Testing: Ensures product functionality matches the proposed success criteria ### Product Owners Additional Responsibilities @@ -21,16 +22,12 @@ ### Product Owners Boundaries -- Does not make technical architecture decisions -- Does not directly manage engineering resources or timelines -- Does not configure or deploy infrastructure - -### Product Owners Escalation and Decision-Making Authority - -- Product Decisions: Product Owners have final authority on feature requirements and business priorities -- Technical Decisions: Engineers have authority on technical implementation and architecture choices -- Operational Decisions: IT/Operations has authority on infrastructure, deployment methods, and operational procedures -- Cross-functional Conflicts: Escalate to management when decisions impact multiple domains +- Does not make technical architecture decisions (e.g., cannot mandate specific + frameworks, database choices, or API designs) +- Does not directly manage engineering resources or timelines (e.g., cannot assign + tasks to individual engineers or set development sprint schedules) +- Does not configure or deploy infrastructure (e.g., cannot modify CI/CD pipelines, + deployment configurations, or cloud resources) ## Engineers/Developers @@ -38,54 +35,83 @@ - End-to-End Software Development: Design, code, and implement UX, backend, and data components - Testing: Create and execute unit tests, integration tests, and functional tests +- CI/CD Pipeline Development: Build, test, and validate continuous integration and + deployment pipelines for all environments - Software Security: Implement security best practices and address vulnerabilities - Release Documentation: Provide technical inputs for release notes and deployment guides -- Deployment Configuration Support: Specify runtime requirements and configuration needs for IT +- Deployment Configuration Build-out and Support: Develop and specify runtime + requirements and configuration needs for product deployment ### Engineers/Developers Additional Responsibilities -- Technical Architecture: Design system architecture, APIs, and data models -- Code Quality & Standards: Establish coding standards, conduct code reviews, and maintain technical debt +- Technical Architecture: Design system architecture, APIs, security, and data + models following the Architectural and Platform Engineering Team guidelines and + recommendations. Present significant architectural changes and proposals at the + weekly architectural review call for team alignment and approval. +- Code Quality & Standards: Establish coding standards, implement automated lint + and test checks, conduct code reviews, and maintain technical debt - Performance Optimization: Ensure applications meet performance and scalability requirements -- Technical Documentation: Create API documentation, technical specs, and architecture diagrams -- Environment Management: Support development, staging, and testing environment needs +- Technical Documentation: Create API documentation, technical specs, and + architecture diagrams. Ensure product documentation is deployed following the + latest guidelines and standards. +- Environment Management: Define configuration requirements and specifications for + development, staging, testing, and production environments. Coordinate with + Platform Engineering to ensure secrets and run-time dependencies are available. +- Deployment Operations: Implement and manage product deployment configurations, + including scaling policies, load-balancing rules, auto-scaling groups, and + deployment strategies. Monitor and optimize deployment processes for reliability + and performance. - Incident Response Support: Provide technical expertise during outages and debugging - Technology Evaluation: Research and recommend new tools, frameworks, and technologies -- Cross-functional Collaboration: Work with Product Owners on feasibility and IT on deployment requirements +- Cross-functional Collaboration: Work with Product Owners on feasibility and Platform + Engineering on deployment requirements ### Engineers/Developers Boundaries -- Does not define business requirements or product strategy -- Does not manage production infrastructure or CI/CD pipelines +- Does not define business requirements or product strategy (e.g., cannot decide + which features to build or prioritize product roadmap) +- Does not provision underlying cloud infrastructure or manage cloud resources + directly (e.g., cannot create AWS accounts, VPCs, IAM roles, or modify cloud + billing settings without Platform Engineering) +- Does not bypass Platform Engineering review and approval processes for deployment + changes (e.g., cannot deploy infrastructure changes to production without proper + review and approval) - Does not directly handle customer support or business stakeholder communication + (e.g., does not negotiate SLAs or respond to customer feature requests) -## IT/Operations +## Platform Engineering -### IT/Operations Primary Responsibilities +### Platform Engineering Primary Responsibilities -- CI/CD Management: Configure and maintain build, test, and deployment pipelines +- Review & Approval Authority: Review and approve all CI/CD pipelines, deployment + configurations, environment changes, scaling policies, and infrastructure + modifications developed by Engineers. Ensure all changes comply with organizational + security policies, compliance requirements, and operational standards. - Infrastructure Provisioning: Manage cloud resources, servers, and networking components -- Service Deployment: Execute application deployments across environments - Monitoring & Alerting: Implement and maintain system monitoring, logging, and alerting - Incident Management: Respond to outages, coordinate resolution, and conduct post-mortems - Service Health Reporting: Track and report on uptime, performance metrics, and SLA compliance -### IT/Operations Additional Responsibilities +### Platform Engineering Additional Responsibilities - Infrastructure Security: Implement security controls, access management, and compliance frameworks - Capacity Planning: Monitor resource usage and plan for scaling requirements - Backup & Disaster Recovery: Maintain data backup systems and disaster recovery procedures -- Environment Management: Provision and maintain development, staging, and production environments +- Environment Management: Provision and maintain development, staging, and production + environments based on Engineer-defined specifications - Cost Optimization: Monitor cloud costs and optimize resource allocation - Vendor Management: Manage relationships with cloud providers and third-party service vendors - Documentation & Runbooks: Maintain operational procedures and troubleshooting guides - Compliance & Auditing: Ensure infrastructure meets security and regulatory requirements -### IT/Operations Boundaries +### Platform Engineering Boundaries -- Does not make product feature decisions or business requirements -- Does not write application code or business logic -- Does not define user experience or product functionality +- Does not make product feature decisions or business requirements (e.g., cannot + decide which features to implement or change product priorities) +- Does not write application code or business logic (e.g., does not develop + application features, APIs, or user interfaces) +- Does not define user experience or product functionality (e.g., cannot determine + UI workflows, feature specifications, or user interaction patterns) ## Cross-Role Collaboration Points @@ -93,31 +119,141 @@ - Product Owners define requirements and success criteria - Engineers provide technical feasibility and effort estimates -- IT/Operations advise on infrastructure constraints and deployment considerations +- Platform Engineering advise on infrastructure constraints and deployment considerations ### Development Phase - Product Owners provide clarification and acceptance testing -- Engineers develop features and coordinate deployment requirements with IT -- IT/Operations prepare environments and deployment configurations +- Engineers develop features, build deployment configurations, and implement + deployments subject to Platform Engineering approval +- Platform Engineering review and approve environment and deployment configurations ### Release Phase - Product Owners coordinate go-to-market activities and communicate with stakeholders -- Engineers support deployment activities and provide technical guidance -- IT/Operations execute deployments and monitor system health +- Engineers execute deployments and troubleshoot deployment issues +- Platform Engineering approve deployments and monitor infrastructure health, security, and costs ### Post-Release Phase - Product Owners gather user feedback and measure success metrics -- Engineers address bugs and performance issues -- IT/Operations monitor system performance and manage operational concerns +- Engineers address bugs, resolve security issues, and optimize performance +- Platform Engineering monitor infrastructure cost and manage operational concerns -## Summary of Escalation and Decision-Making Authority +### Ongoing Architectural Governance + +- Engineers present architectural changes and proposals at weekly 1-hour architectural review call +- Architectural team (cross-product representation) provides feedback, guidance, and approval +- Ensures architectural consistency and knowledge sharing across the engineering organization +- Decisions and recommendations documented for reference by all teams + +## Approval Workflows + +### Items Requiring Platform Engineering Review and Approval + +The following items require formal review and approval from Platform Engineering before +implementation in production environments: + +- CI/CD pipeline configurations and modifications +- Production environment configuration changes +- Deployment configurations including scaling policies and load balancing rules +- Infrastructure-as-Code (IaC) templates and modifications +- Security group rules and network configurations +- Access management and permissions changes + +### Items Requiring Architectural Team Review + +The following items are presented at the weekly 1-hour architectural review call +for team alignment and approval: -### Product Decisions +- Significant system architecture changes or new architectural patterns +- New technology stack additions or framework changes +- API design patterns and cross-service integration approaches +- Data model changes affecting multiple services or products +- Security architecture decisions and implementations +- Performance and scalability architectural considerations + +### Approval Process + +1. **Submission**: Engineers submit change requests via designated channels (e.g., + pull requests, ticketing system, or infrastructure change request forms) +2. **Review**: Platform Engineering reviews for compliance with security policies, + operational standards, and best practices +3. **Feedback**: Platform Engineering provides feedback within agreed-upon SLA (typically + 1-2 business days for standard changes) +4. **Approval**: Platform Engineering approves or requests modifications +5. **Implementation**: Engineers implement approved changes +6. **Verification**: Platform Engineering verifies implementation meets approved specifications + +### Expedited Approval Process + +For urgent production issues or security vulnerabilities: + +- Engineers may request expedited review via designated escalation channels +- Platform Engineering provides accelerated review (target: within 4 hours during business hours) +- Post-implementation review conducted within 24 hours for compliance verification + +### Disagreement Resolution + +If Engineers and Platform Engineering disagree on approval decisions: + +1. Document technical rationale from both perspectives +2. Schedule joint review meeting within 24 hours +3. If unresolved, escalate to engineering management for final decision +4. Document decision and rationale for future reference + +## Summary of Escalation and Decision-Making Authority -- Product Decisions: Product Owners have final authority on feature requirements and business priorities -- Technical Decisions: Engineers have authority on technical implementation and architecture choices -- Operational Decisions: IT/Operations has authority on infrastructure, deployment methods, and operational procedures -- Cross-functional Conflicts: Escalate to management when decisions impact multiple domains +### Decision Authority by Domain + +- **Product Domain**: Product Owners have final authority on feature requirements + and business priorities +- **Technical Domain**: Engineers have authority on technical implementation, + architecture choices (with architectural team approval), and deployment + configuration design +- **Operational Domain**: Platform Engineering has authority on infrastructure provisioning, + deployment approvals, and operational standards/compliance +- **Cross-functional Conflicts**: Escalate to management when decisions impact multiple domains + +### Ticketing System Guidance + +#### Ticket Creation Requirements + +All work items must be tracked through Jira tickets to ensure proper coordination, +visibility, and accountability across teams. Tickets should be created in the +appropriate product Jira board for: + +- **Feature Development**: New functionality or capabilities +- **Enhancements**: Improvements to existing features +- **Bug Fixes**: Defect resolution and corrections +- **CI/CD Changes**: Pipeline modifications, build process updates, or testing framework changes +- **Deployment Updates**: Configuration changes, scaling adjustments, or deployment strategy modifications + +#### Platform Engineering Support Requests + +When Engineers require Platform Engineering team support for infrastructure, deployment, +or operational needs: + +1. **Create Ticket in Product Board**: Create the support request ticket within the + same product Jira board (not a separate Platform Engineering board) +2. **Assign to Platform Engineering Team Member**: Assign the ticket to one of the + designated Platform Engineering team members: + - Alan Sherman + - Antonia Gaete + - Trevor Bramwell +3. **Slack Communication**: After creating and assigning the ticket, post a message + in the `#lfx-devops` Slack channel that includes: + - A link to the Jira ticket + - An `@mention` tag referencing the assigned team member + - Brief context about the request if not immediately clear from the ticket title + +**Example Slack Message**: +> @alan I've created ticket LFXV2-302 for the Nats: Automated Backup Solution +> implementation: [Jira link](https://linuxfoundation.atlassian.net/browse/LFXV2-302) + +#### Benefits of This Workflow + +- **Centralized Tracking**: All product-related work remains in the product board +- **Clear Ownership**: Explicit assignment ensures accountability +- **Transparent Communication**: Slack notification provides immediate visibility +- **Audit Trail**: Complete history of requests and resolutions in one location