Purpose
This procedure is designed to help ensure that changes to Maelstrom AI’s production systems are controlled, tested, and documented to maintain security and availability while enabling rapid innovation.
Scope
Applies to all repositories in our GitHub organisation, including:
- Rust backend services (provii-verifier including hosted mode, provii-issuer)
- Rust libraries (provii-crypto, provii-mobile-sdk, shared-rate-limit)
- TypeScript portals and workers (admin-portal, provii-management, provii-credit-management, provii-status, shared-portal-lib)
- TypeScript SDKs and sites (provii-agegate, provii-website, provii-docs)
- Mobile applications (provii-mobile)
- Integration and demo projects (provii-demos)
- Infrastructure configuration (wrangler.toml, KV settings)
- Security controls
- Cryptographic parameters
- Access permissions
Change Types
Standard Changes
Definition: Routine, low-risk changes following established procedures
Examples:
- Dependency updates (non-breaking)
- Documentation updates
- Configuration tweaks
- Bug fixes (non-security)
Approval: Automated via CI/CD passing
Process:
- Create feature branch
- Implement change
- CI/CD tests pass (required)
- Code review approval (2 reviewers minimum)
- Merge to main
- Automatic deployment
Timeline: Minutes to hours
Normal Changes
Definition: Significant changes requiring review but not emergency
Examples:
- New features
- Refactoring
- Third-party integrations
- Schema changes
Approval: Code review + Security Lead signoff (if security-relevant)
Process:
- Create feature branch
- Threat model (if security-relevant)
- Implement with tests
- Code review
- CI/CD validation
- Merge to main
- Monitor post-deployment
Timeline: Days to weeks
Emergency Changes
Definition: Urgent changes to resolve critical incidents
Examples:
- Security vulnerability patches
- Production outages
- Signing key rotations (if compromised)
- Critical bug fixes
Approval: ISMS Owner or Security Lead
Process:
- Assess urgency and risk
- Document change and justification
- Implement fix
- Expedited review (can be post-deployment for P0)
- Deploy immediately
- Retrospective within 24 hours
Timeline: Minutes to hours
Change Workflow
1. Planning
For Normal/Emergency Changes:
- Define scope and objectives
- Identify affected systems
- Assess risks and impacts
- Plan rollback strategy
- Determine testing requirements
- Schedule change window (if applicable)
2. Development
All changes:
- Create Git branch (
git checkout -b feature/description) - Implement change
- Write/update tests
- Run locally (
wrangler dev --local) - Commit with descriptive message
Security-Relevant Changes:
- Threat model documented
- Security testing included
- Security Lead review requested
3. Testing
Automated Testing (runs on every push):
# .github/workflows/ci.yml
- Security scanning (CodeQL, cargo audit, npm audit)
- Unit tests
- Integration tests
- Linting and formatting
- Build validation
Manual Testing:
- Smoke testing for major changes
- E2E testing for user-facing features
- Performance testing (if relevant)
4. Review
Code Review Requirements:
- Self-review via pull request with passing CI status checks before merge. Two-person approval is not feasible for a sole operator (accepted limitation, see the risk register).
- Reviewers check:
- Code quality and style
- Security implications
- Test coverage
- Documentation updates
- Approver cannot be author
Security Review Triggers:
- Changes to authentication/authorisation
- Cryptographic code changes
- New third-party dependencies
- Infrastructure configuration
- Access control changes
5. Approval
Automated Approval:
- CI/CD passes ✅
- Code review approval ✅
- Dismiss stale reviews on new pushes ✅
Manual Approval (if required):
- Security Lead: Security-relevant changes
- ISMS Owner: Emergency changes, major releases
6. Deployment
Automated Deployment (standard/normal):
# Triggered by merge to main
git checkout main
git merge feature/description
git push origin main
# GitHub Actions runs wrangler deploy
Manual Deployment (emergency):
# From local machine (requires Cloudflare credentials)
cd provii-verifier # or provii-issuer/worker
wrangler deploy --env production
Deployment Verification:
- Health check passes
- Smoke tests pass
- Monitoring shows normal operation
- Rollback prepared if needed
7. Post-Deployment
Monitoring (first 30 minutes):
- Error rates (Cloudflare Workers Logs in Grafana Loki)
- Response times
- Authentication success rates
- Any anomalous behaviour
Rollback Triggers:
- Error rate >5%
- Critical functionality broken
- Security issue introduced
- Performance degradation >50%
Communication:
- Internal: Team notification
- External: Status page (if customer-impacting)
Rollback Procedures
Automatic Rollback
Not currently implemented. Manual rollback required.
Manual Rollback
Process:
# 1. Identify last known good commit
git log --oneline
# 2. Revert the problematic commit
git revert <commit-hash>
git push origin main
# 3. Or deploy specific good version
git checkout <good-commit>
cd provii-verifier
wrangler deploy --env production
# 4. Verify rollback successful
curl https://verify.provii.app/health
Timeline: Target <5 minutes for critical rollbacks
Change Documentation
Git Commit Messages
Format:
<type>: <brief description>
<detailed explanation>
<why the change was needed>
<what alternatives were considered>
Fixes: #issue-number
Types: feat, fix, refactor, docs, test, chore, security
Example:
security: Implement rate limiting for challenge creation
Add token bucket rate limiting to prevent API abuse.
Limit: 10 challenges per minute per IP.
Fixes: #123
Change Log
For Major Releases:
- Update CHANGELOG.md
- Follow Keep a Changelog format
- Group by Added/Changed/Fixed/Security
Special Change Types
Cryptographic Changes
Extra requirements:
- Cryptographer review (if available) or thorough self-review
- Extensive testing (including property-based, fuzz testing)
- Backward compatibility considered
- Documentation updated
- May require coordinated release with dependent services
Infrastructure Changes
Examples: wrangler.toml, KV namespace changes, Durable Object migrations
Requirements:
- Test in development environment first
- Document expected impact
- Plan for rollback (KV changes may not be easily reversible)
- ISMS Owner approval
Access Control Changes
Examples: GitHub permissions, Cloudflare roles, API key changes
Requirements:
- Documented justification
- Principle of least privilege verified
- Security Lead approval
- Audit log check post-change
Change Calendar
Change Windows
Preferred Windows:
- Standard changes: Any time (automated)
- Normal changes: Business hours (easier monitoring)
- Major changes: Tuesday-Thursday (avoid Fridays/Mondays)
Change Freeze:
- None currently (continuous deployment)
- May implement for major events (e.g., days before certification audit)
Metrics and KPIs
Tracked Metrics:
- Deployment frequency (current: multiple per week)
- Lead time for changes (commit to production)
- Change failure rate (target: <5%)
- Mean time to recovery (target: <1 hour)
Quarterly Review:
- Analyse metrics
- Identify improvement opportunities
- Update procedures as needed
Emergency Change Authorisation
Who Can Authorise:
- ISMS Owner - Any emergency change
- Security Lead - Security incidents
- On-call Engineer - P0 incidents (retrospective approval within 24h)
Documentation:
- Record in incident ticket
- Explain urgency and risk
- Document testing performed
- Post-mortem within 24 hours
Related Documents
- Incident Response - Emergency changes during incidents
- Access Control Policy - Access control changes
- Business Continuity Plan - Major disruption changes
- Statement of Applicability - Control A.8.32
Document Information
- Version. 1.1
- Effective Date. 2025-01-13
- Last Updated. 2026-05-21
- Owner. ISMS Owner
- Maintained By. Security Lead
- Review Frequency. Annually
- Next Review. 2026-11-21
- Classification. Public