Skip to content
  • Piotr Mankowski's avatar
    47548349
    fix(ci): stabilize Playwright workflows with build-once shards (#3096) · 47548349
    Piotr Mankowski authored
    * fix(ci): split Playwright build from shard execution
    
    Separate analyzer Playwright CI into build-once and shard test phases with cached image reuse, merged blob reporting, and explicit required gate jobs. This restores harness file-import parity in CI and formalizes local demo-video execution without slowing CI.
    
    * fix(ci): address PR review findings for report merge and image export
    
    Run report merge jobs even when tests fail, make merges tolerant to missing blob artifacts, ensure compose image exports include pulled external images, and sanitize doc examples with Linux/macOS plus WSL guidance.
    
    * fix(ci): run analyzer harness reliably from PR workflow
    
    Move analyzer harness execution behind a reusable workflow invoked by the PR-triggered Playwright workflow so harness checks are always scheduled on pull requests. Add an explicit PR label opt-out path and keep gate behavior deterministic by accepting intentional analyzer skips.
    
    * refactor(ci): normalize PR check naming across workflows
    
    Apply a consistent CI naming taxonomy across backend, frontend, Playwright, analyzer harness, and legacy Cypress checks so PR status is grouped and scan-friendly. Align documentation to the new names and ruleset-based CI enforcement model for a cleaner cutover.
    
    * docs(specify): codify cypress deprecation and CI guardrails
    
    Add explicit Playwright-first and Cypress-legacy guidance in .specify artifacts, and document ruleset-aware CI rename guardrails so workflow naming changes stay aligned with required checks.
    
    * refactor(ci): simplify labels and surface analyzer harness grouping
    
    Remove redundant CI prefixes from core checks, make Cypress deprecation explicit in E2E check names, and rename Playwright analyzer checks to explicitly reference Analyzer Harness. Update docs and CI command guidance to match the revised naming model.
    
    * docs(playwright): declare canonical guidance source and align references
    
    Designate .specify/guides/playwright-best-practices.md as the single source of truth and update AGENTS/CLAUDE/frontend Playwright docs to reference it consistently while keeping README for operational details.
    
    * refactor(ci): align workflow taxonomy across files and checks
    
    Split frontend and deprecated Cypress into separate workflows, rename core workflow files under Backend/Frontend/E2E buckets, and standardize required gate contexts for clearer PR check grouping. Update docs and helper scripts to the new workflow names and check taxonomy.
    
    * fix(ci): format playwright README to pass prettier check
    
    * refactor(ci): rename gate jobs to Required Checkpoint labels
    
    Rename required gate job labels to clearer "Required Checkpoint" naming across backend, frontend, Playwright, and deprecated Cypress workflows, and sync the CI stabilization document accordingly.
    
    * refactor(ci): sort required checkpoints first in checks list
    
    Prefix required checkpoint gate names with 00 so they sort to the top of each workflow bucket in GitHub checks while preserving unique ruleset contexts.
    
    * refactor(ci): number required checkpoints for stable sorting
    
    Rename required gate jobs to numbered checkpoint labels so they sort predictably to the top of each workflow bucket while staying explicit in the checks UI and ruleset contexts.
    
    * refactor(ci): simplify checkpoint labels in checks UI
    
    Drop repeated category text from numbered checkpoint gate names so the rendered GitHub checks rely on the workflow bucket for context while still sorting checkpoints first.
    
    * refactor(ci): make backend follow leaf plus checkpoint pattern
    
    Split backend into a Build + Test leaf job plus the numbered checkpoint gate so all primary CI buckets use the same workflow structure and merge-gate shape.
    
    * refactor(ci): add explicit context to checkpoint checks
    
    Restore domain context in required checkpoint names so PR checks remain
    sortable while clearly indicating backend, frontend, and e2e ownership.
    
    * fix(ci): stabilize analyzer packaging and normalize checkpoint naming
    
    Make analyzer harness image export robust by naming astm-simulator explicitly
    and saving only locally built compose images, then align workflow/checkpoint
    naming, action versions, and docs/ruleset contexts for consistent PR check UX.
    
    * fix(ci): restore analyzer plugin test-jar dependency
    
    Keep the root analyzer build on -DskipTests so the OpenELIS tests classifier
    jar is still produced for GenericASTM and related plugin dependency resolution.
    
    * Split analyzer demo into dedicated job and add workflow concurrency
    
    Move analyzer demo execution out of shard 1 into its own job with a separate timeout budget, and wire it into the analyzer required gate to prevent shard timeout cancellations. Add top-level concurrency cancellation to Playwright and Cypress workflows so stale PR runs are automatically superseded.
    
    * Add contextual checkpoint names for required CI gates
    
    Rename the four checkpoint gate jobs to include Backend/Frontend/Playwright/Cypress context and align branch protection contexts with those emitted check names. This keeps required checks unambiguous in PR UI while preserving deterministic ordering.
    
    * fix(ci): shard demo, remove timeouts, GHCR images, guardrail consolidation
    
    Unblock green CI:
    - Shard demo into 2 matrix jobs (same pattern as harness test-shards)
    - Remove all timeout-minutes from Playwright/analyzer jobs (GitHub 6hr default)
    - Replace docker save/load tar artifacts with GHCR push/pull + image map
    - Apply build-once + GHCR pattern to Cypress workflow
    
    Guardrail consolidation:
    - Fix npx playwright test refs in constitution, best-practices, testing-roadmap
    - Harden Cypress deprecation (AGENTS.md, cypress-best-practices, skill)
    - Add Playwright execution invariants and contract to SKILL.md + AGENTS.md
    - Add pw:test:harness and pw:test:demo npm scripts
    - Trim CLAUDE.md to minimal pointers
    - Create .cursor/rules/playwright.mdc
    - Fix .gitignore to track .cursor/rules/ and .claude/settings.local.json
    - Update restart-analyzer-harness for local dev vs LE clarity
    - Pin submodules to remote default branch HEAD
    
    * Fix GHCR image naming and reduce duplicate CI image builds.
    
    Lowercase GHCR owner paths to avoid invalid reference failures and make Cypress reuse analyzer-built compose images before falling back to a one-time local build.
    
    * Fix demo patient selection and make Cypress cache reuse safe.
    
    Avoid stale Cypress image reuse by switching build-once to registry-backed layer caching, and harden the demo patient selection flow so CI can progress even when search results omit fields required by Add Order validation.
    
    * Enforce per-run image parity across CI build-once jobs.
    
    Keep Cypress and Playwright analyzer pipelines independent, use commit-immutable analyzer image tags, and require Cypress shards to consume images built and published by their own build-once job.
    
    * Use one image producer and remove duplicate Cypress builds.
    
    Make Cypress consume analyzer-built GHCR images keyed by commit SHA and wait for publication, so each run uses one shared build artifact set instead of rebuilding the same compose images twice.
    
    * Unify PR execution under a single E2E umbrella workflow.
    
    Run Cypress as a reusable job inside the main E2E workflow and disable standalone Cypress pull_request triggers so one PR run owns shared build, framework fan-out, and final gate.
    
    * Single shared build path for all E2E: Playwright + Cypress under one umbrella.
    
    One top-level Shared Build job produces all Docker images and plugin jars.
    Playwright Core, Analyzer Harness shards, Demo shards, and Cypress shards
    all consume from the same artifact set. No duplicate builds, no polling,
    no cross-workflow image resolution. Cypress removal later = delete one
    reusable workflow call + one file.
    
    * Shard Playwright Core and simplify gate naming.
    
    Split core-app tests into 2 shards for consistency with all other E2E
    branches, remove the unnecessary single-job report merge, and rename
    the gate to just 'Required' since the workflow name already provides
    the E2E context.
    
    * fix(ci): remove Cypress reusable workflow concurrency deadlock
    
    Drop concurrency from the called Cypress workflow so it no longer blocks behind the parent 03 - E2E run and its shard jobs can start after shared-build.
    47548349
    fix(ci): stabilize Playwright workflows with build-once shards (#3096)
    Piotr Mankowski authored
    * fix(ci): split Playwright build from shard execution
    
    Separate analyzer Playwright CI into build-once and shard test phases with cached image reuse, merged blob reporting, and explicit required gate jobs. This restores harness file-import parity in CI and formalizes local demo-video execution without slowing CI.
    
    * fix(ci): address PR review findings for report merge and image export
    
    Run report merge jobs even when tests fail, make merges tolerant to missing blob artifacts, ensure compose image exports include pulled external images, and sanitize doc examples with Linux/macOS plus WSL guidance.
    
    * fix(ci): run analyzer harness reliably from PR workflow
    
    Move analyzer harness execution behind a reusable workflow invoked by the PR-triggered Playwright workflow so harness checks are always scheduled on pull requests. Add an explicit PR label opt-out path and keep gate behavior deterministic by accepting intentional analyzer skips.
    
    * refactor(ci): normalize PR check naming across workflows
    
    Apply a consistent CI naming taxonomy across backend, frontend, Playwright, analyzer harness, and legacy Cypress checks so PR status is grouped and scan-friendly. Align documentation to the new names and ruleset-based CI enforcement model for a cleaner cutover.
    
    * docs(specify): codify cypress deprecation and CI guardrails
    
    Add explicit Playwright-first and Cypress-legacy guidance in .specify artifacts, and document ruleset-aware CI rename guardrails so workflow naming changes stay aligned with required checks.
    
    * refactor(ci): simplify labels and surface analyzer harness grouping
    
    Remove redundant CI prefixes from core checks, make Cypress deprecation explicit in E2E check names, and rename Playwright analyzer checks to explicitly reference Analyzer Harness. Update docs and CI command guidance to match the revised naming model.
    
    * docs(playwright): declare canonical guidance source and align references
    
    Designate .specify/guides/playwright-best-practices.md as the single source of truth and update AGENTS/CLAUDE/frontend Playwright docs to reference it consistently while keeping README for operational details.
    
    * refactor(ci): align workflow taxonomy across files and checks
    
    Split frontend and deprecated Cypress into separate workflows, rename core workflow files under Backend/Frontend/E2E buckets, and standardize required gate contexts for clearer PR check grouping. Update docs and helper scripts to the new workflow names and check taxonomy.
    
    * fix(ci): format playwright README to pass prettier check
    
    * refactor(ci): rename gate jobs to Required Checkpoint labels
    
    Rename required gate job labels to clearer "Required Checkpoint" naming across backend, frontend, Playwright, and deprecated Cypress workflows, and sync the CI stabilization document accordingly.
    
    * refactor(ci): sort required checkpoints first in checks list
    
    Prefix required checkpoint gate names with 00 so they sort to the top of each workflow bucket in GitHub checks while preserving unique ruleset contexts.
    
    * refactor(ci): number required checkpoints for stable sorting
    
    Rename required gate jobs to numbered checkpoint labels so they sort predictably to the top of each workflow bucket while staying explicit in the checks UI and ruleset contexts.
    
    * refactor(ci): simplify checkpoint labels in checks UI
    
    Drop repeated category text from numbered checkpoint gate names so the rendered GitHub checks rely on the workflow bucket for context while still sorting checkpoints first.
    
    * refactor(ci): make backend follow leaf plus checkpoint pattern
    
    Split backend into a Build + Test leaf job plus the numbered checkpoint gate so all primary CI buckets use the same workflow structure and merge-gate shape.
    
    * refactor(ci): add explicit context to checkpoint checks
    
    Restore domain context in required checkpoint names so PR checks remain
    sortable while clearly indicating backend, frontend, and e2e ownership.
    
    * fix(ci): stabilize analyzer packaging and normalize checkpoint naming
    
    Make analyzer harness image export robust by naming astm-simulator explicitly
    and saving only locally built compose images, then align workflow/checkpoint
    naming, action versions, and docs/ruleset contexts for consistent PR check UX.
    
    * fix(ci): restore analyzer plugin test-jar dependency
    
    Keep the root analyzer build on -DskipTests so the OpenELIS tests classifier
    jar is still produced for GenericASTM and related plugin dependency resolution.
    
    * Split analyzer demo into dedicated job and add workflow concurrency
    
    Move analyzer demo execution out of shard 1 into its own job with a separate timeout budget, and wire it into the analyzer required gate to prevent shard timeout cancellations. Add top-level concurrency cancellation to Playwright and Cypress workflows so stale PR runs are automatically superseded.
    
    * Add contextual checkpoint names for required CI gates
    
    Rename the four checkpoint gate jobs to include Backend/Frontend/Playwright/Cypress context and align branch protection contexts with those emitted check names. This keeps required checks unambiguous in PR UI while preserving deterministic ordering.
    
    * fix(ci): shard demo, remove timeouts, GHCR images, guardrail consolidation
    
    Unblock green CI:
    - Shard demo into 2 matrix jobs (same pattern as harness test-shards)
    - Remove all timeout-minutes from Playwright/analyzer jobs (GitHub 6hr default)
    - Replace docker save/load tar artifacts with GHCR push/pull + image map
    - Apply build-once + GHCR pattern to Cypress workflow
    
    Guardrail consolidation:
    - Fix npx playwright test refs in constitution, best-practices, testing-roadmap
    - Harden Cypress deprecation (AGENTS.md, cypress-best-practices, skill)
    - Add Playwright execution invariants and contract to SKILL.md + AGENTS.md
    - Add pw:test:harness and pw:test:demo npm scripts
    - Trim CLAUDE.md to minimal pointers
    - Create .cursor/rules/playwright.mdc
    - Fix .gitignore to track .cursor/rules/ and .claude/settings.local.json
    - Update restart-analyzer-harness for local dev vs LE clarity
    - Pin submodules to remote default branch HEAD
    
    * Fix GHCR image naming and reduce duplicate CI image builds.
    
    Lowercase GHCR owner paths to avoid invalid reference failures and make Cypress reuse analyzer-built compose images before falling back to a one-time local build.
    
    * Fix demo patient selection and make Cypress cache reuse safe.
    
    Avoid stale Cypress image reuse by switching build-once to registry-backed layer caching, and harden the demo patient selection flow so CI can progress even when search results omit fields required by Add Order validation.
    
    * Enforce per-run image parity across CI build-once jobs.
    
    Keep Cypress and Playwright analyzer pipelines independent, use commit-immutable analyzer image tags, and require Cypress shards to consume images built and published by their own build-once job.
    
    * Use one image producer and remove duplicate Cypress builds.
    
    Make Cypress consume analyzer-built GHCR images keyed by commit SHA and wait for publication, so each run uses one shared build artifact set instead of rebuilding the same compose images twice.
    
    * Unify PR execution under a single E2E umbrella workflow.
    
    Run Cypress as a reusable job inside the main E2E workflow and disable standalone Cypress pull_request triggers so one PR run owns shared build, framework fan-out, and final gate.
    
    * Single shared build path for all E2E: Playwright + Cypress under one umbrella.
    
    One top-level Shared Build job produces all Docker images and plugin jars.
    Playwright Core, Analyzer Harness shards, Demo shards, and Cypress shards
    all consume from the same artifact set. No duplicate builds, no polling,
    no cross-workflow image resolution. Cypress removal later = delete one
    reusable workflow call + one file.
    
    * Shard Playwright Core and simplify gate naming.
    
    Split core-app tests into 2 shards for consistency with all other E2E
    branches, remove the unnecessary single-job report merge, and rename
    the gate to just 'Required' since the workflow name already provides
    the E2E context.
    
    * fix(ci): remove Cypress reusable workflow concurrency deadlock
    
    Drop concurrency from the called Cypress workflow so it no longer blocks behind the parent 03 - E2E run and its shard jobs can start after shared-build.
Loading