Skip to content

Onboarding launch stuck at “Check resource” / ~10%

What the UI means

  • The label Check resource is the first row of the admin checklist. Control Plane maps it to no prego_ui_step_* rows yet for the job’s trace_id (or legacy progress when UI-step traces are absent).
  • ~10% is often the quantized fallback when progress_from_ui is undefined (Pending + few trace_events).

So “stuck on step 1” usually means either the GitHub / Ansible pipeline has not written prego_ui_step_01 yet or POST /internal/trace-events from Actions failed silently before this repo’s workflow hardening.

Operator checklist (ordered)

  1. D1 timeline for trace_id

    From repo prego-control-plane:

    Terminal window
    cd prego-control-plane
    ./scripts/d1-provision-trace-timeline.sh "<trace_id>"

    Or: D1_TARGET=local ./scripts/d1-provision-trace-timeline.sh "<trace_id>" for local D1.

    Inspect:

    • provision_jobs: status, updated_at, region, plan_tier
    • workflow_dispatch_log: whether a dispatch row exists for that job_id
    • trace_events: presence of pipeline_started, prego_ui_step_01, resolve_server_ok, prego_ui_step_02, …
  2. GitHub Actions

    Workflow: Provision Tenant (provision-tenant.yml). Find the run for the same job_id / trace_id inputs.

    • resolve-server failed before trace posts: misconfigured CONTROL_PLANE_API_KEY, wrong CONTROL_PLANE_URL, or no server target (target_server_id empty and create_new_server false) — the job fails fast; trace steps now use curl -f so HTTP failures fail the step.
    • Empty trace_id input: early UI-step trace steps are skipped entirely (if: inputs.trace_id != '').
  3. Control Plane logs (Workers)

    Search JSON logs for:

    • provision_pipeline_trigger_skipped — missing queue + missing GITHUB_TOKEN / GITHUB_WORKFLOW_DISPATCH_URL
    • provision_workflow_dispatch_failed / _http_error / _fetch_failed
    • provision_job_accepted_but_pipeline_not_triggered — job row accepted but trigger returned false
    • provision_jobs_possibly_stuck (cron) — stale Pending / Running
  4. Browser (user session)

    • Network: job poll returns status, progress, provision_ui_current_label
    • Session: prego_provision_job_id present; if prego_provision_pipeline_never_triggered is 1, CP returned pipeline_triggered: false when the job was created (admin-web shows an alert banner).

Product / platform follow-ups

  • Keep GitHub trace POST steps failing loudly (curl -f, no continue-on-error) so D1 matches Actions failure state.
  • Surface pipeline_triggered: false on Launch (implemented in admin-web when CP exposes the field).
  • Optional: persist last pipeline stage on provision_jobs for UI when traces are missing.
Help