TEST-INFRA: [ci-execution-time] Avoid Docker-in-Docker for building and testing #7426

Open
opened 2026-04-10 19:12:51 +00:00 by HAL9000 · 3 comments
Owner

Metadata

  • Branch: task/ci-execution-time-avoid-dind
  • Commit Message: chore(ci): replace Docker-in-Docker with shared socket or dedicated build service
  • Milestone: N/A — Backlog (see note below)
  • Parent Epic: #1678

Backlog note: This issue was discovered during autonomous operation
on milestone v3.2.0. It does not block milestone completion and has been
placed in the backlog for human review and future milestone assignment.

Duplicate Check

  • Search Queries: "docker in docker", "dind", "slow docker build"
  • Results: 0 issues found for each query.
  • Reasoning: No existing issues address the use of Docker-in-Docker.

Summary

The docker and helm CI jobs use Docker-in-Docker (DinD), which introduces significant overhead: DinD requires a privileged container, starts a nested Docker daemon on every run, and cannot share the host's layer cache — resulting in slow, resource-intensive builds and reduced reliability.

Current Behavior

The docker and helm jobs in .forgejo/workflows/ci.yml use Docker-in-Docker (docker:dind service or equivalent). This means:

  • A new Docker daemon is started inside the runner container on every CI run.
  • No layer caching is shared between runs, causing full image rebuilds each time.
  • Privileged mode is required, increasing the attack surface of the CI runner.
  • Build times are significantly longer than necessary.

Expected Behavior

The CI pipeline should use an alternative to Docker-in-Docker that:

  • Does not require a privileged container.
  • Can share the host Docker socket or use a dedicated build service (e.g., Kaniko, Buildah, or a remote Docker daemon).
  • Supports layer caching to speed up repeated builds.
  • Is more reliable and easier to debug than DinD.

Acceptance Criteria

  • The CI pipeline does not use Docker-in-Docker for the docker or helm jobs.
  • The alternative approach (shared socket, Kaniko, Buildah, or equivalent) is documented in the workflow file with a comment explaining the rationale.
  • Docker image build times are measurably reduced compared to the DinD baseline.
  • The reliability of Docker builds in CI is improved (no privileged mode required).
  • The change does not break the docker or helm CI jobs.

Supporting Information

  • Related issues: #7206 (Optimize Docker image builds with layer caching), #7216 (Review and optimize CI job dependencies), #1678 (Epic: CI Execution Time Optimization).
  • Alternatives to evaluate: shared Docker socket (/var/run/docker.sock), Kaniko (rootless, daemonless), Buildah (OCI-compliant, daemonless), remote Docker daemon.

Subtasks

  • Audit current docker and helm job configurations in .forgejo/workflows/ci.yml to document the DinD setup.
  • Evaluate alternatives: shared Docker socket, Kaniko, Buildah, remote Docker daemon.
  • Select and implement the preferred alternative in .forgejo/workflows/ci.yml.
  • Add a comment in the workflow file explaining the chosen approach and rationale.
  • Verify the docker and helm jobs pass with the new configuration.
  • Measure and document build time improvement vs. DinD baseline.
  • Run nox (all default sessions), fix any errors.
  • Verify coverage >= 97% via nox -s coverage_report.

Definition of Done

This issue is complete when:

  • All subtasks above are completed and checked off.
  • A Git commit is created where the first line of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation.
  • The commit is pushed to the remote on the branch matching the Branch in Metadata exactly.
  • The commit is submitted as a pull request to master, reviewed, and merged before this issue is marked done.
  • All nox stages pass.
  • Coverage >= 97%

Automated by CleverAgents Bot
Supervisor: Test Infrastructure | Agent: new-issue-creator

## Metadata - **Branch**: `task/ci-execution-time-avoid-dind` - **Commit Message**: `chore(ci): replace Docker-in-Docker with shared socket or dedicated build service` - **Milestone**: N/A — Backlog (see note below) - **Parent Epic**: #1678 > **Backlog note:** This issue was discovered during autonomous operation > on milestone v3.2.0. It does not block milestone completion and has been > placed in the backlog for human review and future milestone assignment. ### Duplicate Check - **Search Queries**: "docker in docker", "dind", "slow docker build" - **Results**: 0 issues found for each query. - **Reasoning**: No existing issues address the use of Docker-in-Docker. ## Summary The `docker` and `helm` CI jobs use Docker-in-Docker (DinD), which introduces significant overhead: DinD requires a privileged container, starts a nested Docker daemon on every run, and cannot share the host's layer cache — resulting in slow, resource-intensive builds and reduced reliability. ## Current Behavior The `docker` and `helm` jobs in `.forgejo/workflows/ci.yml` use Docker-in-Docker (`docker:dind` service or equivalent). This means: - A new Docker daemon is started inside the runner container on every CI run. - No layer caching is shared between runs, causing full image rebuilds each time. - Privileged mode is required, increasing the attack surface of the CI runner. - Build times are significantly longer than necessary. ## Expected Behavior The CI pipeline should use an alternative to Docker-in-Docker that: - Does not require a privileged container. - Can share the host Docker socket or use a dedicated build service (e.g., Kaniko, Buildah, or a remote Docker daemon). - Supports layer caching to speed up repeated builds. - Is more reliable and easier to debug than DinD. ## Acceptance Criteria - [ ] The CI pipeline does not use Docker-in-Docker for the `docker` or `helm` jobs. - [ ] The alternative approach (shared socket, Kaniko, Buildah, or equivalent) is documented in the workflow file with a comment explaining the rationale. - [ ] Docker image build times are measurably reduced compared to the DinD baseline. - [ ] The reliability of Docker builds in CI is improved (no privileged mode required). - [ ] The change does not break the `docker` or `helm` CI jobs. ## Supporting Information - Related issues: #7206 (Optimize Docker image builds with layer caching), #7216 (Review and optimize CI job dependencies), #1678 (Epic: CI Execution Time Optimization). - Alternatives to evaluate: shared Docker socket (`/var/run/docker.sock`), Kaniko (rootless, daemonless), Buildah (OCI-compliant, daemonless), remote Docker daemon. ## Subtasks - [ ] Audit current `docker` and `helm` job configurations in `.forgejo/workflows/ci.yml` to document the DinD setup. - [ ] Evaluate alternatives: shared Docker socket, Kaniko, Buildah, remote Docker daemon. - [ ] Select and implement the preferred alternative in `.forgejo/workflows/ci.yml`. - [ ] Add a comment in the workflow file explaining the chosen approach and rationale. - [ ] Verify the `docker` and `helm` jobs pass with the new configuration. - [ ] Measure and document build time improvement vs. DinD baseline. - [ ] Run `nox` (all default sessions), fix any errors. - [ ] Verify coverage >= 97% via `nox -s coverage_report`. ## Definition of Done This issue is complete when: - All subtasks above are completed and checked off. - A Git commit is created where the **first line** of the commit message matches the Commit Message in Metadata exactly, followed by a blank line, then additional lines providing relevant details about the implementation. - The commit is pushed to the remote on the branch matching the **Branch** in Metadata exactly. - The commit is submitted as a **pull request** to `master`, reviewed, and **merged** before this issue is marked done. - All nox stages pass. - Coverage >= 97% --- **Automated by CleverAgents Bot** Supervisor: Test Infrastructure | Agent: new-issue-creator
Author
Owner

Verified — CI optimization: avoid Docker-in-Docker. MoSCoW: Could-have. Priority: Low.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI optimization: avoid Docker-in-Docker. MoSCoW: Could-have. Priority: Low. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — CI optimization: avoid Docker-in-Docker. MoSCoW: Could-have. Priority: Low.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI optimization: avoid Docker-in-Docker. MoSCoW: Could-have. Priority: Low. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Author
Owner

Verified — CI optimization: avoid Docker-in-Docker. MoSCoW: Could-have. Priority: Low.


Automated by CleverAgents Bot
Supervisor: Project Owner | Agent: project-owner-pool-supervisor

✅ **Verified** — CI optimization: avoid Docker-in-Docker. MoSCoW: Could-have. Priority: Low. --- **Automated by CleverAgents Bot** Supervisor: Project Owner | Agent: project-owner-pool-supervisor
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Reference
cleveragents/cleveragents-core#7426
No description provided.