Skip to content

Runner Integration

Steps are delivered to Step Runner as a YAML blob in the GitLab CI syntax. Runner interacts with Step Runner over a gRPC service StepRunner which is started on a local socket in the execution environment. This is the same way that Nesting serves a gRPC service in a dedicated Mac instance. The service has three RPCs, run, follow and cancel.

Run is the initial delivery of the steps. Follow requests a streaming response to step traces. And Cancel stops execution and cleans up resources as soon as possible.

Step Runner operating in gRPC mode will be able to executed multiple step payloads at once. That is each call to run will start a new goroutine and execute the steps until completion. Multiple calls to run may be made simultaneously. This is also why components are cached by location, version and hash. Because we cannot be changing which ref we are on while multiple, concurrent executions are using the underlying files.

service StepRunner {
    rpc Run(RunRequest) returns (RunResponse);
    rpc Follow(FollowRequest) returns (stream FollowResponse);
    rpc Cancel(CancelRequest) returns (CancelResponse);
}

message RunRequest {
    string id = 1;
    oneof job_oneof {
        string ci_job = 2;
        Steps steps = 3;
    }
}

message RunResponse {
}

message FollowRequest {
    string id = 1;
}

message FollowResponse {
    StepResult result = 1;
}

message CancelRequest {
    string id = 1;
}

message CancelResponse {
}

As steps are executed, traces are streamed back to GitLab Runner. So execution can be followed at least at the step level. If a more granular follow is required, we can introduce a gRPC step type which can stream back logs as they are produced.

Here is how we will connect to Step Runner in each runner executor:

Instance

The Instance executor is accessed via SSH, the same as today. However instead of starting a bash shell and piping in commands, it connects to the Step Runner socket in a known location and makes gRPC calls. This is the same as how Runner calls the Nesting server in dedicated Mac instances to make VMs.

This requires that Step Runner is present and started in the job execution environment.

Docker

The same requirement that Step Runner is present and started is true for the Docker executor (and docker-autoscaler). However in order to connect to the socket inside the container, we must exec a bridge process in the container. This will be another command on the Step Runner binary which proxies STDIN and STDOUT to the local socket in a known location, allowing the caller of exec to make gRPC calls inside the container.

Kubernetes

The Kubelet on Kubernetes Nodes exposes an exec API which will start a process in a container of a running Pod. We will use this exec create a bridge process that will allow the caller to make gRPC calls inside the Pod. Same as the Docker executor.

In order to access to this protected Kubelet API we must use the Kubernetes API which provides an exec sub-resource on Pod. A caller can POST to the URL of a pod suffixed with /exec and then negotiate the connection up to a SPDY protocol for bidirectional byte streaming. So GitLab Runner can use the Kubernetes API to connect to the Step Runner service and deliver job payloads.

This is the same way that kubectl exec works. In fact most of the internals such as SPDY negotiation are provided as client-go libraries. So Runner can call the Kubernetes API directly by importing the necessary libraries rather than shelling out to Kubectl.

Historically one of the weaknesses of the Kubernetes Executor was running a whole job through a single exec. To mitigate this Runner uses the attach command instead, which can "re-attach" to an existing shell process and pick up where it left off.

This is not necessary for Step Runner however, because the exec is just establishing a bridge to the long-running gRPC process. If the connection drops, Runner will just "re-attach" by exec'ing another connection and continuing to make RPC calls like follow.