Skip to content

Conversation

ColinKinloch
Copy link
Contributor

@ColinKinloch ColinKinloch commented Jan 24, 2025

This branch adds crate::clone_git_repository to clone git repos generally and Job::clone_git_repository to clone specifically defined by the gitlab job.

clone_git_repository is heavily based on PrepareFetch::fetch_only

Since the gitoxide PrepareFetch and PrepareCheckout structs delete the target path when they are dropped I opted to checkout the repository to a temporary directory, then move it's contents to the path provided to the target directory.

This adds a rather unwieldy GitCheckoutError enum. gitoxides Error types triggers the result_large_err clippy warning.

This requires rust version 1.73.0 or later.

Fixes #21

@ColinKinloch ColinKinloch force-pushed the git branch 2 times, most recently from 587d8dd to ff98e93 Compare January 28, 2025 10:58
@ColinKinloch ColinKinloch force-pushed the git branch 4 times, most recently from 082f212 to 7428588 Compare January 28, 2025 17:03
@ColinKinloch ColinKinloch force-pushed the git branch 3 times, most recently from dbcc18f to c0f9c71 Compare February 10, 2025 18:02
@ColinKinloch ColinKinloch marked this pull request as ready for review February 10, 2025 18:03
Copy link

@andrewshadura andrewshadura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks okay, but I haven’t gone deeply into details. I trust you tested this code well? (It would be great if it also included some automated tests.)

}

// TODO: Should clone_git_repository be async
// gitoxide doesn't currently have async implemented for http backend
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is viable without either async or cloning in a separate thread (which would make cancellation really tricky, though you seem to have some interrupt-related stuff wired up already); blocking an entire I/O thread for what might be a very long clone process is not workable. Should this maybe just be using libgit2 or something else instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw i'd prefer to stay with gitoxide to avoid more C dependences in the core of gitlab-runner. But yes it should be done in a dedicated thread if there is no async api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to use tokio::task::spawn_blocking which runs it on a different thread.
I used a CancellationToken and tokio::select to trigger gitoxide to interrupt, CancellationToken is already used in the gitlab-runner-rs API so hopefully is okay.
gitoxide re-exports Progress from the prodash crate to allow the caller to monitor progress. Is that something we should do?

* * Clones to BuildDir
* * Checks out git_info.sha and fetches all in git_info.refspecs
* * Hardcoded username "gitlab-ci-token"
* * credHelperCommand?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit tricky to understand out-of-context, might want to add a few more details for some of these points

}

let checkout_progress = progress.add_child("checkout".to_string());
let (checkout, _outcome) = fetch.fetch_then_checkout(checkout_progress, &should_interrupt)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: if the goal is just to be able to read files from the repo, do we need to do an entire checkout to begin with? couldn't it clone a bare repo and then load files as-needed from there? though the apis might be harder to make for that...but it would likely also be a lot faster.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw i wouldn't mind this being a potential future update ; I'm not sure how much faster it would be ("likely be a lot of faster" is always a dangerous statement in the land of optimisation). I guess it primarily depend on what work would be saved, which i guess is primarily unpacking the pack file.

Regardless; the time would be comparable to a normal checking in a normal gitlab runner (or faster as iirc gix is faster at this point then git).

@ColinKinloch ColinKinloch force-pushed the git branch 4 times, most recently from 246508e to 6b04326 Compare July 25, 2025 20:37
* `clone_git_repository` to clone arbitrary repo
* `Job::clone_git_repository` to clone the jobs repo
Also throw an error if the provided HEAD ref cannot be parsed.
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements Git repository cloning functionality for the GitLab runner by adding two main functions: clone_git_repository for general Git repository cloning and Job::clone_git_repository for job-specific cloning. The implementation leverages the gitoxide library and includes comprehensive error handling through a new GitCheckoutError enum.

Key changes:

  • Adds comprehensive Git cloning functionality with support for refspecs, depth control, and SHA-specific checkout
  • Implements proper error handling for Git operations with detailed error categorization
  • Provides both async and sync interfaces for Git operations with cancellation support

Reviewed Changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
gitlab-runner/src/run.rs Fixes error message for build directory creation
gitlab-runner/src/lib.rs Adds core Git cloning functions and comprehensive error handling
gitlab-runner/src/job.rs Implements job-specific Git repository cloning method
gitlab-runner/src/client.rs Defines Git information structures and error types
gitlab-runner/examples/demo-runner.rs Adds example commands for testing Git functionality
gitlab-runner/Cargo.toml Updates dependencies and adds gitoxide library

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +575 to +576
if let Some(depth) = depth.and_then(NonZeroU32::new) {
fetch = fetch.with_shallow(remote::fetch::Shallow::DepthAtRemote(depth));
Copy link

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using NonZeroU32::new will return None for depth value 0, but depth 0 should be valid for unlimited depth. Consider using a separate check or handling depth 0 explicitly.

Suggested change
if let Some(depth) = depth.and_then(NonZeroU32::new) {
fetch = fetch.with_shallow(remote::fetch::Shallow::DepthAtRemote(depth));
match depth {
Some(0) => {
// Depth 0 means unlimited depth; do not set shallow clone.
}
Some(d) => {
if let Some(nonzero_depth) = NonZeroU32::new(d) {
fetch = fetch.with_shallow(remote::fetch::Shallow::DepthAtRemote(nonzero_depth));
}
}
None => {
// No depth specified; do not set shallow clone.
}

Copilot uses AI. Check for mistakes.


index.write(Default::default())?;

Ok(repo_dir.keep())
Copy link

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The function returns the temporary directory path, but the calling code is expected to move its contents manually. Consider returning the final destination path or handling the move operation within this function to improve the API design.

Copilot uses AI. Check for mistakes.

clone_git_repository(
self.build_dir(),
&self.response.git_info.repo_url,
Some(&self.response.git_info.sha),
Copy link

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The SHA is always passed as Some(), but the generic clone_git_repository function expects Option<&str>. Consider whether this should handle cases where SHA might be empty or invalid.

Suggested change
Some(&self.response.git_info.sha),
if self.response.git_info.sha.is_empty() {
None
} else {
Some(&self.response.git_info.sha)
},

Copilot uses AI. Check for mistakes.

Comment on lines +157 to +158
std::fs::rename(repo_path, path)
.map_err(|e| outputln!("Failed to move repo path: {}", e.to_string()))?;
Copy link

Copilot AI Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using std::fs::rename to move a directory may fail across filesystem boundaries. Consider using a recursive copy operation or std::fs::create_dir_all followed by moving individual files.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Should support checking out the git repository

4 participants