Skip to content

Conversation

@rchincha
Copy link
Contributor

Currently, our dedupe scheme is very complicated since master copy is kept in one of the repo dirs and if that image is deleted, the owning repo is changed, requiring multiple repo locks causing unnecessary contention.

What type of PR is this?

Which issue does this PR fix:

What does this PR do / Why do we need it:

If an issue # is not available please add repro steps and logs showing the issue:

Testing done on this change:

Automation added to e2e:

Will this break upgrades or downgrades?

Does this PR introduce any user-facing change?:


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rchincha
Copy link
Contributor Author

Should pair with #2968

@rchincha rchincha force-pushed the fix-dedupe branch 5 times, most recently from 6edb9fb to b4c3c46 Compare September 4, 2025 05:23
DefaultGCInterval = 1 * time.Hour
S3StorageDriverName = "s3"
LocalStorageDriverName = "local"
GlobalBlobsRepo = "_blobstore"
Copy link
Contributor

@andaaron andaaron Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure this blobs repo is per store and substore. Because substores may be on different partitions.

@rchincha rchincha force-pushed the fix-dedupe branch 6 times, most recently from 4b2da37 to da2884a Compare September 11, 2025 05:24
@rchincha rchincha force-pushed the fix-dedupe branch 3 times, most recently from 49521a6 to 23f1ebc Compare October 10, 2025 06:47
Currently, zot uses one of the repos as the master copy for a blob to
achieve dedupe. However, blobs can be deleted from repos and this
complicates dedupe tracking logic.

Now use a single hidden global repo as a blob store instead.

Signed-off-by: Ramkumar Chinchani <[email protected]>
Signed-off-by: Ramkumar Chinchani <[email protected]>
Signed-off-by: Ramkumar Chinchani <[email protected]>
Signed-off-by: Ramkumar Chinchani <[email protected]>
Signed-off-by: Ramkumar Chinchani <[email protected]>
Signed-off-by: Ramkumar Chinchani <[email protected]>
Copy link
Contributor

@andaaron andaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logging is too verbose, and should be trimmed

substore := c.StoreController.SubStore[route]
if substore != nil {
substore.RunDedupeBlobs(time.Duration(0), c.taskScheduler)
//substore.RunDedupeBlobs(time.Duration(0), c.taskScheduler)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just temporary, or we're dropping dedupe altogether?

return nil
}

// create nested deduped bucket where we store all the deduped blobs + original blob
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// create nested deduped bucket where we store all the deduped blobs + original blob
// create nested deduped bucket where we store all the deduped blobs while excluding the original blob

Correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logging is too verbose for merge. It would be too much information to include logs for the code paths not returning errors.


dedupeBlob := d.getOne(deduped)
if dedupeBlob != nil {
d.log.Debug().Str("digest", digest.String()).Str("path", path).Msg("more in dedupe bucket, leaving original alone")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous logic was replacing the old original blob with one of the duplicates, and then delete the blob.
This new logic would keep the original blob around forever? How/when would it be GCed?

}

return nil
return zerr.ErrCacheMiss
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under which scenario would this return the error?
The updated code is a bit confusing as it is returning nil in several other places and it is unclear in which scenario this line is reached.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Are there no other needed changes in this file?
  2. Also the redis implementation should be reviewed

}

origin := bucket.Bucket([]byte(constants.OriginalBucket))
if origin != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if origin is nil, then we should return nil, because the blob does not exist.

It is not in the duplicates list and it is not the original, so it does not exist, so the deletion should be marked as successful, correct?

}

// skip the global blobs repo
if repo == constants.GlobalBlobsRepo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is fishy.
Why is this here? constants.GlobalBlobsRepo is not an actual repo because it has an invalid repo name. How is GC detecting it when "walking" the disk if it doesn't have a valid repo name? It shouldn't call this function at all for constants.GlobalBlobsRepo?


return err
}
goto retry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're refactoring this, can you pick up #2969?
Or do you want me to rebase and merge that in advance?

Is this code working right now, wouldn't it potentially execute the code multiple times?

		if err := is.storeDriver.Move(src, gdst); err != nil {
			is.log.Error().Err(err).Str("src", src).Str("dst", gdst).Str("component", "dedupe").
				Msg("failed to rename blob")

			return err
		}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants