Skip to content

glob with ** leads to incorrect listings_cache state (missing and duplicated directories listings) #988

@ischurov

Description

@ischurov

It seems that usage of ** in glob can lead to incorrect listings cache state so that subsequent commands like glob or ls produce incorrect results.

I have two examples.

Missing directories

import s3fs

print(s3fs.__version__)
s3 = s3fs.S3FileSystem()
s3.glob("s3://<redacted bucket>/<redacted directory>/*/<redacted>/**/*.txt")
print(s3.ls("s3://<redacted bucket>/<redacted directory>"))

Here s3.ls misses several directories that are contained in <redacted directory>. If I comment s3.glob out, or replace ** with * (the output of glob does not change, as there is only one subdirectory level in <redacted>), no missed folders in ls output.

Duplicated directories

I tried to reproduce the previous example with a simpler directory structure, and encountered different problem.

I have a very simple structure inside my bucket:

a/
b/somefile.pdf
c/

Then I do

import s3fs

print(s3fs.__version__)
s3 = s3fs.S3FileSystem()

s3.glob("s3://<test bucket>/**/*.pdf")
print(s3.ls("s3://<test bucket>/"))

The output:

2025.9.0
['<test bucket>/a/', '<test bucket>/b', '<test bucket>/b', '<test bucket>/b/', '<test bucket>/c/']

So directory b is present three times (one with a leading /, the other two without).

If I replace ** with *, I get different output:

2025.9.0
['<test bucket>/a', '<test bucket>/b', '<test bucket>/c']

In all cases, if I initialize the filesystem with s3 = s3fs.S3FileSystem(use_listings_cache=False), the problems disappear. So I believe the problem is the interaction between ** in glob and the cache.

I am using 2025.9.0 version of s3fs (also reproduces with 2025.9.0+3.g2ccadeb), Python 3.12.9 on Mac OS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions