LitData v0.2.54
Lightning AI ⚡ is excited to announce the release of LitData v0.2.54
Highlights
Lightning AI Storage - Direct download
Lightning Studios have special directories for data connections that are available to an entire teamspace. LitData functions that reference those directories will experience a significant performance increase as uploads and downloads will happen directly from the bucket that backs the folder. LitData has supported existing folder types like S3 and GCS folders, and this release introduces support for lightning_storage folders which were recently launched.
For example, data will be downloaded directly from the my-data-1 Lightning Storage bucket in this example code.
from litdata import StreamingDataset
if __name__ == "__main__":
data_dir = "/teamspace/lightning_storage/my-bucket-1/data"
dataset = StreamingDataset(data_dir)
for sample in dataset:
print(sample)References to any of the following directories will work similarly:
/teamspace/lightning_storage/.../teamspace/s3_connections/.../teamspace/gcs_connections/.../teamspace/s3_folders/.../teamspace/gcs_folders/...
Changes
Added
- Add downloader for R2 by @pwgardipee in #711
Full Changelog: v0.2.53...v0.2.54
🧑💻 Contributors
We thank all folks who submitted issues, features, fixes and doc changes. It's the only way we can collectively make LitData better for everyone, nice job!
Key Contributors
Thank you ❤️ and we hope you'll keep them coming!