Skip to content

Conversation

@MichaelEischer
Copy link

The first commit of this series solves that problem, that long RCS histories of large files (nearly 30k revisions resulting in a 4 MB file) requires tremendous amount of memory (200GB RAM were not enough...). The solution is to keep only a hash digest for revisions which will no longer be used for diffing. This way commit coalescing is still possible by using the hash but requires a lot less memory.

The next three changes avoid some unnecessary string and array copies.

This is complemented by applying the diff using a linear scan to avoid lots of small array allocations. This change might be problematic as it introduces the new assumption that a diff always contains incrementing line numbers.

This has the potential to drastically reduce the memory usage for large
files with many revisions.

The text of a commit is no longer needed once it's child commit has been
processed. The memory usage optimization does not work for branches as
these can't be processed reasonably by rcs-fast-export anyways.
replace will already copy the array contents on its own
flatten will just ignore those empty arrays
This avoids the creation of intermediate array and speeds up the whole
conversion by approx. 30%
@lcn2
Copy link

lcn2 commented Mar 17, 2025

We agree with PR #7 , PR #8 , PR #10 and have applied all of the to this rcs-fast-export repo:

https://github.com/lcn2/rcs-fast-export

along with a Makefile, better #! line and other mods.

@MichaelEischer
Copy link
Author

MichaelEischer commented Mar 22, 2025

@lcn2 Judging from a quick look at https://github.com/lcn2/rcs-fast-export/commits/master/ you didn't include my changes from https://github.com/MichaelEischer/rcs-fast-export/commits/master/ . Either way, from my side this PR only exists in case someone still has a use for it. I no longer have an RCS repositories.

@lcn2
Copy link

lcn2 commented Mar 22, 2025

@lcn2 Judging from a quick look at https://github.com/lcn2/rcs-fast-export/commits/master/ you didn't include my changes from https://github.com/MichaelEischer/rcs-fast-export/commits/master/ .

We will update later today, our forked repo with any missed mods from https://github.com/MichaelEischer/rcs-fast-export/commits/master/ .. stay tuned ..

Either way, from my side this PR only exists in case someone still has a use for it. I no longer have an RCS repositories.

We no longer have RCS repositories as well. Our repo exists in case anyone discovers an old RCS directory, such as from an old backup.

UPDATE 0

Fixed as suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants