Skip to content

Conversation

FZambia
Copy link

@FZambia FZambia commented Jul 29, 2022

Hello! First of all, thanks for the awesome library and all the hard work on search in Go ecosystem.

We are using in-memory Bluge index. We also have a separate event stream with changes in the system. On application start we do initial indexing from the database and consume event stream to keep the index actual (starting from event ID known before initial indexing).

One thing we want to improve is a time since application start till possibility to use index. So the idea is to do index backups periodically (using Reader.Backup API), and then use the backup on start to initialize initial state for in-memory only index (applying missing updates from the event stream). Similar to Redis RDB periodic snapshots (but we can still catch up to the actual state using our event log).

This way we can reduce startup time from ~20s to ~1s.

This pull request contains changes which allow starting in-memory index using on-disk backup. It seems to work and provided test is passing. But to be honest I am not sure I fully understand how Bluge works internally and whether this approach makes sense in terms of correctness. I spent some time reading code - but I am still in a position where I need help from someone who understands internals, so trying to validate an idea here.

Having said this all, a couple of questions:

  1. Does the implementation here seems reasonable in terms of correctness? I.e. we are loading initial state for in-memory index using disk-based directory obtained from Reader.Backup call, but then switching to a fully in-memory index implementation.
  2. If the answer on first question is yes – will such pull request be interesting for Bluge? I can re-work it in a better way from the API perspective if required.

@FZambia FZambia changed the title Initialize in-memory writer from disk snapshot Initialize in-memory writer from disk backup Jul 29, 2022
@mschoch
Copy link
Member

mschoch commented Jul 30, 2022

Thanks for sharing this. Yes, the ability to take a backup from an in-memory was a sort of nice surprise feature that emerged from two independent improvements to Bluge. And as you found, it sure would be nice to have a way to resume using an in-memory index by restoring the backup (but then continue on in-memory again).

So, I think this sounds like a great addition to the library, but I have not yet had a chance to review it. Thanks in advance for your patience.

@FZambia
Copy link
Author

FZambia commented Oct 7, 2022

Hello @mschoch. Possibly you can provide some guidance on what can we do with this? Having a Bluge fork seems the wrong way to go as it's quite a sophisticated software to maintain, but it seems that not much activity on the repo happen. Do you have some short or long-term plan to continue working on Bluge? Or possibly call for maintainer? So sorry for such questions, any answer is totally understandable – just trying to evaluate our way forward with the search approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants