Add support for loading model from in-memory buffer #7560
Replies: 2 comments 1 reply
-
|
If I'm not mistaken, there should be an option to make a memory buffer "appear" as a virtual file. If you can do that, then you should be able to reuse |
Beta Was this translation helpful? Give feedback.
-
|
Ran into this discussion after trying my hand at writing some functions. The code base for ggml is pretty massive, and Rider does not want to give me intellisense with it so i might of made some duplicate functions. My c++ knowledge mainly comes from unrealengine so this isn't the easiest transition lol, and i have touched c a handful of times. From my testing my changes to llama_load_model_from_file do not seem to affect performance or break anything. On the note of all of this. I think having an inference directly from a buffer can be nice in systems where accessing the file system is either a hassle, frowned upon, or just not possible. For having a virtual file the previous issues still arise. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Current LlamaCpp comes with support of loading model file from absolute path.
However, don't you have any plan to add API to load llama model from buffer pointer already reside in memory? With such support, we may use pointer for sake of memory mapped i/o or something else. It would cover more cases of dealing with more scenarios.
Beta Was this translation helpful? Give feedback.
All reactions