Skip to content

Conversation

@lucifertrj
Copy link
Contributor

@lucifertrj lucifertrj commented Oct 8, 2025

Add: FastEmbed Embedding for the local embedding inference use cases

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g. code style improvements, linting)
  • Documentation update

How Has This Been Tested?

As of now, I have tested FastEmbed Embeddings with mem0 Memory Config directly.

from mem0 import Memory
import os
os.environ['GOOGLE_API_KEY'] = "<api-key>"

config = {
    "llm": {
        "provider": "gemini",
        "config": {
            "model": "gemini-2.5-flash-lite",
            "temperature": 0.8,
        }
    },
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "collection_name": "travel",
            "path": "/tmp/db",
            "embedding_model_dims": 768,
        }
    },
    "embedder": {
        "provider": "fastembed",
        "config": {
            "model": "jinaai/jina-embeddings-v2-base-en"
        }
    }
}
client = Memory.from_config(config)
messages = [
    {"role": "user", "content": "What is the must try food in Baroda"},
    {"role": "assistant", "content": "Sev Usal is must"},
    {"role": "user", "content": "I'm not into street food, I prefer Gujarati thalis."},
    {"role": "assistant", "content": "Head to Mandap in Baroda, it’s famous for authentic Gujarati thalis."},
]
result1 = client.add(messages, user_id="personal", metadata={"category": "food"})
print(result1)

I will push the testing scripts under the tests folder too.

Please delete options that are not relevant.

  • Unit Test
  • Test Script (please provide)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Made sure Checks passed

@CLAassistant
Copy link

CLAassistant commented Oct 8, 2025

CLA assistant check
All committers have signed the CLA.

@parshvadaftari
Copy link
Contributor

Hey @lucifertrj Thank you for implementing this but can you please see for the failing tests?

self.config.model = self.config.model or "thenlper/gte-large"
self.config.embedding_dims = self.config.embedding_dims or 1024

self.dense_model = TextEmbedding(model_name = self.config.model,max_lenth = self.config.embedding_dims)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove max_lenth from here? Fastembed doesn't support this param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you check now

@parshvadaftari
Copy link
Contributor

Hey @lucifertrj the tests are failing. Can you please check?

@parshvadaftari
Copy link
Contributor

Can you resolve merge conflicts?

@lucifertrj
Copy link
Contributor Author

Can you resolve merge conflicts?

I have resolved it. I am getting this Vercel Authorization required to deploy although I have already approved.

@parshvadaftari
Copy link
Contributor

Vercel deployment is for the maintainers of the project, so that is fine and not an issue on your end.

@lucifertrj
Copy link
Contributor Author

Vercel deployment is for the maintainers of the project, so that is fine and not an issue on your end.

Alright got it.

@lucifertrj
Copy link
Contributor Author

this is test.py I used to test (PFA: Screenshot):

from mem0 import Memory
import os
os.environ['GOOGLE_API_KEY'] = ""

config = {
    "llm": {
        "provider": "gemini",
        "config": {
            "model": "gemini-2.5-flash-lite",
            "temperature": 0.8,
        }
    },
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "collection_name": "devfestbaroda",
            "path": "/tmp/db",
            "embedding_model_dims": 768,
        }
    },
    "embedder": {
        "provider": "fastembed",
        "config": {
            "model": "jinaai/jina-embeddings-v2-base-en"
        }
    }
}
client = Memory.from_config(config)
messages = [
    {"role": "user", "content": "What is the must try food in Baroda"},
    {"role": "assistant", "content": "Sev Usal is must"},
    {"role": "user", "content": "I'm not into street food, I prefer Gujarati thalis."},
    {"role": "assistant", "content": "Head to Mandap in Baroda, it’s famous for authentic Gujarati thalis."},
]
result1 = client.add(messages, user_id="personal", metadata={"category": "food"})
print(result1)
Screenshot 2025-10-16 at 22 11 05

@parshvadaftari
Copy link
Contributor

parshvadaftari commented Oct 16, 2025

@lucifertrj there's no support for the embedding_dims with the fastembed, how will the dimensions be decided and also what if I want to specify a specific dimension, that is something which should be either documented or implemented. Rest of the implementation looks good to me.

@lucifertrj
Copy link
Contributor Author

@lucifertrj there's no support for the embedding_dims with the fastembed, how will the dimensions be decided and also what if I want to specify a specific dimension, that is something which should be documented. Rest of the implementation looks good to me.

In that case, they will have to use a specific vector store, mainly Qdrant. FastEmbed goes well in hand with Qdrant vector store. I can help out with documentation.

@parshvadaftari
Copy link
Contributor

Okay cool it looks good to me!

Copy link
Contributor

@parshvadaftari parshvadaftari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@parshvadaftari parshvadaftari merged commit 1090784 into mem0ai:main Oct 16, 2025
6 of 7 checks passed
@parshvadaftari
Copy link
Contributor

Thanks a lot for your contribution @lucifertrj 🚀

@lucifertrj
Copy link
Contributor Author

Thanks a lot for your contribution @lucifertrj 🚀

let's go. Let me add the documentation too...

@parshvadaftari
Copy link
Contributor

Let me know once you raise the PR!🔥

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants