Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 157 additions & 0 deletions docs/components/vectordbs/dbs/alibabacloud_mysql.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
[AlibabaCloud MySQL](https://www.alibabacloud.com/product/apsaradb-for-rds-mysql) is a fully managed MySQL database service that supports vector operations through MySQL's native VECTOR data type and vector functions. It provides high-performance vector similarity search capabilities with HNSW indexing.

### Usage

<CodeGroup>
```python Python
import os
from mem0 import Memory

os.environ["OPENAI_API_KEY"] = "sk-xx"

config = {
"vector_store": {
"provider": "alibabacloud_mysql",
"config": {
"dbname": "vector_db",
"collection_name": "memories",
"user": "your_username",
"password": "your_password",
"host": "your-mysql-host.mysql.rds.aliyuncs.com",
"port": 3306,
"embedding_model_dims": 1536,
"distance_function": "euclidean",
"m_value": 16
}
}
}

m = Memory.from_config(config)
messages = [
{"role": "user", "content": "I'm planning to watch a movie tonight. Any recommendations?"},
{"role": "assistant", "content": "How about a thriller movie? They can be quite engaging."},
{"role": "user", "content": "I'm not a big fan of thriller movies but I love sci-fi movies."},
{"role": "assistant", "content": "Got it! I'll avoid thriller recommendations and suggest sci-fi movies in the future."}
]
m.add(messages, user_id="alice", metadata={"category": "movies"})
```

```python Python (with connection string)
import os
from mem0 import Memory

os.environ["OPENAI_API_KEY"] = "sk-xx"

config = {
"vector_store": {
"provider": "alibabacloud_mysql",
"config": {
"connection_string": "mysql://username:password@host:3306/database",
"collection_name": "memories",
"distance_function": "cosine"
}
}
}

m = Memory.from_config(config)
```

```python Python (with SSL)
import os
from mem0 import Memory

os.environ["OPENAI_API_KEY"] = "sk-xx"

config = {
"vector_store": {
"provider": "alibabacloud_mysql",
"config": {
"dbname": "vector_db",
"collection_name": "memories",
"user": "your_username",
"password": "your_password",
"host": "your-mysql-host.mysql.rds.aliyuncs.com",
"port": 3306,
"embedding_model_dims": 1536,
"ssl_ca": "/path/to/ca-cert.pem",
"ssl_cert": "/path/to/client-cert.pem",
"ssl_key": "/path/to/client-key.pem"
}
}
}

m = Memory.from_config(config)
```
</CodeGroup>

### Config

Here are the parameters available for configuring AlibabaCloud MySQL:

| Parameter | Description | Default Value |
| --- | --- | --- |
| `dbname` | The name of the database | `None` |
| `collection_name` | The name of the collection (table) | `None` |
| `embedding_model_dims` | Dimensions of the embedding model | `None` |
| `user` | Database username | `None` |
| `password` | Database password | `None` |
| `host` | Database host address | `None` |
| `port` | Database port | `3306` |
| `distance_function` | Distance function for vector index ('euclidean' or 'cosine') | `euclidean` |
| `m_value` | M parameter for HNSW index (3-200). Higher values = more accurate but slower | `16` |
| `ssl_disabled` | Disable SSL connection | `False` |
| `ssl_ca` | SSL CA certificate file path | `None` |
| `ssl_cert` | SSL certificate file path | `None` |
| `ssl_key` | SSL key file path | `None` |
| `connection_string` | MySQL connection string (overrides individual connection parameters) | `None` |
| `charset` | Character set for the connection | `utf8mb4` |
| `autocommit` | Enable autocommit mode | `True` |

### Features

- **Native Vector Support**: Uses MySQL's native VECTOR data type for optimal performance
- **HNSW Indexing**: Supports Hierarchical Navigable Small World (HNSW) indexing for fast similarity search
- **Multiple Distance Functions**: Supports both Euclidean and Cosine distance functions
- **SSL Support**: Full SSL/TLS encryption support for secure connections
- **JSON Payload**: Store additional metadata as JSON alongside vectors
- **Connection Pooling**: Efficient connection management with context managers
- **Flexible Configuration**: Support for connection strings or individual parameters

### Prerequisites

1. **AlibabaCloud MySQL Instance**: You need an AlibabaCloud RDS MySQL instance with vector support enabled
2. **Database Setup**: Ensure your MySQL instance supports the VECTOR data type and vector functions
3. **Network Access**: Configure security groups and whitelist to allow connections from your application

### Connection String Format

The connection string should follow this format:
```
mysql://username:password@host:port/database
```

Example:
```
mysql://myuser:[email protected]:3306/vector_db
```

### Distance Functions

- **euclidean**: Uses Euclidean distance for vector similarity
- **cosine**: Uses Cosine distance for vector similarity

The distance function is set during collection creation and affects the HNSW index configuration.

### Performance Tuning

- **m_value**: Controls the HNSW index quality vs speed tradeoff
- Lower values (3-16): Faster indexing, less memory, lower accuracy
- Higher values (32-200): Slower indexing, more memory, higher accuracy
- **embedding_model_dims**: Should match your embedding model's output dimensions exactly

### Error Handling

The connector includes comprehensive error handling and logging:
- Connection errors are automatically caught and logged
- Failed transactions are rolled back automatically
- Detailed error messages help with debugging
64 changes: 64 additions & 0 deletions mem0/configs/vector_stores/alibabacloud_mysql.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
from typing import Any, Dict, Optional

from pydantic import BaseModel, Field, model_validator


class MySQLVectorConfig(BaseModel):
dbname: str = Field("mem0", description="Default name for the database")
collection_name: str = Field("mem0", description="Default name for the collection")
embedding_model_dims: Optional[int] = Field(1536, description="Dimensions of the embedding model")
user: Optional[str] = Field(None, description="Database user")
password: Optional[str] = Field(None, description="Database password")
host: Optional[str] = Field(None, description="Database host. Default is localhost")
port: Optional[int] = Field(None, description="Database port. Default is 3306")
distance_function: Optional[str] = Field("euclidean", description="Distance function for vector index ('euclidean' or 'cosine')")
m_value: Optional[int] = Field(16, description="M parameter for HNSW index (3-200). Higher values = more accurate but slower")
# SSL and connection options
ssl_disabled: Optional[bool] = Field(False, description="Disable SSL connection")
ssl_ca: Optional[str] = Field(None, description="SSL CA certificate file path")
ssl_cert: Optional[str] = Field(None, description="SSL certificate file path")
ssl_key: Optional[str] = Field(None, description="SSL key file path")
connection_string: Optional[str] = Field(None, description="AlibabaCloud MySQL connection string (overrides individual connection parameters)")
charset: Optional[str] = Field("utf8mb4", description="Character set for the connection")
autocommit: Optional[bool] = Field(True, description="Enable autocommit mode")

@model_validator(mode="before")
def check_auth_and_connection(cls, values):
# If connection_string is provided, skip validation of individual connection parameters
if values.get("connection_string") is not None:
return values

# Otherwise, validate individual connection parameters
user, password = values.get("user"), values.get("password")
host, port = values.get("host"), values.get("port")
if not user and not password:
raise ValueError("Both 'user' and 'password' must be provided when not using connection_string.")
if not host and not port:
raise ValueError("Both 'host' and 'port' must be provided when not using connection_string.")
return values

@model_validator(mode="before")
def validate_distance_function(cls, values):
distance_function = values.get("distance_function", "euclidean")
if distance_function not in ["euclidean", "cosine"]:
raise ValueError("distance_function must be either 'euclidean' or 'cosine'")
return values

@model_validator(mode="before")
def validate_m_value(cls, values):
m_value = values.get("m_value", 16)
if not (3 <= m_value <= 200):
raise ValueError("m_value must be between 3 and 200")
return values

@model_validator(mode="before")
@classmethod
def validate_extra_fields(cls, values: Dict[str, Any]) -> Dict[str, Any]:
allowed_fields = set(cls.model_fields.keys())
input_fields = set(values.keys())
extra_fields = input_fields - allowed_fields
if extra_fields:
raise ValueError(
f"Extra fields not allowed: {', '.join(extra_fields)}. Please input only the following fields: {', '.join(allowed_fields)}"
)
return values
1 change: 1 addition & 0 deletions mem0/utils/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,7 @@ class VectorStoreFactory:
"baidu": "mem0.vector_stores.baidu.BaiduDB",
"cassandra": "mem0.vector_stores.cassandra.CassandraDB",
"neptune": "mem0.vector_stores.neptune_analytics.NeptuneAnalyticsVector",
"alibabacloud_mysql": "mem0.vector_stores.alibabacloud_mysql.MySQLVector",
}

@classmethod
Expand Down
Loading