Skip to content

Commit 3cd9936

Browse files
committed
Merge branch 'develop'
2 parents 2e1336c + fcf0ff3 commit 3cd9936

20 files changed

+473
-1201
lines changed

.github/workflows/ci-cd.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ jobs:
4040
- name: Install dependencies
4141
run: |
4242
python -m pip install --upgrade pip
43-
pip install -e .[dev]
43+
pip install -e ".[dev,all]"
4444
4545
- name: Run tests with coverage
4646
run: |
@@ -50,7 +50,7 @@ jobs:
5050
if: matrix.python-version == '3.11'
5151
uses: codecov/codecov-action@v5
5252
with:
53-
file: ./coverage.xml
53+
files: ./coverage.xml
5454
flags: unittests
5555

5656
security-scan:
@@ -68,7 +68,7 @@ jobs:
6868
- name: Install dependencies for security scan
6969
run: |
7070
python -m pip install --upgrade pip
71-
pip install -e .[dev]
71+
pip install -e ".[dev,all]"
7272
pip install bandit safety
7373
7474
- name: Run Bandit security scan

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,17 @@
11
# Changelog
22

3+
## 0.9.2
4+
### Improvements
5+
- Added ability for sources and segments to have multiple names in chatterlang.
6+
- Removed signature segments.
7+
- Added Anthropic as a chat provider.
8+
- Updated the install options so that not all the providers have to be installed. Include the provider
9+
you want as a parameter (e.g. pip install talkpipe[anthropic]). Current options are anthropic, openai,
10+
and ollama. Use "all" to install all three.
11+
12+
## 0.9.1
13+
Forgot to import the lancedb module in talkpipe/__init__.py, so it wasn't registering the segments.
14+
315
## 0.9.0
416
### New and Updated Segments and Sources
517
- Added **set**, which simply assigns some constant to a key.

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ COPY --chown=builder:builder tests/ tests/
3737
RUN python3 -m pip install --user --upgrade pip setuptools wheel build
3838
RUN python3 -m pip install --user numpy pandas matplotlib scikit-learn scipy
3939
ENV SETUPTOOLS_SCM_PRETEND_VERSION_FOR_TALKPIPE=0.1.0
40-
RUN python3 -m pip install --user -e .[dev]
40+
RUN python3 -m pip install --user -e .[dev,all]
4141
RUN python3 -m pytest --log-cli-level=DEBUG
4242
RUN python3 -m build --wheel
4343

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,17 @@ Install TalkPipe:
6262
pip install talkpipe
6363
```
6464

65+
For LLM support, install the provider(s) you need:
66+
```bash
67+
# Install specific providers
68+
pip install talkpipe[openai] # For OpenAI
69+
pip install talkpipe[ollama] # For Ollama
70+
pip install talkpipe[anthropic] # For Anthropic Claude
71+
72+
# Or install all LLM providers
73+
pip install talkpipe[all]
74+
```
75+
6576
Create a multi-turn chat function in 2 lines:
6677
```python
6778
from talkpipe.chatterlang import compiler

docs/quickstart.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,17 @@ Welcome to TalkPipe! This guide will help you get up and running quickly with Ta
88
pip install talkpipe
99
```
1010

11+
For LLM support, install the provider(s) you need:
12+
```bash
13+
# Install specific providers
14+
pip install talkpipe[openai] # For OpenAI
15+
pip install talkpipe[ollama] # For Ollama
16+
pip install talkpipe[anthropic] # For Anthropic Claude
17+
18+
# Or install all LLM providers
19+
pip install talkpipe[all]
20+
```
21+
1122
## Basic Concepts
1223

1324
TalkPipe provides two ways to build data processing pipelines:

docs/tutorials/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,11 @@ This isn't hypothetical – it's exactly how these tutorials were designed to wo
152152
### Start Running Immediately
153153
One of TalkPipe's core strengths is getting you from idea to working prototype faster than traditional approaches:
154154

155-
1. **Install TalkPipe**: `pip install talkpipe`
155+
1. **Install TalkPipe**:
156+
```bash
157+
pip install talkpipe[all] # Includes all LLM providers
158+
# Or install specific providers: talkpipe[ollama], talkpipe[openai], talkpipe[anthropic]
159+
```
156160
2. **Run a tutorial**: `cd Tutorial_1-Document_Indexing && ./Step_1_CreateSyntheticData.sh`
157161
3. **See results**: Working applications in minutes, not hours
158162

pyproject.toml

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@ dependencies = [
2424
'prompt_toolkit',
2525
'parsy',
2626
'pydantic',
27-
'ollama',
2827
'requests',
2928
'numpy',
3029
'numba==0.62.1',
@@ -34,17 +33,16 @@ dependencies = [
3433
'readability-lxml',
3534
'lxml',
3635
'lxml_html_clean',
37-
'openai',
3836
'fastapi[standard]',
3937
'ipywidgets',
4038
'pymongo',
4139
'umap-learn',
4240
'scikit-learn',
43-
'cryptography',
4441
'uvicorn',
4542
'whoosh',
4643
'lancedb',
47-
'deprecated'
44+
'deprecated',
45+
'pyyaml'
4846
]
4947
dynamic = ["version"]
5048

@@ -55,6 +53,18 @@ dev = [
5553
'pytest-cov',
5654
'mongomock'
5755
]
56+
ollama = [
57+
'ollama'
58+
]
59+
openai = [
60+
'openai'
61+
]
62+
anthropic = [
63+
'anthropic'
64+
]
65+
all = [
66+
'talkpipe[openai,ollama,anthropic]'
67+
]
5868

5969
[project.scripts]
6070
chatterlang_workbench = "talkpipe.app.chatterlang_workbench:main"

src/talkpipe/__init__.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
import warnings
2+
warnings.filterwarnings("ignore", message=".*ColPaliEmbeddings.*has conflict with protected namespace.*")
3+
warnings.filterwarnings("ignore", message=".*SigLipEmbeddings.*has conflict with protected namespace.*")
4+
15
from talkpipe.pipe.basic import *
26
from talkpipe.pipe.math import *
37
from talkpipe.pipe.io import *
@@ -12,7 +16,6 @@
1216
from talkpipe.operations.filtering import *
1317
from talkpipe.operations.transforms import *
1418
from talkpipe.operations.matrices import *
15-
from talkpipe.operations.signatures import *
1619
from talkpipe.app.chatterlang_serve import *
1720
from talkpipe.search.whoosh import *
1821
from talkpipe.search.simplevectordb import *

src/talkpipe/app/chatterlang_reference_browser.py

Lines changed: 74 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -18,24 +18,36 @@
1818

1919
class TalkPipeDoc:
2020
"""Represents a single TalkPipe component (class or function)."""
21-
22-
def __init__(self, name: str, chatterlang_name: str, doc_type: str,
23-
module: str, base_classes: List[str], docstring: str,
21+
22+
def __init__(self, name: str, chatterlang_names: List[str], doc_type: str,
23+
module: str, base_classes: List[str], docstring: str,
2424
parameters: Dict[str, str]):
2525
self.name = name
26-
self.chatterlang_name = chatterlang_name
26+
self.chatterlang_names = chatterlang_names # List of all names for this component
27+
self.primary_name = chatterlang_names[0] # Primary name for display
2728
self.doc_type = doc_type # 'Source', 'Segment', 'Field Segment'
2829
self.module = module
2930
self.base_classes = base_classes
3031
self.docstring = docstring
3132
self.parameters = parameters
3233

34+
@property
35+
def chatterlang_name(self):
36+
"""Backward compatibility property."""
37+
return self.primary_name
38+
39+
@property
40+
def all_names_display(self):
41+
"""Display string showing all names."""
42+
return ", ".join(self.chatterlang_names)
43+
3344

3445
class TalkPipeBrowser:
3546
"""Interactive terminal browser for TalkPipe documentation."""
36-
47+
3748
def __init__(self):
38-
self.components: Dict[str, TalkPipeDoc] = {}
49+
self.components: Dict[str, TalkPipeDoc] = {} # Maps primary name to component
50+
self.name_to_primary: Dict[str, str] = {} # Maps any name to primary name
3951
self.modules: Dict[str, List[str]] = {}
4052
self.load_components()
4153

@@ -44,23 +56,38 @@ def _extract_parameters(self, cls: type) -> Dict[str, str]:
4456
return extract_parameters_dict(cls)
4557

4658
def load_components(self):
47-
"""Load all components from the plugin system."""
59+
"""Load all components from the plugin system, grouping multiple names for the same class."""
4860
load_plugins() # Ensure plugins are loaded
49-
61+
62+
# Group components by class to consolidate multiple names
63+
class_to_names = {}
64+
class_to_type = {}
65+
5066
# Load sources
5167
for chatterlang_name, cls in input_registry.all.items():
52-
component_info = extract_component_info(chatterlang_name, cls, "Source")
53-
if component_info:
54-
self._load_component_from_info(component_info)
55-
68+
if cls not in class_to_names:
69+
class_to_names[cls] = []
70+
class_to_type[cls] = "Source"
71+
class_to_names[cls].append(chatterlang_name)
72+
5673
# Load segments
5774
for chatterlang_name, cls in segment_registry.all.items():
58-
component_type = detect_component_type(cls, "Segment")
59-
component_info = extract_component_info(chatterlang_name, cls, component_type)
75+
if cls not in class_to_names:
76+
class_to_names[cls] = []
77+
class_to_type[cls] = detect_component_type(cls, "Segment")
78+
class_to_names[cls].append(chatterlang_name)
79+
80+
# Create consolidated components
81+
for cls, names in class_to_names.items():
82+
# Sort names to ensure consistent primary name selection
83+
names.sort()
84+
primary_name = names[0]
85+
86+
component_info = extract_component_info(primary_name, cls, class_to_type[cls])
6087
if component_info:
61-
self._load_component_from_info(component_info)
88+
self._load_component_from_info(component_info, names)
6289

63-
def _load_component_from_info(self, component_info):
90+
def _load_component_from_info(self, component_info, all_names: List[str]):
6491
"""Load a single component from ComponentInfo into the browser."""
6592
try:
6693
# Convert parameters from ParamSpec list to dict for browser compatibility
@@ -100,20 +127,26 @@ def _load_component_from_info(self, component_info):
100127
# Create component
101128
component = TalkPipeDoc(
102129
name=component_info.name,
103-
chatterlang_name=component_info.chatterlang_name,
130+
chatterlang_names=all_names,
104131
doc_type=component_info.component_type,
105132
module=component_info.module,
106133
base_classes=component_info.base_classes,
107134
docstring=component_info.docstring,
108135
parameters=parameters
109136
)
110-
111-
self.components[component_info.chatterlang_name] = component
137+
138+
# Store component under primary name
139+
primary_name = all_names[0]
140+
self.components[primary_name] = component
141+
142+
# Map all names to the primary name for lookup
143+
for name in all_names:
144+
self.name_to_primary[name] = primary_name
112145

113146
# Group by module
114147
if component_info.module not in self.modules:
115148
self.modules[component_info.module] = []
116-
self.modules[component_info.module].append(component_info.chatterlang_name)
149+
self.modules[component_info.module].append(primary_name)
117150

118151
except Exception as e:
119152
print(f"Warning: Failed to load component {component_info.chatterlang_name}: {e}")
@@ -223,24 +256,29 @@ def _list_module_components(self, module_name: str):
223256
type_icon = "🔧"
224257
else:
225258
type_icon = "⚙️"
226-
print(f"{type_icon} {comp.chatterlang_name:<20} ({comp.name})")
259+
print(f"{type_icon} {comp.all_names_display:<30} ({comp.name})")
227260
print()
228261

229262
def _show_component(self, component_name: str):
230263
"""Show detailed information about a component."""
231-
# Try exact match first
232-
component = self.components.get(component_name)
233-
234-
# If not found, try case-insensitive search
264+
# Try exact match using name lookup
265+
primary_name = self.name_to_primary.get(component_name)
266+
component = None
267+
268+
if primary_name:
269+
component = self.components.get(primary_name)
270+
271+
# If not found, try case-insensitive search in all names
235272
if not component:
236-
matches = [name for name in self.components.keys()
273+
matches = [name for name in self.name_to_primary.keys()
237274
if name.lower() == component_name.lower()]
238275
if matches:
239-
component = self.components[matches[0]]
276+
primary_name = self.name_to_primary[matches[0]]
277+
component = self.components[primary_name]
240278

241279
# If still not found, suggest similar names
242280
if not component:
243-
similar = [name for name in self.components.keys()
281+
similar = [name for name in self.name_to_primary.keys()
244282
if component_name.lower() in name.lower()]
245283
if similar:
246284
print(f"Component '{component_name}' not found. Did you mean:")
@@ -252,7 +290,7 @@ def _show_component(self, component_name: str):
252290

253291
# Display component details
254292
print(f"\n{'='*60}")
255-
print(f"📋 {component.chatterlang_name}")
293+
print(f"📋 {component.all_names_display}")
256294
print(f"{'='*60}")
257295
print(f"Class/Function: {component.name}")
258296
print(f"Type: {component.doc_type}")
@@ -281,11 +319,12 @@ def _search_components(self, search_term: str):
281319
"""Search for components by name or description."""
282320
search_lower = search_term.lower()
283321
matches = []
284-
322+
285323
for comp_name, component in self.components.items():
286-
# Search in chatterlang name, class name, and docstring
287-
if (search_lower in comp_name.lower() or
288-
search_lower in component.name.lower() or
324+
# Search in all chatterlang names, class name, and docstring
325+
name_match = any(search_lower in name.lower() for name in component.chatterlang_names)
326+
if (name_match or
327+
search_lower in component.name.lower() or
289328
search_lower in component.docstring.lower()):
290329
matches.append(component)
291330

@@ -296,14 +335,14 @@ def _search_components(self, search_term: str):
296335
print(f"\nSearch Results for '{search_term}' ({len(matches)} found):")
297336
print("-" * 60)
298337

299-
for component in sorted(matches, key=lambda x: x.chatterlang_name):
338+
for component in sorted(matches, key=lambda x: x.primary_name):
300339
if component.doc_type == "Source":
301340
type_icon = "🔌"
302341
elif component.doc_type == "Field Segment":
303342
type_icon = "🔧"
304343
else:
305344
type_icon = "⚙️"
306-
print(f"{type_icon} {component.chatterlang_name:<20} ({component.module})")
345+
print(f"{type_icon} {component.all_names_display:<30} ({component.module})")
307346

308347
# Show brief description
309348
if component.docstring:

0 commit comments

Comments
 (0)