tests: add tests in lfx for 1.6.0 starter projects #9997

ogabrielluiz · 2025-09-26T13:06:51Z

Summary by CodeRabbit

New Features
- Added starter projects: Basic Prompting, Blog Writer, Document Q&A, Financial Report Parser, Knowledge Ingestion, Knowledge Retrieval, Memory Chatbot, Research Translation Loop, SEO Keyword Generator, SaaS Pricing, Search Agent, and Twitter Thread Generator. These ready-to-run flows cover chat, retrieval, parsing, agents, pricing, SEO, and social content generation.
Refactor
- Improved logging import compatibility so existing imports continue to work without changes.

- Introduced multiple starter project JSON files, including "Basic Prompt Chaining", "Blog Writer", "Document Q&A", and others, to provide diverse templates for chatbot functionalities. - Each project includes detailed descriptions and structured data to facilitate easy integration and usage. - Enhanced the overall framework for building chatbots, content generation, and data processing tasks, improving user experience and flexibility in application development.

…ion 1.6.0 - Introduced a new test suite to validate that starter project templates from version 1.6.0 can be loaded without import errors. - Implemented asynchronous fetching of starter project JSON files from GitHub and caching them locally for testing. - Added multiple test cases to ensure the existence, validity, and proper execution of starter projects, while checking for import errors related to langflow and lfx modules. - Enhanced overall test coverage for backward compatibility, ensuring existing projects remain functional with the latest codebase.

- Introduced a new module to maintain compatibility for imports from lfx.logging.logger, redirecting functionality to lfx.log.logger. - Ensured all original exports are preserved to facilitate a smooth transition for existing codebases relying on the previous import path.

…n backward compatibility tests - Updated the test suite to utilize httpx for asynchronous HTTP requests instead of aiohttp, enhancing performance and compatibility. - Improved error handling by checking response status codes and ensuring proper JSON parsing. - Added handling for known failing starter projects due to import issues in version 1.6.0, marking them as expected failures in tests. - Enhanced documentation within tests to clarify known issues with specific starter projects.

coderabbitai · 2025-09-26T13:07:00Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds a backward-compatibility module lfx.logging.logger that re-exports symbols from lfx.log.logger. Introduces multiple new starter project JSON templates under tests/data/starter_projects_1_6_0 for various workflows (e.g., Document Q&A, Memory Chatbot, Research Translation Loop), containing component configurations and embedded template code.

Changes

Cohort / File(s)	Summary
Logging back-compat re-export `src/lfx/src/lfx/logging/logger.py`	New shim module re-exporting InterceptHandler, LogConfig, configure, logger, setup_gunicorn_logger, setup_uvicorn_logger from `lfx.log.logger`; defines `__all__` accordingly.
Starter project templates (additions) `src/lfx/tests/data/starter_projects_1_6_0/*`	Added multiple JSON starter flows: Basic Prompting, Blog Writer, Document Q&A, Financial Report Parser, Knowledge Ingestion, Knowledge Retrieval, Memory Chatbot, Research Translation Loop, SEO Keyword Generator, SaaS Pricing, Search agent, Twitter Thread Generator. Each includes nodes, edges, UI metadata, and embedded component templates/config.

Sequence Diagram(s)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(logging): add logger module for backwards compatibility #9929 — Similar back-compat logger module re-exporting symbols from lfx.log.logger.
fix: Add backwards compatibility module for lfx.logging #9912 — Adds re-exports in lfx.logging package aligning with this logger compatibility change.
fix: Clean up some more base templates #8708 — Modifies the same starter project JSON templates (e.g., Document Q&A, Memory Chatbot, Research Translation Loop).

Suggested labels

enhancement, size:XXL, lgtm

Suggested reviewers

jordanrfrazier
pedrocassalpacheco
Cristhianzl

Pre-merge checks and finishing touches

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Test Coverage For New Implementations	❌ Error	The PR introduces a new compatibility module lfx.logging.logger without adding any tests to validate its behavior, so there is no regression or unit test covering that code; the remainder of the changes are JSON fixtures placed under tests/data, which provide resources but do not themselves exercise the new or existing logic. Because the newly added Python module is untested, the PR fails the requirement that new functionality be accompanied by corresponding tests.	Please add tests that import lfx.logging.logger and confirm it re-exports the expected symbols, or otherwise supply coverage for the new functionality, before merging.
Test Quality And Coverage	⚠️ Warning	I inspected the PR and found that it only adds fixture-like JSON definitions under `src/lfx/tests/data/starter_projects_1_6_0/` plus a compatibility re-export module, but it does not introduce any new test cases or adjustments to existing test suites that exercise these additions. Consequently, there is no validation of the new starter project behaviors, no assertions around loading or executing them, and no verification targeting the logging re-export. Because the PR introduces new functionality without accompanying tests, the requirements for the “Test Quality and Coverage” check are not met.	Please add targeted tests that load and exercise each new starter project configuration (including error paths where applicable) and provide coverage for the new logging re-export to ensure backwards compatibility before re-running this check.
Test File Naming And Structure	❓ Inconclusive	No new or modified executable test files were introduced in this pull request—only JSON fixture data under `src/lfx/tests/data/...` and a Python re-export module were added—so there are no backend, frontend, or integration test files whose naming conventions, structure, or coverage can be evaluated against the check’s criteria.	Please provide or point to the actual test files affected by this change so their naming, structure, and coverage can be assessed; if no test code is being added, the check may not apply.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title clearly indicates that the pull request adds tests for the 1.6.0 starter projects in the lfx module, which aligns with the bulk of the changes. It is concise, specific, and free of vague terms, allowing a teammate to understand the primary purpose at a glance.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.
Excessive Mock Usage Warning	✅ Passed	No test files in this pull request introduce or rely on mocks; the changes add data fixtures and a compatibility re-export module without any mocking logic, so there is no evidence of excessive or inappropriate mock usage to warn about.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 11

🧹 Nitpick comments (13)

src/lfx/src/lfx/logging/logger.py (1)
7-15: Good compatibility shim; consider emitting a deprecation warning.

Re-export looks correct. To guide migrations to lfx.log.logger, optionally emit a DeprecationWarning on import.
 """Backwards compatibility module for lfx.logging.logger.
 
 This module provides backwards compatibility for code that imports from lfx.logging.logger.
 All functionality has been moved to lfx.log.logger.
 """
 
+import warnings
+warnings.warn(
+    "lfx.logging.logger is deprecated; use lfx.log.logger instead.",
+    DeprecationWarning,
+    stacklevel=2,
+)
 
 # Ensure we maintain all the original exports
 from lfx.log.logger import (
     InterceptHandler,
     LogConfig,
     configure,
     logger,
     setup_gunicorn_logger,
     setup_uvicorn_logger,
 )
src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json (2)

610-650: Headers default diverges from settings; ensure consistency or clarify.

Template sets User-Agent to a literal "langflow" while code defaults to get_settings_service().settings.user_agent. Consider aligning to settings for consistency.

147-164: SplitText defaults in code vs JSON template mismatch.

Code uses chunk_size=1000, chunk_overlap=200 while the template sets 100 and 0. Verify intended defaults or update to avoid confusion in tests.

Based on learnings

src/lfx/tests/data/starter_projects_1_6_0/Twitter Thread Generator.json (1)

1743-1779: Minor: prefer explicit asyncio.TimeoutError and verify provider options.

Same Language Model component as other flows. Ensure API key resolution works; consider aligning exception type if adding timeouts later.

Based on learnings
src/lfx/tests/data/starter_projects_1_6_0/Knowledge Retrieval.json (2)
479-496: Avoid using private Chroma client internals for embeddings retrieval.

collection = chroma._client.get_collection(...) relies on a private attribute; this is brittle across versions. Prefer public APIs if available, or gate by hasattr and fall back safely.
-                collection = chroma._client.get_collection(name=self.knowledge_base)
-                embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["metadatas", "embeddings"])
+                collection = getattr(chroma, "_collection", None)
+                if collection is None and hasattr(chroma, "_client"):
+                    collection = chroma._client.get_collection(name=self.knowledge_base)  # last-resort fallback
+                if collection is not None:
+                    embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["metadatas", "embeddings"])
+                else:
+                    embeddings_result = {"metadatas": [], "embeddings": []}
Also consider exposing embeddings only when explicitly requested to minimize payload size.

193-213: ChatOutput includes 'clean_data' in field_order but not in template.

Code guards with getattr, so it won’t crash, but UI won’t expose it. Either add the input or remove from field_order.
src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json (3)
137-145: Align field name in Chat Input: field_order uses store_message but input is should_store_message.

This mismatch can confuse the UI ordering and bindings.
-              "store_message",
+              "should_store_message",
Verify no other references rely on the old name.

Also applies to: 307-321

246-248: Fix prompt typo.

Minor copy tweak for clarity.
-                "value": "What is this document is about?"
+                "value": "What is this document about?"
1187-1195: Reduce sensitive logging of args.

Logging the full args dict includes file_path and processing settings. Consider lowering verbosity or omitting the path to avoid leaking server paths in logs.
-        self.log(args)
+        self.log({k: v for k, v in args.items() if k != "file_path"})
src/lfx/tests/data/starter_projects_1_6_0/Financial Report Parser.json (1)
742-776: Handle missing keys when formatting with pattern to avoid KeyError.

Formatting rows/data with .format(**…) will throw if a key is missing. Safer to use a dict that returns empty strings for missing keys.
-                formatted_text = self.pattern.format(**row.to_dict())
+                values = row.to_dict()
+                formatted_text = self.pattern.format_map({k: values.get(k, "") for k in set(values) | set([])})
...
-            formatted_text = self.pattern.format(**data.data)
+            payload = data.data or {}
+            formatted_text = self.pattern.format_map({k: payload.get(k, "") for k in payload.keys()})
Consider using a collections.defaultdict(str, …) wrapper if available.
src/lfx/tests/data/starter_projects_1_6_0/Search agent.json (2)
503-513: Chat Output field_order lists ‘clean_data’ but no input is defined.

UI lists “clean_data” but the template doesn’t define it here (unlike other components). convert_to_string already guards with getattr(..., False). Either add the input or remove it from the order to avoid UI confusion.
-              "text_color",
-              "clean_data"
+              "text_color"
Or add a BoolInput named clean_data if you intend to expose it.

Also applies to: 556-556

1129-1169: Avoid listing non-existent/future model names by default.

Options include speculative entries like gpt-5* and o4-*. They can confuse users and may break dynamic fetch flows.

Trim to known-good defaults and let dynamic refresh populate the rest.
src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json (1)

1100-1131: Validate the static model_name list and default.

The list includes speculative entries (e.g., gpt-5*, o4-*). Consider deriving options dynamically from the selected provider or confirm they exist in 1.6.0 to avoid user confusion.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0ac23a and 098da15.

📒 Files selected for processing (13)

src/lfx/src/lfx/logging/logger.py (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Basic Prompting.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Blog Writer.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Financial Report Parser.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Knowledge Retrieval.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Memory Chatbot.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Research Translation Loop.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/SEO Keyword Generator.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Search agent.json (1 hunks)
src/lfx/tests/data/starter_projects_1_6_0/Twitter Thread Generator.json (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-07-18T18:25:54.486Z

Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-07-18T18:25:54.486Z
Learning: Starter project files auto-format after langflow run; these formatting changes can be committed or ignored

Applied to files:

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json
src/lfx/tests/data/starter_projects_1_6_0/Basic Prompting.json

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Update Starter Projects

🔇 Additional comments (12)

src/lfx/src/lfx/logging/logger.py (1)

17-24: all matches the re-exported API. LGTM.

src/lfx/tests/data/starter_projects_1_6_0/Basic Prompting.json (2)

570-589: ChatOutput type validation looks good and robust.

835-852: Verify API key resolution pipeline
The api_key input defaults to "OPENAI_API_KEY" with load_from_db=True and is resolved at runtime via update_params_with_load_from_db_fields → VariableService.get_variable("OPENAI_API_KEY",…). Ensure your settings enable store_environment_variables=True and include "OPENAI_API_KEY" in variables_to_get_from_environment so that initialize_user_variables imports it into the database before build_model runs.

src/lfx/tests/data/starter_projects_1_6_0/Twitter Thread Generator.json (1)

1401-1672: Prompt template variables and wiring look consistent. LGTM.

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Retrieval.json (2)

1-86: Starter JSON content looks consistent; formatting diffs are acceptable.

Based on learnings

462-478: Confirm OpenAIEmbeddings parameter name: Versions of LangChain expect openai_api_key instead of api_key; verify against your installed version to avoid runtime errors.

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json (1)

409-426: Validate RecursiveUrlLoader kwargs
_create_loader passes base_url, autoset_encoding, exclude_dirs, link_regex to RecursiveUrlLoader. Manually verify these parameters against your installed langchain_community.document_loaders.RecursiveUrlLoader version and drop or guard any unsupported kwargs to prevent runtime errors.
src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json (2)
1397-1418: Add a timeout and handle hangs in subprocess.run.

Docling conversion is executed via subprocess.run without a timeout. If the child process hangs, the component will stall indefinitely.

Add a timeout and handle TimeoutExpired to surface a clean error:
-        proc = subprocess.run(  # noqa: S603
+        try:
+            proc = subprocess.run(  # noqa: S603
             [sys.executable, "-u", "-c", child_script],
             input=json.dumps(args).encode("utf-8"),
             capture_output=True,
-            check=False,
-        )
+            check=False,
+            timeout=300,
+        )
+        except subprocess.TimeoutExpired as e:
+            return Data(data={"error": f"Docling subprocess timeout: {e}", "file_path": file_path})
Likely an incorrect or invalid review comment.

1200-1236: Fix NameError: VlmPipelineOptions is referenced but never imported/defined in child script.

Inside the Docling subprocess child script, VlmPipelineOptions is used but not imported or defined. This will raise a NameError whenever pipeline == "vlm".

Apply this minimal fix (the variable isn’t used; safest is to remove the line):
-                        vl_pipe = VlmPipelineOptions()
Alternatively, if you intended to use options, import the correct symbol and pass it where needed:
-                        from docling.pipeline.vlm_pipeline import VlmPipeline
+                        from docling.pipeline.vlm_pipeline import VlmPipeline
+                        from docling.pipeline.vlm_pipeline import VlmPipelineOptions  # if truly needed
Likely an incorrect or invalid review comment.
src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json (3)

669-693: Calculator component looks solid.

Safe AST evaluation restricted to arithmetic operators, clear error handling, and Data-wrapped result. Good for tool mode.

929-945: Agent error handling and logging are reasonable.

Granular exception handling with async logging and re-raise preserves traceability without swallowing errors.

395-410: Ensure test/runtime deps include orjson and fastapi.

ChatOutput code imports orjson and fastapi.encoders. Confirm these are present in the test environment to prevent import errors when loading the component.

src/lfx/tests/data/starter_projects_1_6_0/Blog Writer.json

src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json

src/lfx/tests/data/starter_projects_1_6_0/Financial Report Parser.json

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json

coderabbitai · 2025-09-26T13:16:10Z

src/lfx/tests/data/starter_projects_1_6_0/Memory Chatbot.json

+  "description": "Create a chatbot that saves and references previous messages, enabling the model to maintain context throughout the conversation.",
+  "endpoint_name": null,
+  "id": "abdc8216-e400-48c4-8d11-0f21441b50ea",
+  "is_component": false,
+  "last_tested_version": "1.4.3",
+  "name": "Memory Chatbot",


⚠️ Potential issue | 🟠 Major

Correct the recorded last_tested_version.

Although this template is part of the 1.6.0 suite, the metadata still states 1.4.3. That undermines the new verification coverage and any automation that keys off the tested version. Please align it with 1.6.0.

- "last_tested_version": "1.4.3", + "last_tested_version": "1.6.0",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"description": "Create a chatbot that saves and references previous messages, enabling the model to maintain context throughout the conversation.",

"endpoint_name": null,

"id": "abdc8216-e400-48c4-8d11-0f21441b50ea",

"is_component": false,

"last_tested_version": "1.4.3",

"name": "Memory Chatbot",

"description": "Create a chatbot that saves and references previous messages, enabling the model to maintain context throughout the conversation.",

"endpoint_name": null,

"id": "abdc8216-e400-48c4-8d11-0f21441b50ea",

"is_component": false,

"last_tested_version": "1.6.0",

"name": "Memory Chatbot",

🤖 Prompt for AI Agents

In src/lfx/tests/data/starter_projects_1_6_0/Memory Chatbot.json around lines 1408 to 1413 the metadata field "last_tested_version" is still set to "1.4.3"; update that value to "1.6.0" so the template correctly reflects the 1.6.0 test suite; ensure the value is quoted exactly as "1.6.0" and preserve JSON formatting and trailing commas as appropriate.

coderabbitai · 2025-09-26T13:16:10Z

src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json

+                "tool_mode": false,
+                "trace_as_input": true,
+                "type": "prompt",
+                "value": "To calculate the monthly subscription price of the software based on the following data:\n\nMonthly infrastructure costs: ${monthly_infrastructure_costs}\nCustomer support: ${customer_support_cost}\nContinuous development: {continuous_development_cost}\nDesired profit margin: {desired_profit_margin}%\nEstimated number of subscribers: {estimated_subscribers}\n\nFollow the step to formulate the answer:\nFixed costs:\nProfit margin:\nTotal amount needed:\nPrice per subscriber:\nThe minimum subscription price per subscriber is:"


⚠️ Potential issue | 🟠 Major

Fix mixed placeholder syntax in Prompt template.

The template mixes ${var} and {var}. Message.from_template won’t interpolate ${...}. Use {var} consistently to avoid unresolved placeholders.

Apply these replacements:

-...Monthly infrastructure costs: ${monthly_infrastructure_costs} +...Monthly infrastructure costs: {monthly_infrastructure_costs} - Customer support: ${customer_support_cost} + Customer support: {customer_support_cost}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"value": "To calculate the monthly subscription price of the software based on the following data:\n\nMonthly infrastructure costs: ${monthly_infrastructure_costs}\nCustomer support: ${customer_support_cost}\nContinuous development: {continuous_development_cost}\nDesired profit margin: {desired_profit_margin}%\nEstimated number of subscribers: {estimated_subscribers}\n\nFollow the step to formulate the answer:\nFixed costs:\nProfit margin:\nTotal amount needed:\nPrice per subscriber:\nThe minimum subscription price per subscriber is:"

"value": "To calculate the monthly subscription price of the software based on the following data:\n\nMonthly infrastructure costs: {monthly_infrastructure_costs}\nCustomer support: {customer_support_cost}\nContinuous development: {continuous_development_cost}\nDesired profit margin: {desired_profit_margin}%\nEstimated number of subscribers: {estimated_subscribers}\n\nFollow the step to formulate the answer:\nFixed costs:\nProfit margin:\nTotal amount needed:\nPrice per subscriber:\nThe minimum subscription price per subscriber is:"

🤖 Prompt for AI Agents

In src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json around line 291, the prompt mixes ${var} and {var} placeholders so Message.from_template will not interpolate the ${...} form; change all ${monthly_infrastructure_costs} and ${customer_support_cost} to {monthly_infrastructure_costs} and {customer_support_cost} respectively, and ensure {continuous_development_cost}, {desired_profit_margin}, and {estimated_subscribers} are used consistently throughout the template so all placeholders are {var} style.

coderabbitai · 2025-09-26T13:16:10Z

src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json

+                "show": true,
+                "title_case": false,
+                "type": "code",
+                "value": "from collections.abc import Generator\nfrom typing import Any\n\nimport orjson\nfrom fastapi.encoders import jsonable_encoder\n\nfrom langflow.base.io.chat import ChatComponent\nfrom langflow.helpers.data import safe_convert\nfrom langflow.inputs.inputs import BoolInput, DropdownInput, HandleInput, MessageTextInput\nfrom langflow.schema.data import Data\nfrom langflow.schema.dataframe import DataFrame\nfrom langflow.schema.message import Message\nfrom langflow.schema.properties import Source\nfrom langflow.template.field.base import Output\nfrom langflow.utils.constants import (\n    MESSAGE_SENDER_AI,\n    MESSAGE_SENDER_NAME_AI,\n    MESSAGE_SENDER_USER,\n)\n\n\nclass ChatOutput(ChatComponent):\n    display_name = \"Chat Output\"\n    description = \"Display a chat message in the Playground.\"\n    documentation: str = \"https://docs.langflow.org/components-io#chat-output\"\n    icon = \"MessagesSquare\"\n    name = \"ChatOutput\"\n    minimized = True\n\n    inputs = [\n        HandleInput(\n            name=\"input_value\",\n            display_name=\"Inputs\",\n            info=\"Message to be passed as output.\",\n            input_types=[\"Data\", \"DataFrame\", \"Message\"],\n            required=True,\n        ),\n        BoolInput(\n            name=\"should_store_message\",\n            display_name=\"Store Messages\",\n            info=\"Store the message in the history.\",\n            value=True,\n            advanced=True,\n        ),\n        DropdownInput(\n            name=\"sender\",\n            display_name=\"Sender Type\",\n            options=[MESSAGE_SENDER_AI, MESSAGE_SENDER_USER],\n            value=MESSAGE_SENDER_AI,\n            advanced=True,\n            info=\"Type of sender.\",\n        ),\n        MessageTextInput(\n            name=\"sender_name\",\n            display_name=\"Sender Name\",\n            info=\"Name of the sender.\",\n            value=MESSAGE_SENDER_NAME_AI,\n            advanced=True,\n        ),\n        MessageTextInput(\n            name=\"session_id\",\n            display_name=\"Session ID\",\n            info=\"The session ID of the chat. If empty, the current session ID parameter will be used.\",\n            advanced=True,\n        ),\n        MessageTextInput(\n            name=\"data_template\",\n            display_name=\"Data Template\",\n            value=\"{text}\",\n            advanced=True,\n            info=\"Template to convert Data to Text. If left empty, it will be dynamically set to the Data's text key.\",\n        ),\n    ]\n    outputs = [\n        Output(\n            display_name=\"Output Message\",\n            name=\"message\",\n            method=\"message_response\",\n        ),\n    ]\n\n    def _build_source(self, id_: str | None, display_name: str | None, source: str | None) -> Source:\n        source_dict = {}\n        if id_:\n            source_dict[\"id\"] = id_\n        if display_name:\n            source_dict[\"display_name\"] = display_name\n        if source:\n            # Handle case where source is a ChatOpenAI object\n            if hasattr(source, \"model_name\"):\n                source_dict[\"source\"] = source.model_name\n            elif hasattr(source, \"model\"):\n                source_dict[\"source\"] = str(source.model)\n            else:\n                source_dict[\"source\"] = str(source)\n        return Source(**source_dict)\n\n    async def message_response(self) -> Message:\n        # First convert the input to string if needed\n        text = self.convert_to_string()\n\n        # Get source properties\n        source, icon, display_name, source_id = self.get_properties_from_source_component()\n\n        # Create or use existing Message object\n        if isinstance(self.input_value, Message):\n            message = self.input_value\n            # Update message properties\n            message.text = text\n        else:\n            message = Message(text=text)\n\n        # Set message properties\n        message.sender = self.sender\n        message.sender_name = self.sender_name\n        message.session_id = self.session_id\n        message.flow_id = self.graph.flow_id if hasattr(self, \"graph\") else None\n        message.properties.source = self._build_source(source_id, display_name, source)\n\n        # Store message if needed\n        if self.session_id and self.should_store_message:\n            stored_message = await self.send_message(message)\n            self.message.value = stored_message\n            message = stored_message\n\n        self.status = message\n        return message\n\n    def _serialize_data(self, data: Data) -> str:\n        \"\"\"Serialize Data object to JSON string.\"\"\"\n        # Convert data.data to JSON-serializable format\n        serializable_data = jsonable_encoder(data.data)\n        # Serialize with orjson, enabling pretty printing with indentation\n        json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)\n        # Convert bytes to string and wrap in Markdown code blocks\n        return \"```json\\n\" + json_bytes.decode(\"utf-8\") + \"\\n```\"\n\n    def _validate_input(self) -> None:\n        \"\"\"Validate the input data and raise ValueError if invalid.\"\"\"\n        if self.input_value is None:\n            msg = \"Input data cannot be None\"\n            raise ValueError(msg)\n        if isinstance(self.input_value, list) and not all(\n            isinstance(item, Message | Data | DataFrame | str) for item in self.input_value\n        ):\n            invalid_types = [\n                type(item).__name__\n                for item in self.input_value\n                if not isinstance(item, Message | Data | DataFrame | str)\n            ]\n            msg = f\"Expected Data or DataFrame or Message or str, got {invalid_types}\"\n            raise TypeError(msg)\n        if not isinstance(\n            self.input_value,\n            Message | Data | DataFrame | str | list | Generator | type(None),\n        ):\n            type_name = type(self.input_value).__name__\n            msg = f\"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}\"\n            raise TypeError(msg)\n\n    def convert_to_string(self) -> str | Generator[Any, None, None]:\n        \"\"\"Convert input data to string with proper error handling.\"\"\"\n        self._validate_input()\n        if isinstance(self.input_value, list):\n            clean_data: bool = getattr(self, \"clean_data\", False)\n            return \"\\n\".join([safe_convert(item, clean_data=clean_data) for item in self.input_value])\n        if isinstance(self.input_value, Generator):\n            return self.input_value\n        return safe_convert(self.input_value)\n"


⚠️ Potential issue | 🔴 Critical

Replace PEP 604 unions in isinstance with tuples (Py3.10/3.11 incompatible).

isinstance(..., A | B) raises TypeError prior to Python 3.12. Use a tuple to ensure runtime compatibility.

Apply this diff inside the ChatOutput code block:

- if isinstance(self.input_value, list) and not all( - isinstance(item, Message | Data | DataFrame | str) for item in self.input_value + if isinstance(self.input_value, list) and not all( + isinstance(item, (Message, Data, DataFrame, str)) for item in self.input_value ): ... - if not isinstance( - self.input_value, - Message | Data | DataFrame | str | list | Generator | type(None), - ): - type_name = type(self.input_value).__name__ - msg = f"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}" + if not isinstance(self.input_value, (Message, Data, DataFrame, str, list, Generator)): + type_name = type(self.input_value).__name__ + msg = f"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}" raise TypeError(msg)

🤖 Prompt for AI Agents

In src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json around line 410, replace PEP 604 union types used inside isinstance calls with tuples for Python 3.10/3.11 compatibility: change isinstance(item, Message | Data | DataFrame | str) to isinstance(item, (Message, Data, DataFrame, str)) and change isinstance(self.input_value, Message | Data | DataFrame | str | list | Generator | type(None)) to isinstance(self.input_value, (Message, Data, DataFrame, str, list, Generator, type(None))). Ensure both isinstance usages are updated to use tuple syntax without modifying other logic.

coderabbitai · 2025-09-26T13:16:10Z

src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json

+  "last_tested_version": "1.4.3",
+  "name": "SaaS Pricing",


⚠️ Potential issue | 🟡 Minor

Update last_tested_version to match the 1.6.0 suite.

File is under starter_projects_1_6_0 but shows "1.4.3".

Apply this diff:

- "last_tested_version": "1.4.3", + "last_tested_version": "1.6.0",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"last_tested_version": "1.4.3",

"name": "SaaS Pricing",

"last_tested_version": "1.6.0",

"name": "SaaS Pricing",

🤖 Prompt for AI Agents

In src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json around lines 1410-1411, the "last_tested_version" field incorrectly shows "1.4.3" while the file is in the starter_projects_1_6_0 suite; update the value to "1.6.0" so it matches the suite naming and commit the change.

coderabbitai · 2025-09-26T13:16:11Z

src/lfx/tests/data/starter_projects_1_6_0/Search agent.json

+            "outputs": [
+              {
+                "allows_loop": false,
+                "cache": true,
+                "display_name": "Toolset",
+                "group_outputs": false,
+                "hidden": null,
+                "method": "to_toolkit",
+                "name": "component_as_tool",
+                "options": null,
+                "required_inputs": null,
+                "selected": "Tool",
+                "tool_mode": true,
+                "types": [
+                  "Tool"
+                ],
+                "value": "__UNDEFINED__"
+              }
+            ],


⚠️ Potential issue | 🔴 Critical

Make ScrapeGraphSearchApi expose a Tool output and fix return type.

Graph wiring expects component_as_tool (Tool) but the class code only exposes a Data output via search(). Also, the search return type hints list[Data] but returns a single Data.

Expose both outputs and wrap the return to match type hints:

class ScrapeGraphSearchApi(Component): @@ - outputs = [ - Output(display_name="Data", name="data", method="search"), - ] + outputs = [ + Output(display_name="Toolset", name="component_as_tool", method="to_toolkit"), + Output(display_name="Data", name="data", method="search"), + ] @@ - def search(self) -> list[Data]: + def search(self) -> list[Data]: @@ - return Data(data=response) + return [Data(data=response)]

This aligns runtime behavior with edges that connect to Agent.tools (Tool).

Also applies to: 147-166, 163-227

🤖 Prompt for AI Agents

In src/lfx/tests/data/starter_projects_1_6_0/Search agent.json around lines 108-126 (and similarly update ranges 147-166, 163-227), the graph expects a Tool output named component_as_tool but the implementation currently only exposes a Data output via search() and the search return type is incorrect (declares list[Data] but returns a single Data). Modify the ScrapeGraphSearchApi class to expose both outputs: keep the Data output and add a Tool output with name component_as_tool (method to_toolkit/tool mapping), and update the search method signature to return a list of Data or wrap its single Data return value into a list to match the declared type; ensure the runtime output selection uses the Tool output when selected by graph wiring.

coderabbitai · 2025-09-26T13:16:11Z

src/lfx/tests/data/starter_projects_1_6_0/SEO Keyword Generator.json

+  "description": "Generates targeted SEO keywords based on product information, pain points, and customer profiles for strategic marketing.",
+  "endpoint_name": null,
+  "id": "77064cce-d437-42be-a7bb-f0dedf93a75a",
+  "is_component": false,
+  "last_tested_version": "1.4.3",
+  "name": "SEO Keyword Generator",


⚠️ Potential issue | 🟠 Major

Bring last_tested_version up to 1.6.0.

The metadata still claims 1.4.3 even though this JSON lives in the 1.6.0 starter catalog. Update the field so tests and tooling accurately reflect the version we are validating.

- "last_tested_version": "1.4.3", + "last_tested_version": "1.6.0",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"description": "Generates targeted SEO keywords based on product information, pain points, and customer profiles for strategic marketing.",

"endpoint_name": null,

"id": "77064cce-d437-42be-a7bb-f0dedf93a75a",

"is_component": false,

"last_tested_version": "1.4.3",

"name": "SEO Keyword Generator",

"description": "Generates targeted SEO keywords based on product information, pain points, and customer profiles for strategic marketing.",

"endpoint_name": null,

"id": "77064cce-d437-42be-a7bb-f0dedf93a75a",

"is_component": false,

"last_tested_version": "1.6.0",

"name": "SEO Keyword Generator",

🤖 Prompt for AI Agents

In src/lfx/tests/data/starter_projects_1_6_0/SEO Keyword Generator.json around lines 1081 to 1086, the last_tested_version is incorrectly set to "1.4.3"; update that JSON field to "1.6.0" (preserve JSON formatting and quotes) so the metadata matches the 1.6.0 starter catalog used by tests and tooling.

mpawlow

@ogabrielluiz

Code Review 1

See PR comments: (a) to (h)
My comments are mainly questions, suggestions and areas for potential improvement
Ready to approve based on your feedback.

src/lfx/tests/unit/cli/test_run_starter_projects_backward_compatibility.py

mpawlow · 2025-10-09T15:17:28Z

src/lfx/tests/unit/cli/test_run_starter_projects_backward_compatibility.py

+        fixed in later versions. These are marked as expected failures.
+        """
+        # Known failing starter projects due to component-level import bugs in 1.6.0
+        known_failing_projects = {


(c) [Minor] Suggest just removing News Aggregator from source control

e.g. Rely only on the source control files rather than git retrieval of the projects

Also, would be great to remove the known failure (pytest.xfail)

i don't want to remove it yet because it shouldn't fail. We need to build a way for this to pass. I think we'll probably have to add a stub api module in lfx. I'm not sure. wdyt?

Question: Is it possible to detect the starter project version (for News Aggregator) and automatically upgrade it to the fixed version in the customer's flow on load?

OR indicate to the customer, that the current starter project is deprecated (disabled from running) and needs to be manually upgraded to the fixed version?

This sort of solution / strategy is usually applicable to artifacts that are versioned plug-ins or extensions

Please completely ignore if this concept is not applicable, feasible or practical to starter projects (I'd have to research further to understand whether what I'm suggesting is nonsense in this context)

Otherwise, it sounds like using temporary stubs (that can be deprecated and eventually removed in later releases) is the only way to go (and likely the easiest tactical solution)

src/lfx/tests/unit/cli/test_run_starter_projects_backward_compatibility.py

…nused async fetching - Refactored the test suite to streamline the retrieval of starter project paths, eliminating the need for asynchronous fetching from GitHub. - Updated the `get_starter_projects_path` function to directly return the local path for cached starter projects. - Removed redundant code related to GitHub API calls and caching logic, enhancing test clarity and performance. - Adjusted test cases to ensure they continue to validate the existence and format of starter project files effectively.

- Introduced two new tests to validate the application of the --format option to error messages in the CLI. - The first test checks that error messages are returned as plain text when the --format option is set to "text". - The second test ensures that even with the --verbose flag, the output remains valid JSON when the --format option is set to "json". - Both tests are marked as expected failures to document current limitations and guide future fixes.

sonarqubecloud · 2025-10-09T21:04:36Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

mpawlow

@ogabrielluiz

Code Review 2

Approved (LGTM)
Answered your question here --> #9997 (comment)

src/lfx/tests/unit/cli/test_run_starter_projects_backward_compatibility.py

mpawlow · 2025-10-09T22:04:24Z

src/lfx/tests/unit/cli/test_run_starter_projects_backward_compatibility.py

+        fixed in later versions. These are marked as expected failures.
+        """
+        # Known failing starter projects due to component-level import bugs in 1.6.0
+        known_failing_projects = {


Question: Is it possible to detect the starter project version (for News Aggregator) and automatically upgrade it to the fixed version in the customer's flow on load?

OR indicate to the customer, that the current starter project is deprecated (disabled from running) and needs to be manually upgraded to the fixed version?

This sort of solution / strategy is usually applicable to artifacts that are versioned plug-ins or extensions

Please completely ignore if this concept is not applicable, feasible or practical to starter projects (I'd have to research further to understand whether what I'm suggesting is nonsense in this context)

Otherwise, it sounds like using temporary stubs (that can be deprecated and eventually removed in later releases) is the only way to go (and likely the easiest tactical solution)

src/lfx/tests/unit/cli/test_run_starter_projects_backward_compatibility.py

ogabrielluiz added 4 commits September 26, 2025 09:31

ogabrielluiz requested review from edwinjosechittilappilly and jordanrfrazier September 26, 2025 13:06

coderabbitai bot reviewed Sep 26, 2025

View reviewed changes

carlosrcoelho requested a review from erichare October 3, 2025 19:38

ogabrielluiz requested a review from mpawlow October 8, 2025 16:31

mpawlow requested changes Oct 9, 2025

View reviewed changes

ogabrielluiz added 2 commits October 9, 2025 17:49

mpawlow approved these changes Oct 9, 2025

View reviewed changes

github-actions bot added the lgtm This PR has been approved by a maintainer label Oct 9, 2025

ogabrielluiz added this pull request to the merge queue Oct 10, 2025

Merged via the queue into main with commit a3f7f91 Oct 10, 2025
21 checks passed

ogabrielluiz deleted the add-1.6-lfx-tests branch October 10, 2025 13:23

tests: add tests in lfx for 1.6.0 starter projects #9997

tests: add tests in lfx for 1.6.0 starter projects #9997

Uh oh!

Conversation

ogabrielluiz commented Sep 26, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

mpawlow left a comment

Choose a reason for hiding this comment

Code Review 1

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mpawlow Oct 9, 2025

Choose a reason for hiding this comment

(c) [Minor] Suggest just removing News Aggregator from source control

Uh oh!

ogabrielluiz Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

mpawlow Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sonarqubecloud bot commented Oct 9, 2025

Quality Gate passed

Uh oh!

mpawlow left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Code Review 2

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ogabrielluiz commented Sep 26, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 26, 2025 •

edited

Loading

(c) [Minor] Suggest just removing `News Aggregator` from source control

mpawlow left a comment •

edited

Loading