Skip to content

Conversation

@ogabrielluiz
Copy link
Contributor

@ogabrielluiz ogabrielluiz commented Sep 26, 2025

Summary by CodeRabbit

  • New Features
    • Added starter projects: Basic Prompting, Blog Writer, Document Q&A, Financial Report Parser, Knowledge Ingestion, Knowledge Retrieval, Memory Chatbot, Research Translation Loop, SEO Keyword Generator, SaaS Pricing, Search Agent, and Twitter Thread Generator. These ready-to-run flows cover chat, retrieval, parsing, agents, pricing, SEO, and social content generation.
  • Refactor
    • Improved logging import compatibility so existing imports continue to work without changes.

- Introduced multiple starter project JSON files, including "Basic Prompt Chaining", "Blog Writer", "Document Q&A", and others, to provide diverse templates for chatbot functionalities.
- Each project includes detailed descriptions and structured data to facilitate easy integration and usage.
- Enhanced the overall framework for building chatbots, content generation, and data processing tasks, improving user experience and flexibility in application development.
…ion 1.6.0

- Introduced a new test suite to validate that starter project templates from version 1.6.0 can be loaded without import errors.
- Implemented asynchronous fetching of starter project JSON files from GitHub and caching them locally for testing.
- Added multiple test cases to ensure the existence, validity, and proper execution of starter projects, while checking for import errors related to langflow and lfx modules.
- Enhanced overall test coverage for backward compatibility, ensuring existing projects remain functional with the latest codebase.
- Introduced a new module to maintain compatibility for imports from lfx.logging.logger, redirecting functionality to lfx.log.logger.
- Ensured all original exports are preserved to facilitate a smooth transition for existing codebases relying on the previous import path.
…n backward compatibility tests

- Updated the test suite to utilize httpx for asynchronous HTTP requests instead of aiohttp, enhancing performance and compatibility.
- Improved error handling by checking response status codes and ensuring proper JSON parsing.
- Added handling for known failing starter projects due to import issues in version 1.6.0, marking them as expected failures in tests.
- Enhanced documentation within tests to clarify known issues with specific starter projects.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 26, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds a backward-compatibility module lfx.logging.logger that re-exports symbols from lfx.log.logger. Introduces multiple new starter project JSON templates under tests/data/starter_projects_1_6_0 for various workflows (e.g., Document Q&A, Memory Chatbot, Research Translation Loop), containing component configurations and embedded template code.

Changes

Cohort / File(s) Summary
Logging back-compat re-export
src/lfx/src/lfx/logging/logger.py
New shim module re-exporting InterceptHandler, LogConfig, configure, logger, setup_gunicorn_logger, setup_uvicorn_logger from lfx.log.logger; defines __all__ accordingly.
Starter project templates (additions)
src/lfx/tests/data/starter_projects_1_6_0/*
Added multiple JSON starter flows: Basic Prompting, Blog Writer, Document Q&A, Financial Report Parser, Knowledge Ingestion, Knowledge Retrieval, Memory Chatbot, Research Translation Loop, SEO Keyword Generator, SaaS Pricing, Search agent, Twitter Thread Generator. Each includes nodes, edges, UI metadata, and embedded component templates/config.

Sequence Diagram(s)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

enhancement, size:XXL, lgtm

Suggested reviewers

  • jordanrfrazier
  • pedrocassalpacheco
  • Cristhianzl

Pre-merge checks and finishing touches

❌ Failed checks (1 error, 1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Test Coverage For New Implementations ❌ Error The PR introduces a new compatibility module lfx.logging.logger without adding any tests to validate its behavior, so there is no regression or unit test covering that code; the remainder of the changes are JSON fixtures placed under tests/data, which provide resources but do not themselves exercise the new or existing logic. Because the newly added Python module is untested, the PR fails the requirement that new functionality be accompanied by corresponding tests. Please add tests that import lfx.logging.logger and confirm it re-exports the expected symbols, or otherwise supply coverage for the new functionality, before merging.
Test Quality And Coverage ⚠️ Warning I inspected the PR and found that it only adds fixture-like JSON definitions under src/lfx/tests/data/starter_projects_1_6_0/ plus a compatibility re-export module, but it does not introduce any new test cases or adjustments to existing test suites that exercise these additions. Consequently, there is no validation of the new starter project behaviors, no assertions around loading or executing them, and no verification targeting the logging re-export. Because the PR introduces new functionality without accompanying tests, the requirements for the “Test Quality and Coverage” check are not met. Please add targeted tests that load and exercise each new starter project configuration (including error paths where applicable) and provide coverage for the new logging re-export to ensure backwards compatibility before re-running this check.
Test File Naming And Structure ❓ Inconclusive No new or modified executable test files were introduced in this pull request—only JSON fixture data under src/lfx/tests/data/... and a Python re-export module were added—so there are no backend, frontend, or integration test files whose naming conventions, structure, or coverage can be evaluated against the check’s criteria. Please provide or point to the actual test files affected by this change so their naming, structure, and coverage can be assessed; if no test code is being added, the check may not apply.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly indicates that the pull request adds tests for the 1.6.0 starter projects in the lfx module, which aligns with the bulk of the changes. It is concise, specific, and free of vague terms, allowing a teammate to understand the primary purpose at a glance.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Excessive Mock Usage Warning ✅ Passed No test files in this pull request introduce or rely on mocks; the changes add data fixtures and a compatibility re-export module without any mocking logic, so there is no evidence of excessive or inappropriate mock usage to warn about.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🧹 Nitpick comments (13)
src/lfx/src/lfx/logging/logger.py (1)

7-15: Good compatibility shim; consider emitting a deprecation warning.

Re-export looks correct. To guide migrations to lfx.log.logger, optionally emit a DeprecationWarning on import.

 """Backwards compatibility module for lfx.logging.logger.
 
 This module provides backwards compatibility for code that imports from lfx.logging.logger.
 All functionality has been moved to lfx.log.logger.
 """
 
+import warnings
+warnings.warn(
+    "lfx.logging.logger is deprecated; use lfx.log.logger instead.",
+    DeprecationWarning,
+    stacklevel=2,
+)
 
 # Ensure we maintain all the original exports
 from lfx.log.logger import (
     InterceptHandler,
     LogConfig,
     configure,
     logger,
     setup_gunicorn_logger,
     setup_uvicorn_logger,
 )
src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json (2)

610-650: Headers default diverges from settings; ensure consistency or clarify.

Template sets User-Agent to a literal "langflow" while code defaults to get_settings_service().settings.user_agent. Consider aligning to settings for consistency.


147-164: SplitText defaults in code vs JSON template mismatch.

Code uses chunk_size=1000, chunk_overlap=200 while the template sets 100 and 0. Verify intended defaults or update to avoid confusion in tests.

Based on learnings

src/lfx/tests/data/starter_projects_1_6_0/Twitter Thread Generator.json (1)

1743-1779: Minor: prefer explicit asyncio.TimeoutError and verify provider options.

Same Language Model component as other flows. Ensure API key resolution works; consider aligning exception type if adding timeouts later.

Based on learnings

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Retrieval.json (2)

479-496: Avoid using private Chroma client internals for embeddings retrieval.

collection = chroma._client.get_collection(...) relies on a private attribute; this is brittle across versions. Prefer public APIs if available, or gate by hasattr and fall back safely.

-                collection = chroma._client.get_collection(name=self.knowledge_base)
-                embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["metadatas", "embeddings"])
+                collection = getattr(chroma, "_collection", None)
+                if collection is None and hasattr(chroma, "_client"):
+                    collection = chroma._client.get_collection(name=self.knowledge_base)  # last-resort fallback
+                if collection is not None:
+                    embeddings_result = collection.get(where={"_id": {"$in": doc_ids}}, include=["metadatas", "embeddings"])
+                else:
+                    embeddings_result = {"metadatas": [], "embeddings": []}

Also consider exposing embeddings only when explicitly requested to minimize payload size.


193-213: ChatOutput includes 'clean_data' in field_order but not in template.

Code guards with getattr, so it won’t crash, but UI won’t expose it. Either add the input or remove from field_order.

src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json (3)

137-145: Align field name in Chat Input: field_order uses store_message but input is should_store_message.

This mismatch can confuse the UI ordering and bindings.

-              "store_message",
+              "should_store_message",

Verify no other references rely on the old name.

Also applies to: 307-321


246-248: Fix prompt typo.

Minor copy tweak for clarity.

-                "value": "What is this document is about?"
+                "value": "What is this document about?"

1187-1195: Reduce sensitive logging of args.

Logging the full args dict includes file_path and processing settings. Consider lowering verbosity or omitting the path to avoid leaking server paths in logs.

-        self.log(args)
+        self.log({k: v for k, v in args.items() if k != "file_path"})
src/lfx/tests/data/starter_projects_1_6_0/Financial Report Parser.json (1)

742-776: Handle missing keys when formatting with pattern to avoid KeyError.

Formatting rows/data with .format(**…) will throw if a key is missing. Safer to use a dict that returns empty strings for missing keys.

-                formatted_text = self.pattern.format(**row.to_dict())
+                values = row.to_dict()
+                formatted_text = self.pattern.format_map({k: values.get(k, "") for k in set(values) | set([])})
...
-            formatted_text = self.pattern.format(**data.data)
+            payload = data.data or {}
+            formatted_text = self.pattern.format_map({k: payload.get(k, "") for k in payload.keys()})

Consider using a collections.defaultdict(str, …) wrapper if available.

src/lfx/tests/data/starter_projects_1_6_0/Search agent.json (2)

503-513: Chat Output field_order lists ‘clean_data’ but no input is defined.

UI lists “clean_data” but the template doesn’t define it here (unlike other components). convert_to_string already guards with getattr(..., False). Either add the input or remove it from the order to avoid UI confusion.

-              "text_color",
-              "clean_data"
+              "text_color"

Or add a BoolInput named clean_data if you intend to expose it.

Also applies to: 556-556


1129-1169: Avoid listing non-existent/future model names by default.

Options include speculative entries like gpt-5* and o4-*. They can confuse users and may break dynamic fetch flows.

Trim to known-good defaults and let dynamic refresh populate the rest.

src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json (1)

1100-1131: Validate the static model_name list and default.

The list includes speculative entries (e.g., gpt-5*, o4-*). Consider deriving options dynamically from the selected provider or confirm they exist in 1.6.0 to avoid user confusion.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0ac23a and 098da15.

📒 Files selected for processing (13)
  • src/lfx/src/lfx/logging/logger.py (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Basic Prompting.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Blog Writer.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Financial Report Parser.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Knowledge Retrieval.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Memory Chatbot.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Research Translation Loop.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/SEO Keyword Generator.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Search agent.json (1 hunks)
  • src/lfx/tests/data/starter_projects_1_6_0/Twitter Thread Generator.json (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-18T18:25:54.486Z
Learnt from: CR
PR: langflow-ai/langflow#0
File: .cursor/rules/backend_development.mdc:0-0
Timestamp: 2025-07-18T18:25:54.486Z
Learning: Starter project files auto-format after langflow run; these formatting changes can be committed or ignored

Applied to files:

  • src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json
  • src/lfx/tests/data/starter_projects_1_6_0/Basic Prompting.json
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Update Starter Projects
🔇 Additional comments (12)
src/lfx/src/lfx/logging/logger.py (1)

17-24: all matches the re-exported API. LGTM.

src/lfx/tests/data/starter_projects_1_6_0/Basic Prompting.json (2)

570-589: ChatOutput type validation looks good and robust.


835-852: Verify API key resolution pipeline
The api_key input defaults to "OPENAI_API_KEY" with load_from_db=True and is resolved at runtime via update_params_with_load_from_db_fieldsVariableService.get_variable("OPENAI_API_KEY",…). Ensure your settings enable store_environment_variables=True and include "OPENAI_API_KEY" in variables_to_get_from_environment so that initialize_user_variables imports it into the database before build_model runs.

src/lfx/tests/data/starter_projects_1_6_0/Twitter Thread Generator.json (1)

1401-1672: Prompt template variables and wiring look consistent. LGTM.

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Retrieval.json (2)

1-86: Starter JSON content looks consistent; formatting diffs are acceptable.

Based on learnings


462-478: Confirm OpenAIEmbeddings parameter name: Versions of LangChain expect openai_api_key instead of api_key; verify against your installed version to avoid runtime errors.

src/lfx/tests/data/starter_projects_1_6_0/Knowledge Ingestion.json (1)

409-426: Validate RecursiveUrlLoader kwargs
_create_loader passes base_url, autoset_encoding, exclude_dirs, link_regex to RecursiveUrlLoader. Manually verify these parameters against your installed langchain_community.document_loaders.RecursiveUrlLoader version and drop or guard any unsupported kwargs to prevent runtime errors.

src/lfx/tests/data/starter_projects_1_6_0/Document Q&A.json (2)

1397-1418: Add a timeout and handle hangs in subprocess.run.

Docling conversion is executed via subprocess.run without a timeout. If the child process hangs, the component will stall indefinitely.

Add a timeout and handle TimeoutExpired to surface a clean error:

-        proc = subprocess.run(  # noqa: S603
+        try:
+            proc = subprocess.run(  # noqa: S603
             [sys.executable, "-u", "-c", child_script],
             input=json.dumps(args).encode("utf-8"),
             capture_output=True,
-            check=False,
-        )
+            check=False,
+            timeout=300,
+        )
+        except subprocess.TimeoutExpired as e:
+            return Data(data={"error": f"Docling subprocess timeout: {e}", "file_path": file_path})

Likely an incorrect or invalid review comment.


1200-1236: Fix NameError: VlmPipelineOptions is referenced but never imported/defined in child script.

Inside the Docling subprocess child script, VlmPipelineOptions is used but not imported or defined. This will raise a NameError whenever pipeline == "vlm".

Apply this minimal fix (the variable isn’t used; safest is to remove the line):

-                        vl_pipe = VlmPipelineOptions()

Alternatively, if you intended to use options, import the correct symbol and pass it where needed:

-                        from docling.pipeline.vlm_pipeline import VlmPipeline
+                        from docling.pipeline.vlm_pipeline import VlmPipeline
+                        from docling.pipeline.vlm_pipeline import VlmPipelineOptions  # if truly needed

Likely an incorrect or invalid review comment.

src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json (3)

669-693: Calculator component looks solid.

Safe AST evaluation restricted to arithmetic operators, clear error handling, and Data-wrapped result. Good for tool mode.


929-945: Agent error handling and logging are reasonable.

Granular exception handling with async logging and re-raise preserves traceability without swallowing errors.


395-410: Ensure test/runtime deps include orjson and fastapi.

ChatOutput code imports orjson and fastapi.encoders. Confirm these are present in the test environment to prevent import errors when loading the component.

Comment on lines +1408 to +1413
"description": "Create a chatbot that saves and references previous messages, enabling the model to maintain context throughout the conversation.",
"endpoint_name": null,
"id": "abdc8216-e400-48c4-8d11-0f21441b50ea",
"is_component": false,
"last_tested_version": "1.4.3",
"name": "Memory Chatbot",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Correct the recorded last_tested_version.

Although this template is part of the 1.6.0 suite, the metadata still states 1.4.3. That undermines the new verification coverage and any automation that keys off the tested version. Please align it with 1.6.0.

-  "last_tested_version": "1.4.3",
+  "last_tested_version": "1.6.0",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"description": "Create a chatbot that saves and references previous messages, enabling the model to maintain context throughout the conversation.",
"endpoint_name": null,
"id": "abdc8216-e400-48c4-8d11-0f21441b50ea",
"is_component": false,
"last_tested_version": "1.4.3",
"name": "Memory Chatbot",
"description": "Create a chatbot that saves and references previous messages, enabling the model to maintain context throughout the conversation.",
"endpoint_name": null,
"id": "abdc8216-e400-48c4-8d11-0f21441b50ea",
"is_component": false,
"last_tested_version": "1.6.0",
"name": "Memory Chatbot",
🤖 Prompt for AI Agents
In src/lfx/tests/data/starter_projects_1_6_0/Memory Chatbot.json around lines
1408 to 1413 the metadata field "last_tested_version" is still set to "1.4.3";
update that value to "1.6.0" so the template correctly reflects the 1.6.0 test
suite; ensure the value is quoted exactly as "1.6.0" and preserve JSON
formatting and trailing commas as appropriate.

"tool_mode": false,
"trace_as_input": true,
"type": "prompt",
"value": "To calculate the monthly subscription price of the software based on the following data:\n\nMonthly infrastructure costs: ${monthly_infrastructure_costs}\nCustomer support: ${customer_support_cost}\nContinuous development: {continuous_development_cost}\nDesired profit margin: {desired_profit_margin}%\nEstimated number of subscribers: {estimated_subscribers}\n\nFollow the step to formulate the answer:\nFixed costs:\nProfit margin:\nTotal amount needed:\nPrice per subscriber:\nThe minimum subscription price per subscriber is:"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix mixed placeholder syntax in Prompt template.

The template mixes ${var} and {var}. Message.from_template won’t interpolate ${...}. Use {var} consistently to avoid unresolved placeholders.

Apply these replacements:

-...Monthly infrastructure costs: ${monthly_infrastructure_costs}
+...Monthly infrastructure costs: {monthly_infrastructure_costs}
- Customer support: ${customer_support_cost}
+ Customer support: {customer_support_cost}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"value": "To calculate the monthly subscription price of the software based on the following data:\n\nMonthly infrastructure costs: ${monthly_infrastructure_costs}\nCustomer support: ${customer_support_cost}\nContinuous development: {continuous_development_cost}\nDesired profit margin: {desired_profit_margin}%\nEstimated number of subscribers: {estimated_subscribers}\n\nFollow the step to formulate the answer:\nFixed costs:\nProfit margin:\nTotal amount needed:\nPrice per subscriber:\nThe minimum subscription price per subscriber is:"
"value": "To calculate the monthly subscription price of the software based on the following data:\n\nMonthly infrastructure costs: {monthly_infrastructure_costs}\nCustomer support: {customer_support_cost}\nContinuous development: {continuous_development_cost}\nDesired profit margin: {desired_profit_margin}%\nEstimated number of subscribers: {estimated_subscribers}\n\nFollow the step to formulate the answer:\nFixed costs:\nProfit margin:\nTotal amount needed:\nPrice per subscriber:\nThe minimum subscription price per subscriber is:"
🤖 Prompt for AI Agents
In src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json around line 291,
the prompt mixes ${var} and {var} placeholders so Message.from_template will not
interpolate the ${...} form; change all ${monthly_infrastructure_costs} and
${customer_support_cost} to {monthly_infrastructure_costs} and
{customer_support_cost} respectively, and ensure {continuous_development_cost},
{desired_profit_margin}, and {estimated_subscribers} are used consistently
throughout the template so all placeholders are {var} style.

"show": true,
"title_case": false,
"type": "code",
"value": "from collections.abc import Generator\nfrom typing import Any\n\nimport orjson\nfrom fastapi.encoders import jsonable_encoder\n\nfrom langflow.base.io.chat import ChatComponent\nfrom langflow.helpers.data import safe_convert\nfrom langflow.inputs.inputs import BoolInput, DropdownInput, HandleInput, MessageTextInput\nfrom langflow.schema.data import Data\nfrom langflow.schema.dataframe import DataFrame\nfrom langflow.schema.message import Message\nfrom langflow.schema.properties import Source\nfrom langflow.template.field.base import Output\nfrom langflow.utils.constants import (\n MESSAGE_SENDER_AI,\n MESSAGE_SENDER_NAME_AI,\n MESSAGE_SENDER_USER,\n)\n\n\nclass ChatOutput(ChatComponent):\n display_name = \"Chat Output\"\n description = \"Display a chat message in the Playground.\"\n documentation: str = \"https://docs.langflow.org/components-io#chat-output\"\n icon = \"MessagesSquare\"\n name = \"ChatOutput\"\n minimized = True\n\n inputs = [\n HandleInput(\n name=\"input_value\",\n display_name=\"Inputs\",\n info=\"Message to be passed as output.\",\n input_types=[\"Data\", \"DataFrame\", \"Message\"],\n required=True,\n ),\n BoolInput(\n name=\"should_store_message\",\n display_name=\"Store Messages\",\n info=\"Store the message in the history.\",\n value=True,\n advanced=True,\n ),\n DropdownInput(\n name=\"sender\",\n display_name=\"Sender Type\",\n options=[MESSAGE_SENDER_AI, MESSAGE_SENDER_USER],\n value=MESSAGE_SENDER_AI,\n advanced=True,\n info=\"Type of sender.\",\n ),\n MessageTextInput(\n name=\"sender_name\",\n display_name=\"Sender Name\",\n info=\"Name of the sender.\",\n value=MESSAGE_SENDER_NAME_AI,\n advanced=True,\n ),\n MessageTextInput(\n name=\"session_id\",\n display_name=\"Session ID\",\n info=\"The session ID of the chat. If empty, the current session ID parameter will be used.\",\n advanced=True,\n ),\n MessageTextInput(\n name=\"data_template\",\n display_name=\"Data Template\",\n value=\"{text}\",\n advanced=True,\n info=\"Template to convert Data to Text. If left empty, it will be dynamically set to the Data's text key.\",\n ),\n ]\n outputs = [\n Output(\n display_name=\"Output Message\",\n name=\"message\",\n method=\"message_response\",\n ),\n ]\n\n def _build_source(self, id_: str | None, display_name: str | None, source: str | None) -> Source:\n source_dict = {}\n if id_:\n source_dict[\"id\"] = id_\n if display_name:\n source_dict[\"display_name\"] = display_name\n if source:\n # Handle case where source is a ChatOpenAI object\n if hasattr(source, \"model_name\"):\n source_dict[\"source\"] = source.model_name\n elif hasattr(source, \"model\"):\n source_dict[\"source\"] = str(source.model)\n else:\n source_dict[\"source\"] = str(source)\n return Source(**source_dict)\n\n async def message_response(self) -> Message:\n # First convert the input to string if needed\n text = self.convert_to_string()\n\n # Get source properties\n source, icon, display_name, source_id = self.get_properties_from_source_component()\n\n # Create or use existing Message object\n if isinstance(self.input_value, Message):\n message = self.input_value\n # Update message properties\n message.text = text\n else:\n message = Message(text=text)\n\n # Set message properties\n message.sender = self.sender\n message.sender_name = self.sender_name\n message.session_id = self.session_id\n message.flow_id = self.graph.flow_id if hasattr(self, \"graph\") else None\n message.properties.source = self._build_source(source_id, display_name, source)\n\n # Store message if needed\n if self.session_id and self.should_store_message:\n stored_message = await self.send_message(message)\n self.message.value = stored_message\n message = stored_message\n\n self.status = message\n return message\n\n def _serialize_data(self, data: Data) -> str:\n \"\"\"Serialize Data object to JSON string.\"\"\"\n # Convert data.data to JSON-serializable format\n serializable_data = jsonable_encoder(data.data)\n # Serialize with orjson, enabling pretty printing with indentation\n json_bytes = orjson.dumps(serializable_data, option=orjson.OPT_INDENT_2)\n # Convert bytes to string and wrap in Markdown code blocks\n return \"```json\\n\" + json_bytes.decode(\"utf-8\") + \"\\n```\"\n\n def _validate_input(self) -> None:\n \"\"\"Validate the input data and raise ValueError if invalid.\"\"\"\n if self.input_value is None:\n msg = \"Input data cannot be None\"\n raise ValueError(msg)\n if isinstance(self.input_value, list) and not all(\n isinstance(item, Message | Data | DataFrame | str) for item in self.input_value\n ):\n invalid_types = [\n type(item).__name__\n for item in self.input_value\n if not isinstance(item, Message | Data | DataFrame | str)\n ]\n msg = f\"Expected Data or DataFrame or Message or str, got {invalid_types}\"\n raise TypeError(msg)\n if not isinstance(\n self.input_value,\n Message | Data | DataFrame | str | list | Generator | type(None),\n ):\n type_name = type(self.input_value).__name__\n msg = f\"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}\"\n raise TypeError(msg)\n\n def convert_to_string(self) -> str | Generator[Any, None, None]:\n \"\"\"Convert input data to string with proper error handling.\"\"\"\n self._validate_input()\n if isinstance(self.input_value, list):\n clean_data: bool = getattr(self, \"clean_data\", False)\n return \"\\n\".join([safe_convert(item, clean_data=clean_data) for item in self.input_value])\n if isinstance(self.input_value, Generator):\n return self.input_value\n return safe_convert(self.input_value)\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Replace PEP 604 unions in isinstance with tuples (Py3.10/3.11 incompatible).

isinstance(..., A | B) raises TypeError prior to Python 3.12. Use a tuple to ensure runtime compatibility.

Apply this diff inside the ChatOutput code block:

-        if isinstance(self.input_value, list) and not all(
-            isinstance(item, Message | Data | DataFrame | str) for item in self.input_value
+        if isinstance(self.input_value, list) and not all(
+            isinstance(item, (Message, Data, DataFrame, str)) for item in self.input_value
         ):
...
-        if not isinstance(
-            self.input_value,
-            Message | Data | DataFrame | str | list | Generator | type(None),
-        ):
-            type_name = type(self.input_value).__name__
-            msg = f"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}"
+        if not isinstance(self.input_value, (Message, Data, DataFrame, str, list, Generator)):
+            type_name = type(self.input_value).__name__
+            msg = f"Expected Data or DataFrame or Message or str, Generator or None, got {type_name}"
             raise TypeError(msg)
🤖 Prompt for AI Agents
In src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json around line 410,
replace PEP 604 union types used inside isinstance calls with tuples for Python
3.10/3.11 compatibility: change isinstance(item, Message | Data | DataFrame |
str) to isinstance(item, (Message, Data, DataFrame, str)) and change
isinstance(self.input_value, Message | Data | DataFrame | str | list | Generator
| type(None)) to isinstance(self.input_value, (Message, Data, DataFrame, str,
list, Generator, type(None))). Ensure both isinstance usages are updated to use
tuple syntax without modifying other logic.

Comment on lines +1410 to +1411
"last_tested_version": "1.4.3",
"name": "SaaS Pricing",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update last_tested_version to match the 1.6.0 suite.

File is under starter_projects_1_6_0 but shows "1.4.3".

Apply this diff:

-  "last_tested_version": "1.4.3",
+  "last_tested_version": "1.6.0",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"last_tested_version": "1.4.3",
"name": "SaaS Pricing",
"last_tested_version": "1.6.0",
"name": "SaaS Pricing",
🤖 Prompt for AI Agents
In src/lfx/tests/data/starter_projects_1_6_0/SaaS Pricing.json around lines
1410-1411, the "last_tested_version" field incorrectly shows "1.4.3" while the
file is in the starter_projects_1_6_0 suite; update the value to "1.6.0" so it
matches the suite naming and commit the change.

Comment on lines +108 to +126
"outputs": [
{
"allows_loop": false,
"cache": true,
"display_name": "Toolset",
"group_outputs": false,
"hidden": null,
"method": "to_toolkit",
"name": "component_as_tool",
"options": null,
"required_inputs": null,
"selected": "Tool",
"tool_mode": true,
"types": [
"Tool"
],
"value": "__UNDEFINED__"
}
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Make ScrapeGraphSearchApi expose a Tool output and fix return type.

Graph wiring expects component_as_tool (Tool) but the class code only exposes a Data output via search(). Also, the search return type hints list[Data] but returns a single Data.

Expose both outputs and wrap the return to match type hints:

 class ScrapeGraphSearchApi(Component):
@@
-    outputs = [
-        Output(display_name="Data", name="data", method="search"),
-    ]
+    outputs = [
+        Output(display_name="Toolset", name="component_as_tool", method="to_toolkit"),
+        Output(display_name="Data", name="data", method="search"),
+    ]
@@
-    def search(self) -> list[Data]:
+    def search(self) -> list[Data]:
@@
-            return Data(data=response)
+            return [Data(data=response)]

This aligns runtime behavior with edges that connect to Agent.tools (Tool).

Also applies to: 147-166, 163-227

🤖 Prompt for AI Agents
In src/lfx/tests/data/starter_projects_1_6_0/Search agent.json around lines
108-126 (and similarly update ranges 147-166, 163-227), the graph expects a Tool
output named component_as_tool but the implementation currently only exposes a
Data output via search() and the search return type is incorrect (declares
list[Data] but returns a single Data). Modify the ScrapeGraphSearchApi class to
expose both outputs: keep the Data output and add a Tool output with name
component_as_tool (method to_toolkit/tool mapping), and update the search method
signature to return a list of Data or wrap its single Data return value into a
list to match the declared type; ensure the runtime output selection uses the
Tool output when selected by graph wiring.

Comment on lines +1081 to +1086
"description": "Generates targeted SEO keywords based on product information, pain points, and customer profiles for strategic marketing.",
"endpoint_name": null,
"id": "77064cce-d437-42be-a7bb-f0dedf93a75a",
"is_component": false,
"last_tested_version": "1.4.3",
"name": "SEO Keyword Generator",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bring last_tested_version up to 1.6.0.

The metadata still claims 1.4.3 even though this JSON lives in the 1.6.0 starter catalog. Update the field so tests and tooling accurately reflect the version we are validating.

-  "last_tested_version": "1.4.3",
+  "last_tested_version": "1.6.0",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"description": "Generates targeted SEO keywords based on product information, pain points, and customer profiles for strategic marketing.",
"endpoint_name": null,
"id": "77064cce-d437-42be-a7bb-f0dedf93a75a",
"is_component": false,
"last_tested_version": "1.4.3",
"name": "SEO Keyword Generator",
"description": "Generates targeted SEO keywords based on product information, pain points, and customer profiles for strategic marketing.",
"endpoint_name": null,
"id": "77064cce-d437-42be-a7bb-f0dedf93a75a",
"is_component": false,
"last_tested_version": "1.6.0",
"name": "SEO Keyword Generator",
🤖 Prompt for AI Agents
In src/lfx/tests/data/starter_projects_1_6_0/SEO Keyword Generator.json around
lines 1081 to 1086, the last_tested_version is incorrectly set to "1.4.3";
update that JSON field to "1.6.0" (preserve JSON formatting and quotes) so the
metadata matches the 1.6.0 starter catalog used by tests and tooling.

@carlosrcoelho carlosrcoelho requested a review from erichare October 3, 2025 19:38
@ogabrielluiz ogabrielluiz requested a review from mpawlow October 8, 2025 16:31
Copy link
Contributor

@mpawlow mpawlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogabrielluiz

Code Review 1

  • See PR comments: (a) to (h)
  • My comments are mainly questions, suggestions and areas for potential improvement
  • Ready to approve based on your feedback.

fixed in later versions. These are marked as expected failures.
"""
# Known failing starter projects due to component-level import bugs in 1.6.0
known_failing_projects = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(c) [Minor] Suggest just removing News Aggregator from source control

  • e.g. Rely only on the source control files rather than git retrieval of the projects
  • Also, would be great to remove the known failure (pytest.xfail)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't want to remove it yet because it shouldn't fail. We need to build a way for this to pass. I think we'll probably have to add a stub api module in lfx. I'm not sure. wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Question: Is it possible to detect the starter project version (for News Aggregator) and automatically upgrade it to the fixed version in the customer's flow on load?
    • OR indicate to the customer, that the current starter project is deprecated (disabled from running) and needs to be manually upgraded to the fixed version?
    • This sort of solution / strategy is usually applicable to artifacts that are versioned plug-ins or extensions
    • Please completely ignore if this concept is not applicable, feasible or practical to starter projects (I'd have to research further to understand whether what I'm suggesting is nonsense in this context)
  • Otherwise, it sounds like using temporary stubs (that can be deprecated and eventually removed in later releases) is the only way to go (and likely the easiest tactical solution)

…nused async fetching

- Refactored the test suite to streamline the retrieval of starter project paths, eliminating the need for asynchronous fetching from GitHub.
- Updated the `get_starter_projects_path` function to directly return the local path for cached starter projects.
- Removed redundant code related to GitHub API calls and caching logic, enhancing test clarity and performance.
- Adjusted test cases to ensure they continue to validate the existence and format of starter project files effectively.
- Introduced two new tests to validate the application of the --format option to error messages in the CLI.
- The first test checks that error messages are returned as plain text when the --format option is set to "text".
- The second test ensures that even with the --verbose flag, the output remains valid JSON when the --format option is set to "json".
- Both tests are marked as expected failures to document current limitations and guide future fixes.
@sonarqubecloud
Copy link

sonarqubecloud bot commented Oct 9, 2025

Copy link
Contributor

@mpawlow mpawlow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ogabrielluiz

Code Review 2

fixed in later versions. These are marked as expected failures.
"""
# Known failing starter projects due to component-level import bugs in 1.6.0
known_failing_projects = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Question: Is it possible to detect the starter project version (for News Aggregator) and automatically upgrade it to the fixed version in the customer's flow on load?
    • OR indicate to the customer, that the current starter project is deprecated (disabled from running) and needs to be manually upgraded to the fixed version?
    • This sort of solution / strategy is usually applicable to artifacts that are versioned plug-ins or extensions
    • Please completely ignore if this concept is not applicable, feasible or practical to starter projects (I'd have to research further to understand whether what I'm suggesting is nonsense in this context)
  • Otherwise, it sounds like using temporary stubs (that can be deprecated and eventually removed in later releases) is the only way to go (and likely the easiest tactical solution)

@github-actions github-actions bot added the lgtm This PR has been approved by a maintainer label Oct 9, 2025
@ogabrielluiz ogabrielluiz added this pull request to the merge queue Oct 10, 2025
Merged via the queue into main with commit a3f7f91 Oct 10, 2025
21 checks passed
@ogabrielluiz ogabrielluiz deleted the add-1.6-lfx-tests branch October 10, 2025 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants