feat: Add Elasticsearch docs #24

zach-shu · 2025-10-14T14:58:23Z

This PR migrates Elasticsearch guides and docs from Assistant Builder to Agent Builder.

Source docs: https://github.com/watson-developer-cloud/assistant-toolkit/tree/master/integrations/extensions/docs/elasticsearch-install-and-setup

Signed off by: [email protected]

kndeepa-ibm · 2025-10-31T06:34:51Z

agent_knowledge/elasticsearch-install-and-setup/README.md

@@ -0,0 +1,26 @@
+# Elasticsearch Installation and Setup Documentation
+
+This directory contains documentation for installing and setting up Elasticsearch along with related guides and integrations.


This document explains about installing and setting up Elasticsearch along with related guides and integrations.

kndeepa-ibm · 2025-10-31T06:36:15Z

agent_knowledge/elasticsearch-install-and-setup/README.md

+## Elasticsearch Setup
+- [Install Docker or Docker alternatives](how_to_install_docker.md): A guide explaining Docker and Docker Compose installation options, essential for running Elasticsearch-related applications.
+- [Set up Elasticsearch from IBM Cloud and integrate it with watsonx Orchestrate](ICD_Elasticsearch_install_and_setup.md): Instructions for provisioning Elasticsearch instance on IBM Cloud and setting up Agent Knowledge in watsonx Orchestrate.
+- [Set up watsonx Discovery (aka Elasticsearch on-prem) and integrate it with watsonx Orchestrate on-prem](watsonx_discovery_install_and_setup.md): Documentation for setting up watsonx Discovery (aka Elasticsearch on-prem) and integrating it with watsonx Orchestrate on-prem.


Set up watsonx Discovery (also called as Elasticsearch on-prem)

kndeepa-ibm · 2025-10-31T06:37:13Z

agent_knowledge/elasticsearch-install-and-setup/README.md

+### Option 1: Add Knowledge to your agents in the Agent Builder UI
+See [Connecting to an Elasticsearch content repository](https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=agents-connecting-elasticsearch-content-repository) in watsonx Orchestrate documentation for more details.
+
+### Option 2: Create Knowledge bases via watsonx Orchestrate ADK (Agent Development Kit)


Create Knowledge bases through watsonx Orchestrate Agent Development Kit (ADK)

kndeepa-ibm · 2025-10-31T06:40:19Z

agent_knowledge/elasticsearch-install-and-setup/README.md

+See [Creating external knowledge bases with Elasticsearch](https://developer.watson-orchestrate.ibm.com/knowledge_base/build_kb#elasticsearch) in ADK documentation for more details.
+
+### Configure the Advanced Elasticsearch Settings
+There are two settings under `Advanced Elasticsearch Settings` for using custom query body and custom filters to achieve advanced search use cases. See the guide [How to configure Advanced Elasticsearch Settings](./how_to_configure_advanced_elasticsearch_settings.md) for more details. 


To achieve advanced search results, use custom query body and custom filters in Advanced Elasticsearch Settings. For more details, see How to configure Advanced Elasticsearch Settings.

kndeepa-ibm · 2025-10-31T06:42:16Z

agent_knowledge/elasticsearch-install-and-setup/README.md

+There are two settings under `Advanced Elasticsearch Settings` for using custom query body and custom filters to achieve advanced search use cases. See the guide [How to configure Advanced Elasticsearch Settings](./how_to_configure_advanced_elasticsearch_settings.md) for more details. 
+
+### Federated search
+You can follow the guide [here](federated_search.md) to run queries across multiple indexes within your Elasticsearch cluster.


Follow the guidance in Federated Search in Elasticsearch to run queries across multiple indexes within your Elasticsearch cluster.

kndeepa-ibm · 2025-10-31T14:31:05Z

...ledge/elasticsearch-install-and-setup/how_to_index_pdf_and_office_documents_elasticsearch.md

+
+You can now run the fscrawler to ingest your documents.
+
+NOTE: If the updated date of the documents are older than the current date, you would have to follow instructions as mentioned [here](https://fscrawler.readthedocs.io/en/latest/user/tips.html#moving-files-to-a-watched-directory) to ensure the fscrawler is able to pick it up for indexing.


NOTE: If the updated date of the documents are older than the current date, you must follow the instructions as mentioned in Moving files to a “watched” directory to ensure the fscrawler is able to pick it up for indexing.

kndeepa-ibm · 2025-10-31T14:39:05Z

...ledge/elasticsearch-install-and-setup/how_to_index_pdf_and_office_documents_elasticsearch.md

+-H "Content-Type: application/json" --cacert "${ES_CACERT}"
+```
+
+OPTIONAL: Once all documents are indexed, you can stop the `fscrawler` app or if you would like to, you can leave it running to keep the filesystem in sync if new documents are added or old ones removed. To stop the app , run the following:


OPTIONAL: Once all documents are indexed, you can stop the fscrawler app. Else, you can leave it running to keep the file system in sync if new documents are added or old ones removed.
To stop the app , run the following:

kndeepa-ibm · 2025-10-31T14:40:25Z

...ledge/elasticsearch-install-and-setup/how_to_index_pdf_and_office_documents_elasticsearch.md

+docker-compose down
+```
+
+Your documents are now available in the index, ready for searching and querying. Follow the steps outlined below to use this index for Agent Knowledge in watsonx Orchestrate. 


Please remove " Follow the steps outlined below to use this index for Agent Knowledge in watsonx Orchestrate. "
Its understood automatically.

kndeepa-ibm · 2025-10-31T14:41:14Z

...ledge/elasticsearch-install-and-setup/how_to_index_pdf_and_office_documents_elasticsearch.md

+
+### Step 5: Connecting to Agent Knowledge in watsonx Orchestrate
+
+To configure your index for Agent Knowledge in watsonx Orchestrate, you need to follow the documentation for [Connecting to an Elasticsearch content repository](https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=agents-connecting-elasticsearch-content-repository).  


To configure your index for Agent Knowledge in watsonx Orchestrate, refer to Connecting to an Elasticsearch content repository.

kndeepa-ibm · 2025-10-31T14:45:06Z

...ledge/elasticsearch-install-and-setup/how_to_index_pdf_and_office_documents_elasticsearch.md

+
+To configure your index for Agent Knowledge in watsonx Orchestrate, you need to follow the documentation for [Connecting to an Elasticsearch content repository](https://www.ibm.com/docs/en/watsonx/watson-orchestrate/base?topic=agents-connecting-elasticsearch-content-repository).  
+
+Importantly, you need to use the right fields to configure your result content (In this guide, use `title` for Title and `text` for Body). You also need to use the right query body to make Knowledge work with your web crawler index. Here is an screenshot of the configuration:


Ensure to use the right fields to configure your result content, that is, title for Title and text for Body. You must use the right query format so that Knowledge works properly with your web crawler index. Refer to the following configuration image:

kndeepa-ibm · 2025-11-04T07:27:29Z

agent_knowledge/elasticsearch-install-and-setup/how_to_install_docker.md

+
+## What is Docker?
+
+Docker is a software platform that allows you to build, test, and deploy applications quickly. Docker packages software into standardized units called containers that have everything the software needs to run including libraries, system tools, code, and runtime. Using Docker, you can quickly deploy and scale applications into any environment and know your code will run.


Docker is a software platform designed to streamline the process of building, testing, and deploying applications. It packages software into standardized units called containers that include all necessary component such as libraries, system tools, code, and runtime. With Docker, you can efficiently deploy and scale applications in any environment ensuring consistent and reliable execution of your code.

kndeepa-ibm · 2025-11-04T07:29:35Z

agent_knowledge/elasticsearch-install-and-setup/how_to_install_docker.md

+
+Docker is a software platform that allows you to build, test, and deploy applications quickly. Docker packages software into standardized units called containers that have everything the software needs to run including libraries, system tools, code, and runtime. Using Docker, you can quickly deploy and scale applications into any environment and know your code will run.
+
+And Docker Compose is a tool for defining and running multi-container applications. It is the key to unlocking a streamlined and efficient development and deployment experience.


Docker Compose is a tool for defining and managing multi-container applications. It plays a crucial role in simplifying and optimizing both development and deployment experience.

kndeepa-ibm · 2025-11-04T07:32:25Z

agent_knowledge/elasticsearch-install-and-setup/how_to_install_docker.md

+
+And Docker Compose is a tool for defining and running multi-container applications. It is the key to unlocking a streamlined and efficient development and deployment experience.
+
+You will see references to `docker` and `docker-compose` as you work through some of our guides and this document serves to guide anyone who needs a starting point to install that software or its alternatives.


As you go through the guides, you can see references to docker and docker-compose. This document is intended to help anyone who needs a starting point for installing these tools or exploring alternative solutions.

kndeepa-ibm · 2025-11-04T07:36:45Z

agent_knowledge/elasticsearch-install-and-setup/how_to_install_docker.md

+
+1. [Docker Compose Overview](https://docs.docker.com/compose/)
+2. [Docker Overview](https://docs.docker.com/get-docker/) 
+


Please add a sentence like
"You can install Docker in many ways. The following table serves as a quick guide to choose the method of installing Docker."

Install method Who can use Maintenance

Docker Desktop) Small organizations Maintained regularly by Docker

Rancher Desktop New users to Docker and prefer an easy one-click install of basic functionality

Podman Windows or Mac users who has a Linux distribution/subsystem or Linux in a virtual machine Manual update

Colima Who prefers CLI than Docker desktop GUI

Please give links to all the install options so that users can go to the correct option easily.

kndeepa-ibm · 2025-11-04T07:37:22Z

agent_knowledge/elasticsearch-install-and-setup/how_to_install_docker.md

+1. [Docker Compose Overview](https://docs.docker.com/compose/)
+2. [Docker Overview](https://docs.docker.com/get-docker/) 
+
+## Option 1: Docker Desktop


Using Docker Desktop

kndeepa-ibm · 2025-11-04T08:39:48Z

agent_knowledge/elasticsearch-install-and-setup/python-document-ingestion/README.md

+
+To use ELSER for text expansion queries on chunked texts, you need to build a pipeline with an inference processor that uses the ELSER model.
+
+NOTE: ELSER model is not enabled by default, and you can enable it in Kibana, following the [download-deploy-elser instructions](https://www.elastic.co/guide/en/machine-learning/8.11/ml-nlp-elser.html#download-deploy-elser).


Follow NOTE or Note throughout.

Note: ELSER model is not enabled by default. You can enable it in Kibana by following the download-deploy-elser instructions.

kndeepa-ibm · 2025-11-04T08:42:26Z

agent_knowledge/elasticsearch-install-and-setup/python-document-ingestion/README.md

+
+Depending on your Elasticsearch version, you can choose to deploy either ELSER v1 or v2 model. The following steps and commands are based on ELSER v1 model, but you can find what change is needed for ELSER v2 in the notes of each step. 
+
+You will be able to reference this pipeline in the next few steps as a part of indexing the documents of choice. It transforms the "text" field using the ELSER model and produces the terms along with weights as a sparse vector in the "ml" field at index time.


You can refrence this pipeline in the upcoming steps as part of the document indexing process. It applies the ELSER model to transform the text field, generating the terms along with weights as a sparse vector in the ml field at index time.

kndeepa-ibm · 2025-11-04T08:43:07Z

agent_knowledge/elasticsearch-install-and-setup/python-document-ingestion/README.md

+
+Learn more about [inference-ingest-pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/8.11/semantic-search-elser.html#inference-ingest-pipeline) from the tutorial 
+
+Create the pipeline using the command below: 


Create the pipeline using the following command:

kndeepa-ibm · 2025-11-04T08:43:38Z

agent_knowledge/elasticsearch-install-and-setup/python-document-ingestion/README.md

+
+You will be able to reference this pipeline in the next few steps as a part of indexing the documents of choice. It transforms the "text" field using the ELSER model and produces the terms along with weights as a sparse vector in the "ml" field at index time.
+
+Learn more about [inference-ingest-pipeline](https://www.elastic.co/guide/en/elasticsearch/reference/8.11/semantic-search-elser.html#inference-ingest-pipeline) from the tutorial 


Learn more about inference-ingest-pipeline from the tutorial.

kndeepa-ibm · 2025-11-04T08:45:39Z

agent_knowledge/elasticsearch-install-and-setup/python-document-ingestion/README.md

+
+Your documents are now available in the index, ready for searching and querying. Follow the steps outlined below to use this index for Agent Knowledge in watsonx Orchestrate. 
+
+**NOTE**: There are some example documents available [here](../assets/sample_pdf_docs), if you would like to test the setup.


Refer to Example documents to test the setup.

kndeepa-ibm · 2025-11-04T08:50:50Z