Merge pull request #1135 from janmatzek/jmat-SVS-1199-public-facing-documentation-for-gooddata-pipelines

janmatzek · web-flow · commit 880b8de63a0b · 2025-09-12T08:31:35.000+02:00
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -60,6 +60,8 @@ The project documentation is done in hugo. To contribute:
 
 2. Run `make new-docs`
 
+3. Open [http://localhost:1313/latest/](http://localhost:1313/latest/) in your browser to see the preview.
+
 The documentation is deployed using manually triggered GitHub workflows.
 
 One logical change is done in one commit.
diff --git a/docs/content/en/latest/api-reference/_index.md b/docs/content/en/latest/api-reference/_index.md
@@ -1,7 +1,7 @@
 ---
 title: "API Reference"
 linkTitle: "API Reference"
-weight: 60
+weight: 99
 navigationLabel: true
 ---
 
diff --git a/docs/content/en/latest/pipelines-overview.md b/docs/content/en/latest/pipelines-overview.md
@@ -0,0 +1,58 @@
+---
+title: "Pipelines Overview"
+linkTitle: "Pipelines Overview"
+weight: 14
+---
+
+GoodData Pipelines contains tools for automating GoodData lifecycle management. Built on top of [GoodData Python SDK](https://www.gooddata.com/docs/python-sdk/latest/), it enables you to programmatically provision and manage workspaces, users, user groups, and their permissions.
+
+For further information, refer to the PIPELINES section in the left navigation menu.
+
+## Installation
+
+Run the following command to install the ``gooddata-pipelines`` package on your system:
+
+```bash
+pip install gooddata-pipelines
+```
+
+### Requirements
+
+- Python 3.10 or newer
+- GoodData.CN or GoodData Cloud
+
+## Examples
+
+Here is an introductory example of how to manage GoodData resources using GoodData Pipelines:
+
+### Provision Child Workspaces
+```python
+from gooddata_pipelines import WorkspaceFullLoad, WorkspaceProvisioner
+
+# GoodData.CN host URI (e.g., "http://localhost:3000")
+host = "http://localhost:3000"
+
+# GoodData.CN user token
+token = "some_user_token"
+
+# Initialize the provisioner
+provisioner = WorkspaceProvisioner.create(host=host, token=token)
+
+# Gather the definitions of the workspaces you want to create
+raw_data: list[dict] = [
+    {
+        "parent_id": "demo_parent_workspace",
+        "workspace_id": "sales_team_workspace",
+        "workspace_name": "Sales Team Workspace",
+        "workspace_data_filter_id": "region_filter",
+        "workspace_data_filter_values": ["north_america"],
+    },
+]
+
+# Validate the data
+validated_data = [WorkspaceFullLoad(**item) for item in raw_data]
+
+# Run the provisioning
+provisioner.full_load(validated_data)
+
+```
diff --git a/docs/content/en/latest/pipelines/_index.md b/docs/content/en/latest/pipelines/_index.md
@@ -0,0 +1,6 @@
+---
+title: "GOODDATA PIPELINES"
+linkTitle: "GOODDATA PIPELINES"
+weight: 60
+navigationLabel: true
+---
diff --git a/docs/content/en/latest/pipelines/provisioning/_index.md b/docs/content/en/latest/pipelines/provisioning/_index.md
@@ -0,0 +1,91 @@
+---
+title: "Provisioning"
+linkTitle: "Provisioning"
+weight: 1
+no_list: true
+---
+
+Programmatically manage and provision resources in your GoodData environment.
+
+## Supported Resources
+
+Resources you can provision using GoodData Pipelines:
+
+- [Workspaces](workspaces/)
+- [Users](users/)
+- [User Groups](user_groups/)
+- [Workspace Permissions](workspace-permissions/)
+
+
+## Workflow Types
+
+There are two types of provisioning supported by GoodData Pipelines:
+
+- [Full load](#full-load)
+- [Incremental load](#incremental-load)
+
+The provisioning types employ different algorithms and expect different structures of input data. For details about the expected inputs, check out the documentation page for each individual resource.
+
+### Full Load
+
+Full load provisioning aims to fully synchronize the state of your GoodData instance with the provided input. This workflow will create new resources and update existing ones based on the input. Any resources existing on GoodData Cloud not included in the input will be deleted.
+
+{{% alert color="warning" title="Full loads are destrucitve"%}}
+Full load provisioning will delete any existing resources not included in your input data. Test in non-production environment.
+{{% /alert %}}
+
+### Incremental Load
+
+During incremental provisioning, the algorithm will only interact with resources specified in the input. During the incremental load, the input data expects an extra parameter: `is_active`. Resources with `True` value will be updated. On the other hand, by setting it to `False`, you can mark resources for deletion. Any other resources already existing in GoodData will not be altered.
+
+### Workflow Comparison
+
+| **Aspect** | **Full Load** | **Incremental Load** |
+|------------|---------------|----------------------|
+| **Scope** | Synchronizes entire state | Only specified resources |
+| **Deletion** | Deletes unspecified resources | Only deletes resources marked `is_active: False` |
+| **Use Case** | Complete environment setup | Targeted updates |
+
+## Usage
+
+Regardless of workflow type or resource being provisioned, the typical usage follows these steps:
+
+1. Initialize the provisioner
+
+1. Validate your data using an input model
+
+1. Run the selected provisioning method (`.full_load()` or `.incremental_load()`) with your validated data
+
+
+Check the [resource pages](#supported-resources) for detailed instructions and examples of workflow implementations.
+
+## Logs
+
+By default, the provisioners operate silently. To monitor progress and troubleshoot issues, you can subscribe to the emitted logs using the `.subscribe()` method on the `logger` property of the provisioner instance.
+
+```python
+# Import and set up your logger
+import logging
+
+# Import the provisioner
+from gooddata_pipelines import WorkspaceProvisioner
+
+host = "http://localhost:3000"
+token = "some_user_token"
+
+# In this example, we will use Python standard logging library.
+# However, you can use any logger conforming to the LoggerLike protocol
+# defined in gooddata_pipelines.logger.logger
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+
+# Initialize the provisioner
+provisioner = WorkspaceProvisioner.create(host=host, token=token)
+
+# Subscribe to the logging service
+provisioner.logger.subscribe(logger)
+
+# Continue with the provisioning
+...
+```
diff --git a/docs/content/en/latest/pipelines/provisioning/user_groups.md b/docs/content/en/latest/pipelines/provisioning/user_groups.md
@@ -0,0 +1,177 @@
+---
+title: "User Groups"
+linkTitle: "User Group"
+weight: 3
+---
+
+User group provisioning allows you to create, update, or delete user groups.
+
+User groups enable you to organize users and manage permissions at scale by assigning permissions to groups rather than individual users.
+
+You can provision user groups using full or incremental load methods. Each of these methods requires a specific input type.
+
+## Usage
+
+Start by importing and initializing the UserGroupProvisioner.
+
+```python
+
+from gooddata_pipelines import UserGroupProvisioner
+
+host = "http://localhost:3000"
+token = "some_user_token"
+
+# Initialize the provisioner with GoodData credentials
+provisioner = UserGroupProvisioner.create(host=host, token=token)
+
+```
+
+
+Then validate your data using an input model corresponding to the provisioned resource and selected workflow type, i.e., `UserGroupFullLoad` if you intend to run the provisioning in full load mode, or `UserGroupIncrementalLoad` if you want to provision incrementally.
+
+The models expect the following fields:
+- **user_group_id**: ID of the user group.
+- **user_group_name**: Name of the user group.
+- **parent_user_groups**: A list of parent user group IDs.
+- _**is_active**:_ Deletion flag. Present only in the IncrementalLoad models.
+
+{{% alert color="info" title="Note on IDs"%}}
+Each ID can only contain allowed characters. See [Workspace Object Identification](https://www.gooddata.com/docs/cloud/create-workspaces/objects-identification/) to learn more about object identifiers.
+{{% /alert %}}
+
+Use the appropriate model to validate your data:
+
+```python
+# Add the model to the imports
+from gooddata_pipelines import UserGroupFullLoad, UserGroupProvisioner
+
+host = "http://localhost:3000"
+token = "some_user_token"
+
+# Initialize the provisioner with GoodData credentials
+provisioner = UserGroupProvisioner.create(host=host, token=token)
+
+# Load your data
+raw_data = [
+    {
+        "user_group_id": "user_group_1",
+        "user_group_name": "User Group 1",
+        "parent_user_groups": [],
+    },
+]
+
+# Validate the data
+validated_data = [
+    UserGroupFullLoad(
+        user_group_id=item["user_group_id"],
+        user_group_name=item["user_group_name"],
+        parent_user_groups=item["parent_user_groups"],
+    )
+    for item in raw_data
+]
+
+```
+
+Now with the provisioner initialized and your data validated, you can run the provisioner:
+
+```python
+# Import, initialize, validate...
+...
+
+# Run the provisioning method
+provisioner.full_load(validated_data)
+
+```
+
+## Examples
+
+Here are full examples of a full load and incremental load user group provisioning workflows:
+
+### Full Load
+
+```python
+import logging
+
+from gooddata_pipelines import UserGroupFullLoad, UserGroupProvisioner
+
+host = "http://localhost:3000"
+token = "some_user_token"
+
+# Initialize the provisioner
+provisioner = UserGroupProvisioner.create(host=host, token=token)
+
+# Optional: set up logging and subscribe to logs emitted by the provisioner
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+provisioner.logger.subscribe(logger)
+
+# Prepare your data
+raw_data = [
+    {
+        "user_group_id": "user_group_1",
+        "user_group_name": "User Group 1",
+        "parent_user_groups": [],
+    },
+]
+
+# Validate the data
+validated_data = [
+    UserGroupFullLoad(
+        user_group_id=item["user_group_id"],
+        user_group_name=item["user_group_name"],
+        parent_user_groups=item["parent_user_groups"],
+    )
+    for item in raw_data
+]
+
+# Run the provisioning with the validated data
+provisioner.full_load(validated_data)
+
+```
+
+
+### Incremental Load
+
+```python
+import logging
+
+from gooddata_pipelines import UserGroupIncrementalLoad, UserGroupProvisioner
+
+host = "http://localhost:3000"
+token = "some_user_token"
+
+# Initialize the provisioner
+provisioner = UserGroupProvisioner.create(host=host, token=token)
+
+# Optional: set up logging and subscribe to logs emitted by the provisioner
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+provisioner.logger.subscribe(logger)
+
+# Prepare your data
+raw_data = [
+    {
+        "user_group_id": "user_group_1",
+        "user_group_name": "User Group 1",
+        "parent_user_groups": [],
+        "is_active": True,
+    },
+]
+
+# Validate the data
+validated_data = [
+    UserGroupIncrementalLoad(
+        user_group_id=item["user_group_id"],
+        user_group_name=item["user_group_name"],
+        parent_user_groups=item["parent_user_groups"],
+        is_active=item["is_active"],
+    )
+    for item in raw_data
+]
+
+# Run the provisioning with the validated data
+provisioner.incremental_load(validated_data)
+
+```
diff --git a/docs/content/en/latest/pipelines/provisioning/users.md b/docs/content/en/latest/pipelines/provisioning/users.md
diff --git a/docs/content/en/latest/pipelines/provisioning/workspace-permissions.md b/docs/content/en/latest/pipelines/provisioning/workspace-permissions.md
diff --git a/docs/content/en/latest/pipelines/provisioning/workspaces.md b/docs/content/en/latest/pipelines/provisioning/workspaces.md