Skip to content

Modern data pipeline automation platform that transforms CSV data into actionable insights delivered via Slack

License

Notifications You must be signed in to change notification settings

ODORA0/relayboard

Repository files navigation

πŸ“Š Relayboard - Data Pipeline Platform

Transform CSV data into actionable insights delivered via Slack

GitHub TypeScript Next.js NestJS

Created by AJAL ODORA JONATHAN

Relayboard is a modern data pipeline automation platform that transforms CSV/Google Sheets data into actionable insights delivered directly to your team's Slack channels. It automates the entire data processing workflow from ingestion to delivery.

🎬 See It In Action

Watch Relayboard transform CSV data into Slack notifications in real-time:

Screen.Recording.Oct.16.2025.1.1.mov

What You'll See:

  • πŸ“Š CSV dataset registration via web interface
  • ⚑ One-click pipeline execution
  • πŸ”„ Real-time processing status updates
  • πŸ“± Instant Slack notifications with data previews
  • 🎯 Smart data sampling for large datasets

🎯 What Problem Does It Solve?

Many teams struggle with:

  • Manual data processing that's time-consuming and error-prone
  • Data silos where insights don't reach the right people
  • Complex data pipelines that require technical expertise
  • Delayed insights that arrive too late to be actionable

Relayboard solves this by providing a "data-to-notification" system that automates the entire workflow.

πŸš€ Core Features

Automated Data Pipeline

CSV URL β†’ MinIO Storage β†’ PostgreSQL Staging β†’ dbt Transform β†’ PostgreSQL Warehouse β†’ Slack

One-Click Execution

  • Register CSV datasets via web interface
  • Configure Slack webhook destinations
  • Execute complete pipeline with single click
  • Real-time feedback and error handling

Modern Tech Stack

  • Frontend: Next.js 15 with Tailwind CSS
  • API: NestJS with TypeScript
  • Worker: Python/FastAPI for data processing
  • Database: PostgreSQL with staging/warehouse schemas
  • Storage: MinIO (S3-compatible)
  • Transformations: dbt for data modeling

πŸ—οΈ Architecture Overview

graph TB
    A[Web UI - Next.js] --> B[API - NestJS]
    B --> C[Worker - Python/FastAPI]
    B --> D[PostgreSQL Database]
    B --> E[MinIO Storage]
    C --> D
    C --> E
    C --> F[dbt Transformations]
    F --> D
    C --> G[Slack Webhook]

    H[CSV URL] --> C
    I[User] --> A
Loading

Component Details

Frontend (Next.js) 🎨

  • Location: apps/web/
  • Port: 3000
  • Features:
    • Modern, responsive UI with Tailwind CSS
    • Step-by-step pipeline configuration
    • Real-time loading states and feedback
    • Service status monitoring

API (NestJS) πŸ”Œ

  • Location: apps/api/
  • Port: 4000
  • Features:
    • RESTful API endpoints
    • Database connection management
    • File storage integration
    • Pipeline orchestration

Worker (Python/FastAPI) 🐍

  • Location: apps/worker/
  • Port: 5055
  • Features:
    • CSV processing and validation
    • PostgreSQL data loading
    • dbt model generation and execution
    • Slack webhook integration

Infrastructure πŸ—οΈ

  • PostgreSQL: Port 5433 (staging + warehouse schemas)
  • MinIO: Port 9000 (storage) + 9001 (console)
  • Redis: Port 6379 (caching/queuing)

πŸ“‹ Quick Start

Prerequisites

  • Node.js 18+ and pnpm
  • Python 3.11+
  • Docker and Docker Compose
  • dbt CLI (optional, for local development)

1. Start Infrastructure Services

# Start PostgreSQL, MinIO, and Redis
docker compose -f infra/docker/docker-compose.dev.yml up -d

2. Install Dependencies

# Install all workspace dependencies
pnpm install

3. Start Development Services

Terminal 1 - API Server:

pnpm --filter @relayboard/api dev

Terminal 2 - Web Interface:

pnpm --filter @relayboard/web dev

Terminal 3 - Worker Service:

cd apps/worker
pip install -r requirements.txt
./start.sh

4. Access the Application

πŸ”§ API Endpoints

Dataset Management

# Register CSV dataset
POST /v1/datasets/csv
{
  "name": "sales_data",
  "csvUrl": "https://example.com/data.csv"
}

Destination Configuration

# Configure Slack webhook
POST /v1/destinations/slack
{
  "webhookUrl": "https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK"
}

Pipeline Execution

# Run complete pipeline
POST /v1/pipelines/run
{
  "datasetName": "sales_data"
}

Health Check

GET /health

πŸ“Š Data Flow Process

1. Data Ingestion

  • User provides CSV URL via web interface
  • API downloads CSV and stores in MinIO
  • Dataset metadata saved to PostgreSQL

2. Pipeline Execution

  • API triggers worker with pipeline parameters
  • Worker downloads CSV from MinIO
  • Data loaded into PostgreSQL staging schema
  • dbt models auto-generated based on CSV schema
  • dbt transformations executed
  • Results loaded into warehouse schema

3. Delivery

  • Worker queries transformed data
  • Results formatted and sent to Slack
  • Pipeline status updated in database

πŸ—„οΈ Database Schema

Core Tables

-- Dataset registry
CREATE TABLE dataset (
  id SERIAL PRIMARY KEY,
  name TEXT UNIQUE NOT NULL,
  source_kind TEXT NOT NULL, -- 'csv'
  s3_key TEXT NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Destination configuration
CREATE TABLE destination (
  id SERIAL PRIMARY KEY,
  kind TEXT NOT NULL, -- 'slack'
  config_json JSONB NOT NULL,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Pipeline run tracking
CREATE TABLE run (
  id SERIAL PRIMARY KEY,
  dataset_id INT REFERENCES dataset(id),
  status TEXT NOT NULL,
  started_at TIMESTAMPTZ DEFAULT NOW(),
  finished_at TIMESTAMPTZ,
  error TEXT
);

Data Schemas

  • staging: Raw CSV data loaded from MinIO
  • warehouse: Transformed data from dbt models

πŸ”„ dbt Integration

Auto-Generated Models

The worker automatically generates dbt models based on CSV schema:

-- Example: sales_data_clean.sql
select
  "date",
  "product_name",
  "quantity",
  "price"
from staging."sales_data"

dbt Project Structure

dbt/relayboard/
β”œβ”€β”€ dbt_project.yml
β”œβ”€β”€ profiles.yml
└── models/
    β”œβ”€β”€ example.sql
    └── generated/          # Auto-generated models
        └── {dataset}_clean.sql

🎨 Web Interface

Step 1: Register CSV Dataset

  • Enter dataset name
  • Provide CSV URL
  • Click "Register CSV"

Step 2: Configure Slack Destination

  • Enter Slack webhook URL
  • Click "Save Slack Destination"

Step 3: Run Pipeline

  • Click "Run Pipeline"
  • Monitor progress with loading indicators
  • View success/error feedback

Service Status

  • Real-time status of all services
  • Connection indicators
  • Service URLs and ports

πŸš€ Deployment

Development Environment

# Start all services
docker compose -f infra/docker/docker-compose.dev.yml up -d
pnpm --filter @relayboard/api dev
pnpm --filter @relayboard/web dev
cd apps/worker && ./start.sh

Production Considerations

  • Use environment variables for configuration
  • Set up proper SSL certificates
  • Configure production PostgreSQL
  • Use managed MinIO or AWS S3
  • Set up monitoring and logging
  • Implement proper security measures

πŸ”§ Configuration

Environment Variables

API (.env):

# Database
PG_HOST=127.0.0.1
PG_PORT=5433
PG_USER=relayboard
PG_PASSWORD=relayboard
PG_DATABASE=relayboard

# Storage
S3_ENDPOINT=http://127.0.0.1:9000
S3_ACCESS_KEY=relayboard
S3_SECRET_KEY=relayboard123
S3_BUCKET=relayboard

# Services
WORKER_BASE_URL=http://127.0.0.1:5055
SLACK_WEBHOOK=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK

Web (.env.local):

NEXT_PUBLIC_API_BASE=http://localhost:4000

πŸ› Troubleshooting

Common Issues

PostgreSQL Connection Error:

# Check if PostgreSQL is running
docker ps | grep postgres

# Check port conflicts
lsof -i :5433

MinIO Connection Error:

# Check MinIO status
docker logs docker-minio-1

# Access MinIO console
open http://localhost:9001

Worker Service Error:

# Check Python dependencies
cd apps/worker
pip install -r requirements.txt

# Check dbt installation
dbt --version

πŸ“ˆ Roadmap

Phase 1: Core Features βœ…

  • CSV data ingestion
  • PostgreSQL integration
  • dbt transformations
  • Slack delivery
  • Web interface

Phase 2: Enhanced Features 🚧

  • Google Sheets integration with OAuth
  • Advanced dbt models with business logic
  • Data preview with DuckDB
  • Pipeline scheduling
  • Error handling and retry logic

Phase 3: Enterprise Features πŸ“‹

  • User management and RBAC
  • Audit logs and data lineage
  • Advanced analytics and dashboards
  • API rate limiting and security
  • Multi-tenant support

Phase 4: Scale & Performance πŸš€

  • Horizontal scaling
  • Advanced caching strategies
  • Performance monitoring
  • CI/CD pipelines
  • Kubernetes deployment

🀝 Contributing

Development Setup

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

Code Style

  • Frontend: ESLint + Prettier
  • API: NestJS conventions
  • Worker: Python PEP 8
  • Database: PostgreSQL best practices

πŸ‘¨β€πŸ’» Author

AJAL ODORA JONATHAN - @ODORA0

  • 🌐 GitHub: https://github.com/ODORA0
  • πŸ’Ό LinkedIn: Available on GitHub profile
  • 🎯 Tech Stack: Java, TypeScript, JavaScript, Python, React, Node.js, Firebase, AWS

About the Developer

Experienced full-stack developer with expertise in:

  • Backend: Java, Python, Node.js, NestJS
  • Frontend: React, TypeScript, Next.js, Tailwind CSS
  • Cloud: AWS, Firebase, Docker
  • Data: PostgreSQL, dbt, data pipelines
  • Healthcare: OpenMRS contributor and billing systems expert

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Built with modern web technologies
  • Inspired by data engineering best practices
  • Designed for developer experience and ease of use
  • Special thanks to the open-source community

Ready to transform your data into actionable insights? Start with Relayboard today! πŸš€

Created with ❀️ by AJAL ODORA JONATHAN

About

Modern data pipeline automation platform that transforms CSV data into actionable insights delivered via Slack

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published