🧠 AI Support Ticket Triage System

An AI-powered app that classifies IT support tickets (e.g., software issues, hardware failures, network errors) into predefined categories. It exposes a REST API and includes a frontend built with Next.js and TailwindCSS.

✅ Requirements Gathering & Planning

📌 Problem Statement

Customer support teams receive hundreds or thousands of tickets daily. These tickets vary in type (bug reports, feature requests, billing issues, etc.) and urgency. Manually triaging them is time-consuming and error-prone.

🫡 Objective

Build a system that automatically classifies incoming support tickets by category and priority level using machine learning.

🎯 Goals & Deliverables

Goal	Metric	Target Value
Accurate ticket categorization	F1-Score, Accuracy	≥ 90% Accuracy
Detect priority level (High/Med/Low)	Precision/Recall for High	≥ 85% Precision
Fast inference	API Response Time	≤ 300ms
Usable API	REST API with `/predict` route	✅
Scalable system	Docker + CI/CD + Deployment	✅

🗂️ Ticket Types & Labels (Multi-Class)

We need a dataset with these types of tickets:

bug
billing
feature_request
technical_issue
account_problem
security
maintenance
general_inquiry
other

And with the follwing set of priorities:

high
medium
low

🧠 Model Plan

Start with:

TF-IDF + RandomForestClassifier (Baseline)
Move to better methods for better metrics as required

🏗️ Architecture Overview

The planned architecture is shown in the diagram below:

The steps in the planned architecture are:

User / System Submits Ticket →
/predict route of the api (FastAPI) receives the request→
Preprocessing + Inference Pipeline→
Response with Category & Priority
The response is displayed in the UI

🧱 Non-Functional Requirements

Category	Requirement
Reliability	Must not crash on malformed input
Latency	Must respond within 300ms
Scalability	Dockerized, CI/CD enabled
Security	Validate inputs, no shell evals
Maintainability	Testable, modular code

🛠️ Tools & Stack (Preview)

Phase	Tool Stack
Data preprocessing(clean, visualze, extract features)	pandas, numpy, matplotlib, seaborn, scikitlearn, scipy
Model Training	scikit-learn
Experiment Tracking	MLflow
API	FastAPI, Pydantic
Testing	pytest
Deployment	Docker, GitHub Actions, Google Cloud Platform
Monitoring	`logging` module, Google Cloud Platform
Frontend	Next.js, TailwindCSS, ShadcnUI, `lucide` for icons

✅ Data Collection & Preprocessing

📦 Dataset

Since we need support tickets with categories and priorities:

The Customer IT Support - Ticket Dataset dataset has been selected. The dataset provides email tickets categorized into departments and priority levels.
The dataset contains:
1. Category values: Specifies the department to which the email ticket is categorized. This helps in routing the ticket to the appropriate support team for resolution.
  - 💻 Technical Support: Technical issues and support requests.
  - 🈂️ Customer Service: Customer inquiries and service requests.
  - 💰 Billing and Payments: Billing issues and payment processing.
  - 🖥️ Product Support: Support for product-related issues.
  - 🌐 IT Support: Internal IT support and infrastructure issues.
  - 🔄 Returns and Exchanges: Product returns and exchanges.
  - 📞 Sales and Pre-Sales: Sales inquiries and pre-sales questions.
  - 🧑‍💻 Human Resources: Employee inquiries and HR-related issues.
  - ❌ Service Outages and Maintenance: Service interruptions and maintenance.
  - 📮 General Inquiry: General inquiries and information requests.
2. Priority values: Indicates the urgency and importance of the issue. Helps in managing the workflow by prioritizing tickets that need immediate attention.
  - 🟢 1 (Low): Non-urgent issues that do not require immediate attention. Examples: general inquiries, minor inconveniences, routine updates, and feature requests.
  - 🟠 2 (Medium): Moderately urgent issues that need timely resolution but are not critical. Examples: performance issues, intermittent errors, and detailed user questions.
  - 🔴 3 (Critical): Urgent issues that require immediate attention and quick resolution. Examples: system outages, security breaches, data loss, and major malfunctions.

🔄 Preprocessing

Steps we have applied:

Step	Tool	Purpose
Text cleaning	spacy	non-alphanumeric character removal
Feature extraction	sklearn.feature_extraction.text	extract TF-IDF vectors
Label encoding	sklearn.OrdinalEncoder	Encode category
Train/val/test split	sklearn.model_selection.train_test_split	Create reusable split

📊 Exploratory Data Analysis (EDA)

We have performed:

Distribution of ticket categories
Distribution of priorities
Ticket count by category and priority
Distribution of ticket length
Word cloud

✅ Model Training & Evaluation

We aim to build a model that classifies:

Ticket category (bug, billing, feature_request, etc.)
Ticket priority (high, medium, low)

This is a multi-class classification problem for both labels.

🧪 Data Split

Split the dataset for reliable training and evaluation. 80% of the dataset is used for training and 20% is used for evaluation.

🧠 Modeling Strategy

We'll build two independent classifiers:

For category
For priority

🧱 Baseline Model (Traditional ML)

We have used TF-IDF + Random Forest to build a fast, reliable baseline.

📈 Metrics to Track

Task	Metrics	Threshold
Category	Accuracy, Precision, Recll, F1-Score, Confusion Matrix	≥ 80% Accuracy
Priority	Accuracy, Precision, Recll, F1-Score, Confusion Matrix	≥ 75% Precision
Inference Speed	Avg. response time (ms)	≤ 300 ms

📂 Outputs to Save

Artifact	Purpose
Trained models (.joblib)	Reuse in API
Vectorizer	Needed for consistent predictions
Category Encoder	Reuse in API
Classification reports	For model documentation
Confusion matrices	Visual diagnostics

🧪 Experiment Tracking

Used MLflow to track training experiments

Logged:

Model version
Hyperparameters
Accuracy/F1
Confusion matrix
Time taken

📅 Results

Model	Accuracy	Precision	Recall	F1-score
Category classifier	66.86%	0.6066	0.8214	0.6781
Priority classifier	70.65%	0.6498	0.7812	0.6673

✅ API Development & Integration

🎯 Goal

Build a REST API that takes a customer support ticket and returns:

The predicted category (e.g., bug, billing)
The predicted priority (e.g., high, medium, low)

🧰 Tech Stack

Component	Tool	Reason
Web Framework	FastAPI	Modern, async, Swagger support
Model Serving	joblib, Hugging Face Hub	Load .gzip model(from HuggingFace), vectorizer, encoder
Input Validation	Pydantic	Ensure clean user inputs
Testing	pytest, httpx	API unit/integration testing

📁 Folder Structure

app/
├── api/
|      └── v1
|           └── endpoints.py # Routes
├── ml/
|      ├── encoders/ # Encoder file to encode category values
|      ├── models/ # Model files loaded during runtime
|      ├── vectorizers/ # Vectorizer file
|      ├── category_encoder.py # Category encoder class
|      └── models_classes.py # Model classes
├── utils/ # Helper scripts(data cleaning, model_loading etc.)
├── config.py # Global configuration file
└── main.py # FastAPI app

✅ Testing and Validation

This is where we test and validate that everything behaves as expected, under both normal and edge-case scenarios. It covers:

Unit Testing
- Test individual components
  - Data cleaning functions
  - Category encoders
  - Model classes
Integration Testing
- Test full API routes
  - /predict
  - /health
Error & Edge Case Testing
- Test valid, tricky and invalid inputs
  - Input within the allowed length
  - Input outside the allowed length
  - Invalid input field
  - Empty text field
  - Missing text field
  - Non-text input
  - List input
  - Input with special characters
  - Input with malformed json
  - Response time for maximum length input

🔄 Request-Response Sample

Method: POST, Endpoint: /predict

Request:

{
    "text": "My card was charged twice but the invoice shows one payment"
}

Response:

{
    "category": "billing",
    "priority": "high",
    "response_time": 0.245159,
}

✅ Deployment

🎯 Goal

Make the application publicly accessible, scalable, and production-ready.

🚀 Deployment Checklist

Task	Tool/Service
✅ Containerization	Docker
✅ Environment Configuration	.env
✅ Model Hosting	Hugging Face Hub
✅ Web API Hosting	GCP(Cloud Run, for autoscaling, pay-per-use, container support and easy integration with frontend)
✅ Logging & Monitoring	Logging module, GCP
✅ Health Check Endpoint	/health
✅ CI/CD Pipeline	GitHub Actions
✅ CORS	FastAPI.middleware.cors

✅ UI / Frontend

In this phase, a user-friendly frontend was developed to interact with the ML ticket triage API. The interface allows users to submit new support tickets and view the predicted department in real-time.

🛠️ Tech Stack

Layer	Technology
Structure	Next.js
Styling	Tailwind CSS
UI Components	Shadcn UI
Icons	Lucide
HTTP Requests	Native fetch API
Build Tool	Vite (via Next.js)
Deployment	Vercel

🧱 Features Implemented

📝 Ticket Submission Form
- A clean form UI built using Tailwind CSS. Includes input fields for describing the issue.
- A Submit button that triggers API call.
🔁 Real-time department and priority prediction

On submit, the form:
- Sends a POST request to the deployed GCP Cloud Run API.
- Displays the predicted department and priority level returned by the model.
🚨 Error Handling

Basic error messaging for failed predictions (e.g., server unreachable or invalid input).
🧠 Dynamic UX Feedback
- Button shows loading state during API call.
- Input fields are disabled while waiting for the response to prevent multiple submissions.
Output Visualization
- The predictions are shown in an animated pie chart.

🎨 Design Highlights

Mobile-responsive layout using Tailwind's utility classes
Clean, modern typography
Form-centric UI focused on clarity

Documentation & Reporting

Task	Tool
Code Docs	Docstrings, Swagger(FastAPI)
Project Docs	README.md
Visuals	tldraw, matplotlib, seaborn

🔗 Live Site

You can interact with the deployed app here: https://ml-ticket-routing.vercel.app/

🧪 How to Use

Click the following image for a video demonstration:

Go to the live frontend
Fill out the form:
- Write a brief message describing the issue
- Or select from the sample messages
Click Submit

The app will display the predicted department and priority level based on your input.

📌 Conclusion

⚠️ Limitations

The model is trained on a limited dataset and may not generalize well to real-world tickets.

It doesn't handle:

Multi-label classification (only one department predicted).
Complex sentences or industry-specific jargon.
Lacks user authentication and rate limiting.
No spell-check or typo correction in the input.

🚫 Out of Scope

Ticket assignment to specific agents or queues.
Multi-language support
Real-time updates or web sockets
A full-fledged ticket management system (CRUD, status tracking, etc.)

✅ Possible Future Improvements

🧠 ML/AI Enhancements
- A better model.
- Support multi-label classification where tickets may belong to multiple departments.
- Integrate confidence scores and show them in the UI.
- Add explainability features like highlighting important words.
📦 API Enhancements
- Add rate limiting and error monitoring.

This project demonstrates a complete end-to-end ML application workflow — from model development to API deployment and UI integration. It simulates how machine learning can enhance support operations by automatically routing tickets, improving efficiency and customer experience.

Although simplified, the system sets the groundwork for a production-ready pipeline. With more real-world data and further enhancements, this can evolve into a robust smart ticket triaging solution.

🙏 Acknowledgements

This project uses the Email Ticket Classification Dataset made available by Tobias Bueck(Kaggle).

The dataset contains labeled customer support tickets across multiple categories and was instrumental in training the machine learning classifiers.
It is used strictly for educational and non-commercial purposes under the licence CC BY 4.0.
Full credit goes to the dataset creators for compiling and sharing this resource.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
app		app
media		media
tests		tests
ticket-classifier-client		ticket-classifier-client
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

bhaskrr/ml-ticket-routing

Folders and files

Latest commit

History

Repository files navigation

🧠 AI Support Ticket Triage System

Table of contents

✅ Requirements Gathering & Planning

📌 Problem Statement

🫡 Objective

🎯 Goals & Deliverables

🗂️ Ticket Types & Labels (Multi-Class)

🧠 Model Plan

🏗️ Architecture Overview

🧱 Non-Functional Requirements

🛠️ Tools & Stack (Preview)

✅ Data Collection & Preprocessing

📦 Dataset

🔄 Preprocessing

📊 Exploratory Data Analysis (EDA)

✅ Model Training & Evaluation

🧪 Data Split

🧠 Modeling Strategy

🧱 Baseline Model (Traditional ML)

📈 Metrics to Track

📂 Outputs to Save

🧪 Experiment Tracking

📅 Results

✅ API Development & Integration

🎯 Goal

🧰 Tech Stack

📁 Folder Structure

✅ Testing and Validation

🔄 Request-Response Sample

✅ Deployment

🎯 Goal

🚀 Deployment Checklist

✅ UI / Frontend

🛠️ Tech Stack

🧱 Features Implemented

🎨 Design Highlights

Documentation & Reporting

🔗 Live Site

🧪 How to Use

📌 Conclusion

⚠️ Limitations

🚫 Out of Scope

✅ Possible Future Improvements

🙏 Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages