IP Intelligence Platform

Overview

The IP Intelligence Platform is a web-based application that enriches IP address data with domain information, company details, and geolocation data. It supports both batch processing through CSV file uploads and single IP lookups via a REST API.

System Architecture

The application follows a client-server architecture with a React frontend, Express backend, and PostgreSQL database for data persistence. The system is designed to handle large datasets efficiently with features like auto-saving and partial result downloads.

┌─────────────────────────────────────┐
│           Client Browser            │
└───────────────────┬─────────────────┘
                    │
                    ▼
┌─────────────────────────────────────┐
│            React Frontend           │
│  ┌─────────────┐    ┌─────────────┐ │
│  │   Upload    │    │   Results   │ │
│  │  Component  │    │  Component  │ │
│  └─────────────┘    └─────────────┘ │
└───────────────────┬─────────────────┘
                    │
                    ▼
┌─────────────────────────────────────┐
│           Express Backend           │
│  ┌─────────────┐    ┌─────────────┐ │
│  │  REST API   │    │IP Enrichment│ │
│  │  Endpoints  │◄───►   Service   │ │
│  └─────────────┘    └─────────────┘ │
│  ┌─────────────┐    ┌─────────────┐ │
│  │ CSV Parser  │    │ Job Manager │ │
│  └─────────────┘    └─────────────┘ │
└───────────────────┬─────────────────┘
                    │
                    ▼
┌─────────────────────────────────────┐
│         PostgreSQL Database         │
│  ┌─────────────┐    ┌─────────────┐ │
│  │    Users    │    │ Enrichment  │ │
│  │    Table    │    │  Jobs Table │ │
│  └─────────────┘    └─────────────┘ │
│  ┌───────────────────────────────┐  │
│  │    Enrichment Results Table   │  │
│  └───────────────────────────────┘  │
└─────────────────────────────────────┘

Core Components

Frontend - React-based SPA with progressive workflow steps
- Upload Section: Handles file validation and upload
- Config Section: Configures enrichment options
- Processing Section: Shows real-time progress
- Results Section: Displays and exports enriched data
Backend - Express server with RESTful API endpoints
- REST API Layer: Handles HTTP requests and responses
- IP Enrichment Service: Core intelligence engine
- CSV Processing: Specialized CSV handling
- Job Manager: Orchestrates long-running jobs
Database - PostgreSQL for storing job information and enrichment results
- Users: Authentication and user management
- IP Enrichment Jobs: Metadata about processing jobs
- IP Enrichment Results: Individual result records

Technical Stack

Frontend

React with TypeScript
TanStack Query for data fetching
ShadCN/UI components with Tailwind CSS for styling
Wouter for routing
React Hook Form for form handling

Backend

Node.js with Express
TypeScript for type safety
Multer for file uploads
Drizzle ORM with PostgreSQL

Data Processing

CSV parsing with papaparse
CSV generation with csv-stringify
Custom IP validation and enrichment logic

Database Schema

The application uses three main data tables:

users - Stores user information
- id (primary key)
- username
- password (hashed)
- email
- createdAt
ip_enrichment_jobs - Stores job metadata
- id (primary key)
- fileName
- originalFileName
- totalIPs
- processedIPs
- successfulIPs
- failedIPs
- filteredIPs
- status ('pending', 'processing', 'completed', 'failed')
- createdAt
- completedAt
- userId (foreign key to users)
- error
- ipColumnName
- includeGeolocation
- includeDomain
- includeCompany
- includeNetwork
- partialResultsAvailable
- processingCheckpoint
ip_enrichment_results - Stores individual results
- id (primary key)
- jobId (foreign key to ip_enrichment_jobs)
- rowIndex
- originalData (JSON)
- enrichmentData (JSON)
- processed
- success
- error
- createdAt
- updatedAt

Core Workflows

1. Batch IP Enrichment Process

The following diagram illustrates the end-to-end data flow for batch processing:

┌──────────┐     ┌──────────────┐     ┌───────────────┐
│  Upload  │     │  Configure   │     │   Process     │
│   CSV    │────►│  Enrichment  │────►│    Data       │
│          │     │   Options    │     │               │
└──────────┘     └──────────────┘     └───────┬───────┘
                                              │
                                              ▼
┌──────────┐     ┌──────────────┐     ┌───────────────┐
│ Download │     │   Preview    │     │ Store Results │
│ Results  │◄────│   Results    │◄────│  in Database  │
│          │     │              │     │               │
└──────────┘     └──────────────┘     └───────────────┘

Detailed process steps:

File Upload:
- User uploads a CSV file through the frontend
- File is validated for proper format
- Headers are extracted and presented to the user
- CSV row count is estimated for job sizing
Enrichment Configuration:
- User selects which column contains IP addresses
- User configures enrichment options (geolocation, domain, company info)
- Job is created with 'pending' status
Processing:
- Backend parses the CSV file row-by-row
- For each row, IP data is enriched through external IP intelligence services
- Common ISP IPs are identified and flagged (Telstra, Comcast, etc.)
- Results are saved to the database in batches of 100 records
- Real-time progress updates are sent to the frontend
Results Handling:
- Processed data is available for preview in the UI
- Full results can be downloaded as a CSV file
- Filtered version (excluding common ISP IPs) is available

2. Single IP Lookup API

The following diagram illustrates the API lookup flow:

┌───────────┐     ┌──────────────┐     ┌───────────────┐
│ API       │     │ Validate     │     │ Perform       │
│ Request   │────►│ IP Address   │────►│ IP Lookup     │
│           │     │              │     │               │
└───────────┘     └──────────────┘     └───────┬───────┘
                                               │
                                               ▼
                  ┌──────────────┐     ┌───────────────┐
                  │ Format JSON  │     │ Enrich with   │
                  │ Response     │◄────│ Additional    │
                  │              │     │ Data          │
                  └──────────────┘     └───────────────┘

Detailed process steps:

Request Handling:
- API endpoint /api/lookup accepts IP address parameter
- Validation checks for proper IP format
IP Enrichment:
- Geolocation data is fetched (country, city, coordinates)
- Domain name is resolved through reverse DNS lookup
- Organization data is determined (ISP, ASN, company)
- ISP filtering logic is applied to identify common consumer IPs
Response Formatting:
- JSON response with all enrichment data
- Error handling for invalid IPs or service issues

Example API Response:

{
  "ip": "8.8.8.8",
  "domain": "dns.google",
  "company": "Google LLC",
  "country": "United States",
  "city": "Mountain View",
  "region": "California",
  "latitude": 37.4056,
  "longitude": -122.0775,
  "isp": "Google LLC",
  "asn": "AS15169",
  "ispFiltered": false,
  "success": true
}

IP Enrichment Details

The IP enrichment service performs several key functions:

IP Validation - Ensures the input is a valid IPv4 or IPv6 address
Geolocation Lookup - Determines country, city, region, and coordinates
Domain Resolution - Performs reverse DNS lookup to find domain names
Organization Identification - Determines the company or ISP that owns the IP
ISP Filtering - Identifies and flags common consumer ISP IPs

ISP Filtering Logic

The application uses a combination of ISP name matching and network range checks to identify common consumer ISPs. This filtering helps focus analysis on business and organizational IPs rather than residential connections.

Common ISPs that are flagged include:

Comcast
Verizon
AT&T
Time Warner
Cox Communications
CenturyLink
Telstra
BT
Other major residential providers

Auto-Save Functionality

To ensure data resilience during long-running jobs, the application implements an auto-save mechanism:

Results are saved in batches of 100 records
A processing checkpoint is maintained in the job record
If processing fails, it can be resumed from the last checkpoint
Partial results are available for download at any time

API Integration

REST API Endpoints

POST /api/upload - Upload a CSV file for processing
POST /api/enrich - Create and start an enrichment job
GET /api/jobs/:id - Get job status and metadata
GET /api/jobs/:id/recent-results - Get results since a specific index
GET /api/jobs - List all enrichment jobs
GET /api/jobs/:id/preview - Get a paginated preview of results
GET /api/jobs/:id/download - Download complete or partial results
GET /api/lookup - Single IP lookup endpoint

BigQuery Integration

The IP Intelligence Platform can be integrated with BigQuery using the BigQuery Connection feature. This allows SQL queries to directly call the IP enrichment API and incorporate the results into analytical workflows.

BigQuery Remote Function Integration

┌─────────────┐     ┌────────────────┐     ┌────────────────┐
│  BigQuery   │     │   BigQuery     │     │  IP Enrichment │
│   Query     │────►│   Connection   │────►│     API        │
│             │     │                │     │                │
└─────────────┘     └────────────────┘     └────────┬───────┘
                                                    │
                    ┌────────────────┐     ┌────────▼───────┐
                    │  SQL Results   │     │    JSON        │
                    │  with IP Data  │◄────│   Response     │
                    │                │     │                │
                    └────────────────┘     └────────────────┘

To implement the integration:

Create a BigQuery Connection to Cloud Resource
Create a remote function that uses the connection to call the IP API
Invoke the function in SQL queries to enrich IP data

Example SQL for setting up the connection and function:

-- Step 1: Create a connection (one-time setup)
CREATE OR REPLACE CONNECTION `your_project.your_region.your_connection`
OPTIONS(
  location = 'your_region',
  connection_type = 'CLOUD_RESOURCE'
);

-- Step 2: Create the remote function using the connection
CREATE OR REPLACE FUNCTION `your_project.your_dataset.get_ip_intelligence`(ip STRING)
RETURNS STRUCT<
  country STRING,
  city STRING,
  region STRING,
  latitude FLOAT64,
  longitude FLOAT64,
  company STRING,
  isp STRING,
  asn STRING,
  ispFiltered BOOL,
  domain STRING,
  success BOOL
>
REMOTE WITH CONNECTION `your_project.your_region.your_connection`
OPTIONS(
  endpoint = 'https://ip-enrich.rittmananalytics.com/api/lookup',
  user_defined_context = [("ip", ip)],
  max_batching_rows = 50
);

Important Note: JavaScript UDFs in BigQuery cannot make external network requests due to sandbox restrictions. The BigQuery Connection approach is the only supported method for calling external APIs.

Performance Considerations

The application is designed for efficient handling of large datasets:

Stream Processing - CSV files are processed as streams rather than loaded entirely in memory
Batch Database Operations - Results are inserted in batches to reduce database load
Asynchronous Processing - Long-running jobs run asynchronously with progress tracking
Pagination - Result previews use pagination to handle large result sets
Partial Downloads - Support for downloading partial results during processing

Error Handling

The application implements comprehensive error handling:

Job-level Errors - Captured in the job record with meaningful messages
Row-level Errors - Individual row failures don't stop the entire job
API Error Responses - Structured error responses with appropriate HTTP status codes
Client-side Validation - Prevents invalid inputs before server requests
Service Resilience - External service failures are gracefully handled

Security Considerations

Input Validation - All user inputs are validated before processing
Rate Limiting - API endpoints are protected against abuse
Data Sanitization - User-provided data is sanitized before database storage
Error Handling - Errors are logged without exposing sensitive information

Deployment

The application is designed to be deployed to production using Replit Deployments, making it accessible at the domain: ip-enrich.rittmananalytics.com

Development and Extension

The codebase is structured to facilitate extension and modification:

Modular Architecture - Core functions are separated into distinct modules
Shared Type Definitions - Types are shared between frontend and backend
Storage Interface Abstraction - Database operations are abstracted through an interface
Configuration Options - Key parameters can be adjusted through environment variables

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
attached_assets		attached_assets
client		client
server		server
shared		shared
.gitignore		.gitignore
.replit		.replit
API-README.md		API-README.md
drizzle.config.ts		drizzle.config.ts
generated-icon.png		generated-icon.png
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
readme.md		readme.md
replit.nix		replit.nix
tailwind.config.ts		tailwind.config.ts
theme.json		theme.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IP Intelligence Platform

Overview

System Architecture

Core Components

Technical Stack

Frontend

Backend

Data Processing

Database Schema

Core Workflows

1. Batch IP Enrichment Process

2. Single IP Lookup API

IP Enrichment Details

ISP Filtering Logic

Auto-Save Functionality

API Integration

REST API Endpoints

BigQuery Integration

BigQuery Remote Function Integration

Performance Considerations

Error Handling

Security Considerations

Deployment

Development and Extension

About

Uh oh!

Releases

Packages

Uh oh!

Languages

rittmananalytics/ip_intelligence_platform

Folders and files

Latest commit

History

Repository files navigation

IP Intelligence Platform

Overview

System Architecture

Core Components

Technical Stack

Frontend

Backend

Data Processing

Database Schema

Core Workflows

1. Batch IP Enrichment Process

2. Single IP Lookup API

IP Enrichment Details

ISP Filtering Logic

Auto-Save Functionality

API Integration

REST API Endpoints

BigQuery Integration

BigQuery Remote Function Integration

Performance Considerations

Error Handling

Security Considerations

Deployment

Development and Extension

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages