Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 47 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,52 @@
# Change Log

Resources for generating a changelog:
## [Unreleased]

### Added
- **AnyCost Stream API Compliance**: Updated `upload_to_anycost()` function to include required `month` parameter in ISO 8601 format (e.g., "2024-08")
- **Batch Processing**: Added support for uploading data to multiple months in a single session
- Single month: `2024-08`
- Month range: `2024-08:2024-10` (uploads to Aug, Sep, Oct)
- Comma-separated: `2024-08,2024-09,2024-11`
- Progress tracking and error resilience for batch uploads
- **Operation Type Support**: Added support for operation types when uploading to AnyCost Stream:
- `replace_drop` (default): Replace all existing data for the month
- `replace_hourly`: Replace data with overlapping hours
- `sum`: Append data to existing records
- **Rich Error Handling**: Comprehensive error handling with helpful messages
- Input validation with retry logic (3 attempts)
- Month format validation with specific error messages
- File processing errors with row-by-row reporting
- Network timeout and connection error handling
- API response validation and error reporting
- **Interactive Prompts**: Added user prompts for processing mode, month selection, and operation type during upload
- **Comprehensive Test Suite**: Added unit tests covering all functions with 20 test cases
- Tests for CSV processing, data transformation, and API upload functionality
- Tests for month range parsing and batch processing functionality
- Mocked external dependencies for reliable testing
- Located in `tests/` directory with pytest framework
- **Developer Experience**: Enhanced documentation and code comments for easy customization
- Step-by-step customization guide for different cloud providers
- Field mapping examples for AWS, Azure, and GCP
- Troubleshooting section with common issues and solutions
- Inline code comments marking customization points

### Changed
- Enhanced function documentation to explain all required and optional parameters for AnyCost Stream uploads
- Updated file header comments to document month and operation requirements
- Removed beta warning from README as AnyCost Stream is now generally available
- Improved README structure with Quick Start guide and detailed customization instructions

### Technical Details
- JSON payload now includes `month`, `operation`, and `data` fields as per AnyCost Stream API specification
- Added `parse_month_range()` function to handle different month input formats
- Batch processing makes sequential API calls with error handling and progress tracking
- Maintains backward compatibility while adding new required functionality
- All 20 tests pass successfully with proper mocking of external dependencies

---

## Resources for generating a changelog:

[skywinder/Github-Changelog-Generator](https://github.com/skywinder/Github-Changelog-Generator) - generates a full changelog that overwrites the existing CHANGELOG.md.

Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contribution

Please read [CloudZero contribution guidelines](https://github.com/cloudzero/open-source-template/blob/master/GENERAL-CONTRIBUTING.md).
Please read [CloudZero contribution guidelines](https://github.com/Cloudzero/template-cloudzero-open-source/blob/main/GENERAL-CONTRIBUTING.md).

## Documentation

Expand Down
183 changes: 175 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ This repository contains a Python script that serves as an example of an Adaptor

You can use this Adaptor as a model for structuring your own AnyCost Stream Adaptor, modifying it to fit your use case.

**Note:** The AnyCost Stream feature is in beta. Contact your CloudZero representative to request access.

## Table of Contents

Expand Down Expand Up @@ -64,25 +63,48 @@ An [AnyCost Stream connection](https://docs.cloudzero.com/docs/anycost-stream-ge

An [AnyCost Stream Adaptor](https://docs.cloudzero.com/docs/anycost-custom-adaptors) is the code that queries data from the provider, transforms it to fit the required format, and sends the transformed data to CloudZero.

### Quick Start for New Users

1. **Prerequisites**: Ensure you have Python 3.9+ installed and access to your cost data in CSV format
2. **Setup**: Clone this repository and install dependencies ([Installation](#installation))
3. **Prepare Data**: Format your CSV files or use the provided examples
4. **Run Script**: Execute with your data files and follow the interactive prompts
5. **Upload**: Choose single month or batch processing to upload to CloudZero

### Three Core Steps

An AnyCost Stream Adaptor typically performs three actions:

1. [Retrieve data from a cloud provider for a billing month.](#step-1-retrieve-cost-data-from-cloud-provider)
2. [Transform the data into the Common Bill Format (CBF).](#step-2-transform-cost-data-to-cbf)
3. [Send the CBF data to the CloudZero API.](#step-3-send-the-cbf-data-to-cloudzero)

You can write an Adaptor in any language, but this example uses Python.
You can write an Adaptor in any language, but this example uses Python and can be easily customized for different cloud providers.

### Step 1: Retrieve Cost Data From Cloud Provider

Your Adaptor should start by retrieving cost data from your cloud provider. Follow your provider's instructions to retrieve the data you need. For example, this could involve sending requests to the provider's APIs to retrieve billing records for one or more accounts, or downloading a CSV of all cost data from the provider.
Your Adaptor should start by retrieving cost data from your cloud provider. This step varies by provider:

Because every provider makes its cost data available in a different way, the example Adaptor skips this step. Instead, we've provided you with three CSVs representing the data your Adaptor could retrieve from this step:
**Common Data Sources:**
- **AWS**: Cost and Usage Reports (CUR), billing CSV exports
- **Azure**: Cost Management exports, billing data APIs
- **GCP**: Billing export to BigQuery, Cloud Billing API
- **Other Clouds**: Billing APIs, cost management dashboards, CSV exports

- `cloud_usage.csv`: Data related to cloud resource usage
- `cloud_purchase_commitments.csv`: Data for discounts related to committed-use contracts
- `cloud_discounts.csv`: Data for other discounts received
**For This Example:**
Because every provider makes cost data available differently, this example uses three sample CSV files:

The dummy data is taken from the [CBF example](https://docs.cloudzero.com/docs/anycost-common-bill-format-cbf#examples) in the CloudZero documentation.
- `cloud_usage.csv`: Resource usage and compute costs
- `cloud_purchase_commitments.csv`: Reserved instances, savings plans
- `cloud_discounts.csv`: Volume discounts, credits, promotions

**Customizing for Your Provider:**
To adapt this script for your cloud provider:
1. Replace the CSV reading logic with API calls to your provider
2. Modify the data processing functions to match your provider's data structure
3. Update the column mappings in the transformation functions

See [Customization Guide](#customizing-for-different-cloud-providers) below for detailed instructions.

### Step 2: Transform Cost Data to CBF

Expand Down Expand Up @@ -143,6 +165,46 @@ After processing the data, the script will prompt you to upload the CBF data to
1. Enter `y` if you want to upload the data.
2. Provide your AnyCost Stream Connection ID.
3. Enter your CloudZero API key when prompted.
4. Choose processing mode:
- **Single month**: Upload data for one billing month
- **Batch processing**: Upload data for multiple months
5. Specify the billing month(s):
- **Single month**: `2024-08`
- **Month range**: `2024-08:2024-10` (uploads to Aug, Sep, Oct)
- **Comma-separated**: `2024-08,2024-09,2024-11`
6. Choose an operation type:
- **replace_drop** (default): Replace all existing data for the month
- **replace_hourly**: Replace data with overlapping hours
- **sum**: Append data to existing records

#### Batch Processing Benefits

- **Time-saving**: Upload historical data for multiple months in one session
- **Progress tracking**: See upload progress and success/failure status for each month
- **Error resilience**: Failed uploads for individual months won't stop the entire process
- **Flexible input**: Support for ranges, lists, or individual months
- **Input validation**: Comprehensive error checking with helpful suggestions
- **Retry logic**: Multiple attempts for invalid input with clear error messages

#### Error Handling

The script provides comprehensive error handling and validation:

**Month Format Validation**:
- Validates YYYY-MM format (e.g., "2024-08")
- Checks for valid date ranges in batch mode
- Provides specific error messages for invalid formats

**File Processing Errors**:
- Clear messages for missing or inaccessible CSV files
- Validation of required CSV columns
- Row-by-row error reporting with line numbers

**Network and API Errors**:
- Timeout handling (30-second limit per request)
- Connection error detection
- HTTP status code reporting with error details
- JSON parsing error handling

### Viewing Results

Expand All @@ -152,6 +214,111 @@ Once uploaded, you can view the processed data within the CloudZero platform. Na

To use the `anycost_example.py` script to transform the cost data to CBF, run the command as described in the [Running the Script](#running-the-script) section.

## Testing

This repository includes a comprehensive test suite to ensure code quality and reliability.

### Running Tests

1. Create and activate a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate
```

2. Install test dependencies:
```bash
pip install -r tests/requirements-dev.txt
```

3. Run the test suite:
```bash
python -m pytest tests/ -v
```

### Test Coverage

The test suite includes 11 test cases covering:
- CSV reading and processing functions
- Data transformation for usage, commitments, and discounts
- CBF output generation
- AnyCost Stream API upload functionality with mocked requests
- All operation types (replace_drop, replace_hourly, sum)

All tests use proper mocking to isolate functionality and avoid external dependencies.

## Customizing for Different Cloud Providers

This script can be easily adapted for different cloud providers by modifying the data processing functions:

### Step-by-Step Customization

1. **Identify Your Data Source**
```python
# Replace CSV reading with API calls
def get_provider_data(start_date, end_date):
# Example: Call your provider's billing API
# response = provider_client.get_billing_data(start=start_date, end=end_date)
# return response.data
```

2. **Update Data Processing Functions**
```python
def process_usage_data(raw_data):
# Map your provider's fields to CBF format
cbf_rows = []
for item in raw_data:
cbf_rows.append({
"lineitem/type": "Usage",
"resource/service": item["service_name"], # Your field
"resource/id": item["resource_identifier"], # Your field
"time/usage_start": item["billing_period"], # Your field
"cost/cost": str(item["total_cost"]), # Your field
"cost/discounted_cost": str(item["net_cost"]), # Your field
})
return cbf_rows
```

3. **Common Provider Mappings**

**AWS CUR Fields:**
- `lineItem/LineItemType` β†’ `lineitem/type`
- `product/ProductName` β†’ `resource/service`
- `lineItem/ResourceId` β†’ `resource/id`
- `lineItem/UsageStartDate` β†’ `time/usage_start`
- `lineItem/UnblendedCost` β†’ `cost/cost`

**Azure Billing Fields:**
- `MeterCategory` β†’ `resource/service`
- `InstanceId` β†’ `resource/id`
- `UsageDateTime` β†’ `time/usage_start`
- `ExtendedCost` β†’ `cost/cost`

**GCP Billing Fields:**
- `service.description` β†’ `resource/service`
- `resource.name` β†’ `resource/id`
- `usage_start_time` β†’ `time/usage_start`
- `cost` β†’ `cost/cost`

4. **Test Your Changes**
```bash
python -m pytest tests/ -v
```

### Common Troubleshooting

**Issue: "Missing required columns in CSV"**
- Solution: Update the `required_columns` list in processing functions to match your data

**Issue: "Invalid cost/discount value"**
- Solution: Check your provider's number format (currency symbols, decimals)

**Issue: "Invalid month format"**
- Solution: Ensure dates are in YYYY-MM format, convert if needed

**Issue: "Connection timeout"**
- Solution: Increase timeout in upload function or implement retry logic

## Contributing

We appreciate feedback and contributions to this repo! Before you get started, see [this repo's contribution guide](CONTRIBUTING.md).
Expand Down
Loading