diff --git a/docs/hooks.md b/docs/hooks.md
new file mode 100644
index 0000000..e5075df
--- /dev/null
+++ b/docs/hooks.md
@@ -0,0 +1,422 @@
+# Hooks System Documentation
+
+This hooks system provides flexible control over API calls by allowing you to define hooks at both the **client level** (applied to all calls from a client) and **per-create level** (applied to specific calls). The system supports additive composition, meaning both types of hooks can work together to provide maximum flexibility for logging, metrics, debugging, and other cross-cutting concerns.
+
+## Overview
+
+The hooks system is built around two key concepts:
+
+- **Client-level hooks**: Applied to ALL calls made by a client
+- **Call-level hooks**: Applied to SPECIFIC calls  
+- **Additive composition**: When both are present, they combine (you get the union of both sets)
+
+### Key Components
+
+- **`BaseHook`**: Base class for creating custom hooks with priority system
+- **`HookContext`**: Contains request/response data and metadata
+- **`InstructorHookableClient`**: Client that supports both global and call-specific hooks
+- **Hook Phases**: Different execution points (PRE_REQUEST, POST_REQUEST, ON_ERROR, ON_RETRY)
+
+## Basic Usage
+
+### 1. Client-Level Hooks (Global)
+
+Client-level hooks are executed for every API call made by that client:
+
+```python
+from core.client import InstructorHookableClient
+from core.hooks import LoggingHook, MetricsHook, RateLimitHook
+
+# Define client-level hooks (applied to ALL calls)
+client_hooks = [
+    LoggingHook(name="global_logging"),
+    MetricsHook(name="global_metrics"), 
+    RateLimitHook(calls_per_second=10.0)
+]
+
+# Create client with hooks
+client = InstructorHookableClient(
+    provider="openai/gpt-4.1-nano",
+    async_client=True,
+    client_hooks=client_hooks
+)
+
+# All calls will use these hooks
+result1 = await client.create(messages=messages1)  # Uses client hooks
+result2 = await client.create(messages=messages2)  # Uses client hooks
+```
+
+### 2. Call-Level Hooks (Per-Create)
+
+Call-level hooks are applied to specific API calls:
+
+```python
+from core.hooks import RetryHook, ContentValidationHook
+
+# Define call-specific hooks
+call_hooks = [
+    RetryHook(max_retries=5),
+    ContentValidationHook(min_queries=4)
+]
+
+# Use in specific generation calls
+result = await generate_questions_pipeline(
+    conversation_hashes=hashes,
+    version="v3",
+    db_path=db_path,
+    call_hooks=call_hooks  # Applied only to this pipeline
+)
+```
+
+### 3. Additive Behavior (Combined Hooks)
+
+When you have both client-level and call-level hooks, they combine additively:
+
+```python
+# Client has these hooks:
+client_hooks = [
+    LoggingHook(priority=100),
+    MetricsHook(priority=90),
+    RateLimitHook(priority=200)
+]
+
+# Call adds these hooks:  
+call_hooks = [
+    RetryHook(priority=50),
+    ValidationHook(priority=60)
+]
+
+# Total hooks executed (in priority order):
+# 1. RateLimitHook (priority 200) - client-level
+# 2. LoggingHook (priority 100) - client-level  
+# 3. MetricsHook (priority 90) - client-level
+# 4. ValidationHook (priority 60) - call-level
+# 5. RetryHook (priority 50) - call-level
+```
+
+## Hook Phases
+
+Hooks can execute at different phases of an API call:
+
+- **`PRE_REQUEST`**: Before the API call is made
+- **`POST_REQUEST`**: After successful API response  
+- **`ON_ERROR`**: When an error occurs
+- **`ON_RETRY`**: When a retry is attempted
+
+```python
+class CustomHook(BaseHook):
+    def __init__(self):
+        super().__init__("custom_hook", priority=100)
+    
+    async def execute(self, context: HookContext) -> Optional[Dict[str, Any]]:
+        if context.phase == HookPhase.PRE_REQUEST:
+            # Pre-processing logic
+            pass
+        elif context.phase == HookPhase.POST_REQUEST:
+            # Post-processing logic
+            pass
+        elif context.phase == HookPhase.ON_ERROR:
+            # Error handling logic
+            pass
+        return None
+```
+
+## Built-in Hooks
+
+### LoggingHook
+Logs API requests and responses at configurable levels:
+
+```python
+LoggingHook(name="my_logger", log_level="INFO", priority=100)
+```
+
+### MetricsHook
+Collects metrics on API calls (count, duration, errors):
+
+```python
+metrics_hook = MetricsHook(name="my_metrics", priority=90)
+# Later: metrics = metrics_hook.get_metrics()
+```
+
+### RateLimitHook
+Implements rate limiting with configurable calls per second:
+
+```python
+RateLimitHook(calls_per_second=5.0, priority=200)
+```
+
+### RetryHook
+Implements retry logic with exponential backoff:
+
+```python
+RetryHook(max_retries=3, backoff_factor=1.5, priority=50)
+```
+
+## Custom Hooks
+
+Create custom hooks by extending `BaseHook`:
+
+```python
+from core.hooks import BaseHook, HookContext, HookPhase
+
+class CustomValidationHook(BaseHook):
+    def __init__(self, min_length: int = 100):
+        super().__init__("custom_validation", priority=70)
+        self.min_length = min_length
+    
+    async def execute(self, context: HookContext) -> Optional[Dict[str, Any]]:
+        if context.phase == HookPhase.POST_REQUEST:
+            # Validate response
+            if context.response_data:
+                result = context.response_data.get("result")
+                if hasattr(result, "queries"):
+                    # Custom validation logic here
+                    if len(result.queries) < self.min_length:
+                        return {"should_retry": True}
+        return None
+```
+
+## Integration with Generation Pipelines
+
+The generation pipelines support both client-level and call-level hooks:
+
+```python
+# Question generation with both hook types
+await generate_questions_pipeline(
+    conversation_hashes=hashes,
+    version="v3", 
+    db_path=db_path,
+    client_hooks=[LoggingHook(), MetricsHook()],     # Applied to all calls
+    call_hooks=[ValidationHook(), FilterHook()]      # Applied to each call
+)
+
+# Summary generation with different call hooks
+await generate_summaries_pipeline(
+    conversation_hashes=hashes,
+    version="v2",
+    db_path=db_path, 
+    client_hooks=[LoggingHook(), MetricsHook()],     # Same client hooks
+    call_hooks=[DifferentValidationHook()]           # Different call hooks
+)
+```
+
+## Integration with Braintrust-Style Evaluations
+
+This hooks system works seamlessly with existing Braintrust evaluation patterns:
+
+```python
+# Your existing task function pattern
+async def task(query, hooks):
+    # Process the query
+    result = await process_query(query)
+    
+    # Call hooks.meta() as before
+    hooks.meta(input=query, output=result)
+    
+    return result
+
+# Now you can also use the client pattern with additive hooks
+client = InstructorHookableClient(
+    provider="openai/gpt-4.1-nano",
+    async_client=True,
+    client_hooks=[LoggingHook(), MetricsHook()]  # Global hooks
+)
+
+# Add call-specific hooks for this evaluation
+call_hooks = [RetryHook(max_retries=5)]
+
+# Both client and call hooks will be executed
+result = await client.create(query, hooks=call_hooks)
+```
+
+## Factory Patterns
+
+Use factory functions for common hook combinations:
+
+```python
+from core.client import ClientFactory
+
+# Pre-configured clients
+basic_client = ClientFactory.create_basic_client()
+monitored_client = ClientFactory.create_monitored_client() 
+reliable_client = ClientFactory.create_reliable_client(calls_per_second=5.0)
+full_client = ClientFactory.create_full_featured_client()
+```
+
+## Hook Priority System
+
+Hooks execute in priority order (higher numbers execute first):
+
+- **`200+`**: Critical hooks (rate limiting, authentication)
+- **`100-199`**: Monitoring and logging hooks
+- **`50-99`**: Business logic hooks
+- **`0-49`**: Cleanup and finalization hooks
+
+```python
+# Example priority ordering
+hooks = [
+    RateLimitHook(priority=200),      # Executes first
+    LoggingHook(priority=100),        # Executes second
+    ValidationHook(priority=60),      # Executes third
+    CleanupHook(priority=10)          # Executes last
+]
+```
+
+## Advanced Features
+
+### Hook Communication
+Hooks can communicate via the context metadata:
+
+```python
+# Hook A sets metadata
+context.metadata["custom_flag"] = True
+
+# Hook B reads metadata  
+if context.metadata.get("custom_flag"):
+    # Do something special
+    pass
+```
+
+### Conditional Execution
+Control when hooks execute:
+
+```python
+def should_execute(self, context: HookContext) -> bool:
+    # Only execute for specific request types
+    return context.request_data.get("method") == "chat.completions.create"
+```
+
+### Flow Control
+Hooks can control execution flow:
+
+```python
+return {
+    "skip_remaining_hooks": True,  # Stop executing other hooks
+    "should_retry": True,          # Trigger retry logic
+    "request_data": {...}          # Modify request
+}
+```
+
+## Best Practices
+
+### 1. Use Appropriate Priorities
+- Critical functionality (auth, rate limiting): 200+
+- Monitoring and logging: 100-199  
+- Business logic: 50-99
+- Cleanup: 0-49
+
+### 2. Client vs Call Level Hooks
+- **Client-level**: Cross-cutting concerns (logging, metrics, rate limiting)
+- **Call-level**: Specific requirements (validation, filtering, special retry logic)
+
+### 3. Hook Naming
+Use descriptive names that indicate purpose and scope:
+
+```python
+LoggingHook("experiment_001_logging")
+MetricsHook("production_metrics") 
+RetryHook("high_priority_retry")
+```
+
+### 4. Error Handling
+Hooks should handle their own errors gracefully:
+
+```python
+async def execute(self, context: HookContext) -> Optional[Dict[str, Any]]:
+    try:
+        # Hook logic here
+        pass
+    except Exception as e:
+        logger.error(f"Hook {self.name} failed: {e}")
+        # Don't re-raise unless critical
+    return None
+```
+
+## Migration Guide
+
+### From Basic Instructor Client
+
+```python
+# Before
+client = instructor.from_provider("openai/gpt-4.1-nano", async_client=True)
+
+# After  
+client = InstructorHookableClient(
+    provider="openai/gpt-4.1-nano",
+    async_client=True,
+    client_hooks=[LoggingHook(), MetricsHook()]
+)
+```
+
+### Updating Generation Pipelines
+
+```python
+# Before
+results = await generate_questions_pipeline(
+    conversation_hashes=hashes,
+    version="v3",
+    db_path=db_path
+)
+
+# After
+results = await generate_questions_pipeline(
+    conversation_hashes=hashes, 
+    version="v3",
+    db_path=db_path,
+    client_hooks=[LoggingHook(), MetricsHook()],  # Applied to all calls
+    call_hooks=[ValidationHook()]                 # Applied to each call
+)
+```
+
+## Example: Complete Usage
+
+```python
+from core.client import InstructorHookableClient
+from core.hooks import LoggingHook, MetricsHook, RetryHook, RateLimitHook
+
+# Create client with global hooks
+client_hooks = [
+    LoggingHook("global_logging", priority=100),
+    MetricsHook("global_metrics", priority=90),
+    RateLimitHook(calls_per_second=10.0, priority=200)
+]
+
+client = InstructorHookableClient(
+    provider="openai/gpt-4.1-nano",
+    async_client=True,
+    client_hooks=client_hooks
+)
+
+# Create call-specific hooks for special cases
+special_call_hooks = [
+    RetryHook(max_retries=5, priority=50),
+    CustomValidationHook(min_length=100, priority=60)
+]
+
+# Regular call - only client hooks execute
+result1 = await client.create(messages=messages1)
+
+# Special call - both client and call hooks execute
+result2 = await client.create(messages=messages2, hooks=special_call_hooks)
+
+# Pipeline usage with both hook types
+results = await generate_questions_pipeline(
+    conversation_hashes=hashes,
+    version="v3",
+    db_path=db_path,
+    client_hooks=client_hooks,      # Applied to all calls in pipeline
+    call_hooks=special_call_hooks   # Applied additionally to each call
+)
+```
+
+## Benefits
+
+1. **Flexible**: Support both global and per-call hooks
+2. **Additive**: Combine different hook types seamlessly
+3. **Priority-based**: Control execution order with priority system
+4. **Phase-aware**: Execute hooks at different API call phases
+5. **Compatible**: Works with existing Braintrust evaluation patterns
+6. **Extensible**: Easy to add new hook types and functionality
+7. **Composable**: Hooks can be combined and reused across different contexts
+
+This additive hooks system provides the fine-grained control needed for complex evaluation and generation pipelines while maintaining clean separation of concerns between global and call-specific functionality.
\ No newline at end of file
diff --git a/mkdocs.yml b/mkdocs.yml
index e10f163..979cbc1 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -109,6 +109,8 @@ nav:
           - "Why Most Document Parsing Sucks (Adit, Reducto)": "talks/reducto-docs-adit.md"
           - "Encoder Stacking and Multi-Modal Retrieval (Daniel, Superlinked)": "talks/superlinked-encoder-stacking.md"
           - "How Extend Achieves 95%+ Document Automation (Eli Badgio)": "talks/extend-document-automation.md"
+  - "Technical Documentation":
+      - "Hooks System": "hooks.md"
 
 # --- Enhancements for mkdocs.yml ---
 # 1. Add recommended plugins for better UX, SEO, and maintainability