Skip to content

Conversation

@LifeJiggy
Copy link

Summary

This PR adds a BatchProcessor utility class that enables efficient grouping and processing of multiple requests, helping developers optimize API utilization and reduce overhead.

Problem

When making multiple API calls, developers often need to batch requests for efficiency, but currently have no built-in way to do this. This leads to:

  • Inefficient API usage with many small requests
  • Manual batching logic scattered throughout applications
  • Difficulty managing timeouts and batch sizes
  • Poor performance for bulk operations

Solution

Add BatchProcessor class with:

  • Configurable batch size limits
  • Timeout-based automatic processing
  • Simple API for checking if requests can be made
  • Force processing capability for immediate batch handling
  • Callback-based batch processing
  • Thread-safe implementation using standard library

Key Features

  • Configurable Batch Size: Set maximum items per batch
  • Timeout Processing: Automatic processing after timeout
  • Force Processing: Immediate batch processing when needed
  • Callback Integration: Custom processing logic via callbacks
  • Thread Safe: Uses standard library only, no external dependencies
  • Simple API: Easy to integrate into existing workflows

Benefits

  • Reduces API overhead for bulk operations
  • Improves request efficiency and throughput
  • Simplifies batch processing logic
  • Better resource utilization
  • Automatic timeout handling

Testing

Added comprehensive test suite covering:

  • Basic batch processing operations
  • Size-based automatic processing
  • Timeout-based processing
  • Force processing functionality
  • Multiple batch scenarios
  • Edge cases and error conditions

All tests pass with full coverage of batch processing functionality.

Usage Examples

from gradient._utils import BatchProcessor

# Create batch processor for API requests
processor = BatchProcessor(batch_size=10, timeout_seconds=5.0)

def process_batch(requests):
    # Process multiple requests efficiently
    responses = []
    for req in requests:
        response = client.chat.completions.create(**req)
        responses.append(response)
    return responses

processor.set_callback(process_batch)

# Add requests to batch
for i in range(15):
    request = {
        "messages": [{"role": "user", "content": f"Question {i}"}],
        "model": "llama3.3-70b-instruct"
    }
    processor.add(request)
    # Automatically processes every 10 requests

# Process remaining items
processor.force_process()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant