A high-performance, asynchronous data extraction tool for ManyChat that processes user data in parallel while respecting API rate limits.
Features • Installation • Usage • Configuration • Contributing
- ⚡ High Performance: Process up to 10 requests per second with async operations
- 📊 Smart Rate Limiting: Automatic handling of API rate limits
- 💾 Auto-Save Progress: Saves data after every batch
- 🔄 Resume Capability: Continue from where you left off
- 📝 Detailed Logging: Comprehensive logs with rich formatting
- 🎯 Error Handling: Robust error recovery with backup saves
- 📈 Live Progress: Real-time progress tracking with status bar
- 📋 Summary Stats: Detailed extraction statistics and error reporting
- Clone the repository:
git clone https://github.com/victoryudi/manychat-data-extractor.git
cd manychat-data-extractor
- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # For Unix/macOS
venv\Scripts\activate # For Windows
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file in the project root:
MANYCHAT_API_TOKEN=your_api_token_here
- Make sure your input CSV file has an
email
column with the emails you want to process
[Previous sections remain the same until Usage...]
python manychat_extractor.py
The script will prompt you for:
- Input CSV file path
- Output CSV file path (optional)
- Whether to resume a previous run (if applicable)
The extractor currently fetches these custom fields from ManyChat:
shopify_domain
telefone
To modify which fields are extracted:
- Update the
ManyChatData
class inmanychat_extractor.py
:
@dataclass
class ManyChatData:
email: str
manychat_id: Optional[str] = None
# Add your custom fields here
your_field_name: Optional[str] = None
another_field: Optional[str] = None
processed_at: Optional[str] = None
- Modify the field extraction in the
fetch_manychat_data_async
method:
manychat_data = ManyChatData(
email=email,
manychat_id=result_data.get('id'),
# Add your custom field extraction here
your_field_name=next((field['value'] for field in custom_fields
if field['name'] == 'your_field_name'), None),
another_field=next((field['value'] for field in custom_fields
if field['name'] == 'another_field'), None),
processed_at=datetime.now().isoformat()
)
# In ManyChatData class:
@dataclass
class ManyChatData:
email: str
manychat_id: Optional[str] = None
phone: Optional[str] = None # Add new field
processed_at: Optional[str] = None
# In fetch_manychat_data_async method:
manychat_data = ManyChatData(
email=email,
manychat_id=result_data.get('id'),
phone=next((field['value'] for field in custom_fields
if field['name'] == 'phone'), None), # Extract phone field
processed_at=datetime.now().isoformat()
)
To see all available custom fields for a subscriber:
- Add this debug log in
fetch_manychat_data_async
:
if data.get('status') == 'success':
result_data = data.get('data', {})
custom_fields = result_data.get('custom_fields', [])
# Add this debug log
log.debug(f"Available custom fields: {json.dumps([
{'name': field['name'], 'value': field['value']}
for field in custom_fields
], indent=2)}")
- Set logging level to DEBUG in your
.env
:
MANYCHAT_API_TOKEN=your_api_token_here
LOG_LEVEL=DEBUG
This will show all available custom fields in your logs for reference.
[Continue with previous sections...]
Run the extractor:
python manychat_extractor.py
The script will prompt you for:
- Input CSV file path
- Output CSV file path (optional)
- Whether to resume a previous run (if applicable)
The extractor retrieves the following data for each email:
{
"email": "[email protected]",
"manychat_id": "123456",
"telefone": "+1234567890",
"processed_at": "2024-10-28T15:30:00"
}
ManyChat Data Extractor
==================================================
Enter the path to your input CSV file: users.csv
Using auto-generated output file: manychat_data_20241028_153000.csv
⠋ Processing emails... ━━━━━━━━━━━━━━━━━━━━━━ 45% 0:01:23
Extraction Summary
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Metric ┃ Value ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ Total Processed │ 100 │
│ Successful │ 98 │
│ Failed │ 0 │
│ Empty Responses │ 2 │
│ Rate Limited │ 1 │
│ Duration │ 0:02:15 │
└─────────────────┴─────────┘
The extractor uses:
- Async/await with
aiohttp
for concurrent requests - Thread-safe rate limiting
- Automatic retry on rate limit exceeded
- Progress saving after each batch
- Comprehensive error handling
manychat-data-extractor/
├── manychat_extractor.py # Main script
├── requirements.txt # Dependencies
├── .env # Configuration
└── logs/ # Log files
Contributions are welcome! Feel free to:
- Fork the repository
- Create a new branch (
git checkout -b feature/improvement
) - Make your changes
- Commit your changes (
git commit -am 'Add new feature'
) - Push to the branch (
git push origin feature/improvement
) - Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- ManyChat API for the platform
- aiohttp for async HTTP requests
- Rich for beautiful terminal formatting
Created with ❤️ by victoryudi