This project provides a framework for analyzing Ethereum blockchain transactions to identify end users (individuals) and distinguish them from institutional accounts, exchanges, and smart contracts. The system uses transaction graph analysis, machine learning, and heuristic-based approaches to classify addresses and predict user behaviors.
- Address Classification: Categorizes Ethereum addresses into different user types (individual, institutional, exchange, DeFi user, NFT trader)
- Transaction Graph Analysis: Builds and analyzes networks of transactions to reveal patterns
- Cluster Detection: Groups similar addresses based on transaction behaviors
- End User Identification: Applies multiple heuristics to identify individual end users with confidence scores
- Suspicious Activity Detection: Flags potentially suspicious transaction patterns
- Investment Analysis: Identifies where users have invested their crypto assets
- XGBoost Model Integration: Uses XGBoost for improved prediction accuracy
- Comprehensive User Profiling: Generates detailed user profiles with likelihood scores and confidence metrics
- Detailed Event Outputs: Provides transaction patterns and cluster analysis for each address
The system uses several complementary approaches to identify end users:
End users (individuals) typically exhibit distinct transaction behaviors compared to institutions or exchanges:
- Lower Transaction Volume: End users typically have fewer transactions (<50) compared to exchanges or institutions
- Fewer Unique Counterparties: Individual users interact with fewer addresses
- Transaction Frequency: End users often show less regular transaction patterns
The system detects specific behavioral patterns associated with individual users:
- Gas Optimization: End users tend to be more sensitive to gas prices
- Small Token Diversity: Individual users typically hold fewer different tokens
- Exchange Interaction: End users usually interact with exchanges for fiat on/off ramping
- Time-Based Patterns: Individual users often show distinctive transaction timing (weekly patterns, working hours activity)
The system examines how addresses interact with known entity types:
- Exchange Interactions: End users typically interact with exchanges to buy/sell crypto
- DeFi Protocol Usage: End users engage with fewer DeFi protocols than institutional users
- NFT Marketplaces: Individual collectors exhibit specific patterns in NFT trading
Multiple ML models are used to improve classification accuracy:
- Isolation Forest: Detects anomalous addresses with unusual transaction patterns
- Random Forest Classifier: Predicts the most likely user category based on transaction features
- XGBoost Model: Provides enhanced prediction capabilities with gradient boosting
A comprehensive scoring system calculates the probability (0-1) that an address belongs to an individual end user by combining multiple signals:
End User Likelihood = f(transaction_volume, network_diversity, exchange_interactions,
defi_interactions, contract_status, known_patterns)
Make sure Python 3.6 or newer is installed:
python --version
# On Windows:
python -m venv venv
venv\Scripts\activate
# On macOS/Linux:
python -m venv venv
source venv/bin/activate
If using Git:
git clone https://github.com/Chorko/End_User_of_Crypto_Transaction.git
cd End_User_of_Crypto_Transaction
Or simply download and extract the ZIP file to a folder.
Install the required dependencies:
pip install -r requirements.txt
The requirements.txt file includes:
requests
numpy
scikit-learn
python-dotenv
networkx
matplotlib
pandas
scipy
xgboost
- Go to https://etherscan.io/
- Register for an account if you don't have one
- Once logged in, go to https://etherscan.io/myapikey
- Click "Add" to create a new API key
- Give it a name (e.g., "Transaction Analysis")
- Copy your new API key
Create a file named .env in the project root directory:
ETHERSCAN_API_KEY=your_api_key_here
Replace "your_api_key_here" with the API key from Etherscan.
Ensure you have the main file:
- enduser.py
python enduser.py
If you see an error about an invalid API key:
- Double-check that the
.envfile is in the correct location - Verify the API key is correct
- Check for any spaces or typos in the
.envfile
If you see errors like "No module named 'X'":
pip install X
If you see connection errors:
- Check your internet connection
- Etherscan may have rate limits - the code has rate limiting built in, but you might need to wait if you hit API limits
When the code runs successfully:
- It will fetch Ethereum addresses from different categories
- Build a transaction graph
- Perform clustering analysis
- Train machine learning models (including XGBoost)
- Provide detailed information about analyzed addresses
You should see output showing:
- API key validation
- Address fetching
- Clustering results
- Machine learning model training
- Sample address analysis with likelihood categories (HIGH π’, MEDIUM π‘, LOW π΄)
The system generates detailed profiles for each analyzed address, including:
- User category and confidence score
- End user likelihood score with visual indicators (HIGH π’, MEDIUM π‘, LOW π΄)
- Profile ID and transaction patterns
- Investment destinations
- Suspicious activities (if any)
- Behavioral patterns
- Transaction statistics
- Cluster membership information
A summary report provides aggregated statistics on identified end users and saves the results to JSON files for further analysis.
- Integration with more blockchain data sources
- Transaction value analysis
- Gas price sensitivity analysis
- Temporal pattern analysis with transaction timestamps
- Cross-chain identity linking
- Web dashboard integration for interactive analysis