Advanced AI-powered email security for phishing detection
PhishGuard AI is a cutting-edge phishing detection system that combines advanced machine learning techniques with psychological intent analysis to provide superior email security. Unlike traditional keyword-based filters, PhishGuard AI analyzes the psychological manipulation techniques used in phishing attacks, making it effective against novel threats.
- 🧠 Intent-Based Detection: Analyzes psychological manipulation techniques beyond simple keywords
- 🔗 Ensemble Learning: Combines multiple ML models (Naive Bayes, Random Forest, SVM) with soft voting
- 📊 Advanced Features: TF-IDF with n-grams (1,2,3,4) + psycholinguistic analysis
- ⚡ Real-time Analysis: Instant email security assessment with confidence scores
- 🎨 Modern UI: Beautiful, responsive web interface with detailed result visualization
- 🔒 Privacy-First: Email content processed locally, not stored
- Test Accuracy: 96.77%
- Training Samples: 3,865 emails across 3 classes
- False Negative Reduction: Improved from 0.26 to 0.94+ confidence on critical cases
- Intent Signal Detection: 15+ psychological manipulation features
Email Text → TF-IDF Vectorization → Intent Feature Extraction → Ensemble Model → Prediction + Confidence
- AI-Generated Phishing: Machine-generated phishing attempts
- Human-Written Phishing: Real-world phishing campaigns
- Legitimate Emails: Normal business communications
- Authority Signals: Official/government language detection
- Pressure Tactics: Urgency and deadline analysis
- Scarcity Signals: Limited time offer detection
- URL Analysis: Suspicious link identification
- Readability Metrics: Text complexity analysis
- Backend: Python, Flask, Scikit-learn
- Frontend: HTML5, CSS3, JavaScript, Bootstrap 5
- ML Libraries: Pandas, NumPy, SciPy
- Features: TF-IDF, N-grams, Ensemble Methods
- Python 3.8 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/EclipseAditya/phishguard-ai.git cd phishguard-ai -
Install dependencies
pip install -r requirements.txt
-
Ensure model files exist Make sure these files are in the
models_and_dataset/folder:Advanced_enhanced_model.pklenhanced_tfidf_vectorizer.pkl
-
Run the application
python app.py
-
Open your browser Navigate to
http://localhost:5000
- Open PhishGuard AI in your browser
- Paste email content into the analysis textarea
- Click "Analyze Email Security"
- View detailed results including:
- Classification (AI Phishing / Human Phishing / Legitimate)
- Confidence scores for each class
- Intent analysis with manipulation signals
- Risk level assessment
POST /api/analyze
Content-Type: application/json
{
"email_text": "Your email content here..."
}Response:
{
"prediction": "ai_phishing",
"confidence_scores": {
"ai_phishing": 0.89,
"human_phishing": 0.08,
"legitimate": 0.03
},
"intent_analysis": {
"authority_signals": 2,
"pressure_signals": 3,
"scarcity_signals": 1,
"total_manipulation_signals": 6,
"readability_score": 65.2,
"urls_detected": true
},
"risk_level": "HIGH",
"risk_color": "danger"
}- TF-IDF Vectorization: Captures word importance with n-gram context
- Psycholinguistic Features: Authority, pressure, scarcity signal detection
- URL Analysis: Link structure and suspicious domain detection
- Readability Metrics: Text complexity and manipulation indicators
- Base Models: MultinomialNB, RandomForest, LinearSVM
- Ensemble Method: Soft voting for probability averaging
- Cross-Validation: Stratified K-fold for robust evaluation
- Feature Combination: TF-IDF + Intent features via sparse matrix concatenation
PhishGuard AI prioritizes False Negative reduction over raw accuracy, as missing a phishing email poses greater security risk than flagging a legitimate email.
The web interface provides:
- Risk Level Badges: Visual risk assessment (HIGH/LOW)
- Confidence Bars: Animated confidence scores for each class
- Intent Signals: Color-coded manipulation technique detection
- Warning Alerts: Highlighted threats with signal counts
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
Aditya Pandey
AI/ML Developer & Security Researcher
Passionate about applying artificial intelligence to cybersecurity challenges. Specialized in machine learning, natural language processing, and threat detection systems.
- Advanced machine learning techniques for email security
- Modern web development practices for user experience
- Cybersecurity research community for threat intelligence
- Open source ML libraries that made this possible
PhishGuard AI - Protecting your digital communications with advanced artificial intelligence