A comprehensive news bias detection system that uses advanced AI to analyze political bias in news articles. Built with Hugging Face transformers, LIME explainability, and Streamlit for an interactive web interface.
- Zero-shot Bias Classification: Uses Facebook's BART-large-mnli model for accurate bias detection
- Three Bias Categories: Left, Center, Right with confidence scores
- Explainable AI: LIME integration highlights influential words and phrases
- Dual Input Methods: Analyze text directly or fetch from URLs
- Interactive Web Interface: Clean, responsive Streamlit UI
- Source Analysis: Known news source bias mapping
- Real-time Analysis: Fast inference with caching
news-bias-detector/
├── data/
│ ├── source_bias.csv # Domain → bias mapping
│ └── sample_articles.csv # Sample test data
├── app/
│ ├── streamlit_app.py # Main Streamlit application
│ └── utils.py # Core functionality
├── models/ # Model artifacts (generated)
├── Dockerfile # Container configuration
├── requirements.txt # Python dependencies
└── README.md
- ML Framework: Hugging Face Transformers
- Model: facebook/bart-large-mnli (Zero-shot classification)
- Explainability: LIME (Local Interpretable Model-agnostic Explanations)
- Web Framework: Streamlit
- Text Processing: newspaper3k, BeautifulSoup4
- Visualization: Altair charts
- Containerization: Docker
git clone <your-repo-url>
cd NewsBiasAIpython -m venv venv
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activatepip install -r requirements.txtstreamlit run app/streamlit_app.pyThe app will open in your browser at http://localhost:8501
# Build the image
docker build -t news-bias-detector .
# Run the container
docker run -p 8501:8501 news-bias-detectorAccess the app at http://localhost:8501
-
Choose Input Method:
- URL: Paste a news article URL for automatic text extraction
- Text: Directly paste article content
-
Analyze: Click the analyze button to process the content
-
Review Results:
- Bias Label: Left, Center, or Right classification
- Confidence Score: Model certainty (0-100%)
- Probability Distribution: Breakdown across all categories
- Bias Score: Numeric scale (-1 to +1)
-
Explore Explanations:
- Generate LIME explanations to see influential words
- Adjust explanation parameters for different insights
The system uses zero-shot classification, meaning no fine-tuning on bias-specific data. Performance characteristics:
- Speed: ~1-3 seconds for classification
- Explanation Generation: 10-60 seconds (depending on settings)
- Accuracy: Competitive with supervised methods on balanced datasets
- Robustness: Handles various article lengths and styles
-
LIME Explanation:
num_features: Number of words to highlight (5-20)num_samples: Quality vs speed trade-off (100-1000)
-
Text Processing:
- Automatic truncation for very long articles
- Smart text cleaning and normalization
Edit data/source_bias.csv to add or modify known news source biases:
domain,bias
cnn.com,Left
reuters.com,Center
foxnews.com,RightInput: "The new environmental regulations will help combat climate change and protect future generations."
Output:
- Predicted Bias: Left (78% confidence)
- Key Words: "environmental regulations", "combat climate change", "protect"
- Bias Score: -0.42 (leans left)
- Zero-shot model may not capture nuanced political contexts
- Performance varies with article length and writing style
- LIME explanations are approximations, not definitive attributions
- Source bias mapping is manually curated and may be subjective
- Fine-tune model on labeled bias datasets
- Add sentence-level analysis
- Implement caching for faster repeated analysis
- Add batch processing capabilities
- Multi-language support
- Temporal bias analysis
- Source credibility scoring
- API endpoints for integration
- Advanced visualizations (bias trends, source clustering)
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Hugging Face: For the transformer models and pipeline infrastructure
- Facebook AI: For the BART-large-mnli model
- LIME Team: For the explainability framework
- Streamlit: For the excellent web framework
- Developer: Your Name
- Email: your.email@example.com
- LinkedIn: [Your LinkedIn Profile]
- GitHub: [Your GitHub Profile]
⭐ If you found this project helpful, please give it a star!