A modern web application that automatically fetches, analyzes, and summarizes research papers from arXiv. The system uses GPT-4 to generate detailed summaries and insights from academic papers across multiple scientific domains.
-
Automated Paper Processing:
- Daily fetching of new papers from arXiv
- Parallel processing of multiple categories
- Smart paper selection based on impact and novelty
- PDF text extraction and analysis
- Detailed summaries using GPT-4
-
Categories Covered:
- Machine Learning (cs.LG)
- Natural Language Processing (cs.CL)
- Computer Vision (cs.CV)
- Statistical ML (stat.ML)
- Quantum Physics (quant-ph)
- Nuclear Theory (nucl-th)
- Nuclear Experiment (nucl-ex)
- Materials Science (cond-mat.mtrl-sci)
- Galaxy Astrophysics (astro-ph.GA)
- Neurons & Cognition (q-bio.NC)
- Crypto & Security (cs.CR)
-
Modern Web Interface:
- Clean, responsive design
- Category-based browsing
- Paper details with comprehensive summaries
- Real-time generation status updates
- Pagination and sorting options
-
Backend:
- FastAPI (Python web framework)
- MongoDB (document storage)
- arXiv API (paper fetching)
- OpenAI GPT-4 (paper analysis)
- PyPDF2 (PDF processing)
- Async processing with asyncio
-
Frontend:
- Next.js 13+ (React framework)
- Tailwind CSS (styling)
- TypeScript
- Axios (API client)
-
Prerequisites:
# Install Python 3.8+ and Node.js 16+ python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Environment Variables: Create a
.envfile in the root directory:OPENAI_API_KEY=your_openai_api_key MONGODB_URI=your_mongodb_connection_string
-
Backend Setup:
cd backend pip install -r requirements.txt python main.py -
Frontend Setup:
cd frontend npm install npm run dev
GET /api/generate: Trigger paper fetching and summarizationGET /api/categories: List all available categoriesGET /api/category/{slug}: Get papers for a specific categoryGET /api/blog/{slug}: Get detailed paper informationGET /api/papers: Get papers with filtering and paginationGET /api/generation-status: Check paper generation status
Query Parameters:
date: Filter by date (YYYY-MM-DD)page: Page number for paginationper_page: Items per pagesort_by: Sort field (published_date, title, generation_date)sort_order: Sort direction (asc, desc)
- Papers are scored based on:
- Innovation and novelty
- Potential impact
- Technical significance
- Clarity of contribution
Each paper summary includes:
- Main objective and motivation
- Key methodology
- Significant findings
- Technical details
- Potential impact and applications
- Parallel processing of categories
- Concurrent PDF analysis
- Async MongoDB operations
- Efficient caching
- No duplicate paper processing
# Backend tests
cd backend
pytest
# Frontend tests
cd frontend
npm test# Backend
black .
flake8
# Frontend
npm run lint- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
MIT License - see LICENSE file for details