PDF Dictate

A powerful PDF management and editing application with AI-powered assistance. Upload, edit, and fill PDF forms using voice transcription and intelligent suggestions.

Features

📄 PDF Management

Upload PDFs: Drag and drop or browse to upload PDF documents
File Management: View all your PDFs with file sizes and modification dates
URL-safe Naming: Automatic renaming for web compatibility

✏️ Advanced PDF Editing

Interactive PDF Editor: Powered by NutrientViewer (PSPDFKit) for professional PDF editing
Form Field Support: Edit and fill PDF forms seamlessly
Real-time Saving: Save your changes directly to the server

🤖 AI-Powered Assistance

Voice Transcription: Real-time speech-to-text transcription while editing
AI Suggestions: Intelligent form field suggestions based on context and voice input
Screen Context: AI analyzes your screen content for better suggestions
Smart Fill: Apply AI suggestions with keyboard shortcuts (Cmd/Ctrl + Enter)

🎯 User Experience

Responsive Design: Works seamlessly across different screen sizes
Real-time Status: Live indicators for recording, AI mode, and connection status
Smooth Transitions: Animated panel transitions for optimal workflow
Visual Feedback: Color-coded form field updates and status indicators

Getting Started

Prerequisites

Node.js 18+
npm, yarn, pnpm, or bun

Installation

Clone the repository:

git clone <your-repo-url>
cd pdf-dictate

Install dependencies:

npm install
# or
yarn install
# or
pnpm install

Set up your environment variables (if any API keys are required for AI features)
Run the development server:

npm run dev
# or
yarn dev
# or
pnpm dev

Open http://localhost:3000 in your browser

Usage

Basic PDF Management

Upload a PDF: Click "Upload PDF" on the homepage and select your file
View PDFs: Browse your uploaded PDFs in the table view
Edit a PDF: Click the "Edit" button next to any PDF

PDF Editing with AI

Open Editor: Click "Edit" on any PDF to open the interactive editor
Enable AI Mode: Click the "AI Mode" button to activate voice transcription and AI assistance
Grant Permissions: Allow screen sharing and microphone access when prompted
Start Editing: Click on any form field in the PDF
Voice Input: Speak naturally - your voice will be transcribed in real-time
AI Suggestions: The AI will provide intelligent suggestions for form fields
Apply Suggestions: Press Cmd + Enter (Mac) or Ctrl + Enter (Windows/Linux) to apply AI suggestions
Save Changes: Click "Save" to persist your changes

Keyboard Shortcuts

Cmd/Ctrl + Enter: Apply AI suggestion to current form field

Technology Stack

Frontend: Next.js 14 with React
PDF Rendering: NutrientViewer (PSPDFKit Web)
UI Components: Custom components with Tailwind CSS
Icons: Lucide React
Real-time Transcription: WebSocket-based transcription service
AI Integration: Custom AI suggestion API
File Management: Server-side PDF storage and processing

Project Structure

src/
├── app/
│   ├── page.tsx              # Homepage with PDF management
│   ├── edit/[name]/page.tsx  # PDF editor with AI features
│   ├── layout.tsx            # Root layout with NutrientViewer scripts
│   └── globals.css           # Global styles
├── components/
│   └── ui/                   # Reusable UI components
└── hooks/
    └── useRealtimeTranscription.ts  # Transcription hook

API Endpoints

GET /api/pdfs - List all uploaded PDFs
GET /api/pdfs/[name] - Serve specific PDF file
POST /api/upload - Upload new PDF
POST /api/pdfs/[name]/save - Save edited PDF
POST /api/suggestions - Get AI suggestions based on context

Features in Detail

AI Mode

When AI Mode is activated:

Screen Capture: Captures your screen for visual context
Voice Recording: Continuous voice transcription
Context Analysis: AI analyzes both visual and audio context
Smart Suggestions: Provides relevant suggestions for form fields
Visual Feedback: Form fields flash green when filled with AI suggestions

PDF Editor Features

Form Field Focus: Automatic detection when form fields are selected
Real-time Updates: Immediate visual feedback for changes
Professional Tools: Full PDF editing capabilities via NutrientViewer
Save Management: Robust saving with error handling and status indicators

Browser Requirements

Modern browser with WebRTC support for screen sharing
Microphone access for voice transcription
Camera/screen sharing permissions for AI context

Contributing

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

License

[Add your license information here]

Support

For issues and questions, please open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
public		public
src		src
.gitignore		.gitignore
README.md		README.md
components.json		components.json
global.d.ts		global.d.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Dictate

Features

📄 PDF Management

✏️ Advanced PDF Editing

🤖 AI-Powered Assistance

🎯 User Experience

Getting Started

Prerequisites

Installation

Usage

Basic PDF Management

PDF Editing with AI

Keyboard Shortcuts

Technology Stack

Project Structure

API Endpoints

Features in Detail

AI Mode

PDF Editor Features

Browser Requirements

Contributing

License

Support

About

Uh oh!

Releases

Packages

Languages

gskaggs/pdf-dictate

Folders and files

Latest commit

History

Repository files navigation

PDF Dictate

Features

📄 PDF Management

✏️ Advanced PDF Editing

🤖 AI-Powered Assistance

🎯 User Experience

Getting Started

Prerequisites

Installation

Usage

Basic PDF Management

PDF Editing with AI

Keyboard Shortcuts

Technology Stack

Project Structure

API Endpoints

Features in Detail

AI Mode

PDF Editor Features

Browser Requirements

Contributing

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages