civictechdc
diff --git a/‎.ai-context/README.md‎
Lines changed: 144 additions & 0 deletions b/‎.ai-context/README.md‎
Lines changed: 144 additions & 0 deletions
diff --git a/‎.ai-context/architecture-overview.md‎
Lines changed: 219 additions & 0 deletions b/‎.ai-context/architecture-overview.md‎
Lines changed: 219 additions & 0 deletions
@@ -0,0 +1,144 @@
+# Mango Tango CLI - AI Context Documentation
+
+## Repository Overview
+
+**Mango Tango CLI** is a Python terminal-based tool for social media data
+analysis and visualization. It provides a modular, extensible architecture
+that separates core application logic from analysis modules, ensuring
+consistent UX while allowing easy contribution of new analyzers.
+
+### Purpose & Domain
+
+- **Social Media Analytics**: Hashtag analysis, n-gram analysis, temporal
+  patterns, user coordination
+- **Modular Architecture**: Clear separation between data import/export,
+  analysis, and presentation
+- **Interactive Workflows**: Terminal-based UI with web dashboard capabilities
+- **Extensible Design**: Plugin-like analyzer system for easy expansion
+
+### Tech Stack
+
+- **Core**: Python 3.12, Inquirer (CLI), TinyDB (metadata)
+- **Data**: Polars/Pandas, PyArrow, Parquet files
+- **Web**: Dash, Shiny for Python, Plotly
+- **Dev Tools**: Black, isort, pytest, PyInstaller
+
+## Semantic Code Structure
+
+### Entry Points
+
+- `mangotango.py` - Main application bootstrap
+- `python -m mangotango` - Standard execution command
+
+### Core Architecture (MVC-like)
+
+- **Application Layer** (`app/`): Workspace logic, analysis orchestration
+- **View Layer** (`components/`): Terminal UI components using inquirer
+- **Model Layer** (`storage/`): Data persistence, project/analysis models
+
+### Domain Separation
+
+1. **Core Domain**: Application, Terminal Components, Storage IO
+2. **Edge Domain**: Data import/export (`importing/`), preprocessing
+3. **Content Domain**: Analyzers (`analyzers/`), web presenters
+
+### Key Data Flow
+
+1. Import (CSV/Excel) → Parquet → Semantic preprocessing
+2. Primary Analysis → Secondary Analysis → Web Presentation
+3. Export → User-selected formats (XLSX, CSV, etc.)
+
+## Key Concepts
+
+### Analyzer System
+
+- **Primary Analyzers**: Core data processing (hashtags, ngrams, temporal)
+- **Secondary Analyzers**: User-friendly output transformation
+- **Web Presenters**: Interactive dashboards using Dash/Shiny
+- **Interface Pattern**: Declarative input/output schema definitions
+
+### Context Pattern
+
+Dependency injection through context objects:
+
+- `AppContext`: Application-wide dependencies
+- `ViewContext`: UI state and terminal context
+- `AnalysisContext`: Analysis execution environment
+- Analyzer contexts: File paths, preprocessing, app hooks
+
+### Data Semantics
+
+- Column semantic types guide user in analysis selection
+- Preprocessing maps user data to expected analyzer inputs
+- Type-safe data models using Pydantic
+
+## Development Patterns
+
+### Code Organization
+
+- Domain-driven module structure
+- Interface-first analyzer design  
+- Context-based dependency injection
+- Test co-location with implementation
+
+### Key Conventions
+
+- Black + isort formatting (enforced by pre-commit)
+- Type hints throughout (modern Python syntax)
+- Parquet for data persistence
+- Pydantic models for validation
+
+## Getting Started
+
+### For Development
+
+1. **Setup**: See @.ai-context/setup-guide.md
+2. **Architecture**: See @.ai-context/architecture-overview.md  
+3. **Symbol Reference**: See @.ai-context/symbol-reference.md
+4. **Development Guide**: See @docs/dev-guide.md
+
+### For AI Assistants
+
+- **Claude Code users**: See @CLAUDE.md (includes Serena integration)
+- **Cursor users**: See @.cursorrules
+- **Deep semantic analysis**: Explore @.serena/memories/
+
+### Quick References
+
+- **Commands**: @.serena/memories/suggested_commands.md
+- **Style Guide**: @.serena/memories/code_style_conventions.md
+- **Task Checklist**: @.serena/memories/task_completion_checklist.md
+
+## External Dependencies
+
+### Data Processing
+
+- `polars` - Primary data processing library
+- `pandas` - Secondary support for Plotly integration
+- `pyarrow` - Parquet file format support
+
+### Web Framework
+
+- `dash` - Interactive web dashboards
+- `shiny` - Python Shiny for modern web UIs
+- `plotly` - Visualization library
+
+### CLI & Storage
+
+- `inquirer` - Interactive terminal prompts
+- `tinydb` - Lightweight JSON database
+- `platformdirs` - Cross-platform data directories
+
+### Development
+
+- `black` - Code formatter
+- `isort` - Import organizer
+- `pytest` - Testing framework
+- `pyinstaller` - Executable building
+
+## Project Status
+
+- **License**: PolyForm Noncommercial License 1.0.0
+- **Author**: CIB Mango Tree / Civic Tech DC
+- **Branch Strategy**: feature branches → develop → main
+- **CI/CD**: GitHub Actions for testing, formatting, builds
@@ -0,0 +1,219 @@
+# Architecture Overview
+
+## High-Level Component Diagram
+
+```mermaid
+flowchart TD
+    User[User] --> Terminal[Terminal Interface]
+    Terminal --> App[Application Layer]
+    App --> Storage[Storage Layer]
+
+    App --> Importers[Data Importers]
+    App --> Preprocessing[Semantic Preprocessor]
+    App --> Analyzers[Analyzer System]
+
+    Importers --> Parquet[(Parquet Files)]
+    Preprocessing --> Parquet
+    Analyzers --> Parquet
+
+    Analyzers --> Primary[Primary Analyzers]
+    Analyzers --> Secondary[Secondary Analyzers]
+    Analyzers --> WebPresenters[Web Presenters]
+
+    WebPresenters --> Dash[Dash Apps]
+    WebPresenters --> Shiny[Shiny Apps]
+
+    Storage --> TinyDB[(TinyDB)]
+    Storage --> FileSystem[(File System)]
+```
+
+## Core Abstractions
+
+### Application Layer (`app/`)
+
+Central orchestration and workspace management
+
+Key Classes:
+
+- `App` - Main application controller, orchestrates all operations
+- `AppContext` - Dependency injection container for application-wide services
+- `ProjectContext` - Project-specific operations and column mapping
+- `AnalysisContext` - Analysis execution environment and progress tracking
+- `AnalysisOutputContext` - Handles analysis result management
+- `AnalysisWebServerContext` - Web server lifecycle management
+- `SettingsContext` - Configuration and user preferences
+
+### View Layer (`components/`)
+
+Terminal UI components using inquirer
+
+Key Components:
+
+- `ViewContext` - UI state management and terminal context
+- `main_menu()` - Application entry point menu
+- `splash()` - Application branding and welcome
+- Menu flows: project selection, analysis creation, parameter customization
+- Server management: web server lifecycle, export workflows
+
+### Model Layer (`storage/`)
+
+Data persistence and state management
+
+Key Classes:
+
+- `Storage` - Main storage controller, manages projects and analyses
+- `ProjectModel` - Project metadata and configuration
+- `AnalysisModel` - Analysis metadata, parameters, and state
+- `SettingsModel` - User preferences and application settings
+- `FileSelectionState` - File picker state management
+- `TableStats` - Data statistics and preview information
+
+## Data Flow Architecture
+
+### Import → Analysis → Export Pipeline
+
+```mermaid
+sequenceDiagram
+    participant User
+    participant Terminal
+    participant App
+    participant Importer
+    participant Preprocessor
+    participant Analyzer
+    participant WebServer
+
+    User->>Terminal: Select data file
+    Terminal->>App: Create project
+    App->>Importer: Import CSV/Excel
+    Importer->>App: Parquet file path
+    App->>Preprocessor: Apply column semantics
+    Preprocessor->>App: Processed data path
+    User->>Terminal: Configure analysis
+    Terminal->>App: Run analysis
+    App->>Analyzer: Execute with context
+    Analyzer->>App: Analysis results
+    App->>WebServer: Start dashboard
+    WebServer->>User: Interactive visualization
+```
+
+### Context-Based Dependency Injection
+
+Each layer receives context objects containing exactly what it needs:
+
+```python
+# Analyzer Context Pattern
+class AnalysisContext:
+    input_path: Path           # Input parquet file
+    output_path: Path          # Where to write results
+    preprocessing: Callable    # Column mapping function
+    progress_callback: Callable # Progress reporting
+    parameters: dict           # User-configured parameters
+
+class AnalysisWebServerContext:
+    primary_output_path: Path
+    secondary_output_paths: list[Path]
+    dash_app: dash.Dash        # For dashboard creation
+    server_config: dict
+```
+
+## Core Domain Patterns
+
+### Analyzer Interface System
+
+Declarative analysis definition
+
+```python
+# interface.py
+interface = AnalyzerInterface(
+    input=AnalyzerInput(
+        columns=[
+            AnalyzerInputColumn(
+                name="author_id",
+                semantic_type=ColumnSemantic.USER_ID,
+                required=True
+            )
+        ]
+    ),
+    outputs=[
+        AnalyzerOutput(
+            name="hashtag_analysis",
+            columns=[...],
+            internal=False  # User-consumable
+        )
+    ],
+    params=[
+        AnalyzerParam(
+            name="time_window",
+            param_type=ParamType.TIME_BINNING,
+            default="1D"
+        )
+    ]
+)
+```
+
+### Three-Stage Analysis Pipeline
+
+1. **Primary Analyzers** - Raw data processing
+   - Input: Preprocessed parquet files
+   - Output: Normalized analysis results
+   - Examples: hashtag extraction, n-gram generation, temporal aggregation
+
+2. **Secondary Analyzers** - Result transformation
+   - Input: Primary analyzer outputs
+   - Output: User-friendly reports and summaries
+   - Examples: statistics calculation, trend analysis
+
+3. **Web Presenters** - Interactive visualization
+   - Input: Primary + secondary outputs
+   - Output: Dash/Shiny web applications
+   - Examples: interactive charts, data exploration interfaces
+
+## Integration Points
+
+### External Data Sources
+
+- **CSV Importer**: Handles delimiter detection, encoding issues
+- **Excel Importer**: Multi-sheet support, data type inference
+- **File System**: Project directory structure, workspace management
+
+### Web Framework Integration
+
+- **Dash Integration**: Plotly-based interactive dashboards
+- **Shiny Integration**: Modern Python web UI framework
+- **Server Management**: Background process handling, port management
+
+### Export Capabilities
+
+- **XLSX Export**: Formatted Excel files with multiple sheets
+- **CSV Export**: Standard comma-separated values
+- **Parquet Export**: Native format for data interchange
+
+## Key Architectural Decisions
+
+### Parquet-Centric Data Flow
+
+- All analysis data stored as Parquet files
+- Enables efficient columnar operations with Polars
+- Provides schema validation and compression
+- Facilitates data sharing between analysis stages
+
+### Context Pattern for Decoupling
+
+- Eliminates direct dependencies between layers
+- Enables testing with mock contexts
+- Allows analyzer development without application knowledge
+- Supports different execution environments (CLI, web, testing)
+
+### Domain-Driven Module Organization
+
+- Clear boundaries between core, edge, and content domains
+- Enables independent development of analyzers
+- Supports plugin-like extensibility
+- Facilitates maintenance and testing
+
+### Semantic Type System
+
+- Guides users in column selection for analyses
+- Enables automatic data validation and preprocessing
+- Supports analyzer input requirements
+- Provides consistent UX across different data sources