Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
155 changes: 155 additions & 0 deletions ALTERNATIVE_SOLUTION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Alternative Solution: Single JSON Field for SAR Metadata

## Problem with Current Approach
Adding 6 new columns (phonecode, currency, currency_name, currency_symbol, emoji, emojiU) to the states table for just 2 Special Administrative Regions creates unnecessary schema complexity.

## Proposed Alternative: Single JSON Field

### Schema Change
Instead of 6 columns, add **one** column:

```sql
-- Instead of:
phonecode VARCHAR(255)
currency VARCHAR(255)
currency_name VARCHAR(255)
currency_symbol VARCHAR(255)
emoji VARCHAR(191)
emojiU VARCHAR(191)

-- Use:
sar_metadata JSON DEFAULT NULL
```

### Data Structure

**Hong Kong SAR:**
```json
{
"id": 2267,
"name": "Hong Kong SAR",
"country_id": 45,
"country_code": "CN",
"type": "special administrative region",
"sar_metadata": {
"phonecode": "852",
"currency": "HKD",
"currency_name": "Hong Kong dollar",
"currency_symbol": "$",
"emoji": "🇭🇰",
"emojiU": "U+1F1ED U+1F1F0"
}
}
```

**Regular State (Beijing):**
```json
{
"id": 3318,
"name": "Beijing",
"country_id": 45,
"country_code": "CN",
"type": "municipality",
"sar_metadata": null // NULL - no storage overhead
}
```

### Benefits

1. **Minimal Schema Impact**: Only 1 column instead of 6
2. **No Storage Overhead**: NULL for 5,071 regular states
3. **Flexible**: Easy to add more SAR attributes without schema changes
4. **Clean Queries**: Can still query SARs: `WHERE sar_metadata IS NOT NULL`
5. **JSON Support**: MySQL 5.7+, PostgreSQL 9.2+ have excellent JSON support

### Query Examples

```sql
-- Get all SARs
SELECT * FROM states WHERE sar_metadata IS NOT NULL;

-- Get SAR phone code
SELECT name, sar_metadata->>'$.phonecode' as phonecode
FROM states
WHERE sar_metadata IS NOT NULL;

-- Get SARs with specific currency
SELECT * FROM states
WHERE JSON_EXTRACT(sar_metadata, '$.currency') = 'HKD';
```

### Export Example

```json
{
"id": 2267,
"name": "Hong Kong SAR",
"phonecode": "852", // Extracted from sar_metadata for flat exports
"currency": "HKD",
"emoji": "🇭🇰"
}
```

### Implementation Files to Change

1. `sql/schema.sql` - Change to single JSON column
2. `prisma/schema.prisma` - Change to `sar_metadata Json?`
3. `bin/Commands/ExportSqlServer.php` - Change to JSON column
4. `bin/Commands/ExportJson.php` - Extract JSON fields for flat export
5. `contributions/states/states.json` - Nest SAR fields in `sar_metadata` object

### Migration Path

For existing databases with the 6-column approach:

```sql
-- Create new column
ALTER TABLE states ADD COLUMN sar_metadata JSON;

-- Migrate data
UPDATE states
SET sar_metadata = JSON_OBJECT(
'phonecode', phonecode,
'currency', currency,
'currency_name', currency_name,
'currency_symbol', currency_symbol,
'emoji', emoji,
'emojiU', emojiU
)
WHERE phonecode IS NOT NULL;

-- Drop old columns
ALTER TABLE states
DROP COLUMN phonecode,
DROP COLUMN currency,
DROP COLUMN currency_name,
DROP COLUMN currency_symbol,
DROP COLUMN emoji,
DROP COLUMN emojiU;
```

## Other Alternatives

### Option 2: Keep Current Dual Representation
- Hong Kong and Macau remain as both countries AND states
- Document this as intentional for backward compatibility
- No schema changes needed
- Users can choose which representation to use

### Option 3: Separate SAR Table
- Create `sar_attributes` table with 1:1 relationship
- More normalized but adds complexity
- Requires JOIN for every SAR query

### Option 4: Extend Translations Field
- Store SAR metadata in existing `translations` JSON field
- No schema changes at all
- Could be confusing since translations are language-specific

## Recommendation

**Single JSON field (Option 1)** provides the best balance:
- Minimal schema impact (1 column vs 6)
- No storage overhead for regular states
- Maintains data completeness for SARs
- Future-proof and flexible
192 changes: 192 additions & 0 deletions IMPLEMENTATION_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
# Implementation Summary: Special Administrative Regions Support

## Issue Addressed

**Original Issue**: "SARs like Hong Kong and Macau should be a state or a city instead of country. But at the same time, it has a different phonecode, currency, currency_name, currency_symbol, emoji and emojiU. So it is hard to be injected into state since it will lose the details."

## Solution Implemented

We extended the `states` table schema to include **optional** country-level attributes specifically for Special Administrative Regions (SARs), allowing them to be properly classified as states under their parent country while retaining their unique characteristics.

## Changes Made

### 1. Database Schema Extensions

#### Modified Files:
- `sql/schema.sql` - MySQL schema
- `prisma/schema.prisma` - Prisma ORM schema
- `bin/Commands/ExportSqlServer.php` - SQL Server export schema

#### New Optional Fields Added to `states` Table:
```sql
phonecode VARCHAR(255) -- Phone dialing code (e.g., "852" for Hong Kong)
currency VARCHAR(255) -- Currency code (e.g., "HKD")
currency_name VARCHAR(255) -- Full currency name (e.g., "Hong Kong dollar")
currency_symbol VARCHAR(255) -- Currency symbol (e.g., "$")
emoji VARCHAR(191) -- Flag emoji (e.g., "🇭🇰")
emojiU VARCHAR(191) -- Emoji Unicode (e.g., "U+1F1ED U+1F1F0")
```

**Important**: These fields are `NULL` for regular states/provinces and only populated for SARs.

### 2. Data Population

Updated `contributions/states/states.json` with SAR-specific fields:

**Hong Kong SAR (ID: 2267)**:
```json
{
"id": 2267,
"name": "Hong Kong SAR",
"country_id": 45,
"country_code": "CN",
"type": "special administrative region",
"phonecode": "852",
"currency": "HKD",
"currency_name": "Hong Kong dollar",
"currency_symbol": "$",
"emoji": "🇭🇰",
"emojiU": "U+1F1ED U+1F1F0"
}
```

**Macau SAR (ID: 2266)**:
```json
{
"id": 2266,
"name": "Macau SAR",
"country_id": 45,
"country_code": "CN",
"type": "special administrative region",
"phonecode": "853",
"currency": "MOP",
"currency_name": "Macanese pataca",
"currency_symbol": "$",
"emoji": "🇲🇴",
"emojiU": "U+1F1F2 U+1F1F4"
}
```

### 3. Export Command Updates

Modified `bin/Commands/ExportJson.php` to export the new SAR fields:
- Added SAR fields to states array output
- Added SAR fields to country-state nested array output
- All other export formats (CSV, XML, YAML, MongoDB) automatically inherit from JSON

### 4. Documentation

Created comprehensive documentation:

#### New Documentation Files:
- `docs/SPECIAL_ADMINISTRATIVE_REGIONS.md` - Complete guide on SAR handling
- Explains the challenge and solution
- Provides schema reference
- Includes usage guidelines for contributors and API consumers
- Documents current SARs (Hong Kong, Macau)
- Suggests future use cases

#### Updated Documentation:
- `contributions/README.md` - Added State Fields reference table with SAR fields
- `README.md` - Added link to SAR documentation in Contributing section

### 5. Schema Validation

- Changed `states` table `ROW_FORMAT` from `COMPACT` to `DYNAMIC` to accommodate the additional fields
- Validated schema with MySQL 8.0 test database
- Successfully tested insert and query operations with SAR data

## Technical Details

### Data Hierarchy
```
China (country_id: 45)
├── Beijing (state, no SAR fields)
├── Shanghai (state, no SAR fields)
├── Hong Kong SAR (state, WITH SAR fields)
└── Macau SAR (state, WITH SAR fields)
└── Cities in Macau (reference state_id: 2266)
```

### Backward Compatibility
- Existing states remain unchanged (SAR fields are NULL)
- All existing queries continue to work
- No breaking changes to API or export formats
- Cities in HK/Macau already reference correct state IDs

### Query Examples

**Get all SARs:**
```sql
SELECT * FROM states WHERE type = 'special administrative region';
```

**Get states with their own currencies:**
```sql
SELECT * FROM states WHERE currency IS NOT NULL;
```

**Get all China subdivisions including SARs:**
```sql
SELECT * FROM states WHERE country_id = 45;
```

## Benefits

1. **Geographical Accuracy**: Hong Kong and Macau are correctly represented as parts of China
2. **Data Completeness**: No loss of important attributes (phone codes, currencies, flags)
3. **Standards Compliance**: Maintains ISO 3166-2 compliance
4. **Extensibility**: Can support other SARs and autonomous territories
5. **Backward Compatible**: Existing integrations continue to work

## Future Applications

This schema can accommodate other similar entities:
- Åland Islands (Finland)
- Faroe Islands (Denmark)
- Greenland (Denmark)
- Puerto Rico (USA)
- Other territories with special status

## Files Modified

1. `sql/schema.sql` - Added SAR fields to states table, changed ROW_FORMAT
2. `prisma/schema.prisma` - Added SAR fields to State model
3. `bin/Commands/ExportSqlServer.php` - Added SAR fields to SQL Server schema
4. `bin/Commands/ExportJson.php` - Added SAR fields to JSON export
5. `contributions/states/states.json` - Populated SAR data for HK and Macau
6. `contributions/README.md` - Added State Fields documentation
7. `README.md` - Added link to SAR documentation

## Files Created

1. `docs/SPECIAL_ADMINISTRATIVE_REGIONS.md` - Comprehensive SAR documentation

## Testing

- [x] Schema syntax validated with MySQL 8.0
- [x] Test insert and query operations successful
- [x] JSON data validated and populated correctly
- [x] Export command modifications verified
- [x] Documentation reviewed for completeness

## Validation Steps for Maintainers

1. Import the updated schema to MySQL
2. Run the JSON import script to populate SAR data
3. Run export commands to generate all formats
4. Verify Hong Kong and Macau states include SAR fields in exports
5. Verify regular states have NULL SAR fields in exports

## Notes

- Hong Kong and Macau entries still exist in the `countries` table (IDs 98 and 128) for backward compatibility
- New integrations should use the state entries under China (IDs 2267 and 2266)
- The solution follows the "One China" principle while respecting the autonomy and unique characteristics of SARs
- This approach is politically neutral and focuses on accurate data representation

## References

- Issue: "Special Administrative Regions representation"
- ISO 3166-2: Standard for subdivision codes
- WikiData Q8646 (Hong Kong), Q14773 (Macau)
Loading