Skip to content

Conversation

ShiZhongming
Copy link
Collaborator

@ShiZhongming ShiZhongming commented Sep 14, 2025

Summary by CodeRabbit

  • New Features

    • Automatic migration for missing "void_deck" data with validation warnings.
    • New indoor comfort data generation per building.
    • Architecture metrics are now computed on‑the‑fly in result exports.
  • Bug Fixes

    • More robust results loading with graceful handling of missing time‑series files.
  • Refactor

    • Hourly demand outputs simplified: removed redundant building name column.
    • Area and U‑value usage consolidated under envelope properties for consistent calculations.
    • Improved geometry/area computation and thermal-network sizing data handling.

Copy link
Contributor

coderabbitai bot commented Sep 14, 2025

Walkthrough

Adds an indoor comfort mapper; introduces a targeted void_deck migration path and centralizes migrate_void_deck_data in utils; refactors useful_areas and geometry/RC-model pipelines to use GeoDataFrames and envelope-derived area/U-values; updates demand writers' hourly API and result-summary architecture handling; minor import/type-cleanups across modules.

Changes

Cohort / File(s) Summary
Void-deck migration & import updates
cea/datamanagement/format_helper/cea4_migrate.py, cea/datamanagement/utils/__init__.py, cea/demand/demand_main.py, cea/analysis/lca/embodied.py, cea/resources/radiation/main.py, cea/resources/radiationCRAX/main.py
Adds a special-case migration path when only void_deck is missing and delegates migration to migrate_void_deck_data in datamanagement.utils; updates function signature/type hints, saves migrated zone geometry, emits validation warnings, and updates imports across demand/analysis/radiation modules.
Useful areas & geometry / RC-model refactor
cea/demand/building_properties/useful_areas.py, cea/demand/building_properties/building_rc_model.py, cea/demand/building_properties/building_geometry.py, cea/demand/occupancy_helper.py, cea/import_export/result_summary.py, cea/demand/building_properties/building_properties.py
Replaces single-DF calc_useful_areas with calc_useful_areas(zone_df: gpd.GeoDataFrame, architecture_df: pd.DataFrame), projects to CRS and computes footprint/GFA components; updates RC model to consume areas/GeoDataFrame and envelope fields; narrows geometry columns and shifts GFA sourcing to RC-model/envelope flows; result_summary now generates architecture results on demand.
Envelope vs RC-model property moves
cea/demand/building_properties/building_properties_row.py, cea/demand/hourly_procedure_heating_cooling_system_load.py, cea/demand/rc_model_SIA.py, cea/demand/sensible_loads.py, cea/demand/building_properties/building_solar.py
Moves area and U-value fields into envelope (adds envelope fields, removes duplicates from RC model); updates thermal balance, RC model usage, sensible loads and solar calculations to read A*/U* from envelope instead of rc_model.
Demand writers hourly API change
cea/demand/demand_writers.py
Removes building_name parameter from calc_hourly_dataframe(self, date, tsd) and from produced hourly data; updates callers and yearly/aggregate composition (e.g., Aroof source, people0 sourcing).
Indoor comfort mapping
cea/datamanagement/archetypes_mapper.py
Adds indoor_comfort_mapper(list_uses, locator, occupant_densities, building_typology_df) which merges archetype comfort DB with typology, computes multi-use aggregated comfort via calculate_average_multiuse, ensures target folder, and writes building comfort CSV to locator.get_building_comfort().
Visualisation & linting tidy
cea/visualisation/a_data_loader.py, cea/visualisation/plot_main.py
Removes unused cea.config import, adjusts import ordering/duplicate import and a minor docstring hyphen change (non-functional).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Caller as indoor_comfort_mapper
  participant Locator
  participant DB as comfort DB (locator.get_database_archetypes_use_type)
  participant Typ as building_typology_df
  participant Agg as calculate_average_multiuse
  participant FS as filesystem

  Caller->>DB: read comfort archetypes
  Caller->>Typ: receive building_typology_df, list_uses, occupant_densities
  DB-->>Caller: comfort_DB
  Caller->>Caller: merge Typ left_on='use_type1' with comfort_DB
  Caller->>Agg: calculate_average_multiuse(occupant_densities, list_uses, merged_df)
  Agg-->>Caller: aggregated_comfort_df
  Caller->>FS: ensure target folder exists (locator.get_building_comfort())
  Caller->>FS: write aggregated_comfort_df as CSV to locator.get_building_comfort()
Loading
sequenceDiagram
  autonumber
  participant Migrator as migrate_cea3_to_cea4
  participant Files as zone/shapefiles
  participant Locator
  participant Utils as migrate_void_deck_data

  Migrator->>Files: inspect attributes for CEA-3 vs CEA-4
  alt only missing 'void_deck'
    Migrator->>Locator: build InputLocator
    Migrator->>Utils: migrate_void_deck_data(locator)
    Utils-->>Migrator: writes zone, emits warnings if invalid
  else other missing attributes
    Migrator-->>Migrator: raise ValueError (format mismatch)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

feature-data-management, feature-database, feature-demand, feature-new

Suggested reviewers

  • martin-mosteiro

Poem

A nibble of data, a hop through the code,
I map cozy rooms on a sunshine road.
Void decks fixed, areas aligned,
Envelopes sorted, comfort assigned.
With floppy ears I merge and write—hop, ship, delight! 🐇✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Allow running pv plots without running demand" is concise and directly reflects the PR objective to decouple PV plotting from the demand pipeline; the changeset shows related refactors (e.g., moving migrate_void_deck_data to utils and adjustments in radiation/visualisation and demand modules) that support this intent, so the title is relevant and specific enough for a quick scan. It does not attempt to enumerate all auxiliary refactors, which is appropriate for a PR title.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch some-fix-from-idp-shanghai---void-deck-and-pv-without-demand

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cea/import_export/result_summary.py (2)

311-316: Missing comma concatenates two list items in 'pv' metrics mapping

'PV_electricity_west[kWh]' and 'PV_generation_to_load[-]' are concatenated into one string, breaking matching for 'PV_generation_to_load[-]'.

Apply this diff to fix:

-               'PV_electricity_east[kWh]', 'PV_installed_area_west[m2]', 'PV_electricity_west[kWh]'
-               'PV_generation_to_load[-]', 'PV_self_consumption[-]', 'PV_self_sufficiency[-]',
+               'PV_electricity_east[kWh]', 'PV_installed_area_west[m2]', 'PV_electricity_west[kWh]',
+               'PV_generation_to_load[-]', 'PV_self_consumption[-]', 'PV_self_sufficiency[-]',

1001-1004: Off‑by‑one in hour_end inflates coverage by +1 hour

hour_end is stored as last hour + 1 here, but add_nominal_actual_and_coverage treats hour_end as inclusive, leading to 1-hour overcounts and coverage ratios > 1.0.

Apply this diff to make hour_end inclusive (no +1):

-                grouped_monthly['hour_end'] = df.groupby('period_month', observed=True)['period_hour'].last().values + 1
+                grouped_monthly['hour_end'] = df.groupby('period_month', observed=True)['period_hour'].last().values
@@
-                grouped_seasonally['hour_end'] = df.groupby('period_season', observed=True)['period_hour'].last().values + 1
+                grouped_seasonally['hour_end'] = df.groupby('period_season', observed=True)['period_hour'].last().values
@@
-                row_sum['hour_end'] = df['period_hour'].iloc[-1] + 1
+                row_sum['hour_end'] = df['period_hour'].iloc[-1]

Also applies to: 1010-1013, 1019-1021

🧹 Nitpick comments (4)
cea/demand/occupancy_helper.py (1)

481-484: Duplicate migrate_void_deck_data call (run twice)

migrate_void_deck_data(locator) is already called at the start of occupancy_helper_main. This second call in main is redundant and can add unnecessary I/O.

Apply this diff to remove the duplicate call:

 def main(config):
   assert os.path.exists(config.scenario), 'Scenario not found: %s' % config.scenario
   print('Running occupancy model for scenario %s' % config.scenario)
   print('Running occupancy model  with schedule model=%s' % config.occupancy_helper.occupancy_model)
   locator = cea.inputlocator.InputLocator(config.scenario)
-  migrate_void_deck_data(locator)
   occupancy_helper_main(locator, config)
cea/import_export/result_summary.py (1)

2146-2151: Don’t break the time‑period loop on first missing file

break exits the loop and silently skips remaining periods, yet later writing still proceeds. Prefer continue to skip just this period.

Apply this diff:

-        if not os.path.exists(df_time_path):
-            print(f"File not found: {df_time_path}.")
-            break
+        if not os.path.exists(df_time_path):
+            print(f"File not found: {df_time_path}.")
+            continue
cea/datamanagement/archetypes_mapper.py (2)

101-104: Confirm intention: architecture CSV is generated unconditionally. Gate behind a flag if unintended.

This now runs regardless of the update flags. If this should be tied to architecture updates, consider guarding with update_architecture_dbf.

Apply this diff if you want it gated:

-    # Generate architecture CSV file with geometric properties
-    generate_architecture_csv(locator, building_typology_df)
+    # Generate architecture CSV file with geometric properties
+    if update_architecture_dbf:
+        generate_architecture_csv(locator, building_typology_df)

407-476: Harden area calculations: guard against NaN/None, clamp negative floors, coerce occupied_bg, and ensure geometry CRS is projected (meters).

  • Coerce numeric inputs robustly (do not rely on "... or 0" — np.nan is truthy). Use pd.to_numeric(..., errors='coerce') + fillna(0) or: val = building.get(...); val = 0.0 if pd.isna(val) else float(val).
  • Coerce Hs/Ns to float with default 0 and occupied_bg to numeric 0/1 (e.g., int(bool(...))) before any arithmetic.
  • Compute effective_floors_ag = max(floors_ag - void_deck, 0.0). Either raise ValueError when floors_ag < void_deck (defensive) or log a warning and clamp — note there are already validations in datamanagement/void_deck_migrator.py and datamanagement/databases_verification.py.
  • Verify the input GeoDataFrame CRS before using geometry.area: require a projected CRS (pyproj.CRS.is_projected) or call .to_crs(...) — zone.shp is external input and must be in metres.
  • Prefer sanitizing the DataFrame (pd.to_numeric + fillna) before iterating (faster, clearer).
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad31cb0 and e4368a0.

📒 Files selected for processing (7)
  • cea/datamanagement/archetypes_mapper.py (2 hunks)
  • cea/datamanagement/format_helper/cea4_migrate.py (3 hunks)
  • cea/datamanagement/void_deck_migrator.py (1 hunks)
  • cea/demand/demand_writers.py (3 hunks)
  • cea/demand/occupancy_helper.py (1 hunks)
  • cea/import_export/result_summary.py (2 hunks)
  • cea/inputlocator.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
cea/import_export/result_summary.py (2)
cea/config.py (1)
  • locator (574-577)
cea/inputlocator.py (1)
  • get_architecture_csv (1337-1339)
cea/demand/occupancy_helper.py (1)
cea/datamanagement/void_deck_migrator.py (1)
  • migrate_void_deck_data (7-47)
cea/demand/demand_writers.py (2)
cea/inputlocator.py (2)
  • get_temporary_file (1497-1499)
  • get_architecture_csv (1337-1339)
cea/demand/time_series_data.py (1)
  • TimeSeriesData (756-841)
cea/datamanagement/format_helper/cea4_migrate.py (3)
cea/datamanagement/void_deck_migrator.py (1)
  • migrate_void_deck_data (7-47)
cea/config.py (1)
  • Configuration (35-277)
cea/inputlocator.py (1)
  • InputLocator (20-1499)
cea/datamanagement/archetypes_mapper.py (1)
cea/inputlocator.py (3)
  • get_database_archetypes_construction_type (712-714)
  • ensure_parent_folder_exists (104-106)
  • get_architecture_csv (1337-1339)
cea/datamanagement/void_deck_migrator.py (1)
cea/inputlocator.py (1)
  • get_zone_geometry (897-901)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (macos-latest)
🔇 Additional comments (5)
cea/inputlocator.py (1)

1332-1340: Additive API for architecture paths looks good

Methods follow existing locator naming and path conventions. No behavior change risk.

cea/import_export/result_summary.py (1)

181-184: Use of locator.get_architecture_csv() for architecture feature

Correctly aligns architecture feature paths to the new output location.

cea/datamanagement/void_deck_migrator.py (1)

35-41: Good validation: floors_ag must be strictly greater than void_deck

The checks prevent invalid geometry-derived metrics later. Consider guarding for missing or non-numeric columns to avoid KeyError/type issues in older scenarios.

If legacy cases may lack 'floors_ag' or store it as strings, add a quick guard:

+    required_cols = {'floors_ag'}
+    missing = required_cols - set(zone_gdf.columns)
+    if missing:
+        raise KeyError(f"Missing required column(s) in zone.shp: {missing}")
+    zone_gdf['floors_ag'] = pd.to_numeric(zone_gdf['floors_ag'], errors='coerce')
+    zone_gdf['void_deck'] = pd.to_numeric(zone_gdf['void_deck'], errors='coerce').fillna(0)

Also applies to: 43-47

cea/demand/demand_writers.py (2)

91-99: Signature updates propagate correctly

Passing locator into calc_yearly_dataframe from both CSV and HDF5 writers is consistent.

Also applies to: 104-109


128-131: Data dict build is fine

Names and people0 fields are added as expected.

Comment on lines +554 to +559
if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
config = cea.config.Configuration()
locator = cea.inputlocator.InputLocator(config.scenario)
migrate_void_deck_data(locator)
else:
raise ValueError('! zone.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_zone_3}.'.format(list_missing_attributes_zone_3=list_missing_attributes_zone_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_zone_4}.'.format(list_missing_attributes_zone_4=list_missing_attributes_zone_4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Guard the ‘void_deck-only’ check and build the locator from the passed scenario (also avoids indexing errors).

  • Indexing list_missing_attributes_zone_4[0] will crash when the list is empty.
  • Use an equality check against ['void_deck'] and instantiate InputLocator with the function’s scenario argument (not the global config). Also, cea.inputlocator isn’t imported in this file.

Apply this diff:

-            else:
-                if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
-                    config = cea.config.Configuration()
-                    locator = cea.inputlocator.InputLocator(config.scenario)
-                    migrate_void_deck_data(locator)
-                else:
+            else:
+                if list_missing_attributes_zone_4 == ['void_deck']:
+                    locator = cea.inputlocator.InputLocator(scenario)
+                    migrate_void_deck_data(locator)
+                else:
                     raise ValueError('! zone.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_zone_3}.'.format(list_missing_attributes_zone_3=list_missing_attributes_zone_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_zone_4}.'.format(list_missing_attributes_zone_4=list_missing_attributes_zone_4)
                                  )

Add the missing import near the top of the file:

import cea.inputlocator
🤖 Prompt for AI Agents
In cea/datamanagement/format_helper/cea4_migrate.py around lines 554 to 559,
replace the unsafe indexing check and wrong locator creation: guard against
empty list by checking equality to ['void_deck'] (i.e.
list_missing_attributes_zone_4 == ['void_deck']) instead of
list_missing_attributes_zone_4[0], instantiate the InputLocator using the
function's scenario argument (locator = cea.inputlocator.InputLocator(scenario))
rather than creating a new global config, and add the missing import "import
cea.inputlocator" near the top of the file so the module is available.

Comment on lines +578 to 584
if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
config = cea.config.Configuration()
locator = cea.inputlocator.InputLocator(config.scenario)
migrate_void_deck_data(locator)
else:
raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Wrong list referenced in surroundings branch; remove void_deck migration here.

void_deck applies to zone.shp, not surroundings.shp. This branch incorrectly inspects list_missing_attributes_zone_4 and calls the void_deck migrator. It should just raise the existing error for surroundings.

Apply this diff:

-            else:
-                if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
-                    config = cea.config.Configuration()
-                    locator = cea.inputlocator.InputLocator(config.scenario)
-                    migrate_void_deck_data(locator)
-                else:
-                    raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4)
-                                 )
+            else:
+                raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
config = cea.config.Configuration()
locator = cea.inputlocator.InputLocator(config.scenario)
migrate_void_deck_data(locator)
else:
raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4)
)
else:
raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4))
🤖 Prompt for AI Agents
In cea/datamanagement/format_helper/cea4_migrate.py around lines 578 to 584, the
code incorrectly checks list_missing_attributes_zone_4 and calls
migrate_void_deck_data in the surroundings.shp branch; remove that conditional
and the migrate_void_deck_data call, and instead always raise the existing
ValueError for surroundings (combine the two formatted messages as currently
written) so surroundings.shp does not trigger void_deck migration.

Comment on lines 116 to 119
keys = data.keys()
columns = self.OTHER_VARS
columns.extend(keys)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Mutating shared class list (self.OTHER_VARS) causes column duplication across buildings

columns = self.OTHER_VARS extends the shared list in-place, so subsequent buildings will accumulate duplicate keys.

Apply this diff:

-        columns = self.OTHER_VARS
+        columns = self.OTHER_VARS.copy()
         columns.extend(keys)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
keys = data.keys()
columns = self.OTHER_VARS
columns.extend(keys)
keys = data.keys()
columns = self.OTHER_VARS.copy()
columns.extend(keys)
🤖 Prompt for AI Agents
In cea/demand/demand_writers.py around lines 116 to 119, the code assigns
columns = self.OTHER_VARS and then extends it, mutating the shared class list
and causing duplicated columns across buildings; instead, create a new list
instance before extending (for example copy self.OTHER_VARS with
list(self.OTHER_VARS) or self.OTHER_VARS[:] and then extend that new list with
the data keys converted to a list), so the class-level OTHER_VARS is never
modified and each building gets its own columns list.

Comment on lines 120 to 126
# get the architecture data
architecture_df = pd.read_csv(locator.get_architecture_csv())
Af_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Af_m2'].iloc[0])
Aroof_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Aroof_m2'].iloc[0])
GFA_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'GFA_m2'].iloc[0])
Aocc_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Aocc_m2'].iloc[0])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Architecture CSV read lacks guards; crash if file or building row missing

Add explicit checks for file existence and building presence to fail fast with a clear message. Also avoid re-reading the CSV per building if possible.

Apply this diff (minimal guard; optional caching noted in comment):

+        import os
+        arch_path = locator.get_architecture_csv()
+        if not os.path.exists(arch_path):
+            raise FileNotFoundError(f"Missing architecture CSV: {arch_path}. Generate it before running demand.")
-        architecture_df = pd.read_csv(locator.get_architecture_csv())
-        Af_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Af_m2'].iloc[0])
-        Aroof_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Aroof_m2'].iloc[0])
-        GFA_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'GFA_m2'].iloc[0])
-        Aocc_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Aocc_m2'].iloc[0])
+        architecture_df = pd.read_csv(arch_path)
+        row = architecture_df.loc[architecture_df['name'] == building_name]
+        if row.empty:
+            raise KeyError(f"Building '{building_name}' not found in {arch_path}.")
+        Af_m2 = float(row['Af_m2'].iloc[0])
+        Aroof_m2 = float(row['Aroof_m2'].iloc[0])
+        GFA_m2 = float(row['GFA_m2'].iloc[0])
+        Aocc_m2 = float(row['Aocc_m2'].iloc[0])

Optional: cache architecture_df once per run to avoid N file reads (e.g., memoize by path).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# get the architecture data
architecture_df = pd.read_csv(locator.get_architecture_csv())
Af_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Af_m2'].iloc[0])
Aroof_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Aroof_m2'].iloc[0])
GFA_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'GFA_m2'].iloc[0])
Aocc_m2 = float(architecture_df.loc[architecture_df['name'] == building_name, 'Aocc_m2'].iloc[0])
# get the architecture data
import os
arch_path = locator.get_architecture_csv()
if not os.path.exists(arch_path):
raise FileNotFoundError(f"Missing architecture CSV: {arch_path}. Generate it before running demand.")
architecture_df = pd.read_csv(arch_path)
row = architecture_df.loc[architecture_df['name'] == building_name]
if row.empty:
raise KeyError(f"Building '{building_name}' not found in {arch_path}.")
Af_m2 = float(row['Af_m2'].iloc[0])
Aroof_m2 = float(row['Aroof_m2'].iloc[0])
GFA_m2 = float(row['GFA_m2'].iloc[0])
Aocc_m2 = float(row['Aocc_m2'].iloc[0])

ShiZhongming and others added 9 commits September 19, 2025 14:59
Simplifies validation logic to check for buildings where void_deck exceeds floors_ag and issues a warning instead of raising an exception. This improves robustness by allowing migration to continue while notifying users of problematic data.
The call to migrate_void_deck_data was removed from the main function in occupancy_helper.py, since it is already called in occupancy_helper_main
Refactored generate_architecture_csv to use vectorized pandas operations instead of row-wise iteration, improving performance and code clarity for calculating building architectural properties.
Expanded the docstring to list the geometric properties included in the generated architecture CSV file, providing clearer documentation for users of the function.
Added logic to generate the architecture CSV using zone geometry before running the demand calculation. This ensures required input data is available and up-to-date for the demand workflow.
Added logic to plot_main.py to generate the architecture CSV file if it does not exist, ensuring geometric properties are available for GFA normalization. Also removed redundant architecture CSV generation from archetypes_mapper.py.
@reyery reyery changed the title Some fix from idp shanghai - void deck and pv without demand Allow running pv plots without running demand Sep 19, 2025
Refactored the warning to avoid nested quotes and improve readability by assigning invalid building names to a variable before including them in the message.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cea/datamanagement/void_deck_migrator.py (1)

17-29: Data loss risk: zone.shp not persisted while architecture CSV is mutated.

You drop void_deck from the architecture CSV but never write the updated zone_gdf back to zone.shp. On the next run, void_deck will be missing in both places, and buildings will be defaulted to 0 — silent data loss.

Write back to zone.shp whenever you create/populate void_deck:

         if "void_deck" in envelope_df.columns:
             # assign void_deck from envelope.csv to zone.shp and remove it from envelope.csv
             zone_gdf = zone_gdf.merge(
                 envelope_df[["name", "void_deck"]], on="name", how="left"
             )
             zone_gdf["void_deck"] = zone_gdf["void_deck"].fillna(0)
-            print("Migrated void_deck data from envelope.csv to zone.shp.")
+            warnings.warn("Migrated void_deck data from architecture.csv to zone.shp.", category=UserWarning)
+            zone_gdf.to_file(locator.get_zone_geometry(), driver="ESRI Shapefile")
             envelope_df.drop(columns=["void_deck"], inplace=True)
             envelope_df.to_csv(locator.get_building_architecture(), index=False)
 
         else:  # cannot find void_deck anywhere, just initialize it to 0
             zone_gdf["void_deck"] = 0
             warnings.warn(
                 "No void_deck data found in envelope.csv, setting to 0 in zone.shp"
             )
+            zone_gdf.to_file(locator.get_zone_geometry(), driver="ESRI Shapefile")

If persistence was intentionally removed, don’t mutate the CSV either; otherwise the migration isn’t idempotent and still loses data. Happy to provide an alternative “read-only migration” pattern if that’s desired.

🧹 Nitpick comments (8)
cea/demand/demand_main.py (2)

9-12: Avoid top-level geopandas import; lazy-load where needed with a clear error message

This reduces import cost and avoids hard-failing modules that don’t need gpd. Keep the archetype import as-is.

Apply:

-import geopandas as gpd

116-119: Generate architecture.csv only if missing and fail fast with actionable errors

  • Don’t overwrite on every run.
  • Lazy-import geopandas and raise a helpful exception if missing.
-    # Ensure that the architecture csv is generated
-    zone_df = gpd.read_file(locator.get_zone_geometry())
-    generate_architecture_csv(locator, zone_df)
+    # Ensure architecture.csv exists (needed by writers / aggregation)
+    if not os.path.exists(locator.get_architecture_csv()):
+        try:
+            import geopandas as gpd
+        except ImportError as e:
+            raise MissingInputDataException(
+                "geopandas is required to generate architecture.csv. "
+                "Install it (e.g., pip install geopandas) or pre-generate the file."
+            ) from e
+        zone_df = gpd.read_file(locator.get_zone_geometry())
+        generate_architecture_csv(locator, zone_df)

Please also confirm zone.shp contains the expected fields used by generate_architecture_csv: name, const_type, floors_ag, floors_bg, void_deck. Missing fields will raise at runtime.

cea/visualisation/plot_main.py (2)

6-16: Remove top-level geopandas import; import only when needed

Geopandas is a heavy import and plots that don’t need architecture.csv shouldn’t pay for it or fail if gpd isn’t installed.

Apply:

-import geopandas as gpd

95-100: Harden architecture.csv lazy-generation: add lazy import and robust error handling

Keep the exists-check, but add:

  • Lazy gpd import with actionable error.
  • Try/except around read_file to surface path/schema issues clearly.
-    # Ensure that the architecture csv is generated for gfa normalization
-    locator = cea.inputlocator.InputLocator(scenario=scenario)
-    if not os.path.exists(locator.get_architecture_csv()):
-        zone_df = gpd.read_file(locator.get_zone_geometry())
-        generate_architecture_csv(locator, zone_df)
+    # Ensure that the architecture csv is generated for GFA normalization
+    locator = cea.inputlocator.InputLocator(scenario=scenario)
+    if not os.path.exists(locator.get_architecture_csv()):
+        try:
+            import geopandas as gpd
+        except ImportError as e:
+            raise CEAException(
+                "geopandas is required to generate architecture.csv for plotting. "
+                "Install it (e.g., pip install geopandas) or pre-generate the file by running demand."
+            ) from e
+        try:
+            zone_df = gpd.read_file(locator.get_zone_geometry())
+        except Exception as e:
+            raise CEAException(f"Failed to read zone geometry at {locator.get_zone_geometry()}: {e}") from e
+        generate_architecture_csv(locator, zone_df)

Optionally, gate this block to features that actually require GFA normalization (e.g., 'demand', 'pv'/'pvt'/'sc') to avoid unnecessary I/O for unrelated plots.

cea/datamanagement/void_deck_migrator.py (4)

8-12: Docstring and terminology are outdated (“envelope.csv” vs “architecture.csv”).

The implementation reads via locator.get_building_architecture() (architecture CSV) but the docstring still references envelope.csv. Align wording to avoid confusion.

-"""Check if void_deck exists in zone.shp and copy it from envelope.csv if necessary.
+"""Ensure 'void_deck' exists in zone.shp. If missing, copy from architecture.csv (formerly envelope.csv) or initialize to 0, and persist updates to zone.shp.

26-26: Avoid print in library code; use logging or warnings.

print is hard to manage in pipelines/CLI. You already import warnings; use that (as in the diff above) or the project’s logger.


14-21: Handle partial/missing values when void_deck column exists.

If void_deck exists in zone_gdf but has NaNs for some buildings, we currently do nothing. Consider backfilling from architecture CSV for those rows to make the migration idempotent and robust.

Happy to sketch a minimal patch using combine_first on a left-join if you want this behavior.


36-39: Guard against missing columns (floors_ag, name).

If floors_ag or name is absent in zone_gdf, this will raise KeyError. Add upfront validation with a clear error/warning.

Example:

+required_zone_cols = {"name", "floors_ag"}
+missing = required_zone_cols - set(zone_gdf.columns)
+if missing:
+    warnings.warn(f"Missing expected columns in zone.shp: {sorted(missing)}", category=RuntimeWarning)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d94b8cf and 9555831.

📒 Files selected for processing (5)
  • cea/datamanagement/archetypes_mapper.py (2 hunks)
  • cea/datamanagement/void_deck_migrator.py (1 hunks)
  • cea/demand/demand_main.py (2 hunks)
  • cea/visualisation/a_data_loader.py (1 hunks)
  • cea/visualisation/plot_main.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • cea/visualisation/a_data_loader.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • cea/datamanagement/archetypes_mapper.py
🧰 Additional context used
🧬 Code graph analysis (2)
cea/demand/demand_main.py (2)
cea/datamanagement/archetypes_mapper.py (2)
  • archetypes_mapper (27-100)
  • generate_architecture_csv (405-472)
cea/inputlocator.py (1)
  • get_zone_geometry (897-901)
cea/visualisation/plot_main.py (2)
cea/datamanagement/archetypes_mapper.py (2)
  • archetypes_mapper (27-100)
  • generate_architecture_csv (405-472)
cea/inputlocator.py (3)
  • InputLocator (20-1499)
  • get_architecture_csv (1337-1339)
  • get_zone_geometry (897-901)
🪛 GitHub Actions: Ruff
cea/datamanagement/void_deck_migrator.py

[error] 40-40: Ruff: invalid-syntax: Cannot reuse outer quote character in f-strings on Python 3.10 (syntax was added in Python 3.12).

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (macos-latest)
🔇 Additional comments (1)
cea/datamanagement/void_deck_migrator.py (1)

20-28: Ignore — both locator methods exist; no AttributeError risk.

cea/inputlocator.py defines get_building_architecture (≈line 940) and get_architecture_csv (≈line 1337). Both variants are referenced across the repo (e.g., datamanagement/void_deck_migrator.py and resources/radiation*.py use get_building_architecture; datamanagement/archetypes_mapper.py and demand/demand_writers.py use get_architecture_csv). No change required; optional: unify naming if you want a single API.

Likely an incorrect or invalid review comment.

Moved generate_architecture_csv and migrate_void_deck_data functions to cea/datamanagement/utils/__init__.py for better modularity. Removed void_deck_migrator.py and cleaned up archetypes_mapper.py by deleting the duplicate generate_architecture_csv implementation.
Updated multiple modules to import migrate_void_deck_data and generate_architecture_csv from cea.datamanagement.utils instead of their previous locations. This change centralizes utility functions and improves code maintainability.
Added a call to generate_architecture_csv in the test setup to ensure the architecture file is present for the reference case. This addresses a missing dependency required for the thermal loads calculation tests.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cea/resources/radiationCRAX/main.py (1)

458-458: Persist migrated void_deck to zone.shp (critical)

migrate_void_deck_data (cea/datamanagement/utils/init.py — def at line 12) merges/initializes void_deck into the in-memory zone_gdf but never writes zone.shp; meanwhile it drops void_deck from envelope.csv — this risks deleting the only source of void_deck values and breaking CRAX and other workflows that read zone.shp after migration.

  • After merging or setting zone_gdf["void_deck"], persist it to disk (e.g., zone_gdf.to_file(locator.get_zone_geometry())).
  • Only drop/write envelope.csv after confirming zone.shp was successfully written.
  • Also persist the branch that initializes void_deck = 0 and preserve shapefile schema/CRS when writing.
♻️ Duplicate comments (2)
cea/datamanagement/format_helper/cea4_migrate.py (2)

554-558: Unsafe list indexing and incorrect locator instantiation.

The code has two issues:

  1. Accessing list_missing_attributes_zone_4[0] will fail when the list is empty
  2. Creating a new Configuration() instance and using config.scenario instead of the function's scenario parameter

Apply this diff to fix both issues:

-                if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
-                    config = cea.config.Configuration()
-                    locator = cea.inputlocator.InputLocator(config.scenario)
+                if list_missing_attributes_zone_4 == ['void_deck']:
+                    locator = cea.inputlocator.InputLocator(scenario)
                     migrate_void_deck_data(locator)

Add the missing import at the top of the file:

import cea.inputlocator

578-583: Incorrect logic for surroundings.shp migration.

This branch incorrectly checks list_missing_attributes_zone_4 and calls migrate_void_deck_data for surroundings.shp. The void_deck column is only relevant for zone.shp, not surroundings.shp.

Apply this diff to remove the incorrect migration logic:

             else:
-                if list_missing_attributes_zone_4[0] == 'void_deck' and len(list_missing_attributes_zone_4) == 1:
-                    config = cea.config.Configuration()
-                    locator = cea.inputlocator.InputLocator(config.scenario)
-                    migrate_void_deck_data(locator)
-                else:
-                    raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4)
-                                 )
+                raise ValueError('surroundings.shp exists but follows neither the CEA-3 nor CEA-4 format. CEA cannot proceed with the data migration. Check the following column(s) for CEA-3 format: {list_missing_attributes_surroundings_3}.'.format(list_missing_attributes_surroundings_3=list_missing_attributes_surroundings_3), 'Check the following column(s) for CEA-4 format: {list_missing_attributes_surroundings_4}.'.format(list_missing_attributes_surroundings_4=list_missing_attributes_surroundings_4))
🧹 Nitpick comments (2)
cea/datamanagement/utils/__init__.py (2)

46-47: Consider more descriptive warning message for invalid floors_ag configuration.

When floors_ag - void_deck <= 0, the warning message could be more actionable by explaining the implications and suggesting remediation steps.

Apply this diff to improve the warning message:

-        warnings.warn(f"Some buildings have void_deck greater than floors_ag: {invalid_buildings}",
+        warnings.warn(f"Buildings {invalid_buildings} have void_deck >= floors_ag, resulting in zero or negative effective floors. This may cause incorrect GFA calculations. Please verify the void_deck and floors_ag values in zone.shp.",
                       RuntimeWarning)

89-90: Potential division by zero when all floors are void decks.

When effective_floors_ag = 0 and floors_bg = 0, the denominator becomes zero. While fillna(1.0).where(denominator > 0, 1.0) handles this, the logic could be clearer.

Consider extracting the division logic into a helper function for clarity:

-    share_ag = effective_floors_ag / denominator
-    # Handle division by zero case
-    share_ag = share_ag.fillna(1.0).where(denominator > 0, 1.0)
+    # Calculate share_ag with safe division
+    share_ag = pd.Series(
+        np.where(denominator > 0, effective_floors_ag / denominator, 1.0),
+        index=effective_floors_ag.index
+    )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9555831 and 0f7f254.

📒 Files selected for processing (11)
  • cea/analysis/lca/embodied.py (1 hunks)
  • cea/datamanagement/archetypes_mapper.py (1 hunks)
  • cea/datamanagement/format_helper/cea4_migrate.py (3 hunks)
  • cea/datamanagement/utils/__init__.py (1 hunks)
  • cea/datamanagement/void_deck_migrator.py (0 hunks)
  • cea/demand/demand_main.py (3 hunks)
  • cea/demand/occupancy_helper.py (1 hunks)
  • cea/resources/radiation/main.py (1 hunks)
  • cea/resources/radiationCRAX/main.py (1 hunks)
  • cea/tests/test_calc_thermal_loads.py (2 hunks)
  • cea/visualisation/plot_main.py (2 hunks)
💤 Files with no reviewable changes (1)
  • cea/datamanagement/void_deck_migrator.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • cea/visualisation/plot_main.py
  • cea/demand/demand_main.py
  • cea/datamanagement/archetypes_mapper.py
🧰 Additional context used
🧬 Code graph analysis (7)
cea/demand/occupancy_helper.py (1)
cea/datamanagement/utils/__init__.py (1)
  • migrate_void_deck_data (12-47)
cea/tests/test_calc_thermal_loads.py (2)
cea/datamanagement/utils/__init__.py (1)
  • generate_architecture_csv (50-117)
cea/inputlocator.py (1)
  • get_zone_geometry (897-901)
cea/analysis/lca/embodied.py (1)
cea/datamanagement/utils/__init__.py (1)
  • migrate_void_deck_data (12-47)
cea/datamanagement/utils/__init__.py (1)
cea/inputlocator.py (5)
  • get_zone_geometry (897-901)
  • get_building_architecture (940-945)
  • get_database_archetypes_construction_type (712-714)
  • ensure_parent_folder_exists (104-106)
  • get_architecture_csv (1337-1339)
cea/resources/radiationCRAX/main.py (1)
cea/datamanagement/utils/__init__.py (1)
  • migrate_void_deck_data (12-47)
cea/datamanagement/format_helper/cea4_migrate.py (3)
cea/datamanagement/utils/__init__.py (1)
  • migrate_void_deck_data (12-47)
cea/config.py (1)
  • Configuration (35-277)
cea/inputlocator.py (1)
  • InputLocator (20-1499)
cea/resources/radiation/main.py (1)
cea/datamanagement/utils/__init__.py (1)
  • migrate_void_deck_data (12-47)
🪛 GitHub Actions: CI
cea/tests/test_calc_thermal_loads.py

[error] 46-46: Unit test setup failed due to error in architecture CSV generation (KeyError 'void_deck').

cea/datamanagement/utils/__init__.py

[error] 72-72: KeyError: 'void_deck' while generating architecture CSV in generate_architecture_csv (column 'void_deck' missing).

🔇 Additional comments (6)
cea/analysis/lca/embodied.py (1)

18-18: LGTM!

The import change correctly reflects the module reorganization, moving migrate_void_deck_data to the centralized utils module.

cea/demand/occupancy_helper.py (1)

16-16: LGTM!

The import change correctly updates the path for migrate_void_deck_data following the module reorganization.

cea/tests/test_calc_thermal_loads.py (1)

44-46: Ensure 'void_deck' exists in reference-case zone.shp before generating architecture CSV

Sandbox verification failed: ModuleNotFoundError: No module named 'geopandas' — run the included script locally to confirm zone.shp columns (required: 'name', 'floors_ag', 'floors_bg', 'void_deck', 'const_type'). If 'void_deck' is missing, add the migrate step before generating the architecture CSV.

File: cea/tests/test_calc_thermal_loads.py Lines: 44-46

     # FIXME: Update reference case to have architecture file
+    # Ensure void_deck column exists
+    from cea.datamanagement.utils import migrate_void_deck_data
+    migrate_void_deck_data(cls.locator)
     # Ensure that the architecture csv is generated
     generate_architecture_csv(cls.locator, gpd.read_file(cls.locator.get_zone_geometry()))
cea/resources/radiationCRAX/main.py (1)

24-24: LGTM: Import refactoring completed successfully.

The import statement has been properly updated to reflect the new location of migrate_void_deck_data in the consolidated utils module. The function call on line 458 remains unchanged, maintaining functional consistency.

cea/resources/radiation/main.py (2)

17-17: LGTM: Import refactoring completed successfully.

The import statement has been properly updated from the deprecated cea.datamanagement.void_deck_migrator to the consolidated cea.datamanagement.utils module. The function call on line 135 remains unchanged, maintaining functional consistency.


135-135: Ensure void‑deck migration runs for radiation‑only PV workflows.

migrate_void_deck_data(locator) at cea/resources/radiation/main.py:135 — confirm it executes when demand calculations are skipped and that it populates the geometry/fields radiation code requires (sky‑view factor, per‑surface shading masks, ground‑reflected irradiance). If it does not, add an explicit migration/fallback in the radiation/PV path and add tests exercising PV plots with and without existing void‑deck data to prevent runtime errors and incorrect irradiance.

footprint = prop_architecture_df.geometry.area # building footprint area
floors_ag = prop_architecture_df['floors_ag'] # above-ground floors
floors_bg = prop_architecture_df['floors_bg'] # below-ground floors
void_deck = prop_architecture_df['void_deck'] # void deck floors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing void_deck column in building_typology_df causes pipeline failure.

The pipeline error indicates that void_deck is not present in building_typology_df. The function attempts to access prop_architecture_df['void_deck'] without checking if it exists, causing a KeyError when the column is missing.

Apply this diff to handle the missing column:

 floors_ag = prop_architecture_df['floors_ag']  # above-ground floors
 floors_bg = prop_architecture_df['floors_bg']  # below-ground floors
-void_deck = prop_architecture_df['void_deck']  # void deck floors
+void_deck = prop_architecture_df['void_deck'] if 'void_deck' in prop_architecture_df.columns else 0  # void deck floors

Alternatively, ensure migrate_void_deck_data is called before generate_architecture_csv to guarantee the column exists:

# Add at the beginning of generate_architecture_csv function
migrate_void_deck_data(locator)
building_typology_df = gpd.read_file(locator.get_zone_geometry())  # Re-read after migration
🧰 Tools
🪛 GitHub Actions: CI

[error] 72-72: KeyError: 'void_deck' while generating architecture CSV in generate_architecture_csv (column 'void_deck' missing).

The test setup now calls migrate_void_deck_data to ensure void deck data is present before generating the architecture CSV. This prepares the reference case for updated data requirements.
Adds suffixes to the DataFrame merge in generate_architecture_csv to avoid column name conflicts, such as 'void_deck', and ensures left DataFrame columns are preserved.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
cea/datamanagement/utils/__init__.py (1)

75-75: Address the void_deck column availability issue.

Based on past review feedback, there's a pipeline failure where void_deck column is missing from building_typology_df, causing a KeyError. The function needs to handle this case gracefully.

The previous review correctly identified this issue. Apply the suggested fix to handle the missing column:

-void_deck = prop_architecture_df['void_deck']  # void deck floors
+void_deck = prop_architecture_df['void_deck'] if 'void_deck' in prop_architecture_df.columns else 0  # void deck floors

Or ensure migrate_void_deck_data is always called before this function to guarantee the column exists.

🧹 Nitpick comments (1)
cea/tests/test_calc_thermal_loads.py (1)

44-44: Consider updating the FIXME comment to be more actionable.

The FIXME comment could be more specific about what updating the reference case entails.

Apply this diff to make the comment more actionable:

-        # FIXME: Update reference case to have void deck and architecture file
+        # FIXME: Update reference case to include void_deck in zone.shp and pre-generated architecture.csv
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f7f254 and 6b4c12a.

📒 Files selected for processing (2)
  • cea/datamanagement/utils/__init__.py (1 hunks)
  • cea/tests/test_calc_thermal_loads.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
cea/tests/test_calc_thermal_loads.py (2)
cea/datamanagement/utils/__init__.py (2)
  • generate_architecture_csv (50-120)
  • migrate_void_deck_data (12-47)
cea/inputlocator.py (1)
  • get_zone_geometry (897-901)
cea/datamanagement/utils/__init__.py (1)
cea/inputlocator.py (5)
  • get_zone_geometry (897-901)
  • get_building_architecture (940-945)
  • get_database_archetypes_construction_type (712-714)
  • ensure_parent_folder_exists (104-106)
  • get_architecture_csv (1337-1339)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: tests (macos-latest)
  • GitHub Check: tests (windows-latest)
🔇 Additional comments (5)
cea/tests/test_calc_thermal_loads.py (3)

11-11: LGTM! Proper import added.

The GeoPandas import is correctly added to support reading geometry files in the test setup.


14-14: LGTM! Proper utilities imported.

The utilities are correctly imported from the new datamanagement.utils module.


44-47: LGTM! Proper test setup ensures architecture data availability.

The setup correctly calls migrate_void_deck_data followed by generate_architecture_csv to ensure the required architecture data exists for the reference case. This follows the proper dependency order where void deck migration must precede architecture CSV generation.

cea/datamanagement/utils/__init__.py (2)

12-48: LGTM! Well-implemented void deck migration with proper validation.

The function correctly handles migration from envelope.csv to zone.shp, includes proper error handling for missing data, and validates business logic constraints (floors_ag > void_deck). The implementation handles edge cases appropriately and provides useful feedback to users.


50-121: LGTM! Comprehensive architecture CSV generation with robust calculations.

The function implements proper architectural area calculations following established patterns, handles division by zero cases, and includes comprehensive documentation. The vectorized operations and proper data handling make this a solid implementation.

:type locator: cea.inputlocator.InputLocator
"""

zone_gdf = gpd.read_file(locator.get_zone_geometry())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing zone geometry update after void deck migration.

The migrate_void_deck_data function modifies the zone geometry by adding the void_deck column, but it doesn't save the updated GeoDataFrame back to the shapefile. This means subsequent calls to read the zone geometry won't see the migrated void_deck data.

Apply this diff to save the updated zone geometry:

 zone_gdf["void_deck"] = zone_gdf["void_deck"].fillna(0)
 print("Migrated void_deck data from envelope.csv to zone.shp.")
+zone_gdf.to_file(locator.get_zone_geometry())
 envelope_df.drop(columns=["void_deck"], inplace=True)
 envelope_df.to_csv(locator.get_building_architecture(), index=False)

And for the case where void_deck is initialized to 0:

 zone_gdf["void_deck"] = 0
 warnings.warn(
     "No void_deck data found in envelope.csv, setting to 0 in zone.shp"
 )
+zone_gdf.to_file(locator.get_zone_geometry())

Also applies to: 27-27

🤖 Prompt for AI Agents
In cea/datamanagement/utils/__init__.py around lines 19 and 27, after reading
zone_gdf and after migrate_void_deck_data modifies or initializes the void_deck
column you must write the updated GeoDataFrame back to the zone geometry
shapefile; call zone_gdf.to_file(locator.get_zone_geometry()) (preserve
CRS/index as needed) so subsequent reads see the migrated data, and do the same
when void_deck is initialized to 0.

Renamed function parameter from building_typology_df to building_zone_df for clarity and updated references within the function to use building_zone_df. This improves code readability and consistency with the data being processed.
Adds a call to save the updated zone_gdf to the zone geometry file after migrating void_deck data from envelope.csv to zone.shp, ensuring changes are persisted.
Moved the call to migrate_void_deck_data into generate_architecture_csv to ensure void_deck data is present before architectural calculations. Updated test_calc_thermal_loads to remove redundant migrate_void_deck_data call.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
cea/datamanagement/utils/__init__.py (2)

69-77: KeyError risk: building_zone_df may not have 'void_deck' even after migration.

migrate_void_deck_data(locator) updates zone.shp on disk, but the in‑memory building_zone_df isn’t refreshed. Accessing building_zone_df['void_deck'] can raise KeyError.

Apply:

     # Ensure void_deck data is present in building_zone_df
-    migrate_void_deck_data(locator)
+    migrate_void_deck_data(locator)
+    if 'void_deck' not in building_zone_df.columns:
+        zone_void = gpd.read_file(locator.get_zone_geometry())[["name", "void_deck"]]
+        building_zone_df = building_zone_df.merge(zone_void, on="name", how="left")
+        building_zone_df["void_deck"] = building_zone_df["void_deck"].fillna(0)

37-42: Persist zone.shp after initializing void_deck to 0.

You add the column but don’t write the updated GeoDataFrame back, so downstream reads won’t see it.

         else:  # cannot find void_deck anywhere, just initialize it to 0
             zone_gdf["void_deck"] = 0
             warnings.warn(
                 "No void_deck data found in envelope.csv, setting to 0 in zone.shp"
             )
+            zone_gdf.to_file(locator.get_zone_geometry())
🧹 Nitpick comments (5)
cea/datamanagement/utils/__init__.py (5)

109-115: Ensure consistent indexing; take 'name' from the same frame as computed Series.

Computed Series (af_m2, aocc_m2, etc.) are derived from building_zone_df. Use building_zone_df['name'] to avoid subtle misalignment if merges reorder rows.

-        'name': prop_architecture_df['name'],
+        'name': building_zone_df['name'],

80-83: Normalize occupied_bg to numeric 0/1.

CSV content may be strings (e.g., "True"/"False"/"YES"/"NO"). Non‑numeric values will yield unexpected broadcasting in floors_bg * occupied_bg.

-    occupied_bg = prop_architecture_df['occupied_bg']  # Whether basement is occupied
+    occupied_bg = pd.to_numeric(prop_architecture_df['occupied_bg'], errors='coerce')
+    # fallback for common string encodings
+    if occupied_bg.isna().any():
+        mapping = {'true':1, 'false':0, 'yes':1, 'no':0, 'y':1, 'n':0}
+        occupied_bg = prop_architecture_df['occupied_bg'].astype(str).str.strip().str.lower().map(mapping)
+    occupied_bg = occupied_bg.fillna(0).astype(int).clip(0,1)

89-97: Clamp shares to [0, 1] and guard negative denominators.

If void_deck > floors_ag (already warned), effective_floors_ag becomes negative. Clamp to keep shares physically meaningful.

-    effective_floors_ag = floors_ag - void_deck
-    denominator = effective_floors_ag + floors_bg * occupied_bg
-    share_ag = effective_floors_ag / denominator
-    # Handle division by zero case
-    share_ag = share_ag.fillna(1.0).where(denominator > 0, 1.0)
+    effective_floors_ag = (floors_ag - void_deck).clip(lower=0)
+    denominator = (effective_floors_ag + floors_bg.clip(lower=0) * occupied_bg)
+    share_ag = (effective_floors_ag / denominator.where(denominator > 0)).fillna(1.0).clip(0.0, 1.0)

33-35: Use logging instead of print for library code.

Favor logging to integrate with CEA’s logging/CLI.

+import logging
@@
+logger = logging.getLogger(__name__)
@@
-            print("Migrated void_deck data from envelope.csv to zone.shp.")
+            logger.info("Migrated void_deck data from envelope.csv to zone.shp.")
@@
-    print(f"Architecture data generated and saved to: {locator.get_architecture_csv()}")
+    logger.info("Architecture data generated and saved to: %s", locator.get_architecture_csv())

Also applies to: 121-122


43-50: Consider hard‑failing on invalid floors rather than only warning.

Negative/zero effective floors will still produce outputs (after clamping), which may hide data issues.

Option: raise ValueError if any invalid buildings are found, or gate behind a setting/flag.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b4c12a and 15d3817.

📒 Files selected for processing (2)
  • cea/datamanagement/utils/__init__.py (1 hunks)
  • cea/tests/test_calc_thermal_loads.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • cea/tests/test_calc_thermal_loads.py
🧰 Additional context used
🧬 Code graph analysis (1)
cea/datamanagement/utils/__init__.py (1)
cea/inputlocator.py (5)
  • get_zone_geometry (897-901)
  • get_building_architecture (940-945)
  • get_database_archetypes_construction_type (712-714)
  • ensure_parent_folder_exists (104-106)
  • get_architecture_csv (1337-1339)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (macos-latest)
🔇 Additional comments (1)
cea/datamanagement/utils/__init__.py (1)

74-74: Verify CRS is metric before using geometry.area.

GeoSeries.area assumes CRS units. If zone.shp is in degrees, all areas are wrong.

Please confirm building_zone_df.crs is projected (meters). If not, reproject:

-    footprint = building_zone_df.geometry.area  # building footprint area
+    if building_zone_df.crs is None or not building_zone_df.crs.is_projected:
+        building_zone_df = building_zone_df.to_crs(epsg=3857)  # or your project CRS
+    footprint = building_zone_df.geometry.area  # building footprint area

The function now takes an 'envelope' DataFrame as input and updates it directly with calculated geometry values instead of creating a new DataFrame. Reverts change of `2eb9c7be6`
Updated BuildingGeometry to read only specified columns from zone geometry using COLUMNS_ZONE_GEOMETRY. This ensures only relevant data is loaded and improves data handling consistency.
Updated the BuildingRCModel to separate calculation of useful floor areas into areas_df and consistently use envelope and areas_df for property calculations. This improves clarity and correctness in how building geometry and envelope properties are handled.
Replaced usage of RC model U-values with envelope U-values for wall and roof solar calculations in calc_Isol_daysim.
Eliminated U-value attributes from the RCModelProperties class to avoid duplication, as these properties are already present in the envelope model.
Transferred roof, underside, above ground wall, above ground window, and below ground opaque area fields from RCModelProperties to EnvelopeProperties for better data organization and clarity.
Replaced areas_df['void_deck'] with geometry['void_deck'] to correctly determine floating buildings in the BuildingRCModel class.
Deleted the 'footprint' field from the RCModelProperties class as it is found in geometry properties.
Replaces the calculation of gross floor area (GFA) using geometry footprint and floors with the value from the RC model, since it has already been calculated.
Replaced references to bpr.rc_model attributes with bpr.envelope attributes in heating/cooling system load, RC model, and sensible loads calculations.
Replaces reading building architecture data from CSV with direct access to attributes from the BuildingPropertiesRow (bpr) object in calc_yearly_dataframe. Also removes the unused locator parameter from the method signature and its calls.
Eliminated the generation of the architecture CSV in the test setup for calc_thermal_loads.
Adds logic to specify suffixes when merging zone_df and architecture_df if both contain a 'void_deck' column, preventing column name conflicts during the merge operation.
Cleaned up plot_main.py by removing unused imports and the code that generates the architecture CSV, which is no longer required for GFA normalization.
Introduces special handling for the 'architecture' feature in exec_read_and_slice by loading zone and architecture data, calculating useful areas, and extracting relevant metrics. Updates get_results_path and load_cea_results_from_csv_files signatures for improved type clarity.
Deleted the generate_architecture_csv function from utils/__init__.py as it is no longer needed. Use calc_useful_areas instead since it is the same logic.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
cea/demand/demand_writers.py (1)

165-167: KeyError risk: dropping non-existent 'name' column in HDF5 path.

hourly_data no longer contains a name column; drop('name', ...) will raise.

Apply:

-        hourly_data.drop('name', inplace=True, axis=1)
+        hourly_data.drop('name', inplace=True, axis=1, errors='ignore')
cea/demand/building_properties/building_properties.py (1)

111-116: Guard for missing 'GFA_m2' and use min_gfa in warning

Direct indexing of self.rc_model._prop_rc_model['GFA_m2'] can raise a KeyError if architecture data isn't loaded; also the warning hardcodes 100 m2 instead of using min_gfa.

File: cea/demand/building_properties/building_properties.py (around lines 111-116)

-        GFA_m2 = self.rc_model._prop_rc_model['GFA_m2']
+        if 'GFA_m2' not in self.rc_model._prop_rc_model.columns:
+            raise KeyError("RC model properties missing 'GFA_m2'. Generate architecture.csv and ensure it is loaded.")
+        GFA_m2 = self.rc_model._prop_rc_model['GFA_m2']
@@
-        if len(small_buildings) > 0:
-            print(f'Warning! The following list of buildings have less than 100 m2 of gross floor area, '
-                  f'this might cause potential issues with demand calculation: {small_buildings.index.tolist()}')
+        if len(small_buildings) > 0:
+            print(f'Warning! The following buildings have less than {min_gfa} m2 of gross floor area; '
+                  f'this might cause potential issues with demand calculation: {small_buildings.index.tolist()}')
cea/demand/building_properties/useful_areas.py (1)

47-55: Zero‑division risk in share splitting (Hs/Ns)

When both above‑ and below‑ground occupied floors are zero, split_above_and_below_ground_shares divides by 0. Make the helper robust.

Apply in the helper below:

 def split_above_and_below_ground_shares(Hs, Ns, occupied_bg, floors_ag, floors_bg):
@@
-    share_ag = floors_ag / (floors_ag + floors_bg * occupied_bg)
-    share_bg = 1 - share_ag
+    denom = (floors_ag + floors_bg * occupied_bg)
+    share_ag = denom.where(denom != 0, 0)
+    share_ag = share_ag.mask(denom != 0, other=floors_ag / denom)
+    share_bg = 1 - share_ag
cea/import_export/result_summary.py (1)

298-314: Syntax bug: missing comma in PV metrics list

Between 'PV_electricity_west[kWh]' and 'PV_generation_to_load[-]' there is no comma, merging two strings and breaking mapping. Add a comma.

Apply:

-               'PV_electricity_east[kWh]', 'PV_installed_area_west[m2]', 'PV_electricity_west[kWh]'
-               'PV_generation_to_load[-]', 'PV_self_consumption[-]', 'PV_self_sufficiency[-]',
+               'PV_electricity_east[kWh]', 'PV_installed_area_west[m2]', 'PV_electricity_west[kWh]',
+               'PV_generation_to_load[-]', 'PV_self_consumption[-]', 'PV_self_sufficiency[-]',
cea/demand/building_properties/building_properties_row.py (1)

251-256: A_op and win_wall are computed incorrectly

A_op is labeled “Opaque area above ground” but currently sums windows+walls. win_wall should be windows/walls, not windows/(windows+walls).

Apply:

-        A_op = envelope['Awin_ag'] + envelope['Awall_ag']
-        envelope['A_op'] = A_op
-        envelope['win_wall'] = envelope['Awin_ag'] / A_op if A_op != 0 else 0.0
+        A_op = envelope['Awall_ag']
+        envelope['A_op'] = A_op
+        envelope['win_wall'] = (envelope['Awin_ag'] / A_op) if A_op != 0 else 0.0
cea/demand/building_properties/building_rc_model.py (1)

213-227: Radiation CSV reads: add existence/column guards

Reading every building CSV without checks can crash the whole run. Guard file existence and missing columns; continue gracefully.

Apply:

-            geometry_data = pd.read_csv(locator.get_radiation_building(building_name))
+            path = locator.get_radiation_building(building_name)
+            if not os.path.exists(path):
+                print(f"Radiation file not found for {building_name}: {path}")
+                continue
+            geometry_data = pd.read_csv(path)
@@
-            envelope.loc[building_name, 'Aroof'] = geometry_data['roofs_top_m2'][0]
+            envelope.loc[building_name, 'Aroof'] = geometry_data.get('roofs_top_m2', pd.Series([0])).iloc[0]
@@
-            envelope.loc[building_name, 'Aunderside'] = geometry_data['undersides_bottom_m2'][0]
+            envelope.loc[building_name, 'Aunderside'] = geometry_data.get('undersides_bottom_m2', pd.Series([0])).iloc[0]
🧹 Nitpick comments (12)
cea/datamanagement/utils/__init__.py (1)

66-73: Strengthen merge semantics and diagnostics.

Use validate='m:1' and explicit NA checks right after merge to catch DB issues deterministically.

Apply:

-    prop_architecture_df = building_zone_df.merge(architecture_DB, how="left", on="const_type")
+    prop_architecture_df = building_zone_df.merge(
+        architecture_DB, how="left", on="const_type", validate="m:1"
+    )

Optionally, log const_type missing rows with names before raising.

cea/demand/building_properties/building_geometry.py (1)

29-35: Be defensive on required zone columns or explain failure clearly.

Indexing COLUMNS_ZONE_GEOMETRY + ['geometry'] will KeyError if scenarios are missing a column. Either validate and raise a clear error or intersect with available columns and warn.

Example guard:

-        prop_geometry = Gdf.from_file(locator.get_zone_geometry())[COLUMNS_ZONE_GEOMETRY + ['geometry']].set_index('name').loc[building_names]
+        zone = Gdf.from_file(locator.get_zone_geometry())
+        missing = [c for c in (COLUMNS_ZONE_GEOMETRY + ['geometry']) if c not in zone.columns]
+        if missing:
+            raise KeyError(f"Zone shapefile missing required columns: {missing}")
+        prop_geometry = zone[COLUMNS_ZONE_GEOMETRY + ['geometry']].set_index('name').loc[building_names]
cea/demand/demand_writers.py (1)

252-253: Aggregate-hourly: consider dropping non-additive columns before summation.

You drop x_int post-sum; if name is reintroduced to hourly CSVs, add it to the drop list pre-sum to avoid type errors.

-        aggregated_hourly_results_df = aggregated_hourly_results_df.drop(columns=['x_int'])
+        aggregated_hourly_results_df = aggregated_hourly_results_df.drop(columns=['x_int'], errors='ignore')

And drop any non-numeric columns before the += loop if present.

cea/demand/building_properties/useful_areas.py (2)

41-45: Clamp effective above‑ground floors and null-safety

Subtracting void_deck can go negative; also floors_* may contain NaNs. Clamp at ≥0 and fill NaNs.

Apply:

-    df['GFA_ag_m2'] = df['footprint'] * (df['floors_ag'] - df['void_deck'])
-    df['GFA_bg_m2'] = df['footprint'] * df['floors_bg']
+    floors_ag_eff = (df['floors_ag'].fillna(0) - df['void_deck'].fillna(0)).clip(lower=0)
+    floors_bg = df['floors_bg'].fillna(0)
+    df['GFA_ag_m2'] = df['footprint'] * floors_ag_eff
+    df['GFA_bg_m2'] = df['footprint'] * floors_bg

57-58: Return a trimmed schema (or make it configurable)

Downstream only needs a subset (Af, Aef, footprint, GFA_m2, Aocc, Hs_ag, geometry). Returning the full merged frame increases coupling and I/O.

Apply:

-    # TODO: Return only relevant columns
-    return df
+    cols = [c for c in ['Af','Aef','footprint','GFA_m2','Aocc','Hs_ag','Hs_bg','Ns_ag','Ns_bg','geometry'] if c in df.columns]
+    return df[cols + ([] if 'geometry' in cols else [])]
cea/import_export/result_summary.py (4)

177-179: Return typing doesn’t match actual structure (lists of paths vs nested lists)

list_paths can contain nested lists (PV, PVT, SC), but the signature says tuple[list[str], list[str]]. Adjust the type to reflect reality to avoid misleading callers and type checkers.

Apply:

-from typing import TYPE_CHECKING
+from typing import TYPE_CHECKING, Union, Sequence
@@
-def get_results_path(locator: cea.inputlocator.InputLocator, cea_feature: str, list_buildings: list)-> tuple[list[str], list[str]]:
+def get_results_path(locator: cea.inputlocator.InputLocator, cea_feature: str, list_buildings: list) -> tuple[list[Union[str, list[str]]], list[str]]:

645-689: CSV loader: tiny doc/behavior mismatch and robustness nit

Docstring mentions renaming to 'DATE' but code standardizes to 'date'. Also, consider using parse_dates in read_csv to avoid double conversions.

Apply:

-    Standardizes the date column name in the DataFrame.
-    If the DataFrame contains 'DATE', 'date', or 'Date', renames it to 'DATE'.
+    Standardizes the date column name in the DataFrame.
+    If the DataFrame contains 'DATE', 'date', or 'Date', renames it to 'date'.

Optional:

-                df = pd.read_csv(path)  # Load the CSV file into a DataFrame
+                df = pd.read_csv(path, parse_dates=[c for c in ['DATE','Date','date'] if c in pd.read_csv(path, nrows=0).columns])

871-889: Ensure 'name' column exists after reset_index and clarify roof mapping

reset_index will only produce a 'name' column if the index is named 'name'. Guard against misnamed indices; also add a comment that footprint≈roof may deviate for pitched roofs.

Apply:

-        architecture_data = result_df[['Af', 'footprint', 'GFA_m2', 'Aocc']].rename(columns={
+        architecture_data = result_df[['Af', 'footprint', 'GFA_m2', 'Aocc']].rename(columns={
             'Af': 'Af_m2',
             'footprint': 'Aroof_m2',  # Assuming footprint corresponds to roof area
             'Aocc': 'Aocc_m2'
-        }).reset_index()
+        }).reset_index()
+        # Ensure 'name' is present even if the index name was lost
+        if 'name' not in architecture_data.columns:
+            architecture_data = architecture_data.rename(columns={architecture_data.columns[0]: 'name'})

Please verify whether you prefer using radiation-derived roofs_top_m2 when available instead of footprint for Aroof_m2.


2162-2167: Skip missing time-period files instead of breaking the whole loop

Using break stops processing subsequent periods; use continue so available periods still get written.

Apply:

-        if not os.path.exists(df_time_path):
-            print(f"File not found: {df_time_path}.")
-            break
+        if not os.path.exists(df_time_path):
+            print(f"File not found: {df_time_path}. Skipping this period.")
+            continue
cea/demand/building_properties/building_rc_model.py (3)

121-127: Hs correction side‑effect on envelope

Mutating input envelope in place can surprise callers if reused. Consider copying envelope before mutation.

Apply:

-        for building in self.building_names:
+        envelope = envelope.copy()
+        for building in self.building_names:

149-152: HD/Hg: NaN propagation and boolean semantics

  • Multiply with Hs_ag/Hs_bg but these can be NaN; fillna(0) to avoid NaNs in HD/Hg.
  • 'is_floating' derived from void_deck>0 is a heuristic; verify against your definition of floating buildings.

Apply:

-        df['HD'] = (envelope['Awall_ag'] * envelope['U_wall'] * np.sqrt(areas_df['Hs_ag'])
-                    + envelope['Aroof'] * envelope['U_roof'] * areas_df['Hs_ag']
-                    + is_floating * envelope['Aunderside'] * envelope['U_base'] * areas_df['Hs_ag']
-                    )
+        hs_ag = areas_df['Hs_ag'].fillna(0)
+        df['HD'] = (envelope['Awall_ag'] * envelope['U_wall'] * np.sqrt(hs_ag)
+                    + envelope['Aroof'] * envelope['U_roof'] * hs_ag
+                    + is_floating * envelope['Aunderside'] * envelope['U_base'] * hs_ag)
@@
-        df['Hg'] = B_F * envelope['Aop_bg'] * envelope['U_base'] * areas_df['Hs_bg']
+        df['Hg'] = B_F * envelope['Aop_bg'] * envelope['U_base'] * areas_df['Hs_bg'].fillna(0)

229-232: Aop_bg formula: handle missing geometry fields

Ensure height_bg/perimeter/footprint exist; otherwise fill defaults to avoid NaNs.

Apply:

-        envelope['Aop_bg'] = geometry['height_bg'] * geometry['perimeter'] + geometry['footprint']
+        gb = geometry.reindex(envelope.index)
+        envelope['Aop_bg'] = gb['height_bg'].fillna(0) * gb['perimeter'].fillna(0) + gb['footprint'].fillna(0)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 15d3817 and f9890f6.

📒 Files selected for processing (15)
  • cea/datamanagement/utils/__init__.py (1 hunks)
  • cea/demand/building_properties/building_geometry.py (2 hunks)
  • cea/demand/building_properties/building_properties.py (1 hunks)
  • cea/demand/building_properties/building_properties_row.py (2 hunks)
  • cea/demand/building_properties/building_rc_model.py (6 hunks)
  • cea/demand/building_properties/building_solar.py (2 hunks)
  • cea/demand/building_properties/useful_areas.py (2 hunks)
  • cea/demand/demand_main.py (2 hunks)
  • cea/demand/demand_writers.py (5 hunks)
  • cea/demand/hourly_procedure_heating_cooling_system_load.py (1 hunks)
  • cea/demand/occupancy_helper.py (2 hunks)
  • cea/demand/rc_model_SIA.py (1 hunks)
  • cea/demand/sensible_loads.py (1 hunks)
  • cea/import_export/result_summary.py (5 hunks)
  • cea/visualisation/plot_main.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • cea/demand/demand_main.py
  • cea/visualisation/plot_main.py
  • cea/demand/occupancy_helper.py
🧰 Additional context used
🧬 Code graph analysis (6)
cea/demand/building_properties/building_geometry.py (2)
cea/inputlocator.py (1)
  • get_zone_geometry (897-901)
cea/utilities/standardize_coordinates.py (2)
  • get_lat_lon_projected_shapefile (75-90)
  • get_projected_coordinate_system (55-65)
cea/demand/building_properties/useful_areas.py (1)
cea/utilities/standardize_coordinates.py (2)
  • get_lat_lon_projected_shapefile (75-90)
  • get_projected_coordinate_system (55-65)
cea/import_export/result_summary.py (2)
cea/demand/building_properties/useful_areas.py (1)
  • calc_useful_areas (11-58)
cea/inputlocator.py (2)
  • get_zone_geometry (897-901)
  • get_building_architecture (940-945)
cea/demand/building_properties/building_rc_model.py (2)
cea/demand/building_properties/useful_areas.py (1)
  • calc_useful_areas (11-58)
cea/demand/control_heating_cooling_systems.py (2)
  • has_heating_system (22-39)
  • has_cooling_system (42-58)
cea/demand/demand_writers.py (2)
cea/plots/demand/__init__.py (1)
  • data (151-154)
cea/plots/demand/heating_reset_schedule.py (1)
  • data (38-39)
cea/datamanagement/utils/__init__.py (1)
cea/inputlocator.py (5)
  • get_zone_geometry (897-901)
  • get_building_architecture (940-945)
  • get_database_archetypes_construction_type (712-714)
  • ensure_parent_folder_exists (104-106)
  • get_architecture_csv (1337-1339)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (windows-latest)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (macos-latest)
🔇 Additional comments (9)
cea/demand/building_properties/building_solar.py (1)

99-99: LGTM: switch to envelope U-values is consistent.

Using prop_envelope['U_wall'] and ['U_roof'] aligns with the envelope-centric refactor.

Please confirm prop_envelope['a_wall'] and ['a_roof'] are absorptance (not area), as assumed here.

Also applies to: 109-109

cea/demand/hourly_procedure_heating_cooling_system_load.py (1)

763-766: LGTM: envelope-driven opaque H terms.

Switching to bpr.envelope.(Awall_ag|Aop_bg|Aroof) and corresponding U-values is consistent with the refactor.

cea/demand/demand_writers.py (1)

116-118: Mutating shared OTHER_VARS produces duplicated columns across buildings.

You extend the class list in-place.

Apply:

-        columns = self.OTHER_VARS
+        columns = self.OTHER_VARS.copy()
         columns.extend(keys)
cea/datamanagement/utils/__init__.py (2)

37-41: Persistence bug: void_deck initialization not saved to zone.shp.

When envelope.csv lacks void_deck, you set it to 0 but don’t write back to the shapefile.

Apply:

         else:  # cannot find void_deck anywhere, just initialize it to 0
             zone_gdf["void_deck"] = 0
             warnings.warn(
                 "No void_deck data found in envelope.csv, setting to 0 in zone.shp"
             )
+            zone_gdf.to_file(locator.get_zone_geometry())

74-83: Dataflow bug: building_zone_df might still miss void_deck after migration.

You call migrate_void_deck_data(locator) but continue using the original building_zone_df. If it was loaded before migration, void_deck is absent → KeyError.

Apply:

     # Ensure void_deck data is present in building_zone_df
-    migrate_void_deck_data(locator)
+    migrate_void_deck_data(locator)
+    # Re-read to ensure the migrated column is present
+    building_zone_df = gpd.read_file(locator.get_zone_geometry())
cea/demand/rc_model_SIA.py (1)

710-711: Awin_ag confirmed as above‑ground window area (m²) — no change required. Declared as "Above ground window area [m2]" in cea/demand/building_properties/building_properties_row.py:241 and assigned as the sum of orientation windows in cea/demand/building_properties/building_rc_model.py:219–221; used by cea/demand/sensible_loads.py:103 and cea/demand/rc_model_SIA.py:710.

cea/demand/building_properties/building_geometry.py (1)

41-41: Drop or serialize Shapely geometry before storing in _prop_geometry

Keeping Shapely geometry objects increases memory and will embed non‑JSON‑serializable objects when rows are converted to dict/JSON. Evidence: assignment at cea/demand/building_properties/building_geometry.py (self._prop_geometry = prop_geometry, ~line 41) and getitem returns self._prop_geometry.loc[building_name].to_dict() (cea/demand/building_properties/building_geometry.py:79); CEADatabase.to_dict() is returned by the dashboard API (cea/interfaces/dashboard/api/inputs.py:401) and tests/callers use .to_dict()/json.dumps elsewhere — geometry can be propagated into JSON. Action: either drop the geometry column here (df = df.drop(columns='geometry')) or convert geometries to a serializable form (WKT/GeoJSON strings) before exposing to downstream consumers.

cea/import_export/result_summary.py (1)

15-16: Import looks good

Bringing calc_useful_areas in here aligns with the on‑demand architecture flow.

cea/demand/building_properties/building_rc_model.py (1)

112-117: Data flow is solid; two follow-ups

Good: areas via calc_useful_areas then geometry-derived envelope areas.

Please confirm areas_df retains 'height_bg' and 'perimeter' from zone for Aop_bg calculation downstream.

Comment on lines 78 to 83
# Calculate areas based on geometry
footprint = building_zone_df.geometry.area # building footprint area
floors_ag = building_zone_df['floors_ag'] # above-ground floors
floors_bg = building_zone_df['floors_bg'] # below-ground floors
void_deck = building_zone_df['void_deck'] # void deck floors

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Area units risk: ensure metric CRS before computing areas.

geometry.area on geographic CRS (degrees) produces wrong m². Reproject to a projected CRS first.

Apply:

+    from cea.utilities.standardize_coordinates import (
+        get_lat_lon_projected_shapefile, get_projected_coordinate_system
+    )
@@
-    footprint = building_zone_df.geometry.area  # building footprint area
+    lat, lon = get_lat_lon_projected_shapefile(building_zone_df)
+    target_crs = get_projected_coordinate_system(float(lat), float(lon))
+    zone_df_proj = building_zone_df.to_crs(target_crs)
+    footprint = zone_df_proj.geometry.area  # building footprint area [m2]
-    floors_ag = building_zone_df['floors_ag']  # above-ground floors
-    floors_bg = building_zone_df['floors_bg']  # below-ground floors
-    void_deck = building_zone_df['void_deck']  # void deck floors
+    floors_ag = building_zone_df['floors_ag']
+    floors_bg = building_zone_df['floors_bg']
+    void_deck = building_zone_df['void_deck']

Also applies to: 89-113

Comment on lines +237 to +243
# Area fields added from BuildingRCModel
Aroof: Annotated[float, "Roof area [m2]"]
Aunderside: Annotated[float, "Underside area [m2]"]
Awall_ag: Annotated[float, "Above ground wall area [m2]"]
Awin_ag: Annotated[float, "Above ground window area [m2]"]
Aop_bg: Annotated[float, "Area of opaque surfaces below ground [m2]"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

New area fields should have safe defaults

Envelope CSVs in the wild may not include these new columns yet. Without defaults, from_dict will raise. Provide 0.0 defaults to maintain backward compatibility.

Apply:

-    Aroof: Annotated[float, "Roof area [m2]"]
-    Aunderside: Annotated[float, "Underside area [m2]"]
-    Awall_ag: Annotated[float, "Above ground wall area [m2]"]
-    Awin_ag: Annotated[float, "Above ground window area [m2]"]
-    Aop_bg: Annotated[float, "Area of opaque surfaces below ground [m2]"]
+    Aroof: Annotated[float, "Roof area [m2]"] = 0.0
+    Aunderside: Annotated[float, "Underside area [m2]"] = 0.0
+    Awall_ag: Annotated[float, "Above ground wall area [m2]"] = 0.0
+    Awin_ag: Annotated[float, "Above ground window area [m2]"] = 0.0
+    Aop_bg: Annotated[float, "Area of opaque surfaces below ground [m2]"] = 0.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Area fields added from BuildingRCModel
Aroof: Annotated[float, "Roof area [m2]"]
Aunderside: Annotated[float, "Underside area [m2]"]
Awall_ag: Annotated[float, "Above ground wall area [m2]"]
Awin_ag: Annotated[float, "Above ground window area [m2]"]
Aop_bg: Annotated[float, "Area of opaque surfaces below ground [m2]"]
# Area fields added from BuildingRCModel
Aroof: Annotated[float, "Roof area [m2]"] = 0.0
Aunderside: Annotated[float, "Underside area [m2]"] = 0.0
Awall_ag: Annotated[float, "Above ground wall area [m2]"] = 0.0
Awin_ag: Annotated[float, "Above ground window area [m2]"] = 0.0
Aop_bg: Annotated[float, "Area of opaque surfaces below ground [m2]"] = 0.0
🤖 Prompt for AI Agents
In cea/demand/building_properties/building_properties_row.py around lines 237 to
243, the new area fields Aroof, Aunderside, Awall_ag, Awin_ag and Aop_bg lack
defaults so from_dict will raise when CSVs omit them; add safe defaults of 0.0
to each field declaration (e.g. set each Annotated field to = 0.0) so missing
columns map to 0.0 and maintain backward compatibility.

Comment on lines +273 to +276
footprint: Annotated[float, "Building footprint area [m2]"]
Atot: Annotated[float, "Area of all surfaces facing the building zone [m2]"]
Am: Annotated[float, "Effective mass area [m2]"]
# TODO: Determine if these fields belong here (calculated using calc_useful_areas)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

RCModelProperties: add defaults for newly required fields

If upstream dict lacks 'footprint' or 'Am' (computed later), construction will fail. Provide safe defaults here too.

Apply:

-    footprint: Annotated[float, "Building footprint area [m2]"]
+    footprint: Annotated[float, "Building footprint area [m2]"] = 0.0
@@
-    Am: Annotated[float, "Effective mass area [m2]"]
+    Am: Annotated[float, "Effective mass area [m2]"] = 0.0
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
footprint: Annotated[float, "Building footprint area [m2]"]
Atot: Annotated[float, "Area of all surfaces facing the building zone [m2]"]
Am: Annotated[float, "Effective mass area [m2]"]
# TODO: Determine if these fields belong here (calculated using calc_useful_areas)
footprint: Annotated[float, "Building footprint area [m2]"] = 0.0
Atot: Annotated[float, "Area of all surfaces facing the building zone [m2]"]
Am: Annotated[float, "Effective mass area [m2]"] = 0.0
# TODO: Determine if these fields belong here (calculated using calc_useful_areas)
🤖 Prompt for AI Agents
In cea/demand/building_properties/building_properties_row.py around lines
273-276, RCModelProperties now requires 'footprint' and 'Am' but these keys may
be missing from upstream dict causing construction to fail; add safe default
values (e.g., footprint=0.0 and Am=0.0) to the dataclass/constructor/field
definitions so instances can be created when those fields are absent, and ensure
the defaults match the annotated types and do not break later computations that
will overwrite them.

Comment on lines 161 to 164
df['Htr_ms'] = H_MS * df['Am'] # Coupling conductance 1 in W/K
df['Htr_em'] = 1 / (1 / df['Htr_op'] - 1 / df['Htr_ms']) # Coupling conductance 2 in W/K
df['Htr_is'] = H_IS * df['Atot']

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Htr_em can explode or go negative

When Htr_op ≈ Htr_ms the denominator approaches 0. Guard numerically to avoid inf/negatives.

Apply:

-        df['Htr_em'] = 1 / (1 / df['Htr_op'] - 1 / df['Htr_ms'])  # Coupling conductance 2 in W/K
+        denom = (1 / df['Htr_op']) - (1 / df['Htr_ms'])
+        eps = 1e-9
+        df['Htr_em'] = 1 / denom.clip(lower=eps)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
df['Htr_ms'] = H_MS * df['Am'] # Coupling conductance 1 in W/K
df['Htr_em'] = 1 / (1 / df['Htr_op'] - 1 / df['Htr_ms']) # Coupling conductance 2 in W/K
df['Htr_is'] = H_IS * df['Atot']
df['Htr_ms'] = H_MS * df['Am'] # Coupling conductance 1 in W/K
denom = (1 / df['Htr_op']) - (1 / df['Htr_ms'])
eps = 1e-9
df['Htr_em'] = 1 / denom.clip(lower=eps)
df['Htr_is'] = H_IS * df['Atot']
🤖 Prompt for AI Agents
In cea/demand/building_properties/building_rc_model.py around lines 161-164, the
computation of df['Htr_em'] = 1 / (1 / df['Htr_op'] - 1 / df['Htr_ms']) can blow
up or become negative when 1/Htr_op ≈ 1/Htr_ms; guard numerically by computing
the denominator explicitly, apply a small positive tolerance (e.g. eps = 1e-6)
and use a vectorized conditional to replace any denominator values with eps when
they are near zero or non-positive, then compute Htr_em from that safe
denominator and ensure the final Htr_em is non-negative (e.g. clip to a minimum
of eps) to avoid inf/NaN/negative values.

Comment on lines +31 to +35
if "void_deck" in zone_df.columns and "void_deck" in architecture_df.columns:
props['suffixes'] = ('', '_arch')

# Merge zone data with architecture data to get building properties
df = zone_df.merge(architecture_df, how='left', left_index=True, right_index=True, **props)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Guard merge against duplicate/missing 'void_deck' and enforce zone-owned value

If both inputs contain 'void_deck' you set suffixes but still later read df['void_deck'] unconditionally. This risks KeyError or picking the wrong column. Drop it from architecture_df and default to 0 if entirely absent.

Apply:

-    props = {}
-    if "void_deck" in zone_df.columns and "void_deck" in architecture_df.columns:
-        props['suffixes'] = ('', '_arch')
+    # Ensure 'void_deck' comes from zone only
+    architecture_df = architecture_df.drop(columns=['void_deck'], errors='ignore')
+    if 'void_deck' not in zone_df.columns:
+        zone_df = zone_df.copy()
+        zone_df['void_deck'] = 0
+    props = {}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if "void_deck" in zone_df.columns and "void_deck" in architecture_df.columns:
props['suffixes'] = ('', '_arch')
# Merge zone data with architecture data to get building properties
df = zone_df.merge(architecture_df, how='left', left_index=True, right_index=True, **props)
# Ensure 'void_deck' comes from zone only
architecture_df = architecture_df.drop(columns=['void_deck'], errors='ignore')
if 'void_deck' not in zone_df.columns:
zone_df = zone_df.copy()
zone_df['void_deck'] = 0
props = {}
# Merge zone data with architecture data to get building properties
df = zone_df.merge(architecture_df, how='left', left_index=True, right_index=True, **props)
🤖 Prompt for AI Agents
In cea/demand/building_properties/useful_areas.py around lines 31 to 35, the
merge can end up with duplicated or missing 'void_deck' and the code later reads
df['void_deck'] unconditionally; before merging, drop 'void_deck' from
architecture_df so the zone_df value always wins, and if neither dataframe has
'void_deck' add it to zone_df (or to df after merge) with a default of 0; then
perform the merge (no need for suffixes for void_deck), and ensure post-merge
df['void_deck'] is present and sourced from the zone-owned column (or defaulted
to 0).

Comment on lines +37 to +40
# reproject to projected coordinate system (in meters) to calculate area
lat, lon = get_lat_lon_projected_shapefile(zone_df)
df = df.to_crs(get_projected_coordinate_system(float(lat), float(lon)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

❓ Verification inconclusive

CRS precondition before to_crs

to_crs will fail if zone_df.crs is None. Add an explicit check with a clear error to aid users with malformed shapefiles.

Apply:

-    lat, lon = get_lat_lon_projected_shapefile(zone_df)
+    if zone_df.crs is None:
+        raise ValueError("zone_df has no CRS. Ensure zone.shp contains a valid CRS before calling calc_useful_areas.")
+    lat, lon = get_lat_lon_projected_shapefile(zone_df)

Fail fast when zone_df has no CRS

File: cea/demand/building_properties/useful_areas.py Lines: 37-40

to_crs will raise if zone_df.crs is None — add an explicit check and raise a clear error.

-    lat, lon = get_lat_lon_projected_shapefile(zone_df)
+    if zone_df.crs is None:
+        raise ValueError("zone_df has no CRS. Ensure zone.shp contains a valid CRS before calling calc_useful_areas.")
+    lat, lon = get_lat_lon_projected_shapefile(zone_df)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# reproject to projected coordinate system (in meters) to calculate area
lat, lon = get_lat_lon_projected_shapefile(zone_df)
df = df.to_crs(get_projected_coordinate_system(float(lat), float(lon)))
# reproject to projected coordinate system (in meters) to calculate area
if zone_df.crs is None:
raise ValueError("zone_df has no CRS. Ensure zone.shp contains a valid CRS before calling calc_useful_areas.")
lat, lon = get_lat_lon_projected_shapefile(zone_df)
df = df.to_crs(get_projected_coordinate_system(float(lat), float(lon)))
🤖 Prompt for AI Agents
In cea/demand/building_properties/useful_areas.py around lines 37 to 40, add an
explicit check for zone_df.crs being None before reprojecting: if zone_df.crs is
None raise a clear ValueError (or custom exception) with a message like "zone_df
has no CRS; cannot reproject — set zone_df.crs or use .set_crs() before calling
this function"; perform this check before calling
get_lat_lon_projected_shapefile() and before df.to_crs(...) so the function
fails fast with an actionable error.

Comment on lines +86 to 87
columns, hourly_data = self.calc_hourly_dataframe(date, tsd)
self.write_to_hdf5(building_name, columns, hourly_data, locator)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

API change removes 'name' from hourly CSV; check plot compatibility.

Many plots select by hourly_loads['name']. Removing the column likely breaks them.

If needed, add name back without changing the calc signature:

     def results_to_csv(self, tsd: TimeSeriesData, bpr: BuildingPropertiesRow, locator, date, building_name):
         # save hourly data
-        columns, hourly_data = self.calc_hourly_dataframe(date, tsd)
+        columns, hourly_data = self.calc_hourly_dataframe(date, tsd)
+        hourly_data.insert(0, 'name', building_name)
+        columns = ['name'] + columns
         self.write_to_csv(building_name, columns, hourly_data, locator)

And keep the HDF5 path robust via the earlier errors='ignore' change.

Also applies to: 100-101


🏁 Script executed:

#!/bin/bash
# Find consumers expecting 'name' in hourly loads
rg -nP --type=py -C2 "\[\'name\'\]"

Length of output: 28039


Reintroduce 'name' column into hourly outputs — breaks many consumers.

Many modules filter hourly_loads by hourly_loads['name'] and will break if the column is removed. Add the column back before writing hourly CSV/HDF5 in cea/demand/demand_writers.py (around the write calls at ~lines 86–87 and 100–101): hourly_data.insert(0, 'name', building_name) and columns = ['name'] + columns. Apply the same for the HDF5 writer.

Key consumers to check: cea/visualisation/special/comfort_chart.py:54, cea/plots/demand/energy_balance.py:65, cea/plots/demand/init.py:153, cea/plots/demand/heating_reset_schedule.py:39.

🤖 Prompt for AI Agents
In cea/demand/demand_writers.py around lines 86–87 (and likewise around lines
100–101), the hourly outputs no longer include the 'name' column which breaks
downstream consumers; before writing the CSV and HDF5 add the building name as
the first column with hourly_data.insert(0, 'name', building_name) and prepend
the columns list with columns = ['name'] + columns, and apply the identical
change for the HDF5 writer so both CSV and HDF5 outputs include the 'name'
column.

Comment on lines +103 to 107
I_rad_win = tsd.thermal_resistance.RSE_win[t] * bpr.envelope.U_win * calc_hr(bpr.envelope.e_win, theta_ss) * bpr.envelope.Awin_ag * delta_theta_er
I_rad_roof = tsd.thermal_resistance.RSE_roof[t] * bpr.envelope.U_roof * calc_hr(bpr.envelope.e_roof, theta_ss) * bpr.envelope.Aroof * delta_theta_er
I_rad_wall = tsd.thermal_resistance.RSE_wall[t] * bpr.envelope.U_wall * calc_hr(bpr.envelope.e_wall, theta_ss) * bpr.envelope.Awall_ag * delta_theta_er
I_rad_underside = tsd.thermal_resistance.RSE_underside[t] * bpr.envelope.U_base * calc_hr(bpr.envelope.e_underside, theta_ss) * bpr.envelope.Aunderside * delta_theta_er
I_rad = Fform_wall * I_rad_wall + Fform_win * I_rad_win + Fform_roof * I_rad_roof + Fform_underside * I_rad_underside
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Inconsistent envelope field names: likely AttributeError at runtime.

Aunderside and e_underside don’t appear elsewhere; other modules use Aop_bg and often e_base. Align naming to the envelope schema.

Apply:

-    I_rad_underside = tsd.thermal_resistance.RSE_underside[t] * bpr.envelope.U_base * calc_hr(bpr.envelope.e_underside, theta_ss) * bpr.envelope.Aunderside * delta_theta_er
+    I_rad_underside = (
+        tsd.thermal_resistance.RSE_underside[t]
+        * bpr.envelope.U_base
+        * calc_hr(bpr.envelope.e_base, theta_ss)
+        * bpr.envelope.Aop_bg
+        * delta_theta_er
+    )

🏁 Script executed:

#!/bin/bash
# Verify which envelope attributes exist across the codebase
rg -nP --type=py -C2 '\benvelope\.(Aunderside|Aop_bg|e_underside|e_base)\b'

Length of output: 1339


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== envelope.e_* occurrences ==="
rg -nP --type=py '\benvelope\.e_[A-Za-z0-9_]+' -S -n -C2 || true

echo
echo "=== envelope.A* occurrences ==="
rg -nP --type=py '\benvelope\.A[A-Za-z0-9_]+' -S -n -C2 || true

echo
echo "=== Envelope class / constructor matches ==="
rg -nP --type=py 'class\s+\w*Envelope\b|@dataclass\s*\n\s*class\s+\w*Envelope\b|\bEnvelope\(' -S -n -C2 || true

echo
echo "=== assignments / constructions of envelope ==="
rg -nP --type=py '\benvelope\s*=|=\s*Envelope\(' -S -n -C2 || true

echo
echo "=== sensible_loads.py context ==="
sed -n '1,220p' cea/demand/sensible_loads.py || true

Length of output: 19228


Replace undefined envelope attributes on underside radiation
In cea/demand/sensible_loads.py line 106, use bpr.envelope.e_base and bpr.envelope.Aop_bg instead of bpr.envelope.e_underside and bpr.envelope.Aunderside to match the existing envelope schema.

🤖 Prompt for AI Agents
In cea/demand/sensible_loads.py around lines 103 to 107, the underside radiation
calculation uses undefined envelope attributes e_underside and Aunderside;
replace those with the correct schema attributes bpr.envelope.e_base and
bpr.envelope.Aop_bg respectively so the I_rad_underside line uses
RSE_underside[t] * bpr.envelope.U_base * calc_hr(bpr.envelope.e_base, theta_ss)
* bpr.envelope.Aop_bg * delta_theta_er and leave the rest of the I_rad
composition unchanged.

Replaced DataFrame usage with pandas Series for HEX areas, UA, and Q_nom data in substation_HEX_sizing. Updated substation_HEX_design_main to aggregate Q_nom data as a DataFrame from Series. Simplified access to HEX_UA values in substation return functions by removing DataFrame indexing and using Series directly.
Updated substation_HEX_sizing to return multiple values instead of a list. Adjusted substation_HEX_design_main to unpack these values and store them correctly in the DataFrame, improving code clarity and data handling.
Replaces repeated access to buildings_demands[name] with a local demand_df variable for clarity and efficiency in determine_building_supply_temperatures. This improves code readability and maintainability by reducing redundant dictionary lookups.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
cea/technologies/thermal_network/substation_matrix.py (6)

15-25: Include DATE and set a time index when loading demand CSVs.

Downstream uses .loc[[t]], but DATE isn’t loaded, so index is a RangeIndex. Load DATE and set it as index to avoid silent misalignment.

Apply:

-        demand_df = pd.read_csv(locator.get_demand_results_file(name),
-                                usecols=(BUILDINGS_DEMANDS_COLUMNS))
+        demand_df = (
+            pd.read_csv(
+                locator.get_demand_results_file(name),
+                usecols=['DATE', *BUILDINGS_DEMANDS_COLUMNS]
+            )
+            .set_index('DATE')
+        )

Also applies to: 86-88


329-349: Bug: Attribute access on Series; use indexing for HEX_UA.

substation_HEX_specs at this point is a row Series whose 'HEX_UA' value is another Series. Attribute access (substation_HEX_specs.HEX_UA.UA_…) will raise AttributeError. Use bracket access and guard by index membership + non‑NaN.

Apply:

-    if 'UA_heating_hs_ahu' in substation_HEX_specs.HEX_UA:
+    UA_series = substation_HEX_specs["HEX_UA"]
+    if 'UA_heating_hs_ahu' in UA_series.index and pd.notna(UA_series['UA_heating_hs_ahu']):
@@
-                                                                                    substation_HEX_specs.HEX_UA.UA_heating_hs_ahu,
+                                                                                    UA_series['UA_heating_hs_ahu'],

Do the same for hs_aru, hs_shu, and hs_ww below.

Also applies to: 361-379


413-472: Same bug in DC path: attribute access on Series.

Use bracket access with index/NaN checks for all UA_cooling_cs_*.

Apply:

-    if 'UA_cooling_cs_ahu' in substation_HEX_specs.HEX_UA:
+    UA_series = substation_HEX_specs["HEX_UA"]
+    if 'UA_cooling_cs_ahu' in UA_series.index and pd.notna(UA_series['UA_cooling_cs_ahu']):
@@
-                                                                                    substation_HEX_specs.HEX_UA.UA_cooling_cs_ahu,
+                                                                                    UA_series['UA_cooling_cs_ahu'],

Repeat for cs_aru, cs_scu, cs_data, cs_re.


854-856: Update main() calls to new signatures.

main still calls determine_building_supply_temperatures and substation_HEX_design_main with old signatures; it will crash.

Apply:

-    thermal_network.buildings_demands = determine_building_supply_temperatures(thermal_network.building_names, locator)
-    thermal_network.substations_HEX_specs = substation_HEX_design_main(thermal_network.buildings_demands)
+    thermal_network.buildings_demands = determine_building_supply_temperatures(
+        thermal_network.building_names, locator, thermal_network.substation_systems
+    )
+    thermal_network.substations_HEX_specs, _ = substation_HEX_design_main(
+        thermal_network.buildings_demands, thermal_network.substation_systems, thermal_network
+    )

329-480: Replace attribute-style access on substation_HEX_specs.HEX_UA with dict-style indexing (or ensure HEX_UA exposes attributes) to avoid AttributeError.

Change instances of substation_HEX_specs.HEX_UA.UA_* to substation_HEX_specs.HEX_UA['UA_*'] (or use getattr) — you currently check membership with "in" which implies HEX_UA is a mapping.

Occurrences to fix:

  • cea/technologies/thermal_network/substation_matrix.py:332 — UA_heating_hs_ahu
  • cea/technologies/thermal_network/substation_matrix.py:348 — UA_heating_hs_aru
  • cea/technologies/thermal_network/substation_matrix.py:364 — UA_heating_hs_shu
  • cea/technologies/thermal_network/substation_matrix.py:378 — UA_heating_hs_ww
  • cea/technologies/thermal_network/substation_matrix.py:416 — UA_cooling_cs_ahu
  • cea/technologies/thermal_network/substation_matrix.py:431 — UA_cooling_cs_aru
  • cea/technologies/thermal_network/substation_matrix.py:446 — UA_cooling_cs_scu
  • cea/technologies/thermal_network/substation_matrix.py:459 — UA_cooling_cs_data
  • cea/technologies/thermal_network/substation_matrix.py:471 — UA_cooling_cs_re

578-618: Fix ambiguous NumPy truth-value comparisons in HEX solver

  • In cea/technologies/thermal_network/substation_matrix.py (lines 578–618; also 707–747) expressions like if ch > 0, if cmin < ch, and if cc_old.any() > 0 compare NumPy arrays and will raise or behave incorrectly.
  • Replace with explicit scalar/reduction checks: coerce single-element arrays to scalars (e.g., ch = ch.item() or float(np.asarray(ch)[0])) or use .any()/.all()/.max() consistently according to the intended logic; use if cc_old.any(): instead of if cc_old.any() > 0.
  • I can submit a small patch to coerce single-row arrays to scalars and update the comparisons.
🧹 Nitpick comments (7)
cea/technologies/thermal_network/substation_matrix.py (7)

51-54: Fix return docstring and type mismatch.

Docstring says “(substations_HEX_specs, buildings_demands)” but the function returns “(substations_HEX_specs, substations_Q)”. Update the docstring accordingly.

Apply:

-    :return: ``(substations_HEX_specs, buildings_demands)`` - substations_HEX_specs: dataframe with substation heat
-        exchanger specs at each building,  buildings_demands: lists of heating demand/flowrate/supply temperature of all
-        buildings connected to the network.
+    :return: (substations_HEX_specs, substations_Q)
+        - substations_HEX_specs: DataFrame with per-building Series for HEX_areas and HEX_UA
+        - substations_Q: DataFrame with nominal loads (kW) per building/system.

Also applies to: 69-73


92-104: Use index membership-safe aggregation without np.vectorize.

np.vectorize is Python-looped and slow. Prefer numpy nan‑aware reductions.

Apply:

-                T_supply_heating_C = np.vectorize(calc_DH_supply)(T_supply_heating_C,
-                                                                  np.where(demand_df['Qhs_sys_' + system + '_kWh'] > 0,
-                                                                           demand_df['Ths_sys_sup_' + system + '_C'],
-                                                                           np.nan))
+                T_supply_heating_C = np.nanmax(
+                    np.vstack([
+                        np.asarray(T_supply_heating_C),
+                        np.where(demand_df['Qhs_sys_' + system + '_kWh'] > 0,
+                                 demand_df['Ths_sys_sup_' + system + '_C'],
+                                 np.nan)
+                    ]),
+                    axis=0
+                )

110-127: Same for DC: prefer nan-aware vector ops over np.vectorize.

Consistent with the heating path.

Apply similarly for each DC branch:

-                T_supply_cooling_C = np.vectorize(calc_DC_supply)(T_supply_cooling_C,
-                                                                  np.where(
-                                                                      abs(demand_df.Qcdata_sys_kWh) > 0,
-                                                                      demand_df.Tcdata_sys_sup_C,
-                                                                      np.nan))
+                T_supply_cooling_C = np.nanmin(
+                    np.vstack([
+                        np.asarray(T_supply_cooling_C),
+                        np.where(abs(demand_df.Qcdata_sys_kWh) > 0, demand_df.Tcdata_sys_sup_C, np.nan)
+                    ]),
+                    axis=0
+                )

Repeat the same pattern for the 're' and generic cooling systems.


171-175: Set Series names for clarity.

Naming the returned Series improves downstream debugging/printing.

Apply:

-    hex_areas = pd.Series(index=area_columns, dtype=float)      # Series 1
-    UA_data = pd.Series(index=UA_columns, dtype=float)          # Series 2
-    Q_nom_data = pd.Series(index=Q_columns, dtype=float)        # Series 3
+    hex_areas = pd.Series(index=area_columns, dtype=float, name='HEX_areas')
+    UA_data = pd.Series(index=UA_columns, dtype=float, name='HEX_UA')
+    Q_nom_data = pd.Series(index=Q_columns, dtype=float, name='Q_nom')

Also applies to: 207-207


76-83: Doc: describe substation_systems and delta_T_supply_return.

Add brief param doc to avoid ambiguity (keys expected in substation_systems['heating'/'cooling'] and units for delta_T).


141-142: Units: confirm Q_nom_data rounding and flow units.

Q_nom_data rounds Qnom/1000 (kW). Ensure consumers of substations_Q expect kW, not W. Add a note or rename columns to include _kW suffix.

Also applies to: 249-249


58-68: Consider normalizing HEX specs to a flat, typed table.

Storing Series inside DataFrame cells complicates serialization and type checking. A tidy alternative is a MultiIndex DataFrame: index=building, columns=spec ('A'/'UA') x system.

Do you want a follow-up patch to migrate to a flat table? It simplifies IO and downstream lookups.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f6d3239 and ab9746a.

📒 Files selected for processing (1)
  • cea/technologies/thermal_network/substation_matrix.py (14 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
cea/technologies/thermal_network/substation_matrix.py (2)
cea/inputlocator.py (1)
  • get_demand_results_file (1432-1434)
cea/technologies/substation.py (4)
  • calc_DH_supply (887-897)
  • calc_DC_supply (866-884)
  • calc_HEX_heating (769-820)
  • calc_HEX_cooling (665-716)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: tests (ubuntu-latest)
  • GitHub Check: tests (macos-latest)
  • GitHub Check: tests (windows-latest)
🔇 Additional comments (4)
cea/technologies/thermal_network/substation_matrix.py (4)

141-142: Confirm timestep basis for flow calculation (kWh → W).

Multiplying kWh by 1000 assumes 1-hour timesteps (kWh/h = kW). If timestep differs, flows will be wrong. Consider dividing by timestep_hours and converting to W via ×1000.

Apply:

-        demand_df['V_substation_heating_m3pers'] = (demand_df['Q_substation_heating'] * 1000 / (delta_T_supply_return  * HEAT_CAPACITY_OF_WATER_JPERKGK)) / P_WATER_KGPERM3
-        demand_df['V_substation_cooling_m3pers'] = (demand_df['Q_substation_cooling'] * 1000 / (delta_T_supply_return  * HEAT_CAPACITY_OF_WATER_JPERKGK)) / P_WATER_KGPERM3
+        timestep_h = getattr(getattr(locator, 'config', None), 'time_resolution_hours', 1) or 1
+        factor_W = 1000 / timestep_h
+        demand_df['V_substation_heating_m3pers'] = (demand_df['Q_substation_heating'] * factor_W / (delta_T_supply_return * HEAT_CAPACITY_OF_WATER_JPERKGK)) / P_WATER_KGPERM3
+        demand_df['V_substation_cooling_m3pers'] = (demand_df['Q_substation_cooling'] * factor_W / (delta_T_supply_return * HEAT_CAPACITY_OF_WATER_JPERKGK)) / P_WATER_KGPERM3

If locator doesn’t expose a timestep, wire it via function args.


799-823: calc_DC_supply/calc_DH_supply: align semantics with callers.

Given callers pass NaN when no load, using nanmin/nanmax is correct. Consider noting this in docstrings to avoid future reintroduction of 0.0 sentinel logic.


149-207: Signature change ripple-check — no external callers found

Only call site is cea/technologies/thermal_network/substation_matrix.py:64 and it already unpacks three returned Series; no other callers detected.


15-25: Verify BUILDINGS_DEMANDS_COLUMNS against demand CSV schema

rg matched these column names only in cea/plots/naming.csv; no scenario demand CSVs were found in the repo, so I cannot confirm the actual demand CSV schema. read_csv(..., usecols=BUILDINGS_DEMANDS_COLUMNS) will raise if any column is missing.

File: cea/technologies/thermal_network/substation_matrix.py (lines 15–25)

BUILDINGS_DEMANDS_COLUMNS = ['Ths_sys_sup_aru_C', 'Ths_sys_sup_ahu_C', 'Ths_sys_sup_shu_C',
                             'Qww_sys_kWh', 'Tww_sys_sup_C', 'Tww_sys_re_C', 'mcpww_sys_kWperC',
                             'Qcdata_sys_kWh', 'Tcdata_sys_sup_C', 'Tcdata_sys_re_C', 'mcpcdata_sys_kWperC',
                             'Qcre_sys_kWh', 'Tcre_sys_sup_C', 'Tcre_sys_re_C', 'mcpcre_sys_kWperC',
                             'Ths_sys_re_aru_C', 'Ths_sys_re_ahu_C', 'Ths_sys_re_shu_C',
                             'Tcs_sys_sup_ahu_C', 'Tcs_sys_sup_aru_C',
                             'Tcs_sys_sup_scu_C', 'Tcs_sys_re_ahu_C', 'Tcs_sys_re_aru_C', 'Tcs_sys_re_scu_C',
                             'Qhs_sys_aru_kWh', 'Qhs_sys_ahu_kWh', 'Qhs_sys_shu_kWh',
                             'Qcs_sys_ahu_kWh', 'Qcs_sys_aru_kWh', 'Qcs_sys_scu_kW h', 'mcphs_sys_aru_kWperC',
                             'mcphs_sys_ahu_kWperC', 'mcphs_sys_shu_kWperC', 'mcpcs_sys_ahu_kWperC',
                             'mcpcs_sys_aru_kWperC', 'mcpcs_sys_scu_kWperC', 'E_sys_kWh']
  • Confirm every name in BUILDINGS_DEMANDS_COLUMNS exists in the demand CSV(s) used at runtime.
  • If not, either add the missing columns to those CSVs or change the code to pass only existing columns to read_csv (e.g., intersect BUILDINGS_DEMANDS_COLUMNS with the CSV's columns before using usecols).

Comment on lines 58 to 71
# Calculate disconnected buildings_demands files and substation operation.
substations_HEX_specs = pd.DataFrame(columns=['HEX_areas', 'HEX_UA', 'HEX_Q'])
substations_Q = pd.DataFrame()
for name in buildings_demands.keys():
substations_HEX_specs = pd.DataFrame(columns=['HEX_areas', 'HEX_UA'])
Q_nom_list = []
for name, demand in buildings_demands.items():
print(name)
# calculate substation parameters (A,UA) per building and store to .csv (target)
substation_HEX = substation_HEX_sizing(buildings_demands[name], substation_systems, thermal_network)
hex_areas, UA_data, Q_nom_data = substation_HEX_sizing(demand, substation_systems, thermal_network)
# write into dataframe
substations_HEX_specs.loc[name] = substation_HEX
if substations_Q.empty:
substations_Q = pd.DataFrame(substation_HEX[2])
else:
substations_Q = pd.concat([substations_Q, substation_HEX[2]])
substations_HEX_specs.loc[name] = {'HEX_areas': hex_areas, 'HEX_UA': UA_data}
Q_nom_list.append(Q_nom_data)

# Convert list of Series to DataFrame
substations_Q = pd.DataFrame(Q_nom_list, index=buildings_demands.keys())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Preserve build order for substations_Q and remove stray print.

  • Indexing substations_Q with buildings_demands.keys() assumes dict order matches Q_nom_list append order. Capture names explicitly.
  • Replace print with logger or drop it.

Apply:

-    substations_HEX_specs = pd.DataFrame(columns=['HEX_areas', 'HEX_UA'])
-    Q_nom_list = []
-    for name, demand in buildings_demands.items():
-        print(name)
+    substations_HEX_specs = pd.DataFrame(columns=['HEX_areas', 'HEX_UA'])
+    Q_nom_list: list[pd.Series] = []
+    names: list[str] = []
+    for name, demand in buildings_demands.items():
         # calculate substation parameters (A,UA) per building and store to .csv (target)
         hex_areas, UA_data, Q_nom_data = substation_HEX_sizing(demand, substation_systems, thermal_network)
         # write into dataframe
         substations_HEX_specs.loc[name] = {'HEX_areas': hex_areas, 'HEX_UA': UA_data}
         Q_nom_list.append(Q_nom_data)
+        names.append(name)
 
-    # Convert list of Series to DataFrame
-    substations_Q = pd.DataFrame(Q_nom_list, index=buildings_demands.keys())
+    # Convert list of Series to DataFrame
+    substations_Q = pd.DataFrame(Q_nom_list, index=names)
🤖 Prompt for AI Agents
In cea/technologies/thermal_network/substation_matrix.py around lines 58 to 71,
remove the stray print(name) and preserve the exact build order when creating
substations_Q by collecting the building names into a list as you iterate and
then use that list as the DataFrame index (instead of relying on dict keys
order); specifically, during the loop append each name to a names_list, append
Q_nom_data to Q_nom_list, and after the loop construct substations_Q from
Q_nom_list with index=names_list (or replace print with appropriate logger.debug
if logging is needed).

@reyery reyery merged commit c2fbdef into master Sep 22, 2025
8 checks passed
@reyery reyery deleted the some-fix-from-idp-shanghai---void-deck-and-pv-without-demand branch September 22, 2025 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants