This notebook presents lca-algebraic a small libray above brightay2, designed for the definition of parametric inventories with fast computation of LCA impacts, suitable for monte-carlo analyis.
lca-algebraic provides a set of helper functions for :
- compact & human readable definition of activites :
- search background (tech and biosphere) activities
- create new foreground activites with parametrized amounts
- parametrize / update existing background activities (extending the class Activity)
- Definition of parameters
- Computation of LCAs
- Computation of statistics (including Sobols indices)
- Compute simplified parametric model by fixing minor input parameters
The main idea of this libray is to move from procedural definition of models (slow and prone to errors) to a declarative / purely functionnal definition of parametric models (models as pure functions).
This enables fast computation of LCA impacts, useful for Monte Carlo methods and statistical analysis. We leverage the power of symbolic calculus provided by the library SymPy.
We define our model in a separate database, as a nested combination of :
- other foreground activities
- background activities :
- Technical, refering ecoinvent DB
- Biosphere, refering brightway2 biosphere activities
The amounts in exchanges are expressed either as static amounts, or symbolic expressions of pre-defined parameters.
Each activity of our root model is defined as a parametrized combination of the foreground activities, which can themselves be expressed by the background activities.
When computing LCA for foreground models, the library develops the model as a combination of only background activities. It computes once for all the impact of all required background activities and compiles a fast numpy (vectorial) function for each impact, replacing each background activity by the static value of the corresponding impact.
By providing large vectors of parameter values to those numpy functions, we can compute LCA for thousands of values at a time.
Under the hood, the activities we define with lca-algebraic are standard brightway2 activities. The amounts of exchanges are stored as float values or serialized as string in the property formula.
Parameters are also stored in the brightay2 projets, making it fully compatible with brightway.
Thus, a model defined with lca-algebraic is stored as a regular bw2 projet. We can use bw2 native support for parametrized dataset for computing LCAs, even if much more slower than the method explain here.
The followng notebook explores the main functions. Full documentation of the functions is available here
# Those two lines are for dev only : they watch imported libraries for changes
# %load_ext autoreload
# %autoreload 2
import pandas as pd
import time
import matplotlib.pyplot as plt
import numpy as np
import brightway2 as bw
# Custom utils defined for inter-acv
from lca_algebraic import *
from lca_algebraic.stats import *
import lca_algebraic
from sympy import init_printing, simplify
init_printing()
### Init the brightway2 project :choose any project name
initProject('MyProject')
# Import Ecoinvent DB (if not already done)
# Update the name and path to the location of the ecoinvent database
importDb("ecoinvent 3.4", './ecoinvent 3.4_cutoff_ecoSpold02/datasets')
# We use a separate DB for defining our foreground model / activities
# Choose any name
USER_DB = 'Foreground DB'
# This is better to cleanup the whole foreground model each time, and redefine it in the notebook
# instead of relying on a state or previous run.
# Any persistent state is prone to errors.
resetDb(USER_DB)
# Parameters are stored at project level :
# Reset them also
# You may remove this line if you import a project and parameters from an external source (see loadParam(..))
resetParams()
# Overview of the databases
list_databases()
Numpy is a python libray for symbolic calculus.
You write Sympy expression as you write standard python expressions, using sympy symbols in them.
The result is then a symbolic expression that can be manipulated, instead of a numeric value.
from sympy import symbols
# create sympy symbol
a = symbols("a")
# Expressions are not directly evaluated
f = a * 2 + 4
f
# symbols can be replaced by values afterwards
f.subs(dict(a=3))
In practice, you don't need to care about Sympy. Just remember that :
- The parameters defined below are instances of sympy symbols
- Any valid python expression containing a sympy symbol will create a sympy symbolic expression
First, we define the input parameters of the model together with their distribution.
The numeric parameters are instances of sympy 'Symbol'.
Thus, any python arithmetic expression composed of parameters will result in a symbolic expression to be used later in the definition of the model, rather than a static numeric result.
# Example of 'float' parameters
a = newFloatParam(
'a',
default=0.5, min=0.2, max=2, distrib=DistributionType.TRIANGLE, # Distribution type, linear by default
description="hello world",
label="extended label for a")
b = newFloatParam(
'b',
default=0.5, # Fixed if no min /max provided
description="foo bar")
# Parameters can be attached to a specific DB instead of global project
# We recommand this for team working / working with several DBs
# lca_algebraic supports resolution of parameters with duplicate names attached to different scopes
b2 = newFloatParam(
'b',
default=0.5, # Fixed if no min /max provided
description="foo bar",
dbname=USER_DB)
share_recycled_aluminium = newFloatParam(
'share_recycled_aluminium',
default=0.6, min=0, max=1, std=0.2, distrib=DistributionType.NORMAL, # Normal distrib, with std dev
description="Share of reycled aluminium")
c = newFloatParam(
'c',
default=0.6, std=0.2, distrib=DistributionType.LOGNORMAL)
beta = newFloatParam(
'beta',
default=0.6, std=0.2, a=2, b=5, distrib=DistributionType.BETA)
# You can define boolean parameters, taking only discrete values 0 or 1
bool_param = newBoolParam(
'bool_param',
default=1)
# Example 'enum' parameter, acting like a switch between several possibilities
# Enum parameters are not Symbol themselves
# They are a facility to represent many boolean parameters at once '<paramName>_<enumValue>'
# and should be used with the 'newSwitchAct' method
elec_switch_param = newEnumParam(
'elec_switch_param',
values=["us", "eu"], # If provided as list, all possibilities have te same probability
default="us",
description="Switch on electricty mix")
# Another example enum param
techno_param = newEnumParam(
'techno_param',
values={
"technoA":0.4,
"technoB":0.1,
"technoC":0.5}, # You can provide a statistical weight for each value, by using a dict
default="technoA",
description="Choice of techonoly")
# List of parameters
list_parameters()
By default, new parameters are kept in memory but also persisted in the project (unless save=False).
You can persist parameters afterwards with persistParams()
.
You can load also load parameters from an existing database with loadParams()
.
The persistance of parameters and the distribution is compatible with Brightway2 and Activity Browser see documentation of stat_arrays
# Load parameters previously persisted in the dabatase.
loadParams()
lca_algebraic supports several foreground / background datasets. Background datasets are considered static / non parametrized by the library : they use standard LCA method of Brightway2.
Foreground databases are considered parametric and their activities are developped as functions of parameters and background activities.
The functions setForeground(...) and setBackground(...) change the status of a database.
setBackground(USER_DB)
list_databases()
setForeground(USER_DB)
list_databases()
lca_algebraic
extends BW2Package, adding persistence of parameters.
# Save database and parameters as Bzipped JSON
export_db(USER_DB, "db.bw2")
# Reimport DB
import_db("db.bw2")
A foreground database can be "frozen" to be used as a background database for a specific scenario : the parametrized amounts in the exhanges are computed for a given configuration of the parameters, and replaced by their value. The formulas are still stored in the database and not lost : the database can still be used as a foreground database until its status is changed with setBackground(...)
.
This feature is useful for studies requiring several datasets to be used as background by other ones. It also enables to use standard Brightway2 tools, not aware of parametrization.
freezeParams(
USER_DB, # Name of database to freeze
a=1, b=2) # custom parameter values
We provide two functions for easy and fast (indexed) search of activities in reference databases :
- findBioAct : Search activity in biosphere3 db
- findTechAct : Search activity in ecoinvent db
Those methods are faster and safer than using traditionnal "list-comprehension" search : They will fail with an error if more than one activity matches, preventing the model to be based on a random selection of one activity.
# Biosphere activities
ground_occupuation = findBioAct('Occupation, industrial area') # Search by name
heat = findBioAct('Heat, waste', categories=['air']) # Add category selector
# Technosphere activities
# You can add an optionnal location selector
alu = findTechAct("aluminium alloy production, AlMg3", loc="RER")
alu_scrap = findTechAct('aluminium scrap, new, Recycled Content cut-off')
# Elec
eu_elec = findTechAct("market group for electricity, medium voltage", 'ENTSO-E')
us_elec = findTechAct("market group for electricity, medium voltage", 'US')
The model is defined as a nested combination of background activities with amounts.
Amounts are defined either as constant float values or algebric formulas implying the parameters defined above.
# Create a new activity
activity1 = newActivity(USER_DB, # We define foreground activities in our own DB
"first foreground activity", # Name of the activity
"kg", # Unit
exchanges= { # We define exhanges as a dictionarry of 'activity : amount'
ground_occupuation:3 * b, # Amount can be a fixed value
heat: b + 0.2 # Amount can be a Sympy expression (any arithmetic expression of Parameters)
})
# You can create a virtual "switch" activity combining several activities with a switch parameter
elec_mix = newSwitchAct(USER_DB,
"elect mix", # Name
elec_switch_param, # Sith parameter
{ # Dictionnary of enum values / activities
"us" : us_elec, # By default associated amount is 1
"eu" : (eu_elec, 0.8) # You can also provide custom amout or formula with a tuple
})
You can copy and update an existing background activity.
Several new helper methods have been added to the class Activity for easy update of exchanges.
alu2 = copyActivity(
USER_DB, # The copy of a background activity is done in our own DB, so that we can safely update it
alu, # Initial activity : won't be altered
"Aluminium 2") # New name
# Update exchanges by their name
alu2.updateExchanges({
# Update amount : the special symbol *old_amount* references the previous amount of this exchange
"aluminium, cast alloy": old_amount * (1 - share_recycled_aluminium),
# Update input activity. Note also that you can use '*' wildcard in exchange name
"electricity*": elec_mix,
# Update both input activity and amount.
# Note that you can use '#' for specifying the location of exchange (useful for duplicate exchange names)
"chromium#GLO" : dict(amount=4.0, input=activity1)
})
# Add exchanges
alu2.addExchanges({alu_scrap : 12})
Usually, we normalize the final model as the whole LCI divided by the functional value (production of the system)
functional_value = a + 5
model = newActivity(USER_DB, "model", "kg", {
activity1 : b * 5 + a + 1, # Reference the activity we just created
alu2: 3 * share_recycled_aluminium,
alu:0.4 * a})
normalized_model = newActivity(USER_DB, "normalized model", "kg", {
model : 1 / functional_value})
Alternatively, you may not define the model again, but load it from the USER DB. We don't recommand it while developping the model as it may lead to unstable code / uncontrolled state of the DB.
activity1 = findActivity("first foreground activity", db_name=USER_DB)
model = findActivity("model", db_name=USER_DB)
normalized_model = findActivity("normalized model", db_name=USER_DB)
alu2 = findActivity("Aluminium 2", db_name=USER_DB)
printAct displays the list of all exchanges of an activity.
Note that symbolic expressions have not been evaluated at this stage
# Print_act displays activities as tables
printAct(activity1)
printAct(model)
printAct(normalized_model)
# You can also compute amounts by replacing parameters with a float value
printAct(activity1, b=1.5)
# You print several activities at once to compare them
printAct(alu, alu2)
# List of impacts to consider
impacts = [m for m in bw.methods if 'ILCD 1.0.8 2016' in str(m) and 'no LT' in str(m)]
# You can provide alternate names for some methods
set_custom_impact_labels({
impacts[0] : 'Resource usage',
impacts[1]: 'Climate change'})
We provide two methods for computing LCA :
- multiLCA : It uses brightway2 native parametric support. It is much slower and kept for comparing results.
- multiLCAAlgebric : It computes an algebric expression of the model and computes LCA once for all the background activities. Then it express each impact as a function of the parameters. This expression is then compiled into 'numpy' native code, for fast computation on vectors of samples. This version is 1 million time faster.
# Uses brightway2 parameters
multiLCA(
normalized_model,
impacts,
# Parameters of the model
a=1,
elec_switch_param="us",
share_recycled_aluminium=0.4)
# Compute with algebric implementation
# The values should be the same as above
multiLCAAlgebric(
normalized_model, # The model
impacts, # Impacts
# Parameters of the model
a=1,
elec_switch_param="us",
share_recycled_aluminium=0.4)
# You can compute several LCAs at a time and compare them:
multiLCAAlgebric(
[alu, alu2], # The models
impacts, # Impacts
# Parameters of the model
share_recycled_aluminium=0.3,
elec_switch_param="us")
You can extract the contribution of a subset of activites to the global impact of the model
# Contribution of impact of aluminium
multiLCAAlgebric(
normalized_model, # The model
impacts, # Impacts
# List of sub activites to consider
extract_activities = [alu2],
# Parameters of the model
a=1,
elec_switch_param="us",
share_recycled_aluminium=0.4)
# Fast computation for millions of separate samples
multiLCAAlgebric(
model, # The model
impacts, # Impacts
# Parameters of the model
a=list(range(1, 100000)), # All lists should have the same size
share_recycled_aluminium=1, # Those parameters are fixed
elec_switch_param="eu")
We provide several functions for computing statistics for local variations of parameters (one at a time).
Shows a matrix of impacts x parameters colored according to the variation of the impact in the bounds of the parameter.
oat_matrix(model, impacts)
This functions draws a dashboard showing :
- A dropdown list, for choosing a parameter
- Several graphs of evolution of impacts for this parameter
- Full table of data
- A graph of "bars" representing the variation of each impact for this parameter (similar to the information given in oat_matrix)
oat_dashboard_interact(
model, impacts,
# Optionnal layout parameters
figspace=(0.5,0.5),
figsize=(15, 15),
sharex=True)
Here we leverage fast computation of monte-carlo approches.
We compute global sensivity analysis (GSA). Not only local ones.
Similar to OAT matrix, we compute Sobol indices. they represent the ratio between the variance due to a given parameter and the total variance.
for easier comparison, we translate those relative sobol indices into "deviation / mean" importance :
# Show sobol indices
incer_stochastic_matrix(model, impacts)
We provide a dashboard showing violin graphs : the exact probabilistic distribution for each impact. Together with medians of the impacts.
incer_stochastic_violin(
model, impacts,
# Optionnal layout parameters
figspace=(0.5,0.5),
figsize=(15, 15),
sharex=True,
nb_cols=3)
##### Alternatively, graphs can be shown horizontally, together with a box of statistical outcomes
distrib(
model, impacts,
# Optionnal layout parameters
height=7, width=15,
nb_cols=2,
percentiles=[5, 95])
A dashboard groups all this information in a single interface with tabs.
It also shows total variation of impacts. This last graph could be improved by showing stacked colored bars with the contribution of each parameter to this variation, according to Sobol indices.
incer_stochastic_dashboard(model, impacts)
One of te outcome of the statisticall analysis above would be to identify main input parameters and produce simplidied models, fixing the minor ones.
We provide several functions for doing this.
# First, let's look at the full expression defining our model
expr, _ = actToExpression(normalized_model)
expr
We provide some method to automatically select a subset of parameters, based on the sobol indices, and then compute simplified models for it.
We also round numerical expression to 3 digits, and we remove terms in sums that are less than 1% of total.
simplified = sobol_simplify_model(
normalized_model, # The model
impacts, # Impacts to consider
n=10000, # For large model, you may test other value and ensure ST and sum(S1) are close to 1.0
fixed_mode = FixedParamMode.MEDIAN, # We replace minor parameters by median by default,
min_ratio=0.8, # Min ratio of variability to explain
num_digits=3)
# Let's look at the expression for first impact again
# much simpler !
simplified[0].expr
Finally, we can compare the distribution of those simplified model against the full model. We provide a function for graphical display of it, and compuation of de R-Square score.
compare_simplified(normalized_model, impacts, simplified)