Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add minimal pymc example #7281

Closed
wants to merge 5 commits into from

Conversation

HarshvirSandhu
Copy link
Contributor

@HarshvirSandhu HarshvirSandhu commented Apr 26, 2024

Description

Example is taken from this notebook and made to look like the readme example of sunode
Not sure if the example is placed correctly.

Related Issue

Checklist

Type of change

  • Documentation

📚 Documentation preview 📚: https://pymc--7281.org.readthedocs.build/en/7281/

Copy link

welcome bot commented Apr 26, 2024

Thank You Banner]
💖 Thanks for opening this pull request! 💖 The PyMC community really appreciates your time and effort to contribute to the project. Please make sure you have read our Contributing Guidelines and filled in our pull request template to the best of your ability.

README.rst Outdated
mu = x @ betas

# Likelihood
y = pm.Normal("y", mu, sigma, dims=["trial"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ricardoV94 what might the output be? I think at least having a short story like "imagine we ran an experiment where for different levels of hardness, conducitivty and temperature, we measured Y".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I'm asking what we're measuring in this hypothetical experiment. Just to round out the example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pick one from ChatGPT:

  1. Material Strength: This could include tensile strength, yield strength, or fatigue strength. These measures could be affected by changes in material hardness and temperature.
  2. Electrical Properties: Besides conductivity, which is already a variable, YY might represent electrical resistance or capacitance, which can change with temperature and material properties.
  3. Thermal Properties: Such as thermal expansion or specific heat capacity, which could be influenced by the material's hardness and its conductivity.
  4. Optical Properties: Such as reflectivity or light absorption, which might change with temperature and the physical state of the material.
  5. Chemical Reactivity: Rate of a chemical reaction that could be influenced by the material's properties and temperature.
  6. Phase Changes: The point at which a material changes from solid to liquid (melting point) or from liquid to gas (boiling point), which can be influenced by the material's composition and environmental conditions.
  7. Durability or Wear Resistance: How well a material can withstand wear or degradation over time, potentially influenced by its hardness and operating temperature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a different example from GPT that I think works better:

Objective: Investigate the effects of sunlight exposure, water amount, and soil nitrogen content on plant growth.

Background: Plant growth can be influenced by multiple factors, and understanding these relationships is crucial for optimizing agricultural practices. In this experiment, we aim to predict the growth of a plant based on three different environmental variables.

Experiment Setup:
Variables:
Independent Variables:
Sunlight Hours (X1): Number of hours the plant is exposed to sunlight daily.
Water Amount (X2): Daily water amount given to the plant (in milliliters).
Soil Nitrogen Content (X3): Percentage of nitrogen content in the soil.
Dependent Variable:
Plant Growth (Y): Measured as the increase in plant height (in centimeters) over a certain period.

Copy link
Member

@ricardoV94 ricardoV94 May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, just say data is standardized so the normal draws and likelihood make sense

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HarshvirSandhu Can you make these changes to the example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

README.rst Outdated

# Define coordinate values for all dimensions of the data
coords={
"trial": range(100),
"features": ["hardness", "conductivity", "temperature"],
"features": ["X1", "X2", "X3"],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"features": ["X1", "X2", "X3"],
"features": ["sunlight hours", "water amount", "soil nitrogen"],

README.rst Outdated
@@ -45,11 +52,20 @@ Linear Regression Example
x_dist = pm.Normal.dist(shape=(100, 3))
x_data = pm.draw(x_dist, random_seed=seed)

# Independent Variables:
# Sunlight Hours (X1): Number of hours the plant is exposed to sunlight daily.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Sunlight Hours (X1): Number of hours the plant is exposed to sunlight daily.
# Sunlight Hours: Number of hours the plant is exposed to sunlight daily.

README.rst Outdated

Plant growth can be influenced by multiple factors, and understanding these relationships is crucial for optimizing agricultural practices.

In this experiment, we aim to predict the growth of a plant based on three different environmental variables.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In this experiment, we aim to predict the growth of a plant based on three different environmental variables.
Imagine we conduct an experiment to predict the growth of a plant based on three different environmental variables.

README.rst Outdated
@@ -36,6 +36,13 @@ Features

Linear Regression Example
==========================

**Background**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Background**

README.rst Outdated
mu = x @ betas

# Likelihood
y = pm.Normal("y", mu, sigma, dims=["trial"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
y = pm.Normal("y", mu, sigma, dims=["trial"])
plant_growth = pm.Normal("plant growth (z-scored)", mu, sigma, dims="trial")

README.rst Outdated
@@ -137,7 +135,7 @@ sigma 0.511 0.037 0.438 0.575 0.001 0
with pm.do(
inference_model,
{inference_model["betas"]: inference_model["betas"] * [1, 1, 0]},
) as heat_death_model:
) as new_model:
Copy link
Member

@twiecki twiecki May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) as new_model:
) as plant_growth_model:

README.rst Outdated
@@ -80,7 +79,7 @@ In this experiment, we aim to predict the growth of a plant based on three diffe
mu = x @ betas

# Likelihood
y = pm.Normal("y", mu, sigma, dims=["trial"])
plant_growth = pm.Normal("plant growth (z-scored)", mu, sigma, dims="trial")
Copy link
Member

@twiecki twiecki May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
plant_growth = pm.Normal("plant growth (z-scored)", mu, sigma, dims="trial")
# Assuming we measure deviation of each plant from baseline
plant_growth = pm.Normal("plant growth", mu, sigma, dims="trial")

random_seed=seed,
)

pm.stats.summary(idata.predictions, kind="stats")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we show all the summaries outputs? Why only the first?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for starters it's TMI and can scare people off. Convergence diagnostics is more advanced than what we want to demo here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I thought you meant more columns, but you meant more rows?

Copy link
Member

@ricardoV94 ricardoV94 May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean't every time the have pm.stats.summary, we should show the output. I already removed the extra convergence columns with kind="stats". Right now it's only showing for the first usage

========================== ====== ===== ======== =========
Output mean sd hdi_3% hdi_97%
========================== ====== ===== ======== =========
plant growth (z-scored)[0] 14.21 0.509 13.232 15.144
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is still the old name ("z-scored").

========================== ====== ===== ======== =========
Output mean sd hdi_3% hdi_97%
========================== ====== ===== ======== =========
plant growth (z-scored)[0] 14.153 0.509 13.181 15.096
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also needs updated name.

@kirahowe
Copy link
Contributor

Having a look at this at the PyData London 2024 hackathon 😄

@kirahowe
Copy link
Contributor

kirahowe commented Jun 15, 2024

Couldn't commit to this branch, so there's a PR here on another fork: #7358. As of today, the example works on my machine and the outputs reflect the names used in the example.

@twiecki
Copy link
Member

twiecki commented Jun 15, 2024

Closing in favor of #7358.

@twiecki twiecki closed this Jun 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include minimal pymc example in README
4 participants