Configure day-specific Kd feeder data for GLM #3

jordansread · 2017-03-24T17:33:46Z

No description provided.

gjahansen2 · 2017-03-29T21:32:52Z

Seasonal pattern plus annual variability. How do we balance being true to the observed data with tracking noise?

jordansread · 2017-03-30T15:25:16Z

My suggestion is that we create a characteristic seasonal curve for the lake and do a simple parameterization of that curve that stretches or amplifies it to fit the year. Maybe we could end up parameterizing it with just the annual median/mean, or perhaps we would need a second parameter for timing.

lawinslow · 2017-03-30T17:34:48Z

Ok, quick crack at this. I'm not seeing any crazy seasonal variability. Generally looks like Secchi generally declines throughout the year, but I don't see a big clear water phase "bump" you sometimes see. Overall, we also see a bit of an increase towards the end of the year.

Looked at seasonal variability for pre and post 1997 (the year which splits the data into equal halves).

My vote is to use the empirical seasonal variability. Basically, use the median of each bi-weekly bin for the overall pattern. Then use a multiplier to bump it down early in the season and bump it up later in the season (tie it together with a winter linear interp, doesn't matter much in the winter). This will amplify the annual change a bit, which lines up with the trends we see (which are steeper in the clearer periods, and muted in the less clear periods).

Trend is somewhat harder. Do we want to capture the big bump we see in 1997? If so, we can't just apply a monotonic trend. We could use a more empirical approach where we try to re-construct the overall long-term variability and use the seasonal variability reconstruction as above.

Thoughts?

gjahansen2 · 2017-03-30T18:08:28Z

I don't want to impose a linear trend over time (year to year) - the peak in 1997 is real I think (many reports from that time reference the unusually high water clarity). There seems to also be a peak in 2013.

As for seasonal patterns, I agree that the general trend appears to be the same from year to year (declines later in the year). I am not sure how much creedence we should put in the seasonal patterns early and late in the year since they are so infrequently sampled. So I don't want to tie it too strongly to the bi-weekly data in those parts of the year where we only have one or two sampling dates.

So Jordan's suggestion of a single seasonal curve, parameterized by the annual median or mean seems reasonable to me. But then how do we link Dec 31 of one year to Jan 1 of the next? Or does it not matter in ice-covered periods, and we can just have clarity jump up or down to the year-specific pattern after the ice goes off?

lawinslow · 2017-03-30T18:15:56Z

I am not sure how much creedence we should put in the seasonal patterns early and late in the year since they are so infrequently sampled.

Yeah, sure, the far edges are uncertain, but the lake seems to have reasonable coverage much of the open-water season.

So Jordan's suggestion of a single seasonal curve, parameterized by the annual median or mean seems reasonable to me.

Question here though is what is the parameterizable "single seasonal curve"?

But then how do we link Dec 31 of one year to Jan 1 of the next? Or does it not matter in ice-covered periods, and we can just have clarity jump up or down to the year-specific pattern after the ice goes off?

I'm not worried about a dec 31 to jan 1 jump. Won't affect much that time of year. We can pick something reasonable for when there is no observation coverage. I think high during that time makes sense. Secchi of 5 or so.

jordansread · 2017-03-30T18:25:40Z

The curve I was referring to is like a somewhat complex shape (maybe some loess or something) but boiled down to two parameters: 1) the mean, and 2) the amplitude. A third would be stretching vertically, but that seems harder since you need to know how long the season would be a priori. For mean and amplitude, it is akin to stretching a shape vertically or translating it vertically (amplitude and mean respectively).

lawinslow · 2017-03-30T18:26:27Z

Ok, you want to take first crack at that? Seems like you have a clear vision.

gjahansen2 · 2017-03-30T18:42:22Z

what about some kind of moving average?

jordansread · 2017-03-30T18:43:08Z

A moving average of that year's data, or the whole thing flattened to DOY?

jordansread · 2017-03-30T18:49:27Z

Was thinking something simple like this that we can adjust w/ two params:

lawinslow · 2017-03-30T18:51:48Z

Hmmm, do we not want to fit to that portion of the year? There are very few obs then.

jordansread · 2017-03-30T18:53:48Z

or something like this that flattens out to an early spring mean and flattens to a fall mean:

difference being what I can quickly code vs draw in ppt...

lawinslow · 2017-03-30T19:29:13Z

Yeah, got an equation for that? The challenging part there can be finding something well behaved with the right number of parameters. (hence my advocacy for a more empirical approach).

jordansread · 2017-03-30T19:32:45Z

empirical sounds good here, but often we may not have enough secchi data and may need to pick a curve off of the shelf (e.g., dystrophic, eutrophic) and apply it.

jordansread · 2017-03-30T20:08:13Z

The first plot is just a weibull distribution

gjahansen2 · 2017-03-30T20:34:20Z

Or if we have a year with no/little data, use an adjacent year? Create an 'average' year to use when year-specific data are missing? Like we did for Wilma Secchi.

…

On Thu, Mar 30, 2017 at 3:08 PM, Jordan S Read ***@***.***> wrote: The first plot is just a weibull distribution — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ASiU6krLTfjYRE3nkUNSWoA0BKP5aveKks5rrAutgaJpZM4Mojwv> .

-- Gretchen Hansen https://gretchenhansen.squarespace.com

lawinslow · 2017-03-30T20:56:03Z

For future reference, here is the best simple model I could come up with for seasonal variability of secchi in the LTER lakes.

ratio is defined as median(cw_phase_secchi)/median(summer_secchi), though they aren't defined by season, I just eyed up the time window of the clear-water phase in those lakes (cw_phase - week 22:23, summer - week 27:42). Adding DOC into the linear model improves it (R^2~0.8, and has lower AIC) so there's that.

jordansread · 2017-03-31T13:36:45Z

interesting...cool

gjahansen2 · 2017-03-31T13:48:13Z

I am reading some old (early 1990's) PCA reports about Mille Lacs. They describe Secchi as follows: "Under ice readings range from 4.5-5.5m with very low levels of algae and suspended solids. Following ice out, Secchi transparency declines quickly to 3 m or less. Transparency continues to decline, coincident with peak algae concentrations, until late August with a minimum reading of 1.3 m".

Given the paucity of under ice data, I think the shape should be something like a flat line (with high Secchi) until ice out, then a fairly steady decline from right around ice out through the end of August, then flat until ice on. The actual values of the starting and ending points could be parameterized based on observed open water medians for the year (where available). When we look at annual mean/medians, it is probably important to remove samples from winter, since they only exist for a few years and will skew the data for those years to be more clear than if only open water season was sampled. This figure has data only from April-Oct.

.

gjahansen2 · 2017-03-31T16:49:55Z

OK, I played around with some simple logistic functions, parameterized with year-specific highs and lows.

This might be close to what we want, but I don't like how (potentially) bad measurements (or few measurements) in a year can drag the values up or down. I think a better way would be to construct a mixed-effects non-linear model of this same form, with a random year effect that lets the max and min thresholds vary by year. BUT those year specific estimates will be influenced by all measurements, so if a year is not well-sampled (or not at all), it will drag that year's estimates closer to the overall mean.
What do you think?

gjahansen2 · 2017-03-31T16:50:52Z

Logistic function is just seasonal=function(x,max, min){min+((max-min)/(1+exp(.05*(x-200))))}, where I chose the rate of change as 0.05 and the mid point of transition period as 200. We could fit those to data as well if we wanted to.

lawinslow · 2017-03-31T19:15:00Z

Yeah, I think that would be a decent approach, though my only concern would be the big jump in clarity at the new year transition. It would be a nice fusion of model and empiricism.

jordansread · 2017-03-31T20:36:05Z

Maybe we apply this to a certain DOY range (100-320?) and then interpolate between the years so that the transition changes aren't abrupt?

gjahansen2 · 2017-04-04T16:55:30Z

^^ this is what I was thinking too - apply the modeled curve throughout the open water season and then interpolate between ice on and ice off to get from ending value of year x to starting value of year x+1.
I fit the nonlinear mixed effects model and estimated year-specific maxima and minima. Looks like some of the years are kind of funky. The black line is the generic fit.

gjahansen2 · 2017-04-04T16:56:33Z

customLogitModel=function(x,max, min){min+((max-min)/(1+exp(.046*(x-192))))}

Where 0,046 and 192 were estimated from a simple nonlinear model, and max and min are estimated as year-specific random effects.

jordansread · 2017-04-04T17:42:38Z

yes, this seems doable.

lawinslow · 2017-04-04T17:50:08Z

Can someone re-create a daily Kd value based on that? I could start running the model then.

gjahansen2 · 2017-04-04T17:55:06Z

Are we just doing kd=secchi/something? What is the 'something' again?

Here is what Optical habitat looks like (3-year moving average) with the new seasonal pattern included (note generic parameters used for much of the 1980's but word on the street is that the Mille Lacs Band of Ojibwe might have data from that time).

jordansread · 2017-04-04T18:00:17Z

I think it is 1.7/Kd

gjahansen2 · 2017-04-04T18:11:07Z

ok, so we want daily values for all years, with linear interpolation between end of season in year x and beginning of season in year x+1. I can create this.

Question:
Do we want the end and beginning of years to be based on some arbitrary day numbers, or based on ice on and ice off? I just used DOY 110-310 in the figure shown above, but it seems reasonable to use the modeled ice on/ice off dates as those cutoffs. And that is not something I can incorporate at this point. Also it is perhaps a bit of a chicken vs. egg situation - if i need modeled outputs to calculate daily kd and you need daily kd to calculate modeled outputs...

lawinslow · 2017-04-04T18:28:39Z

I'd just say arbitrary. You can be conservative if you want. I really don't think the model will be terribly sensitive to these numbers. You're right that we don't have ice on/off so it would be hard to pick them now.

jordansread · 2017-04-04T18:51:58Z

I think starting out with the system you have above w/ an easy to change set of parameters that we can use to tell us how sensitive the model is to that time range (if we choose to look into it).

gjahansen2 · 2017-04-04T20:49:30Z

Hmm.. okay here is a stab at it. Modeled Secchi as a function of DOY from DOY 100-320, then linearly interpolated between missing values.
daily_kd_interpolated.txt

@lawinslow data are there ^^^

gjahansen2 · 2017-04-04T20:49:50Z

`
library(ggplot2)
library(dplyr)
library(lme4)
library(reshape2)
library(readr)
library(ggrepel)
library(RColorBrewer)

#data from Heidi
mydata=read_csv("ML_secchi.csv")
mydata$date=as.Date(mydata$date, format="%m/%d/%Y")
mydata$doy=as.numeric(format(mydata$date, format="%j"))

year.medians=summarise(group_by(mydata, year), median.secchi=median(secchi_m))

#exclude data outside window used for TOHA
toha.secchi=subset(mydata, doy>100 & doy<320)
year.medians.toha=summarise(group_by(toha.secchi, year), median.secchi=median(secchi_m))
year.medians.toha$date=as.Date(paste(year.medians.toha$year,"-07-01", sep=""))

#create function for seasonal pattern
toha.secchi$week=as.numeric(format(toha.secchi$date, "%U"))
toha.secchi$year.f=factor(toha.secchi$year)

customLogitModel=function(x,max, min){min+((max-min)/(1+exp(.046*(x-192))))}
#k=0.046, midseason=192 as estimated from nls

customLogitModelGradient <- deriv(
body(customLogitModel)[[2]],
namevec = c("max", "min"),
function.arg=customLogitModel
)

#starting guesses
#secchi.nls=nls(secchi_m~customLogitModel(doy, max, min, k, mid.season), data=toha.secchi,start = c(max=4, min=1.5, mid.season=200, k=1))

Fit the model

model <- nlmer(
secchi_m ~ customLogitModelGradient(x=doy, max, min) ~
# Random effects with a second ~
(max | year.f) + (min|year.f) ,
data = toha.secchi,
start = c(max=4, min=2.3)
)

summary(model)
year.estimates=coef(model)$year.f
general.estimates=fixef(model)
write.csv(year.estimates, "year_secchi_max_min_estimates.csv", row.names=T)
write.csv(data.frame(general.estimates), "generic_secchi_max_min_estimates.csv",row.names=F)

#create time series - if year is missing or bad estimate, use generic
yearly.secchi=data.frame(year.estimates )
yearly.secchi$year=row.names(yearly.secchi)

secchi.params=data.frame(year=seq(1974,2016))
secchi.params=merge(secchi.params, yearly.secchi, by="year", all=T)
#bad estimates if min is bigger than max
secchi.params$max[secchi.params$max<secchi.params$min]=NA
secchi.params$min[secchi.params$max<secchi.params$min]=NA
secchi.params$max[is.na(secchi.params$max)]=general.estimates[1]
secchi.params$min[is.na(secchi.params$min)]=general.estimates[2]

#create data frame of secchi depths
day.list=data.frame(date= seq(as.Date("1974-03-31"), as.Date("2016-12-31"), by="1 day"))
day.list$doy=as.numeric(format(day.list$date, "%j"))
day.list$year=as.numeric(format(day.list$date, "%Y"))
daily.kd=data.frame("doy"=0, "year"=0, "date"=as.Date(0), "Secchi"=0)

for(i in 1:nrow(secchi.params))
{
this.year=subset(day.list, year==secchi.params$year[i])
open.water=data.frame(doy=seq(100,320), Secchi=customLogitModel(x=seq(100,320), max=secchi.params$max[i], min=secchi.params$min[i]), year=as.numeric(secchi.params$year[i]))
this.year=merge(this.year, open.water, by=c("doy", "year"), all=T)
daily.kd=rbind(daily.kd, this.year)
}
daily.kd=daily.kd[2:nrow(daily.kd),]

daily.kd$Secchi=na.approx(daily.kd$Secchi, na.rm=F)
daily.kd$kd=1.7/daily.kd$Secchi
write_tsv(daily.kd, "daily_kd_interpolated.txt")

ggplot()+geom_path(data=daily.kd, aes(date, kd))+theme_classic()
ggsave("kd_time_series_interpolated.png", height=5, width=8, units="in")

ggplot()+geom_path(data=daily.kd, aes(doy, Secchi, group=year, colour=factor(year)))+theme_classic()
ggsave("secchi_seasonal_interpolated.png", height=5, width=8, units="in")
`

gjahansen2 self-assigned this Apr 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configure day-specific Kd feeder data for GLM #3

Configure day-specific Kd feeder data for GLM #3

jordansread commented Mar 24, 2017

gjahansen2 commented Mar 29, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

gjahansen2 commented Mar 30, 2017

lawinslow commented Mar 30, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

gjahansen2 commented Mar 30, 2017

jordansread commented Mar 30, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

jordansread commented Mar 30, 2017

jordansread commented Mar 30, 2017

gjahansen2 commented Mar 30, 2017 via email

lawinslow commented Mar 30, 2017

jordansread commented Mar 31, 2017

gjahansen2 commented Mar 31, 2017 •

edited

Loading

gjahansen2 commented Mar 31, 2017

gjahansen2 commented Mar 31, 2017

lawinslow commented Mar 31, 2017

jordansread commented Mar 31, 2017

gjahansen2 commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017 •

edited

Loading

jordansread commented Apr 4, 2017

lawinslow commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017 •

edited

Loading

jordansread commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017

lawinslow commented Apr 4, 2017

jordansread commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017 •

edited

Loading

gjahansen2 commented Apr 4, 2017

Configure day-specific Kd feeder data for GLM #3

Configure day-specific Kd feeder data for GLM #3

Comments

jordansread commented Mar 24, 2017

gjahansen2 commented Mar 29, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

gjahansen2 commented Mar 30, 2017

lawinslow commented Mar 30, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

gjahansen2 commented Mar 30, 2017

jordansread commented Mar 30, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

jordansread commented Mar 30, 2017

lawinslow commented Mar 30, 2017

jordansread commented Mar 30, 2017

jordansread commented Mar 30, 2017

gjahansen2 commented Mar 30, 2017 via email

lawinslow commented Mar 30, 2017

jordansread commented Mar 31, 2017

gjahansen2 commented Mar 31, 2017 • edited Loading

gjahansen2 commented Mar 31, 2017

gjahansen2 commented Mar 31, 2017

lawinslow commented Mar 31, 2017

jordansread commented Mar 31, 2017

gjahansen2 commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017 • edited Loading

jordansread commented Apr 4, 2017

lawinslow commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017 • edited Loading

jordansread commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017

lawinslow commented Apr 4, 2017

jordansread commented Apr 4, 2017

gjahansen2 commented Apr 4, 2017 • edited Loading

gjahansen2 commented Apr 4, 2017

Fit the model

gjahansen2 commented Mar 31, 2017 •

edited

Loading

gjahansen2 commented Apr 4, 2017 •

edited

Loading

gjahansen2 commented Apr 4, 2017 •

edited

Loading

gjahansen2 commented Apr 4, 2017 •

edited

Loading