-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v1.1] Add support for errors-in-variables #19
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #19 +/- ##
==========================================
- Coverage 98.63% 91.79% -6.84%
==========================================
Files 3 3
Lines 73 134 +61
Branches 20 29 +9
==========================================
+ Hits 72 123 +51
- Misses 1 10 +9
- Partials 0 1 +1
Continue to review full report at Codecov.
|
Hello @m93a |
@jobo322 Hey, If you have doubts about any specific claim, I'll try to provide more specific sources 🙂 |
Hello @m93a, I will checking the code as soon as possible, also, I'll review the test cases. Again many thanks for your collaboration |
Now the only thing left is to add new tests and evaluate why the old ones fail. |
Hello @m93a , I'm looking for an explanation about why the parameters in test case of sinFunction is always going down, I have been trying with diferents dampingBoost, dampingDrop, damping but the result is the min coutes defined by minValues, if you do not set inital values close to the real parameters, do you have idea how to control it? |
Hey @jobo322, EDIT: It would be really helpful to make a plot with the algorithm's “trajectory” in the parameter space, rendered on top of a heatmap of that space. |
Hello @m93a and thanks a lot for your work! I hope you don't mind, I pushed a change to your branch to fix the conflicts with master and outstanding issues with eslint. In case you haven't seen it, we have an |
I don't think we need to support legacy options. Let's design a good API and get rid of the inconsistencies. |
@@ -12,7 +12,7 @@ | |||
"scripts": { | |||
"eslint": "eslint src", | |||
"eslint-fix": "npm run eslint -- --fix", | |||
"prepare": "rollup -c", | |||
"preparea": "rollup -c", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this intended? I indeed have no idea about what it should do, but it seems a lot like a typo to me 😀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, it was a temporary rename because npm install
was failing 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yes, typo. You can push up a fix.
@@ -1,6 +1,6 @@ | |||
{ | |||
"name": "ml-levenberg-marquardt", | |||
"version": "1.0.3", | |||
"version": "1.1.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's customary to separate release bump commits from other work (some package maintainers get really peeved about this). @targos Are there strong opinions about this for this project?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we use a tool that bumps the version number automatically (based on the commit messages) when we publish the release
Math.max(minValues[k], parameters[k]), | ||
maxValues[k] | ||
); | ||
params2[k] = Math.min(Math.max(minValues[k], params2[k]), maxValues[k]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params2
seems like a name that isn't highly expressive. Am I understanding correctly that what you're doing here is avoiding overwriting the previous parameters in case they produce better results? If so, perhaps it'd be better to use names like prevParams
and updatedParams
or lastParams
and nextParams
, etc.
|
||
var converged = error <= errorTolerance; | ||
var residuals = sumOfSquaredResiduals(data, params, paramFunction); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to be mixing var
and let
/const
. I don't know if the switch belongs in this PR, but I think we should migrate away from using var
unless we're trying to be compatible with really old engines. (And with the move to TS, that wouldn't be a problem anyway since it could just be translated down at the end.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can migrate to let and const with typescript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm used to let
and const
from TS, so it's possible that I accidentaly used them instead of var
on some places. I was trying to use var
as the old code was written that way and I didn't want to decrease compatibility with older engines. Moving to TS would be really cool.
* @prop {number} [residualEpsilon = 1e-6] - Minimum change of the sum of residuals per step – if the sum of residuals changes less than this number, the algorithm will stop | ||
* @prop {Array<number>} [maxValues] - Maximum values for the parameters | ||
* @prop {Array<number>} [minValues] - Minimum values for the parameters | ||
* @prop {{rough: number, fine: number}} [errorPropagation] - How many evaluations (per point per step) of the fitted function to use to approximate the error propagation through it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are these rough
and fine
values used? (Are these just leftovers from fc79bc6?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I apparently forgot to remove them. In the best-case scenario, the algorithm would interpolate between those two as it approached the minimum, but I didn't manage to do it reliably, so it's best to remove those options.
FYI I am taking a stab at this also. https://github.com/jacobq/levenberg-marquardt/tree/support-error |
initializeErrorPropagation(errorPropagation); | ||
|
||
/** @type Array<number> */ | ||
var residualDifferences = Array(10).fill(NaN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the significance of 10
here? It seems like a "magic number"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct in interpreting the intention here as: (1) require at least 10 iterations and (2) stop after the total amount of largest change in the last 10 residuals is less than residualEpsilon
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. The algorithm just “takes a break” sometimes and other times it starts slow and gets up to speed after some iterations. The number 10 worked best for me, but you're right that it should be configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I'm not saying that it should be configurable (maybe it should, maybe not; I don't know yet). I'm saying that the code should be more expressive (Note: this is just a free article I found from searching the web; there are much better resources on this topic, such as the book "Clean Code" by Robert C. Martin). In other words, when a developer looks at this code, he or she should not have to ask "What does this number mean?" -- the answer should be already provided by giving it a name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, I would even suggest taking it a step further to cover the question of "Why is this array being filled with NaN
"? However, I don't think the solution to that problem lies in improving the name but rather in refactoring the code. It seems like we are only concerned about (1) whether or not we have taken at least 10 steps and (2) whether or not the biggest change in the last 10 steps was smaller than some given threshold. So, for example, we could do something like this instead of having an array, filling it with NaN
, shifting it, scanning it, etc.
// Outside of loop
const minIterations = 10; // We won't use the `residualEpsilon` stopping condition until it's been stable for this many steps
let maxEpsilon = -Infinity; // This will get replaced during the first step because the new values must be > -Infinity (alternatively we could set to null or undefined and handle that as a special case
let iterationsRemaining; // This count-down will get initialized during the first step because we initialize maxEpsilon to its minimum
// Inside of loop
// ... algorithm takes a step ...
let currEpsilon = prevResiduals - currResiduals; // (may need ensure that this value is >= 0)
if (currEpsilon > maxEpsilon) {
// We just found a new maximum, so we reset the count-down before allowing this to be used as a stopping condition
// This eliminates the need to maintain an array of previous epsilon values
maxEpsilon = maxEpsilon;
iterationsRemaining = minIterations;
}
else {
// maxEpsilon did not change so we can decrement our counter (limiting to zero)
iterationsRemaining--;
if (iterationsRemaining < 0) {
iterationsRemaining = 0;
}
}
const isChangeTooSmall = maxEpsilon <= residualEpsilon;
if (isChangeTooSmall && iterationsRemaining === 0) {
// ... stop ...
}
}); | ||
}); | ||
|
||
describe('error propagation estimator', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@m93a Could you please help me understand these test values? When I look at them it is not at all obvious to me what the correct values should be. (Partly because I am not sure how errorPropagation
works other than that it seems to take pseudo-random steps around a point and estimate the slope there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The errorPropagation
method tries to approximate the propagation of uncertainty through the function, given a variable with known error. Since we know the mean, as well as the standard deviation of the variable, we can assume it's normally distributed and choose equiprobable intervals where the variable's measured value might actually be.
This means that if we slice the interval of all variable's possible values to eg. 10 equiprobable intervals, there's the same probability that the measured value will be in the first interval, as there is probability that it will be in the second, third, etc. interval. For practical purposes, I don't really count with all of the possible values, I only use values between x̅ − 2.25σ and x̅ + 2.25σ. (There's 97.6 % probability that the measured value will be between those values). This process is not pseudo-random, but completely deterministic – the intervals are computed using the CDF of normal distribution.
Then, the functional values on the boundaries of the intervals are computed. This way the function can be approximated as a linear function on each interval. When a variable “passes through” a linear function, its standard error gets “stretched” (ie. multiplied) by the slope of the function. If a variable has the same probability of passing through multiple linear functions, its standard error (on average) gets stretched by the average of the slopes of those linear functions.
Of course the error distribution of the functional value most probably won't be a Gauss curve, even if the input variable was normally distributed in the first place. But as our fitting algorithm can't deal with those either (ie. it's not robust), it's not a big deal. We can simply pretend that all of the resulting distributions are close enough to the normal distribution (which is mostly true as long as the errors are small).
(An uncanny fact as a bonus: if the error is large enough and the function is very unsymmetrical, the mean of the functional value won't be equal to the functional value of the mean. For now let's just pretend those cases never occur.)
Now about the test values. You're right that the values are not exactly obvious. The results should match the standard deviation of normally distributed data that passed through the function. One could use Monte Carlo to get the right value.
// Pseudo-code to validate the results of errorPropagation
const mean = 42;
const sigma = 0.1;
let data = Array(BILLIONS_AND_BILLIONS)
data = data.map(() => normal.random(mean, sigma));
data = data.map( x => fn(x) );
const propagatedSigma = stdev(data);
expect(errCalc.errorPropagation(fn, mean, sigma))
.toBeCloseTo(propagatedSigma, DEPENDS_ON_NUMBER_OF_INTERVALS);
Wow, this was really time-consuming, I'm taking a break for today. I've got other things to do 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation! I plan to read through it a few times and attempt to make the test cases more expressive once I understand it.
|
||
function sinFunction([a, b]) { | ||
return (t) => a * Math.sin(b * t); | ||
} | ||
|
||
describe('errorCalculation test', () => { | ||
describe('sum of residuals', () => { | ||
it('Simple case', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we simplify this test by making the sampleFunction
always evaluate to a constant value (perhaps 0
)? I don't see why it should matter whether we use sin
or any other particular function since sumOfResiduals
should be purely a function of the difference between the function's output and the corresponding value in the set of data points. e.g. something like this:
const sampleFunction = ([k]) => (x => k); // resulting function returns constant
const n = 3;
const xs = new Array(n).fill(0).map((zero, i) => i); // [0, 1, ..., n-1] but doesn't matter because f(x; [k]) -> k
const data = {
x: xs,
y: xs.map(sampleFunction([0])) // all zeros
};
expect(errCalc.sumOfResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1);
expect(errCalc.sumOfResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1);
expect(errCalc.sumOfResiduals(data, [2], sampleFunction)).toBeCloseTo(2*n, 1);
expect(errCalc.sumOfSquaredResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1);
expect(errCalc.sumOfSquaredResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1);
expect(errCalc.sumOfSquaredResiduals(data, [2], sampleFunction)).toBeCloseTo(Math.pow(2*n, 2), 1);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a simpler function could be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize now that all your change really does here is update the name. The simplification of the test function is outside the scope of this PR. I've attempted to address this in #26, so after that merges this PR can be rebased to resolve this.
* @param {number} x | ||
* @return {number} | ||
*/ | ||
const sq = (x) => x * x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, this isn't really any faster than Math.pow(x, 2)
, which has been part of the ES standard since the early days. (You can also use x ** 2
if you don't care about IE.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt that the code in the perf actually does anything. Since you're effectively throwing away the result right after the computation, the interpreter probably optimizes that part out and just iterates i
(probably by more than 1
).
You're probably right about this function not adding very much to performance. If you don't like it, just remove it, but it seems quite readable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what I'm trying to say is that if you're doing it for performance reasons then it's probably better to remove it. If you think it is more readable I can see why that might be a good reason. However, I think it's hard to argue that a custom function that is identical to a built-in one is easier to read (because one has to look it up to see what it does rather than just recognize it), though in this case it matters little either way since the function is so simple.
You are probably right about jsPerf...sometimes it's too easy to fool oneself. Here's an updated example adapted to reduce the likelihood of this kind of mistake. It still shows less than 1% difference between the two methods of interest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the performance differences are negligible. I would choose x ** 2
then, as it seems to have the best readability. And when we move to TypeScript, it will be compiled to Math.pow
so we won't lose compatibility with old browsers.
@jobo322 @m93a Regarding the question about divergence in the Here's a 2D plot showing how SSR varies with the frequency parameter when the amplitude is held correct: Here's a surface plot you can explore on jsbin.com. (In these examples I used |
Since there is a lot of work in this PR I'd like to break it down into smaller bites (each as their own commit), and possibly even merge them separately. Thoughts? (I don't mind doing the rebasing myself, if it's helpful.)
|
I don't have an opinion on this, we'd better ask @targos 🙂 Should I push all the minor fixes we talked about before you start rebasing? Or will you take care of it yourself? I'm talking specifically:
|
Yes, please. I plan to work on other branches and cherry-pick your commits, so feel free to keep working here. (I'm also not sure how much time I'll have available to work on this over the next 2 weeks.) |
@m93a just checking to see if you have done any more work related to this. I am planning to get back to this in the next week or two. |
Sorry, I didn't have much time because of school and the other project that I'm working on (which will need this repo as a dependency but we're not quite there yet). I'm not sure if I'll have spare time sooner than a month from now, so feel free to work on this until then 🙂 |
I'm very curious of any progress on this proposed upgrade of the algorithm. Maybe you can get some help figuring out your problem by looking into this implementation. |
You know, I meant to get around to this but never did. Thanks for the reminder and the link. @m93a did you / do you plan to work more on this any time soon? |
Hey, |
This pull request resolves #17 by adding support for regression with known standard deviations of data. This can be done with the optional parameters
xError
andyError
on thedata
object passed to the algorithm. Furthermore I implemented adaptive fitting by changing the lambda parameter. I tried as hard as I could to maintain the good taste of the code, but I had to disable some overly restrictive eslint rules.I've done these changes:
minValue
andmaxValue
tominValues
andmaxValues
respectively. This way they're more consistent with theinitialValues
option. I've exposed these options in jsDoc, which fixes Constraint parameter values (feature request) #15.parameterError
on the returned object toresiduals
which is a much better name as I describe in OptionerrorTolerance
is misnamed and theoretically wrong #18. However, that's a breaking change – this needs to be taken care of.errorTolerance
which fixes OptionerrorTolerance
is misnamed and theoretically wrong #18.damping
parameter. The new options are:dampingDrop
– if the fit got better, multiplydamping
by this factor, default value:0.1
dampingBoost
– if the fit got worse, multiplydamping
by this factor, default value:1.5
minDamping
andmaxDamping
– limits of thedamping
parameterdata.xError
anddata.yError
to the algorithm. The new options are:errorPropagation
– how many iterations are used to approximate the error propagation through the parametrized function, default value:50
rough
andfine
options to distinguish between rough guesses far from minimum and more precise estimations near minimum. But because I wasn't able to avoid getting stuck in local minima because of suboptimal "switching strategy", I decided to drop this distinction.Both the old and my improved version of this algorithm are quite prone to local minima. My thoughts were that if we did a rough check of the parameter space (computing the residuals for various parameters) before the actual fitting, it could improve the results. But I'd better find some paper on this beforehand.
A few things that need to be done before merging:
damping
parametererrorTolerance
Find a way to smoothly transition between(I decided to drop this API)errorPropagation.rough
anderrorPropagation.fine
Find a way to avoid local minima(Probably not in the scope of this PR)