[v1.1] Add support for errors-in-variables #19

cshaa · 2018-09-08T11:10:26Z

This pull request resolves #17 by adding support for regression with known standard deviations of data. This can be done with the optional parameters xError and yError on the data object passed to the algorithm. Furthermore I implemented adaptive fitting by changing the lambda parameter. I tried as hard as I could to maintain the good taste of the code, but I had to disable some overly restrictive eslint rules.

I've done these changes:

I've changed minValue and maxValue to minValues and maxValues respectively. This way they're more consistent with the initialValues option. I've exposed these options in jsDoc, which fixes Constraint parameter values (feature request) #15.
Then I changed the property parameterError on the returned object to residuals which is a much better name as I describe in Option errorTolerance is misnamed and theoretically wrong #18. However, that's a breaking change – this needs to be taken care of.
I changed the behavior of errorTolerance which fixes Option errorTolerance is misnamed and theoretically wrong #18.
I've implemented the adaptive changes of damping parameter. The new options are:

dampingDrop – if the fit got better, multiply damping by this factor, default value: 0.1
dampingBoost – if the fit got worse, multiply damping by this factor, default value: 1.5
minDamping and maxDamping – limits of the damping parameter

I've added support for EIV (errors in variables). They can be passed as data.xError and data.yError to the algorithm. The new options are:

errorPropagation – how many iterations are used to approximate the error propagation through the parametrized function, default value: 50
- Previously there have been rough and fine options to distinguish between rough guesses far from minimum and more precise estimations near minimum. But because I wasn't able to avoid getting stuck in local minima because of suboptimal "switching strategy", I decided to drop this distinction.

Both the old and my improved version of this algorithm are quite prone to local minima. My thoughts were that if we did a rough check of the parameter space (computing the residuals for various parameters) before the actual fitting, it could improve the results. But I'd better find some paper on this beforehand.

A few things that need to be done before merging:

Add more tests for the new code
Add a "legacy layer" which would avert the breaking changes
Fine-tune the damping parameter
Find a new name for errorTolerance
Compare the weighted and non-weighted versions of this algorithm
~~Find a way to smoothly transition between errorPropagation.rough and errorPropagation.fine~~ (I decided to drop this API)
~~Find a way to avoid local minima~~ (Probably not in the scope of this PR)

codecov · 2018-09-08T11:11:54Z

Codecov Report

Merging #19 into master will decrease coverage by 6.83%.
The diff coverage is 86.48%.

@@            Coverage Diff             @@
##           master      #19      +/-   ##
==========================================
- Coverage   98.63%   91.79%   -6.84%     
==========================================
  Files           3        3              
  Lines          73      134      +61     
  Branches       20       29       +9     
==========================================
+ Hits           72      123      +51     
- Misses          1       10       +9     
- Partials        0        1       +1

Impacted Files	Coverage Δ
src/index.js	`98% <100%> (+0.63%)`	⬆️
src/step.js	`100% <100%> (ø)`	⬆️
src/errorCalculation.js	`81.48% <79.59%> (-18.52%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update afbbd90...a3d2b2d. Read the comment docs.

jobo322 · 2018-09-08T21:17:25Z

Hello @m93a
many thanks for your interest in this package, before everything, have you some literature where I could check the theoretical meaning of your suggestions?

cshaa · 2018-09-09T02:16:48Z

@jobo322 Hey,
My primary sources were these:
Numerical Recipes in FORTRAN 77, Chapter 15
Wikipedia: Levenberg-Marquardt
Damping–undamping strategies for the Levenberg–Marquardt
nonlinear least-squares method
Wikipedia: Nonlinear regression
Plus some basic knowledge of statistics from university and a lot of googling.

If you have doubts about any specific claim, I'll try to provide more specific sources 🙂

jobo322 · 2018-09-09T15:40:38Z

Hello @m93a, I will checking the code as soon as possible, also, I'll review the test cases. Again many thanks for your collaboration

cshaa · 2018-09-24T20:49:01Z

Now the only thing left is to add new tests and evaluate why the old ones fail.

jobo322 · 2018-10-03T14:29:14Z

Hello @m93a , I'm looking for an explanation about why the parameters in test case of sinFunction is always going down, I have been trying with diferents dampingBoost, dampingDrop, damping but the result is the min coutes defined by minValues, if you do not set inital values close to the real parameters, do you have idea how to control it?

cshaa · 2018-10-04T14:26:02Z

Hey @jobo322,
Sadly I don't know yet. The first thing to do would be to make a 3D plot of the function sumOfSquaredResiduals(a, b) and explore the parameter space of this particular dataset. It is quite possible that the local minima are very shallow and getting lower with higher frequency. In such case, the algorithm would be performing as expected and an additional technique would be needed to find the fundamental frequency. (Our desired solution would be the local minimum, which of all periodically reocurring local minima has the lowest frequency.)
If you don't manage to inspect it and reach a conclusion, I'll take a look at it as soon as I have some free time for that (which should be in a month at most).

EDIT: It would be really helpful to make a plot with the algorithm's “trajectory” in the parameter space, rendered on top of a heatmap of that space.

targos · 2018-10-18T09:25:44Z

Hello @m93a and thanks a lot for your work!

I hope you don't mind, I pushed a change to your branch to fix the conflicts with master and outstanding issues with eslint. In case you haven't seen it, we have an eslint-fix npm script to automatically fix styling issues so there shouldn't be a need to disable rules.

targos · 2018-10-18T09:26:50Z

I don't think we need to support legacy options. Let's design a good API and get rid of the inconsistencies.

cshaa · 2018-10-18T14:19:10Z

package.json

@@ -12,7 +12,7 @@
  "scripts": {
    "eslint": "eslint src",
    "eslint-fix": "npm run eslint -- --fix",
-    "prepare": "rollup -c",
+    "preparea": "rollup -c",


Was this intended? I indeed have no idea about what it should do, but it seems a lot like a typo to me 😀

Haha, it was a temporary rename because npm install was failing 🙃

So yes, typo. You can push up a fix.

jacobq · 2018-10-19T14:29:12Z

package.json

@@ -1,6 +1,6 @@
 {
  "name": "ml-levenberg-marquardt",
-  "version": "1.0.3",
+  "version": "1.1.0",


I think it's customary to separate release bump commits from other work (some package maintainers get really peeved about this). @targos Are there strong opinions about this for this project?

Yes, we use a tool that bumps the version number automatically (based on the commit messages) when we publish the release

jacobq · 2018-10-19T15:13:08Z

src/index.js

-        Math.max(minValues[k], parameters[k]),
-        maxValues[k]
-      );
+      params2[k] = Math.min(Math.max(minValues[k], params2[k]), maxValues[k]);


params2 seems like a name that isn't highly expressive. Am I understanding correctly that what you're doing here is avoiding overwriting the previous parameters in case they produce better results? If so, perhaps it'd be better to use names like prevParams and updatedParams or lastParams and nextParams, etc.

jacobq · 2018-10-19T15:15:00Z

src/index.js


-  var converged = error <= errorTolerance;
+  var residuals = sumOfSquaredResiduals(data, params, paramFunction);


We seem to be mixing var and let/const. I don't know if the switch belongs in this PR, but I think we should migrate away from using var unless we're trying to be compatible with really old engines. (And with the move to TS, that wouldn't be a problem anyway since it could just be translated down at the end.)

We can migrate to let and const with typescript

I'm used to let and const from TS, so it's possible that I accidentaly used them instead of var on some places. I was trying to use var as the old code was written that way and I didn't want to decrease compatibility with older engines. Moving to TS would be really cool.

jacobq · 2018-10-19T15:19:07Z

src/legacy.js

+ * @prop {number} [residualEpsilon = 1e-6] - Minimum change of the sum of residuals per step – if the sum of residuals changes less than this number, the algorithm will stop
+ * @prop {Array<number>} [maxValues] - Maximum values for the parameters
+ * @prop {Array<number>} [minValues] - Minimum values for the parameters
+ * @prop {{rough: number, fine: number}} [errorPropagation] - How many evaluations (per point per step) of the fitted function to use to approximate the error propagation through it


Where are these rough and fine values used? (Are these just leftovers from fc79bc6?)

Yes, I apparently forgot to remove them. In the best-case scenario, the algorithm would interpolate between those two as it approached the minimum, but I didn't manage to do it reliably, so it's best to remove those options.

src/index.js

jacobq · 2018-10-19T16:12:25Z

FYI I am taking a stab at this also. https://github.com/jacobq/levenberg-marquardt/tree/support-error

jacobq · 2018-10-19T17:03:17Z

src/index.js

+  initializeErrorPropagation(errorPropagation);
+
+  /** @type Array<number> */
+  var residualDifferences = Array(10).fill(NaN);


What is the significance of 10 here? It seems like a "magic number"

Am I correct in interpreting the intention here as: (1) require at least 10 iterations and (2) stop after the ~~total amount of~~ largest change in the last 10 residuals is less than residualEpsilon?

That's correct. The algorithm just “takes a break” sometimes and other times it starts slow and gets up to speed after some iterations. The number 10 worked best for me, but you're right that it should be configurable.

Well, I'm not saying that it should be configurable (maybe it should, maybe not; I don't know yet). I'm saying that the code should be more expressive (Note: this is just a free article I found from searching the web; there are much better resources on this topic, such as the book "Clean Code" by Robert C. Martin). In other words, when a developer looks at this code, he or she should not have to ask "What does this number mean?" -- the answer should be already provided by giving it a name.

In fact, I would even suggest taking it a step further to cover the question of "Why is this array being filled with NaN"? However, I don't think the solution to that problem lies in improving the name but rather in refactoring the code. It seems like we are only concerned about (1) whether or not we have taken at least 10 steps and (2) whether or not the biggest change in the last 10 steps was smaller than some given threshold. So, for example, we could do something like this instead of having an array, filling it with NaN, shifting it, scanning it, etc.

// Outside of loop const minIterations = 10; // We won't use the `residualEpsilon` stopping condition until it's been stable for this many steps let maxEpsilon = -Infinity; // This will get replaced during the first step because the new values must be > -Infinity (alternatively we could set to null or undefined and handle that as a special case let iterationsRemaining; // This count-down will get initialized during the first step because we initialize maxEpsilon to its minimum // Inside of loop // ... algorithm takes a step ... let currEpsilon = prevResiduals - currResiduals; // (may need ensure that this value is >= 0) if (currEpsilon > maxEpsilon) { // We just found a new maximum, so we reset the count-down before allowing this to be used as a stopping condition // This eliminates the need to maintain an array of previous epsilon values maxEpsilon = maxEpsilon; iterationsRemaining = minIterations; } else { // maxEpsilon did not change so we can decrement our counter (limiting to zero) iterationsRemaining--; if (iterationsRemaining < 0) { iterationsRemaining = 0; } } const isChangeTooSmall = maxEpsilon <= residualEpsilon; if (isChangeTooSmall && iterationsRemaining === 0) { // ... stop ... }

src/index.js

jacobq · 2018-10-19T18:54:32Z

src/__tests__/errorCalculation.js

+  });
+});
+
+describe('error propagation estimator', () => {


@m93a Could you please help me understand these test values? When I look at them it is not at all obvious to me what the correct values should be. (Partly because I am not sure how errorPropagation works other than that it seems to take pseudo-random steps around a point and estimate the slope there)

The errorPropagation method tries to approximate the propagation of uncertainty through the function, given a variable with known error. Since we know the mean, as well as the standard deviation of the variable, we can assume it's normally distributed and choose equiprobable intervals where the variable's measured value might actually be.

This means that if we slice the interval of all variable's possible values to eg. 10 equiprobable intervals, there's the same probability that the measured value will be in the first interval, as there is probability that it will be in the second, third, etc. interval. For practical purposes, I don't really count with all of the possible values, I only use values between x̅ − 2.25σ and x̅ + 2.25σ. (There's 97.6 % probability that the measured value will be between those values). This process is not pseudo-random, but completely deterministic – the intervals are computed using the CDF of normal distribution.

Then, the functional values on the boundaries of the intervals are computed. This way the function can be approximated as a linear function on each interval. When a variable “passes through” a linear function, its standard error gets “stretched” (ie. multiplied) by the slope of the function. If a variable has the same probability of passing through multiple linear functions, its standard error (on average) gets stretched by the average of the slopes of those linear functions.

Of course the error distribution of the functional value most probably won't be a Gauss curve, even if the input variable was normally distributed in the first place. But as our fitting algorithm can't deal with those either (ie. it's not robust), it's not a big deal. We can simply pretend that all of the resulting distributions are close enough to the normal distribution (which is mostly true as long as the errors are small).

(An uncanny fact as a bonus: if the error is large enough and the function is very unsymmetrical, the mean of the functional value won't be equal to the functional value of the mean. For now let's just pretend those cases never occur.)

Now about the test values. You're right that the values are not exactly obvious. The results should match the standard deviation of normally distributed data that passed through the function. One could use Monte Carlo to get the right value.

// Pseudo-code to validate the results of errorPropagation const mean = 42; const sigma = 0.1; let data = Array(BILLIONS_AND_BILLIONS) data = data.map(() => normal.random(mean, sigma)); data = data.map( x => fn(x) ); const propagatedSigma = stdev(data); expect(errCalc.errorPropagation(fn, mean, sigma)) .toBeCloseTo(propagatedSigma, DEPENDS_ON_NUMBER_OF_INTERVALS);

Wow, this was really time-consuming, I'm taking a break for today. I've got other things to do 😉

Thanks for the detailed explanation! I plan to read through it a few times and attempt to make the test cases more expressive once I understand it.

jacobq · 2018-10-19T19:10:06Z

src/__tests__/errorCalculation.js


 function sinFunction([a, b]) {
  return (t) => a * Math.sin(b * t);
 }

-describe('errorCalculation test', () => {
+describe('sum of residuals', () => {
  it('Simple case', () => {


Could we simplify this test by making the sampleFunction always evaluate to a constant value (perhaps 0)? I don't see why it should matter whether we use sin or any other particular function since sumOfResiduals should be purely a function of the difference between the function's output and the corresponding value in the set of data points. e.g. something like this:

const sampleFunction = ([k]) => (x => k); // resulting function returns constant const n = 3; const xs = new Array(n).fill(0).map((zero, i) => i); // [0, 1, ..., n-1] but doesn't matter because f(x; [k]) -> k const data = { x: xs, y: xs.map(sampleFunction([0])) // all zeros }; expect(errCalc.sumOfResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1); expect(errCalc.sumOfResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1); expect(errCalc.sumOfResiduals(data, [2], sampleFunction)).toBeCloseTo(2*n, 1); expect(errCalc.sumOfSquaredResiduals(data, [0], sampleFunction)).toBeCloseTo(0, 1); expect(errCalc.sumOfSquaredResiduals(data, [1], sampleFunction)).toBeCloseTo(n, 1); expect(errCalc.sumOfSquaredResiduals(data, [2], sampleFunction)).toBeCloseTo(Math.pow(2*n, 2), 1);

Yes, a simpler function could be used.

I realize now that all your change really does here is update the name. The simplification of the test function is outside the scope of this PR. I've attempted to address this in #26, so after that merges this PR can be rebased to resolve this.

jacobq · 2018-10-22T19:22:00Z

src/errorCalculation.js

+ * @param {number} x
+ * @return {number}
+ */
+const sq = (x) => x * x;


FYI, this isn't really any faster than Math.pow(x, 2), which has been part of the ES standard since the early days. (You can also use x ** 2 if you don't care about IE.)

I doubt that the code in the perf actually does anything. Since you're effectively throwing away the result right after the computation, the interpreter probably optimizes that part out and just iterates i (probably by more than 1).

You're probably right about this function not adding very much to performance. If you don't like it, just remove it, but it seems quite readable to me.

I guess what I'm trying to say is that if you're doing it for performance reasons then it's probably better to remove it. If you think it is more readable I can see why that might be a good reason. However, I think it's hard to argue that a custom function that is identical to a built-in one is easier to read (because one has to look it up to see what it does rather than just recognize it), though in this case it matters little either way since the function is so simple.

You are probably right about jsPerf...sometimes it's too easy to fool oneself. Here's an updated example adapted to reduce the likelihood of this kind of mistake. It still shows less than 1% difference between the two methods of interest.

You're right, the performance differences are negligible. I would choose x ** 2 then, as it seems to have the best readability. And when we move to TypeScript, it will be compiled to Math.pow so we won't lose compatibility with old browsers.

jacobq · 2018-10-22T20:06:40Z

@jobo322 @m93a Regarding the question about divergence in the Math.sin test, this family of functions seems to have a deep canyon near the solution but a broad plateau for low amplitude parameters. (I am using the terms like the paper "Why are Nonlinear Fits to Data so Challenging?" -- not sure if they are standard.) In other words, when the amplitude parameter is small, it is very hard to find the optimal frequency parameter because going the "wrong way" may make the solution "better".

Here's a 2D plot showing how SSR varies with the frequency parameter when the amplitude is held correct:

Here's a surface plot you can explore on jsbin.com.

(In these examples I used amplitude = 2 and frequency = 3)

jacobq · 2018-10-22T20:16:35Z

You can also see how holding the frequency constant could lead toward the sub-optimal solution of zero amplitude:

jacobq · 2018-10-22T20:29:11Z

Since there is a lot of work in this PR I'd like to break it down into smaller bites (each as their own commit), and possibly even merge them separately. Thoughts? (I don't mind doing the rebasing myself, if it's helpful.)

Refactor / update test suite (there is a lot of duplication, and it's often hard to tell what is desired behavior vs. existing behavior) Clean up tests #26
Replace parameterError with sumOfSquaredResiduals (breaking change)
Support dynamic damping (breaking change)
Support weighting/error (breaking change)

cshaa · 2018-10-23T23:50:01Z

I'd like to break it down into smaller bites

I don't have an opinion on this, we'd better ask @targos 🙂

Should I push all the minor fixes we talked about before you start rebasing? Or will you take care of it yourself? I'm talking specifically:

correct preparea to prepare in package.json
roll package version back
rename params2 and residuals2 to nextParams and nextResiduals
refactor the code with residualDifferences = Array(10) to something reasonable
change sq(x) to x ** 2 or Math.pow

jacobq · 2018-10-24T00:01:46Z

Should I push all the minor fixes we talked about before you start rebasing? Or will you take care of it yourself? I'm talking specifically: ...

Yes, please. I plan to work on other branches and cherry-pick your commits, so feel free to keep working here. (I'm also not sure how much time I'll have available to work on this over the next 2 weeks.)

jacobq · 2018-12-17T17:12:04Z

@m93a just checking to see if you have done any more work related to this. I am planning to get back to this in the next week or two.

cshaa · 2018-12-19T13:27:43Z

Sorry, I didn't have much time because of school and the other project that I'm working on (which will need this repo as a dependency but we're not quite there yet). I'm not sure if I'll have spare time sooner than a month from now, so feel free to work on this until then 🙂

trulstrengerikkealias · 2020-01-16T21:58:24Z

I'm very curious of any progress on this proposed upgrade of the algorithm.
I've used this one in C/C++, which has all the features one needs:
https://pages.physics.wisc.edu/~craigm/idl/cmpfit.html

Maybe you can get some help figuring out your problem by looking into this implementation.

jacobq · 2020-01-16T22:38:27Z

I'm very curious of any progress on this proposed upgrade of the algorithm.

You know, I meant to get around to this but never did. Thanks for the reminder and the link. @m93a did you / do you plan to work more on this any time soon?

cshaa · 2020-01-17T21:22:55Z

Hey,
I kinda forgot about this PR too. I might get into it after the end of our exam period at some time in February. If I forget again, please do bump the issue :)

adds support for errors-in-variables regression, minor API improvements

fc79bc6

add adaptive lambda

87a12b8

added legacy layer, changed API

88f575c

Merge remote-tracking branch 'origin/master' into pr/m93a/19

a3d2b2d

cshaa commented Oct 18, 2018

View reviewed changes

targos mentioned this pull request Oct 18, 2018

Convert to TypeScript #23

Open

jacobq reviewed Oct 19, 2018

View reviewed changes

src/index.js Show resolved Hide resolved

jacobq reviewed Oct 19, 2018

View reviewed changes

src/index.js Show resolved Hide resolved

jacobq reviewed Oct 19, 2018

View reviewed changes

jacobq reviewed Oct 22, 2018

View reviewed changes


		var converged = error <= errorTolerance;
		var residuals = sumOfSquaredResiduals(data, params, paramFunction);

[v1.1] Add support for errors-in-variables #19

Are you sure you want to change the base?

[v1.1] Add support for errors-in-variables #19

Conversation

cshaa commented Sep 8, 2018 • edited Loading

codecov bot commented Sep 8, 2018 • edited Loading

Codecov Report

jobo322 commented Sep 8, 2018

cshaa commented Sep 9, 2018 • edited Loading

jobo322 commented Sep 9, 2018

cshaa commented Sep 24, 2018 • edited Loading

jobo322 commented Oct 3, 2018

cshaa commented Oct 4, 2018 • edited Loading

targos commented Oct 18, 2018

targos commented Oct 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cshaa Oct 20, 2018 • edited Loading

Choose a reason for hiding this comment

jacobq Oct 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacobq commented Oct 19, 2018

Choose a reason for hiding this comment

jacobq Oct 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacobq Oct 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacobq Oct 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jacobq commented Oct 22, 2018 • edited Loading

jacobq commented Oct 22, 2018

jacobq commented Oct 22, 2018 • edited Loading

cshaa commented Oct 23, 2018

jacobq commented Oct 24, 2018

jacobq commented Dec 17, 2018

cshaa commented Dec 19, 2018

trulstrengerikkealias commented Jan 16, 2020

jacobq commented Jan 16, 2020

cshaa commented Jan 17, 2020

cshaa commented Sep 8, 2018 •

edited

Loading

codecov bot commented Sep 8, 2018 •

edited

Loading

cshaa commented Sep 9, 2018 •

edited

Loading

cshaa commented Sep 24, 2018 •

edited

Loading

cshaa commented Oct 4, 2018 •

edited

Loading

cshaa Oct 20, 2018 •

edited

Loading

jacobq Oct 19, 2018 •

edited

Loading

jacobq Oct 19, 2018 •

edited

Loading

jacobq Oct 19, 2018 •

edited

Loading

jacobq Oct 23, 2018 •

edited

Loading

jacobq commented Oct 22, 2018 •

edited

Loading

jacobq commented Oct 22, 2018 •

edited

Loading