-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accuracies lower than 50% if the random seed is unlucky #207
Comments
The main issue is that you have too few data. If -b 1 is used, which is for probabilistic outputs, internally we conduct a cross validation process. Thus there is some randomness. To have deterministic results, either fix the seed or, if prob outputs not needed, remove -b 1 |
@cjlin1 thank you very much for your reply. In my case I need probabilistic results, so option Yes, I could change the seed to make this particular example work, but in general I cannot consider this a solution, since for other input data I could fall into the same low-accuracy problem. Why do you think that constraining the sigmoid function to be always increasing (considered the fact that we expect high probabilities for samples labeled +1 and low probabilities for those labeled -1) could not be a solution? Are there any drawbacks in constraining someway |
I think we do have that sigmoid is always increasing. So I don't understand your question |
I'm sorry: I probably made some confusion with the sign of Consider the input data attached to my first message. Working on this data I experimented that by setting (at startup) the random generator seed to some integer number, normally (i.e., for about the 96% of these seeds) the resulting trained probabilistic model has 100% accuracy. However, for some seeds (only about 3,5%; an example is I noticed that 0% accuracy models have a positive value for Considering the fact that we expect high probabilities (i.e., high values of I attach the adaptation of svm-train.c that I used to make the experiments, hoping it can help. |
What if you put 10 copies of the same data together as input? I suspect
the situation may be improved.
…On 2023-11-02 19:56, giulio-datamind wrote:
I'm sorry: I probably made some confusion with the sign of probA (I
edited the above messages to fix them). I try to explain me better
with other words.
Consider the input data attached to my first message. Working on this
data I experimented that by setting (at startup) the random generator
seed to some integer number, normally (_i.e._, for about the 96% of
these seeds) the resulting trained probabilistic model has 100%
accuracy. However, for some seeds (only about 3,5%; an example is
srand(42) on my machine) the trained model has accuracy of 0%.
I noticed that 0% accuracy models have a positive value for probA,
while 100% accuracy models have a negative one. The sigmoid function
is defined as SF(x) = 1/(1+exp(probA*x+probB)), where x is the
decision value. I stated that 0% accuracy models are associated to a
decreasing sigmoid function because for x that tends to +infinity,
SF(x) tends to 1 if probA < 0 and to 0 if probA > 0.
Considering the fact that we expect high probabilities (_i.e._, high
values of SF(x)) for samples labeled +1 and low probabilities for
those labeled -1, I suspect that there is room for an improvement if
we constrain probA to be always lower than 0.
I attach the adaptation of _svm-train.c_ that I used to make the
experiments [1], hoping it can help.
--
Reply to this email directly, view it on GitHub [2], or unsubscribe
[3].
You are receiving this because you were mentioned.Message ID:
***@***.***> [ { ***@***.***":
"http://schema.org", ***@***.***": "EmailMessage", "potentialAction": {
***@***.***": "ViewAction", "target":
"#207 (comment)",
"url":
"#207 (comment)",
"name": "View Issue" }, "description": "View this Issue on GitHub",
"publisher": { ***@***.***": "Organization", "name": "GitHub", "url":
"https://github.com" } } ]
Links:
------
[1]
https://github.com/cjlin1/libsvm/files/13238410/svm-train-adapted.zip
[2] #207 (comment)
[3]
https://github.com/notifications/unsubscribe-auth/ABI3BHWESD3VUCLBRNPWI63YCOC5NAVCNFSM6AAAAAA6XYI3SCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJQGU4TGMZXGQ
|
Yes, you are correct. By simply replicating the input data 10 times, the input file becomes like this; with this input all the 1000 experimented seeds lead to a 100% accuracy model. I think, however, that there is no reason for not trying to directly improve the algorithm in order to make it work better also for lower-cardinality datasets, as is often the case. |
I tried to impose the constraint
immediately after the
in the backtracking loop of In practice, I implemented the constraint by reflecting, at every iteration, the point (A, B) of the parameters' search space around the line A = -eps. With these changes even if the random seed choice is unfortunate, the accuracies of the trained models never fall below 50%. This happens because in the worst case the samples are classified with a very flat sigmoid (when A is near 0) all into the same class; but, unlike before, it cannot happen that the classification is opposite to the labeling. Are there any disadvantages I didn't foresee in these modifications? |
It's ok to impose such a constraint, but then this is a constrained
optimization problem. Either a constrained optimization algorithm is
used or you need to prove the convergence of your setting.
…On 2024-01-04 17:54, giulio-datamind wrote:
I tried to impose the constraint probA < 0 by adding the line
> newA = newA > -eps ? -2 * eps - newA : newA;
immediately after the
> newA = A + stepsize * dA;
in the backtracking loop of sigmoid_train function. Furthermore, I set
the initial value to A = 1 instead of A = 0.
In practice, I implemented the constraint by reflecting, at every
iteration, the point _(A, B)_ of the parameters' search space around
the line _A = -eps_.
With these changes even if the random seed choice is unfortunate, the
accuracies of the trained models never fall below 50%. This happens
because in the worst case the samples are classified with a very flat
sigmoid (when _A_ is near _0_) all into the same class; but, unlike
before, it cannot happen that the classification is opposite to the
labeling.
Are there any disadvantages I didn't foresee in these modifications?
--
Reply to this email directly, view it on GitHub [1], or unsubscribe
[2].
You are receiving this because you were mentioned.Message ID:
***@***.***> [ { ***@***.***":
"http://schema.org", ***@***.***": "EmailMessage", "potentialAction": {
***@***.***": "ViewAction", "target":
"#207 (comment)",
"url":
"#207 (comment)",
"name": "View Issue" }, "description": "View this Issue on GitHub",
"publisher": { ***@***.***": "Organization", "name": "GitHub", "url":
"https://github.com" } } ]
Links:
------
[1] #207 (comment)
[2]
https://github.com/notifications/unsubscribe-auth/ABI3BHRM3JKLE64UZUUOSG3YMZ35XAVCNFSM6AAAAAA6XYI3SCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZWHAYDSNJYGY
|
In the context of binary SVM classification problems, while I was doing experiments on this input data (constituted by 18 samples, divided in two classes of equal cardinality) I felt into a model with 0% accuracy. The parameters I used for training are: -g 16 -c 32 -b 1.
While attempting to go deep in understanding the reasons of this 0% accuracy, I concluded that it is a consequence of the choice of the random seed.
Hence, I did multiple tests with different seeds. By using as seed the first 1000 integers, I obtained some results that can be summarized in this way:
I noticed that the following sentence is true for all 1000 seeds:
In fact, the condition
probA < 0
corresponds to having an increasing sigmoid function for modelling the decision values' response.Considered this long premise, my questions are:
Thank you very much to anyone who will contribute.
PS: looking for similar questions I didn't find the answer to my question, but I suspect that the questions here are related to the issues
#152, #153 and #155.
The text was updated successfully, but these errors were encountered: