About the rocauc value #52

IItaly · 2021-06-09T14:35:43Z

Hi, @CrohnEngineer
I found that the way to calculate AUC value is a bit strange. The score is used here, and the score is a number that has not changed, some of which are greater than 1. Is this the right way to use this function?
rocauc = M.roc_auc_score(df_res['label'],df_res['score'])

The text was updated successfully, but these errors were encountered:

IItaly · 2021-06-09T14:45:36Z

It seems that it is calculated after taking the mean value, so I have no doubt.

CrohnEngineer · 2021-06-09T14:55:40Z

Hey @IItaly ,

if you are referring to the Analyze results notebook you are right, in the compute_metrics function before computing the ROC curve we should apply a sigmoid to normalize the scores.
We do this in the Analyze results net fusion notebook, where we report the results presented in the paper.
If you look at cell 23, we compute the ROC curve after we apply a sigmoid (the expit function) for normalizing the scores between 0 and 1.
Thank you for pointing out, I'll fix this in the next committ :)
Bests,

Edoardo

P.s. In any case, please be aware that it is not strictly necessary to have scores normalized between 0-1, as the ROC curve from sklearn will automatically find the appropriate thresholds independently from the numeric range of the scores. It is fairer though, as this is the way we presented results in the paper, so we will fix this as soon as possible :)

IItaly · 2021-06-09T14:59:38Z

Thank you for your reply. Maybe that's why I got a higher score than that reported in your paper？

IItaly · 2021-06-09T15:08:38Z

It seems that when calculating the score of video level, the average value is taken here, and the score will be normalized between 0-1.Is this inaccurate calculation of AUC？
df_videos = df_frames[['video','label','score']].groupby('video').mean() df_videos['label'] = df_videos['label'].astype(np.bool) results_video_list.append(compute_metrics(df_videos,train_model_tag))

IItaly · 2021-06-09T15:49:22Z

Is it right to put it between 0 and 1？

Looking forward to your reply

CrohnEngineer · 2021-06-10T16:19:48Z

Hey @IItaly ,

Thank you for your reply. Maybe that's why I got a higher score than that reported in your paper？

it might be, did you manage to re-run the pipeline? In #47 we found out that you used a higher number of iterations wrt those used in the paper right?
Unfortunately from our side we had a very busy period in the lab and we didn't find the time to re-run the experiments, sorry :(

It seems that when calculating the score of video level, the average value is taken here, and the score will be normalized between 0-1.Is this inaccurate calculation of AUC？
df_videos = df_frames[['video','label','score']].groupby('video').mean() df_videos['label'] = df_videos['label'].astype(np.bool) results_video_list.append(compute_metrics(df_videos,train_model_tag))

We compute the average of the non-sigmoid scores for all frames. That will be our raw score for the video, that then must be normalized between 0 and 1 for computing the ROC curve. Is that what you were asking?

Is it right to put it between 0 and 1？

Instead of looking for where the score is > or < than 0, you can directly compute the normalized score with something along this line df_videos['norm_score'] = df_videos['score'].apply(expit). Not sure about the syntax, anyway I suggest you to use the apply or map function of Pandas.
Hope these answers clarify things up!
Bests,

Edoardo

IItaly · 2021-06-10T16:28:30Z

The score is still higher while 20000 iterations.
So I guess maybe that is the reson.

We compute the average of the non-sigmoid scores for all frames. That will be our raw score for the video, that then must be normalized between 0 and 1 for computing the ROC curve. Is that what you were asking?

Yes.I want to get the correct AUC value.

Instead of looking for where the score is > or < than 0, you can directly compute the normalized score with something along this line df_videos['norm_score'] = df_videos['score'].apply(expit). Not sure about the syntax, anyway I suggest you to use the apply or map function of Pandas.
Hope these answers clarify things up!

I did it like this pic, but the final value didn't change

CrohnEngineer · 2021-06-10T16:54:11Z

Yes.I want to get the correct AUC value.

Then yes, you should mean the score for all frames for each video, and the normalize between 0 and 1 with the expit function (or any sigmoid function you prefer).
If you look at cell 23 of the notebook I showed you in the previous comment, you could compute the AUC in a way like this:

df = df.groupby('video')
df = df.mean()
results_df['loss'] = log_loss(df['label'], expit(np.mean(df['score'], axis=1)))
results_df['auc'] = M.roc_auc_score(df['label'], expit(np.mean(df['score'], axis=1)))

I did it like this pic, but the final value didn't change

I'm sorry I'm not sure I understood, computing the score the way you did in the picture you obtained a similar value to the one you had without normalizing?

IItaly · 2021-06-10T16:56:59Z

I'm sorry I'm not sure I understood, computing the score the way you did in the picture you obtained a similar value to the one you had without normalizing?

Yes, you're right. I did as shown in the picture, but the final value didn't change. I think I can try it your way.

IItaly · 2021-06-11T08:56:03Z

Sorry, I don't quite understand the role of np. mean() here, but it seems that it can't be used because it will show:
ValueError: No axis named 1 for object type Series

CrohnEngineer · 2021-06-11T08:59:00Z

Hey @IItaly ,

you're right, you shouldn't need it.
Just try doing expit(df['score']) as you should already have the mean of scores for each vide thanks to the group by above.

IItaly · 2021-06-11T09:05:59Z

The result of not using expit () is the same as that of using it. Maybe there's no need to change score between 0-1?
Thank you for your timely reply, which has solved many of my problems.

IItaly · 2021-06-12T09:31:40Z

Hey,I want to know what is accbal?

CrohnEngineer · 2021-06-23T07:49:19Z

Hey @IItaly ,

sorry for the late reply.
I'll try to address each question separately.

The result of not using expit () is the same as that of using it. Maybe there's no need to change score between 0-1?

From an high level perspective, the results should not be too dissimilar, as the sigmoid function simply normalizes the scores on a scale between 0-1, but then if the network behaves well we should be able to see a clear distinction between FAKE and REAL raw scores ( = not normalized with sigmoid).
Also, it's OK if with 20000 iterations your results are different from ours, as each training of a network is not a deterministic process, therefore repeating training with the same configurations might still provide slightly different results.
Therefore, can you report the values you have found with and without the sigmoid normalization with 20000 iterations?
I am curios to see how much different are from ours.

Hey,I want to know what is accbal?

It's the balanced accuracy from scikit-learn, you can find the explanation here https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html .
Bests,

Edoardo

IItaly · 2021-06-23T10:22:00Z

Thank you very much. I'll sort out my results for you to compare~

zhuzhen1996 · 2022-02-23T08:41:32Z

Are you also doing research on deepfakes?Italy

CrohnEngineer self-assigned this Jun 9, 2021

CrohnEngineer added the enhancement New feature or request label Jun 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the rocauc value #52

About the rocauc value #52

IItaly commented Jun 9, 2021

IItaly commented Jun 9, 2021

CrohnEngineer commented Jun 9, 2021

IItaly commented Jun 9, 2021

IItaly commented Jun 9, 2021

IItaly commented Jun 9, 2021

CrohnEngineer commented Jun 10, 2021

IItaly commented Jun 10, 2021

CrohnEngineer commented Jun 10, 2021

IItaly commented Jun 10, 2021 •

edited

Loading

IItaly commented Jun 11, 2021 •

edited

Loading

CrohnEngineer commented Jun 11, 2021

IItaly commented Jun 11, 2021

IItaly commented Jun 12, 2021

CrohnEngineer commented Jun 23, 2021

IItaly commented Jun 23, 2021

zhuzhen1996 commented Feb 23, 2022

About the rocauc value #52

About the rocauc value #52

Comments

IItaly commented Jun 9, 2021

IItaly commented Jun 9, 2021

CrohnEngineer commented Jun 9, 2021

IItaly commented Jun 9, 2021

IItaly commented Jun 9, 2021

IItaly commented Jun 9, 2021

CrohnEngineer commented Jun 10, 2021

IItaly commented Jun 10, 2021

CrohnEngineer commented Jun 10, 2021

IItaly commented Jun 10, 2021 • edited Loading

IItaly commented Jun 11, 2021 • edited Loading

CrohnEngineer commented Jun 11, 2021

IItaly commented Jun 11, 2021

IItaly commented Jun 12, 2021

CrohnEngineer commented Jun 23, 2021

IItaly commented Jun 23, 2021

zhuzhen1996 commented Feb 23, 2022

IItaly commented Jun 10, 2021 •

edited

Loading

IItaly commented Jun 11, 2021 •

edited

Loading