This study focuses on the replicability of finding relevant predictors for lie detection in various psychometric tests concerning medicine, behavioral science and data science that have been compiled twice, once honestly and once dishonestly. More precisely, the goal is to develop a framework for feature selection that leads to good and similar results for different models used for the discrimination of honesty and dishonesty of test responses. Accuracy, Top-5 stability and Accuracy Standard Deviation are the metrics used to evaluate the results.
The approaches developed in this project to select the features are the following:
- PCA: select 20% of the total number of features using the principal component analysis.
- Permutation importance: fitted on a random forest, with features selected based on t-test.
- Mutual Information: the features selected by the Joint Mutual Information Maximization (JMIM) algorithm with an importance score of at least 0.8 out of 1 are used.
Before applying the methods, the datasets are split into training and test (70%-30%) and for every feature, the mean and the standard deviation are computed in order to scale that feature:
Each one of the approaches considered in this project, as mentioned before, selects a number of features from the corresponding original dataset, and then these selected features are used to train different models and to observe their performance. The models trained in this project are:
- Logistic regression model on all the features (Full LR)
- Logistic regression model on selected features (LR)
- Support vector machine (SVM)
- Random forest (RF)
- Multi-layer perceptron classifier (MLP)
For each of these models is also computed the related accuracy in order to see firstly how good that model is performing with the selected features and secondly to compare the models between them in order to figure out if the selected features give similar performances among all the models. A logistic regression with all the features is trained at the beginning. In this way, it’s possible to have a comparison between the results obtained with the selected features.
- Accuracy: ratio of correct predictions over the number of instances. This has been chosen as all the datasets show a fairly balanced number of examples per class (all are binary classification tasks). The accuracy is computed on the full model (Full LR) as well as all the other four models used for benchmarking and trained only on the subset of features selected by each of the procedures in scope.
- Accuracy Standard Deviation: standard deviation of the four models (i.e. LR, SVM, RF, MLP) fitted on the subset of selected features. It is a measure of the consistency of the classification performance across different models, thus the lower the better.
- Top-5 stability: a more specific metric for assessing consistency across models (i.e. LR, SVM, RF, MLP). It takes into account the first five most important features used by each of the models, the formula developed is:
where
Name | Topic | Faking good/faking bad | Number of samples | Numbers of features |
---|---|---|---|---|
DT_df_CC | Short Dark Triad 3 for child costudy | Faking good | 482 | 27 |
DT_df_JI | Short Dark Triad 3 for a job interview | Faking good | 864 | 27 |
PRMQ_df | Identify memory difficulties | Faking bad | 1404 | 16 |
PCL5_df | Identify victims of PTSD | Faking bad | 402 | 20 |
NAQ_R_df | Identify possible victims of mobbing | Faking bad | 712 | 22 |
PHQ9_GAD7_df | Identify possible victims of anxious-depressive syndrom | Faking bad | 1118 | 16 |
PID5_df | Identify mental disorders | Faking bad | 824 | 220 |
sPID5_df | Identify mental disorders | Faking bad | 1038 | 25 |
PRFQ_df | Specific caregivers' ability to mentalize with their children | Faking good | 678 | 18 |
IESR_df | Identify possible victims of PTSD | Faking bad | 358 | 22 |
R_NEO_PI_df | Personality questionnaire (Big5) | Faking good | 77687 | 30 |
RAW_DDDT_df | Identify Dark Triad personality | Faking bad | 986 | 12 |
IADQ_df | Identify adjustment disorder (stress response syndrome) | Faking bad | 450 | 9 |
BF_df_CTU | Job interview for a salesperson position | Faking good | 442 | 10 |
BF_df_OU | Job interview for in humanitarian organization | Faking good | 460 | 10 |
BF_df_V | Obtain child costudy | Faking good | 486 | 10 |