Skip to content
This repository has been archived by the owner on Jun 22, 2022. It is now read-only.

LightGBM on dimension reduced dataset

Kamil A. Kaczmarek edited this page Jul 10, 2018 · 4 revisions

whale 🐳

Feature Extraction

  • truncated svd projection
  truncated_svd__n_components: 50
  truncated_svd__n_iter: 10
  • pca projection
  pca__n_components: 100
  • fast ica projection
fast_ica__n_components: 15
  • factor analysis
  factor_analysis__n_components: 50
  • gaussian random projection
  gaussian_random_projection__n_components: 50
  gaussian_projection__eps: 0.1

Note as it turns out the eps parameter doesn't matter (tried 0.01,0.1,1.0) with exact same results

  • sparse random projection
  sparse_random_projection__n_components: 50

Model and results

model CV LB 🏆
lightGBM truncated svd 1.56
lightGBM pca 1.55
lightGBM fast ica 1.57
lightGBM factor analysis 1.51
lightGBM gaussian random projection 1.63
lightGBM sparse random projection 1.47
lightGBM projections (all) 1.47
lightGBM projections best (sparse random projection + factor analysis + truncated svd + fast-ica) 1.448
lightGBM projections second best (sparse random projection) 1.452
lightGBM raw + projections (second best) 1.393
lightGBM projections (second best) + aggregations 1.345
lightGBM raw + projections (second best) + aggregations 1.3416 1.41 🚀

Pipeline diagram

pipeline-solution-4