This repository was archived by the owner on Jun 22, 2022. It is now read-only.
  
  
  - 
                Notifications
    
You must be signed in to change notification settings  - Fork 21
 
Exploring various dimension reduction techniques
        Kamil A. Kaczmarek edited this page Jul 10, 2018 
        ·
        2 revisions
      
    - factor analysis
 
  factor_analysis__n_components: 50- sparse random projection
 
  sparse_random_projection__n_components: 50- more row-wise aggregations
 
def aggregate_row(row):
    non_zero_values = row.iloc[row.nonzero()]
    aggs = {'non_zero_mean': non_zero_values.mean(),
            'non_zero_max': non_zero_values.max(),
            'non_zero_min': non_zero_values.min(),
            'non_zero_std': non_zero_values.std(),
            'non_zero_sum': non_zero_values.sum(),
            'non_zero_count': non_zero_values.count(),
            'non_zero_fraction': non_zero_values.count() / row.count()
            }
    return pd.Series(aggs)- not using raw features
 
lightGBM new aggregations + projections (second best) 1.336 CV 1.39 LB

check our GitHub organization https://github.com/neptune-ml for more cool stuff 😃
Kamil & Kuba, core contributors
- honey bee 🐝 LightGBM and 5fold CV
 - beetle 🪲 LightGBM on binarized dataset
 - dromedary camel 🐪 LightGBM with row aggregations
 - whale 🐳 LightGBM on dimension reduced dataset
 - water buffalo 🐃 Exploring various dimension reduction techniques
 - blowfish 🐡 bucketing row aggregations