[Question] Feature agglomeration n_clusters is bigger than number of features.

I am trying to build a Domain-Adversarial Neural Network. In order to do this I want to first fit a autosklearn MLP model. Then extract the pipeline configuration with its hyperparameters and build a NN in keras with this config adding the extra layer responsible for the domain shift. The problem I am finding is that the pipeline is not complete. You see, my training data has 32 features and the pipeline applies FeatureAgglomeration with 338 clusters. I think this is because autosklearn has added more dimensions before, but this is not specified in the pipeline. How could I access this process? 
This model works when I try to predict my test data, so it working is not a problem. The problem is that when I want to build a manual model with this pipeline I can't do it because of this dimensionality change that I cannot find.

This is the pipeline:
```
SimpleClassificationPipeline({'balancing:strategy': 'none', 'classifier:__choice__': 'mlp', 'data_preprocessor:__choice__': 'feature_type', 'feature_preprocessor:__choice__': 'feature_agglomeration', 'classifier:mlp:activation': 'tanh', 'classifier:mlp:alpha': 0.007501808719126309, 'classifier:mlp:batch_size': 'auto', 'classifier:mlp:beta_1': 0.9, 'classifier:mlp:beta_2': 0.999, 'classifier:mlp:early_stopping': 'train', 'classifier:mlp:epsilon': 1e-08, 'classifier:mlp:hidden_layer_depth': 1, 'classifier:mlp:learning_rate_init': 0.0014320876811932824, 'classifier:mlp:n_iter_no_change': 32, 'classifier:mlp:num_nodes_per_layer': 235, 'classifier:mlp:shuffle': 'True', 'classifier:mlp:solver': 'adam', 'classifier:mlp:tol': 0.0001, 'data_preprocessor:feature_type:numerical_transformer:imputation:strategy': 'most_frequent', 'data_preprocessor:feature_type:numerical_transformer:rescaling:__choice__': 'standardize', 'feature_preprocessor:feature_agglomeration:affinity': 'manhattan', 'feature_preprocessor:feature_agglomeration:linkage': 'average', 'feature_preprocessor:feature_agglomeration:n_clusters': 338, 'feature_preprocessor:feature_agglomeration:pooling_func': 'median'},
dataset_properties={
  'task': 1,
  'sparse': False,
  'multilabel': False,
  'multiclass': False,
  'target_type': 'classification',
  'signed': False})
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Feature agglomeration n_clusters is bigger than number of features. #1753

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Feature agglomeration n_clusters is bigger than number of features. #1753

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions