-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
It's the first step of the machine learning pipeline. It is responsible for acquiring and importing data from various sources into the pipeline. In this project, we are using MongoDB as a data source. The schema. YAML file contains a list of column names that should be dropped from the data. We decided which columns to drop in the EDA part which we did before making the pipeline. Once the data has been ingested into the pipeline, it splits data into training and testing sets. Artifacts are used to store the training and testing data, in order to make them available to other components of the pipeline
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done
