Data Ingestion Component



![Image](https://user-images.githubusercontent.com/92209437/212228795-86d9e5f1-eb5f-40dd-b6ce-757804eb0395.png)

It's the first step of the machine learning pipeline. It is responsible for acquiring and importing data from various sources into the pipeline. In this project, we are using MongoDB as a data source. The schema. YAML file contains a list of column names that should be dropped from the data. We decided which columns to drop in the EDA part which we did before making the pipeline. Once the data has been ingested into the pipeline, it splits data into training and testing sets. Artifacts are used to store the training and testing data, in order to make them available to other components of the pipeline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data Ingestion Component #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Data Ingestion Component #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions