- 
                Notifications
    You must be signed in to change notification settings 
- Fork 43
Home
        Bwuljqh edited this page Feb 24, 2022 
        ·
        30 revisions
      
    A listing of projects to get data streams out of MySQL
List of projects that will let you do replication from MySQL to Kafka.
| Project name | Site | Description | 
|---|---|---|
| debezium | https://debezium.io | Debezium is an open-source distributed platform for change data capture. Replicates from MySQL to Kafka. Uses mysql-binlog-connector-java. Kafka Connector. A funded project supported by Redhat with employees working on it full-time. | 
| Airbyte | https://docs.airbyte.io/integrations/sources/mysql#change-data-capture-cdc | Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes, and databases. | 
| aesop | https://github.com/Flipkart/aesop | Built on top of Databus. In production use at http://www.flipkart.com/. Allows you to plug in your own code to transform/process the MySQL events. | 
| databus | https://github.com/linkedin/databus | Precursor to Kafka. Reads from MySQL and Oracle, and replicates to its own log structure. In production use at LinkedIn. No Kafka integration. Uses Open Replicator. | 
| FlexCDC | http://github.com/greenlion/swanhart-tools/ | FlexCDC is a daemon that reads a MySQL replication stream and sends records to log tables or plugins. Supports transactions and ALTER table to keep the log tables in DDL sync with the MySQL server. | 
| Lapidus | https://github.com/JarvusInnovations/lapidus | Streams data from MySQL, PostgreSQL, and MongoDB as newline delimited JSON. Can be run as a daemon or included as a Node.js module. | 
| Maxwell | https://github.com/zendesk/maxwell | Reads MySQL event stream, output events as JSON. Parses ALTER/CREATE TABLE/etc statements to keep schema in sync. Written in java. Well maintained. | 
| mypipe | https://github.com/mardambey/mypipe | Reads MySQL event stream, and emits events corresponding to INSERTs, DELETEs, UPDATEs. Written in Scala. Emits Avro to Kafka. | 
| mysql_cdc | https://github.com/rusuly/mysql_cdc | MySQL binlog Change Data Capture (CDC) connector for Rust. | 
| MySqlCdc | https://github.com/rusuly/MySqlCdc | MySQL binlog Change Data Capture (CDC) connector for .NET. The lib is based on mysql-binlog-connector-java. | 
| mysql-binlog-connector-java | https://github.com/shyiko/mysql-binlog-connector-java | Library that parses MySQL binary logs and calls your code to process them. Fork/rewrite of Open Replicator. Has tests. | 
| mysql_streamer | https://github.com/Yelp/mysql_streamer | MySQLStreamer is a database change data capture and publish system. It’s responsible for capturing each individual database change, enveloping them into messages, and publishing to Kafka. | 
| oltp-cdc-olap | https://github.com/xmlking/nifi-examples/tree/master/oltp-cdc-olap | Uses Maxwell to replicate to Apache Nifi. | 
| Open Replicator | https://code.google.com/p/open-replicator/ | Library that parses MySQL binary logs and calls your code to process them. Does not seem to be maintained. | 
| Canal | https://github.com/alibaba/canal | Alibaba Open Source Solution for MySQL binlog event parse and consumer | 
| Project name | Site | Description | 
| python-mysql-replication | https://github.com/noplay/python-mysql-replication | Pure python library that parses MySQL binary logs and lets you process the replication events. Basically, the python equivalent of mysql-binlog-connector-java | 
| recordbus | https://github.com/pyr/recordbus | Directly maps MySQL events to JSON, with no interpretation. Written in Java. Replicates to Kafka. | 
| R | ||
| wombat | https://github.com/TiVo/wombat | Uses mysql-binlog-connector-java, outputs JSON to Kafka. | 
| php-mysql-replication | https://github.com/krowinski/php-mysql-replication | Pure PHP Implementation of MySQL replication protocol. This allows you to receive event like insert, update, delete with their data and raw SQL queries. | 
| StreamSets Data Collector | https://streamsets.com/products/sdc/ | Pipelines that can be configured to continuously ingest data from any number of tables in a relational database (using JDBC). |