In order to obtain accurate data insights, analysts often spend 80% of their energy on data preprocessing.
DataSpring is an ETL tool based on the latest streaming architecture, which adopts Log-based Change Data Capture, and supports rich, automatic and accurate semantics mapping construction between heterogeneous data, and at the same time satisfies both real-time and batch data processing. It also supports incremental synchronization and conversion of various mainstream databases such as Oracle, MySQL, SQL Server, PostgreSQL, and API data. Simple and easy-to-operate, it can be deployed privately.
In traditional architectures, remote transactional databases need to be read and written; while in event-driven applications, data and computing are not separated, and applications can obtain data only through local access, with higher throughput and lower latency.
Support common relational database data access, and also API data access
Timing tasks to complete batch tasks
Real-time streaming data access based on CDC
Realize data conversion similar to excel functions through preset formulas
For complex data processing logic, custom UDF operators based on python code are supported for processing
Configure timing task flow: specify how often to execute, specified time to execute, and cycle execution
The ETL management interface provides common modules such as operation log query and user management
As a member of the DFC series products, it supports the single sign-on feature of the DFC member center, and a seamless product experience can be achieved by the joint deployment with DFC
Real-time ingestion of live stream, sensors, and black friday event data to form a real-time monitoring dashboard
Load the data of the business system into the data warehouse after extraction, cleaning and transformation
Separate the CPU, MEM, and LOAD information from the messages reported by the server for analysis, and then trigger custom rules to alarm