This article provides a simple overview of how data flows through the Cascade Debt system for customers using a Fivetran data connection.
Data flows through the Cascade Debt system by first ingesting data using Fivetran, a third-party ETL tool.
Fivetran connects directly to a customer's database environment and continuously syncs data by detecting and replicating additions, changes, and deletions. Please note that Fivetran operates on a permission-based model - this means it can only access the specific schemas and tables that customers explicitly grant it access to.
Once the data lands in the Cascade Debt environment, it is processed using Databricks, which serves as Cascade Debt's primary data platform.
Within Databricks, the data moves through several transformation layers that help structure and optimize it for Cascade Debt's internal systems. These transformations are handled using a combination of Python scripts and dbt (data build tool) models. This approach allows Cascade Debt to perform complex, large-scale transformations efficiently and with maintainable code.
After the data has been fully transformed and enriched, Cascade Debt runs a set of validations and quality checks to ensure that everything is accurate and reliable. Once verified, the data is then surfaced in the Cascade Debt application, which is built using React.js on the frontend and Nest.js on the backend.
This entire flow ensures that customer data is securely ingested, reliably transformed, and presented accurately through Cascade Debt's user interface.