All of your queries are running, and your SQL-defined charts are returning results, but when it’s maintenance time, you don’t have a clear picture of the cascade of views, tables, and CSV files that your analysis depends on. At least it’s lucky that we’re good analysts, and we don’t make mistakes like circular references … right. How many of you ran straight for the Git repo? You can sit down and start coding directed acyclical graphs (DAGs) in something like Airflow , but that’s a laborious task that involves reading through tons of SQL files and recording dependencies. There must be a better way! I searched far and wide and found nothing, so I decided to dust off my Python skills from a previous life. This blog post will walk you through what I wrote, and leave you with a (mostly) working mapper, so you too can identify, clean up, and maintain your SQL ETL. If you want a copy of the script to run on your own set of SQL, email us at [email protected] and we’ll send it right o
We are building a knowledge place for new programmers. We focused on providing relevant information in a way that helps and grow knowledge of the reader’s