A few programs to profile tables in a database. Mostly to demonstrate my knowledge and a simple use case of the mysql.connector Python library. I will be adding some more advanced code and queries to this repo over the coming days
Converted from SAS code run in hadoop to Python querying a MySQL database (Hadoop version may follow at a later date)
The programs are designed to give a quick overview of the data in a database. The summary information includes things like the most common values for every field in a table and the number of occurences, referential integrity and missing values which could feed into a Data Quality Analysis ahead of conducting a full analysis on new data