Monitoring and Security of Amazon Redshift DataShares
Introduction
What is Amazon Redshift Data Sharing?
- Amazon Redshift Data Sharing is a feature that enables Redshift cluster owners to share live data across different Amazon Redshift clusters, accounts, and regions.
- With data sharing, users can easily and securely share data with other Redshift users, without having to copy or move the data.
What are Amazon Redshift Data Sharing Benefits?
- Data sharing is a great way to improve your security and performance, as well as save money.
- Enhanced Security - By using Amazon Redshift data sharing, you can ensure that only authorized users have access to sensitive data by limiting the number of tables they can access. This helps prevent accidental or malicious disclosure of confidential information during queries and analysis.
- Improved Performance - Data sharing improves query performance by allowing multiple users with different privileges to access the same table simultaneously without affecting one another’s results or causing errors in the query plan generated by SQL queries run against it (for example, if two people were trying to run a join on two distinct tables).
- Cost savings - Sharing data across accounts and regions can help to reduce data storage costs and eliminate the need to move data.
- Data sharing is a great way to improve your security and performance, as well as save money.
Query to track all changes to the datashares: We can use the svl_datashare_change_log view for tracking changes to datashares on both producer and consumer redshift clusters.
Query to track datashare access activity: We can use svl_datashare_usage_producer view and svl_datashare_usage_consumer view on producer redshift cluster and consumer redshift clusters to track datashare access activity accordingly.
Demo
Objective:
- This lab will see useful queries and system tables for auditing and tracking shared data.
Prerequisite
- Complete Workload Isolation demo using Amazon Redshift Data Sharing lab
- Ensure both the clusters etl-redshift-cluster and bi-redshift-cluster are in a start state.
Tracking changes & Datashare access activity on Producer Redshift Cluster
Go to Amazon Redshift Console by clicking here:link
Select etl-redshift-cluster under Provisioned Clusters from the Redshift console
- Select Query Editor V2 to query etl-redshift-cluster. Configure Query Editor v2, leave settings as default, and press Configure Account
- Let’s connect to our database by selecting etl-redshift-cluster. Select temporary credentials on the list of options and connect using these credentials
- Database name: dev
- Database user: awsuser
Once connected, execute the following query on etl-redshift-cluster
select username, share_name, recordtime, action, share_object_type, share_object_name from svl_datashare_change_log order by recordtime desc;
`select * from svl_datashare_usage_producer;`
Tracking changes & Datashare access activity on Consumer Redshift Cluster
Go to Amazon Redshift Console by clicking here:link
Select bi-redshift-cluster under Provisioned Clusters from the Redshift console
- Let’s connect to our database by selecting bi-redshift-cluster. Select temporary credentials on the list of options and connect using these credentials
- Database name: dev
- Database user: awsuser
Once connected, execute the following query on bi-redshift-cluster
select username, share_name, recordtime, action, share_object_type, share_object_name from svl_datashare_change_log order by recordtime desc;
`select * from svl_datashare_usage_consumer;`
Resources
- Visit this page to find the latest documentation.