Sanchit Dilip Jain/Integrate AWS Secret Manager with AWS Managed workflow for Apache Airflow ๐Ÿ”

Created Tue, 18 Jun 2024 12:00:00 +0000 Modified Mon, 12 Aug 2024 09:03:58 +0000
715 Words 3 min

Integrate AWS Secret Manager with AWS Managed workflow for Apache Airflow

Overview

  1. What is AWS Managed Apache Airflow?

    • AWS Managed Apache Airflow (MWAA) is a managed service that makes it easier to run and manage workflows built with Apache Airflow on AWS. Apache Airflow is an open-source tool used to programmatically author, schedule, and monitor workflows.
    • With MWAA, AWS takes care of the operational aspects such as scaling, patching, and availability, allowing you to focus on building and managing your workflows.
  2. What is AWS Secrets Manager?

    • AWS Secrets Manager is a service that helps you protect access to your applications, services, and IT resources without the upfront cost and complexity of managing your own hardware security modules (HSMs) or infrastructure.
    • Secrets Manager enables you to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. Users and applications retrieve secrets with a call to the Secrets Manager API, eliminating the need to hard-code sensitive information in plaintext.
  3. Importance of Securing Credentials:

    • Securing credentials is critical for the following reasons:
      • Prevent Unauthorized Access
      • Maintain Data Integrity
      • Compliance and Auditing
      • Reduce Risk of Breaches
    • By integrating AWS Secrets Manager with AWS Managed Apache Airflow, you can securely manage and retrieve secrets, ensuring that your workflows operate securely and efficiently.

Demo

  • Managing secrets securely is a crucial aspect of deploying workflows on AWS Managed Apache Airflow. This blog will walk you through the steps required to securely insert secrets at the start of your workflows using AWS Secrets Manager and AWS IAM.

    • Prerequisites

      • AWS Account: Ensure you have an AWS account with the necessary permissions.

      • AWS CLI: Install and configure the AWS CLI on your local machine.

      • Apache Airflow Environment: An existing AWS Managed Apache Airflow environment.

    • Step 1: Create a Secret in AWS Secrets Manager

      • Navigate to AWS Secrets Manager in the AWS Management Console.

      • Click on “Store a new secret”.

      • Select the type of secret you want to store (e.g., “Other type of secrets”).

      • Enter the key-value pairs for your secret.

      • Name your secret and add an optional description.

      • Configure rotation settings if needed (optional).

      • Click “Next” and review your secret before clicking “Store”.

    • Step 2: To allow your Airflow environment to access the secrets, you need to create an IAM role and attach a policy.

      • Create a new IAM policy with the following JSON, replacing YOUR_SECRET_ARN with the ARN of your secret:

        {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Action": [
                        "secretsmanager:GetSecretValue"
                    ],
                    "Resource": "YOUR_SECRET_ARN"
                }
            ]
        }
        
      • Create a new IAM role for your Airflow environment and attach the above policy.

      • Attach the IAM role to your Amazon MWAA environment:

      • Navigate to the Amazon MWAA environment in the AWS Management Console.

      • Under the “Security configuration” section, attach the newly created IAM role.

    • Step 3: Update Airflow DAG to Retrieve Secrets

      • Modify your Airflow DAG to retrieve the secret from AWS Secrets Manager at the start of the workflow.

      • Install the required Python package:

        pip install boto3
        
      • Update your DAG file to include the following script for fetching secrets:

        from airflow import DAG
        from airflow.operators.python_operator import PythonOperator
        from airflow.utils.dates import days_ago
        import boto3
        import base64
        from botocore.exceptions import ClientError
        
        def get_secret():
            secret_name = "YOUR_SECRET_NAME"
            region_name = "YOUR_AWS_REGION"
        
            # Create a Secrets Manager client
            session = boto3.session.Session()
            client = session.client(
                service_name='secretsmanager',
                region_name=region_name
            )
        
            try:
                get_secret_value_response = client.get_secret_value(
                    SecretId=secret_name
                )
            except ClientError as e:
                raise e
        
            # Decrypts secret using the associated KMS key.
            if 'SecretString' in get_secret_value_response:
                secret = get_secret_value_response['SecretString']
            else:
                secret = base64.b64decode(get_secret_value_response['SecretBinary'])
        
            return secret
        
        default_args = {
            'owner': 'airflow',
            'start_date': days_ago(1),
        }
        
        dag = DAG(
            'retrieve_secret_dag',
            default_args=default_args,
            description='A simple DAG to retrieve secrets from AWS Secrets Manager',
            schedule_interval=None,
        )
        
        retrieve_secret = PythonOperator(
            task_id='retrieve_secret',
            python_callable=get_secret,
            dag=dag,
        )
        
        retrieve_secret
        
    • Step 4: Deploy and Run Your DAG

      • Upload the DAG file to your S3 bucket linked to the Amazon MWAA environment.
      • Navigate to the Airflow UI and trigger the DAG.
      • Verify the logs to ensure the secret was retrieved successfully.

Resources

  • By following these steps, you can securely manage and access secrets in your AWS Managed Apache Airflow environment using AWS Secrets Manager and IAM roles. This setup ensures that your sensitive data remains secure and is accessible only to authorized workflows.
  • For more detailed information, you can refer to the AWS Secrets Manager documentation and the AWS Managed Apache Airflow documentation.