Best practices for securing Identity and Access Management on Amazon Web Services
The premise is simple. You have users (person to machine) and roles (machine to machine) that need controlled access to certain services. Using IAM, you assign policies that determine whether each user and role can access certain services or not.
A strong IAM system maintains the principle of least-privilege (POLP), which grants users and roles only the permissions they explicitly need for specific resources. Maintaining secure IAM, however, is often at odds with building fast, experimenting with new technologies, and reducing friction across teams. Much like other DevOps challenges, IAM has only increased in complexity as we’ve moved away from monoliths to microservices and function computing.
As the number of IAM objects and their scope of reach in an environment grows, it can become difficult to answer simple questions such as:
- What policies are assigned to a user?
- Are policies assigned to groups? Are they inline policies?
- Can anyone else assume the roles of other users and principals?
There are also features in the IAM Authorization flow which complicate this scoping exercise, such as implicit denies and service control policies that come from outside the scope of your AWS account.
There are several approaches out there to help developers keep IAM configuration tidy, auditable and right-sized, but automation is a key component. In this post, we’ll look at some of the common themes in approaching IAM automation.
Visibility is key
In the IAM right-sizing space (sometimes referred to in analyst circles as Cloud Infrastructure Entitlement Management, or CIEM for short), one of the biggest challenges is getting insight into your existing IAM configuration. Tools like CloudMapper by Scott Piper helps create instant visibility of the current IAM state within your AWS environment. By querying APIs, it builds up a picture of your current state and validates security parameters to determine things like:
- How old are users? How old are users’ keys?
- Do particular roles ever use the APIs that the attached policy provides access to?
Starting with that visibility is the only way to understand where you have overly permissive definitions and default configuration that you can then right-size. It’s also important to continuously monitor as your environment grows and underlying IAM permissions change to meet evolving needs. To do that, you can either patch or solve.
Runtime detection is suitable for identifying things that are already there. With runtime events already running in your environment, you can use event or activity-based detections — such as searching for over-privileged IAMs, manual changes and unused network configurations. You can also go through API-based criteria — like key rotation policies, encryption settings, and the status of query logging services. Tools like RepoKid and Aardvark by Netflix take a different approach to help materialize dynamic IAM analysis and its benefits for continuous policy right-sizing.
Identifying and patching things in runtime is great, until the next deployment. To maintain secure IAM constantly and prevent issues from resurfacing, you may want to consider an immutable approach.
Infrastructure as code and IAM
Infrastructure as code (IaC) frameworks, such as Terraform and CloudFormation, make cloud provisioning a more controllable and repeatable endeavor. They also make it easier to embed security controls and policy enforcement earlier, via continuous scanning.
IaC may also be important to right-sizing IAM and maintaining it over time. Tools like Policy Sentry by Kinnaird McQuade introduces automated policy generation, and Bridgecrew’s open source tool, AirIAM, helps transform dynamic IAM into machine-readable files.
AirIAM transforms AWS IAM configuration into instantly right-sized Terraform code using just an IAM Read-Only permission for any given AWS account. By leveraging IAM Access Advisor data, AirIAM rapidly produces a list of unused keys, old accounts and unbound roles within your IAM configuration. It also goes a step further and provides recommendations for reducing role permissions based on actual API usage.
Transforming IAM into IaC unlocks incredible opportunities for IAM automation. It provides a foundation for testing, in the same way you automate unit and integration tests. It also enables you to add workflows and guardrails on top of IAM, via code review processes and automated CI/CD pipelines (as we’ll show below).
Enforcing IAM best practices with policy-as-code
Static analysis of IAM configuration in build-time stops things from getting into the cloud in the first place. Using a tool like Bridgecrew’s Checkov can fulfill this “static analysis” requirement with an extendable and customizable framework.
Along with its 450+ policies preventing common security misconfigurations, Checkov also provides early warning detection on several IAM configuration issues, such as:
- Ensuring no IAM policy documents allow * as a statement’s actions.
- Ensuring IAM policies that allow full *_* administrative privileges are not created.
- Ensuring the IAM role allows only specific services or principals to assume it.
With continuous IAM policy enforcement and periodic IAM right-sizing, you can get a relatively high level of confidence in your IAM’s security posture. Because IAM is incredibly context-specific, however, you’re still missing the whole picture.
Tools that derive insights from AWS APIs only capture your environment that is already deployed; thus, it is still potentially vulnerable. And scanning for out-of-the-box best practices lacks awareness of your unique permissions needs and benchmarks.
One way to govern future IAM usage is to build attribute- or policy-based access control, using your policy engine of choice. This defines how access is granted through the use of policies combined with different attributes (user, resource, environment, etc.)
Another way to prevent access issues from making their way into a live deployment and track IAM drift, is to compare pre-deploy and changed IAM configuration in order to determine if anything has changed. For example, you could use AirIAM and GitHub Actions, Slack notifications, and some if-then logic to diff right-sized baselines with a commit’s new IAM configuration.
This tutorial goes into much greater depth on leveraging this approach as part of your automated build pipeline; and also to detect manual IAM configuration changes made, by comparing requested state versus actual objects in the AWS environment periodically.
Human access to AWS resources
The practices above will allow you to spot and remediate old, unused, or potentially compromising IAM users and roles. Our last best practice goes in a different direction, taking into account that IAM does not exist in a bubble.
There are several security risks related to IAM access keys without proper controls in place, including:
- Cryptojacking: Usage of access keys to utilize cloud compute resources to mine cryptocurrencies.
- Data exfiltration: Once access keys are compromised, data resources or compute resources with access to those keys might be compromised as well — depending on the perimeter of the IAM profile. If access keys have a role that enables editing AWS Lambda and the lambda has access to an RDS instance, this instance data can be compromised.
- Malicious activity: An overly permissive access key can grant access to shut down instances and cause downtime to an application.
This process not only mitigates access risk, but also simplifies the number of user objects managed, ensures that access is terminated when users leave, and reduces onboarding steps.
At the end of the day, IAM is just one of several pieces of the cloud security puzzle — although an extremely important one. At Bridgecrew, we are excited to make IAM least-privilege automation easier and more effective to help teams develop and maintain secure IAM.
Share this story with your friends
Bridgecrew is the cloud security platform for developers. By leveraging automation and delivering security-as-code, Bridgecrew empowers teams to find, fix, and prevent misconfigurations in deployed cloud resources and in infrastructure as code.