Skip to content

aws-samples/sample-lambda-auditor

Lambda Auditor

Audit AWS Lambda functions for internet connectivity posture. Discovers all Lambda functions across your AWS account, classifies them by VPC attachment, analyzes VPC route tables for internet reachability, detects external endpoints in non-VPC Lambda code/configuration, and produces a comprehensive audit report with an optional geo-map visualization.

Installation

pip install -r requirements.txt

Usage

Run the auditor as a Python module:

# Scan all regions (default)
python -m lambda_auditor.cli

# Scan specific regions
python -m lambda_auditor.cli --regions us-east-1,eu-west-1

# Custom output directory
python -m lambda_auditor.cli --output-dir ./my-reports

# Use a specific AWS profile
python -m lambda_auditor.cli --profile my-profile

# Skip geo-map rendering
python -m lambda_auditor.cli --no-map

# Env-var scanning only (skip deployment package downloads — faster, lower cost)
python -m lambda_auditor.cli --skip-code-download

# Only scan Python and JavaScript files in deployment packages
python -m lambda_auditor.cli --include-extensions .py,.js

# Scan all default file types except JSON and YAML
python -m lambda_auditor.cli --exclude-extensions .json,.yaml

# Parallel scanning with 20 threads (default: 10)
python -m lambda_auditor.cli --max-workers 20

# Filter functions by name pattern (glob-style)
python -m lambda_auditor.cli --filter "prod-*"

# Filter by exact names (comma-separated)
python -m lambda_auditor.cli --filter "payment-handler,auth-service"

# Mix of exact names and patterns
python -m lambda_auditor.cli --filter "payment-handler,prod-*"

# Exclude functions by name or pattern
python -m lambda_auditor.cli --exclude "test-*,dev-*"

# Combine include and exclude
python -m lambda_auditor.cli --filter "prod-*" --exclude "prod-legacy-*"

# Summary-only HTML report (no per-function cards — faster for large fleets)
python -m lambda_auditor.cli --summary-only

Options can be combined:

python -m lambda_auditor.cli --regions us-east-1 --profile my-profile --output-dir ./reports --no-map

Minimum IAM Policy

The auditor uses only read-only API calls. Attach the following policy to the IAM user or role running the audit:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "lambda:ListFunctions",
                "lambda:GetFunction",
                "ec2:DescribeRouteTables",
                "ec2:DescribeRegions"
            ],
            "Resource": "*"
        }
    ]
}

The lambda:GetFunction call returns a pre-signed S3 URL for the deployment package download — no additional S3 or IAM permissions are needed. If your Lambda functions use a customer-managed KMS key for encryption, the auditor's role also needs kms:Decrypt on that key.

The tool never modifies, creates, or deletes any AWS resources.

Authentication

Lambda Auditor uses boto3's standard credential provider chain. It does not implement any custom authentication — it relies entirely on the credentials already configured in your environment. boto3 checks the following sources in order:

  1. Environment variablesAWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally AWS_SESSION_TOKEN
  2. Shared credentials file~/.aws/credentials
  3. AWS config file~/.aws/config
  4. IAM role — If running on an EC2 instance, ECS task, or Lambda function with an attached IAM role

Recommended: Use aws login (not long-term access keys)

Long-term IAM access keys (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY) are a security risk — they don't expire and can be leaked. Use one of these temporary credential methods instead.

Option 1: aws login (simplest — works with console credentials)

The aws login command lets you authenticate using your existing AWS console credentials (root, IAM user, or federated). It opens a browser, you sign in, and you get temporary credentials valid for up to 12 hours. Requires AWS CLI v2.32.0+.

# Authenticate (opens browser)
aws login

# Or authenticate with a specific profile
aws login --profile my-audit-profile

# Run the auditor
python -m lambda_auditor.cli --profile my-audit-profile

# End session when done
aws logout

Prerequisites:

  • AWS CLI v2.32.0+ installed
  • The SignInLocalDevelopmentAccess managed policy attached to your IAM user/role/group (not needed for root)

For remote/headless environments without a browser, use aws login --remote for cross-device authentication.

See the AWS documentation for full details.

Option 2: aws sso login (for IAM Identity Center / SSO users)

  1. Configure SSO:
aws configure sso
# Follow the prompts: SSO start URL, region, account, role, profile name

This creates a named profile in ~/.aws/config like:

[profile my-audit-profile]
sso_start_url = https://my-org.awsapps.com/start
sso_region = us-east-1
sso_account_id = 123456789012
sso_role_name = ReadOnlyAccess
region = us-east-1
  1. Log in before running the auditor:
aws sso login --profile my-audit-profile
  1. Run the auditor with that profile:
python -m lambda_auditor.cli --profile my-audit-profile

SSO credentials are temporary (typically 1-12 hours), automatically managed, and don't require storing secrets on disk.

If you must use access keys (least preferred)

If SSO isn't available, configure credentials via the AWS CLI:

aws configure --profile lambda-audit
# Enter your Access Key ID, Secret Access Key, region, output format

Then use:

python -m lambda_auditor.cli --profile lambda-audit

Without --profile, boto3 uses the [default] profile or whatever the AWS_PROFILE environment variable points to.

No credentials are stored, logged, or transmitted by the tool. All API calls are read-only (list, describe, get).

Output

Reports are written to the output directory (default: ./audit_output):

File Description
audit_report.json Machine-readable JSON report with full findings, summary counts, and errors
audit_report.html Human-readable HTML report with summary, per-function details, and embedded geo-map
geo_map.png World map showing plot lines from Lambda regions to detected external endpoint locations (omitted with --no-map or when no endpoints are found)

How It Works

Lambda Auditor follows a five-stage pipeline: discover → classify → analyze → detect → report. Here's exactly what happens at each step.

Step 1: Discover Lambda Functions

The scanner determines which regions to scan using a three-tier approach:

  1. If you pass --regions us-east-1,eu-west-1, it uses exactly those regions
  2. If you don't pass --regions, it calls ec2:DescribeRegions filtered by opt-in-status to discover every region your account has access to (default regions plus any you've opted into)
  3. If that API call fails, it falls back to a hardcoded list of 10 common regions so the scan can still proceed

For each region, the scanner paginates through lambda:ListFunctions to capture the name, ARN, runtime, VPC configuration, region, and last modified date for every function. If a region fails (permissions, throttling), the error is logged and the scan continues with the remaining regions.

Step 2: Classify by VPC Attachment and Package Type

Each discovered function is classified along two dimensions:

  • VPC attachment: if the function's VpcConfig.SubnetIds list is non-empty, it's VPC-attached. Otherwise it's non-VPC (and has direct internet access by default).
  • Package type: Zip (code scan eligible — deployment package can be downloaded and scanned) or Container Image (env var scanning only — code is in a container registry, not a downloadable zip).

The report summary itemizes both: total functions, zip-packaged count, container-image count, VPC-attached count, and non-VPC count.

Step 3: Analyze VPC Routes (VPC-attached functions only)

For each VPC-attached function, the tool calls ec2:DescribeRouteTables filtered by the function's subnet IDs. It inspects the routes looking for a 0.0.0.0/0 destination:

  • Target starts with nat-has internet route via NAT Gateway
  • Target starts with igw-has internet route via Internet Gateway
  • No such route → no internet route (isolated subnet)
  • API error → marked as unknown

This tells you which VPC Lambdas can reach the internet and through what gateway type.

Step 4: Detect External Endpoints (non-VPC functions only)

For each non-VPC function, the tool calls lambda:GetFunction to retrieve the function's configuration and deployment package URL. It then:

  1. Scans environment variables for URLs, IP addresses, hostnames, and database connection strings
  2. Downloads the deployment package (zip) and scans source code files for the same patterns. By default it scans .py, .js, .ts, .java, .go, .rb, .cs, .json, .yaml, .yml, .env, .txt, .cfg, .ini, and .toml files. You can customize this with --include-extensions or --exclude-extensions.
  3. Filters out AWS-internal endpoints (*.amazonaws.com, *.amazonwebservices.com)
  4. Deduplicates the results

Use --skip-code-download to skip the zip download entirely and scan only environment variables — this is faster and cheaper for large fleets, catching ~70-80% of endpoints in well-architected environments.

If the deployment package can't be downloaded (timeout, permissions), the tool still reports whatever it found in environment variables and marks detection as "unavailable" for that function.

Note: If an endpoint URL is stored in AWS Secrets Manager or SSM Parameter Store and the Lambda function retrieves it at runtime, the tool will not detect it. Static analysis can only find endpoints that are directly present in code or environment variables — values resolved dynamically from external secret stores at execution time are invisible to this scan.

Step 5: Generate Reports

The tool computes summary counts and generates two reports:

  • JSON report (audit_report.json): Full machine-readable output with all findings, summary statistics, per-function details, and an errors section listing any regions or functions where analysis was incomplete.

  • HTML report (audit_report.html): Styled human-readable report with a summary table, per-function cards showing VPC status/route analysis or detected endpoints, and an errors section. If a geo-map was rendered, it's embedded as a base64 image.

Optional: Geo-Map Visualization

Unless --no-map is specified, the tool also:

  1. Resolves each detected endpoint hostname to an IP address via DNS
  2. Geolocates each IP using the free ip-api.com service (latitude, longitude, city, country)
  3. Renders a world map using cartopy/matplotlib with arc lines drawn from each Lambda function's AWS region to the geographic location of each external endpoint, color-coded by function name

This gives you a visual overview of where your Lambda functions are communicating geographically.

Error Handling

The tool is designed to never stop on a single failure:

  • Region scan failures → logged, other regions continue
  • Route table lookup failures → marked "unknown", scan continues
  • Deployment package download failures → marked "unavailable", env var results still reported
  • DNS resolution failures → endpoint skipped on geo-map
  • Geolocation rate limits (429) → automatic backoff and retry
  • All errors are collected and included in the final report's errors section

Limitations

Lambda Auditor performs static analysis of code and configuration — it finds potential endpoints, not confirmed runtime traffic. Keep these limitations in mind:

  • Endpoint detection may produce false positives (URLs in comments, unused code paths, test fixtures)
  • Dynamically constructed URLs (e.g., f"https://{host}/api") won't be detected if the hostname comes from a runtime variable or external config service
  • DNS resolution runs from your machine, which may return different IPs than what the Lambda would resolve to (e.g., CDN edge nodes vary by location)
  • Deployment packages using container images (Lambda container format) are not scanned for code — only environment variables are analyzed. The report identifies these as "Container-Image (env var scan only)" so you know which functions have limited coverage.

For complete coverage, use Lambda Auditor alongside runtime monitoring (VPC Flow Logs) — static analysis catches potential endpoints in dormant functions, while flow logs capture actual traffic from active ones.

Cost Considerations

The main cost driver is downloading Lambda deployment packages for code scanning. The GetFunction API call itself is free, but the zip download transfers data from S3.

Cost by scale (assuming ~10 MB average package size)

Scale Data Transfer Cost (from internet) Cost (same-region EC2)
100 functions 1 GB ~$0.09 $0.00
500 functions 5 GB ~$0.45 $0.00
1,000 functions 10 GB ~$0.90 $0.00

For daily scans of large fleets, costs compound: 1,000 functions at 25 MB average = ~$810/year from the internet.

Cost optimization tips

  1. Run from within AWS in the same region as your Lambda functions — S3 to same-region compute is free data transfer
  2. Use an S3 Gateway VPC Endpoint (free) if running on EC2/ECS — eliminates NAT Gateway data processing charges ($0.045/GB)
  3. Use --no-map to skip geolocation API calls when you only need the VPC/endpoint analysis
  4. For large fleets (500+ functions), consider that environment variable scanning alone (no zip download) catches 70-80% of external endpoints in well-architected environments. The GetFunction API response includes env vars without any download. Code scanning adds coverage for hardcoded URLs, config files bundled in the zip, and SDK endpoint overrides — but at the cost of downloading every package.

Hidden costs to watch for

  • NAT Gateway data processing if running on EC2 without an S3 VPC endpoint
  • Cross-region data transfer when scanning functions across many regions from a single location
  • Repeated scans (CI/CD pipelines, daily audits) multiply all transfer costs

API Rate Limits

The tool makes sequential AWS API calls, which keeps it well under rate limits for most workloads. However, large accounts should be aware of these service quotas:

API Call Service Limit How the tool uses it Throttle risk
lambda:GetFunction 100 req/s per region 1 call per non-VPC function, sequential Medium for 100+ functions with --skip-code-download
lambda:ListFunctions 15 req/s (control plane) 1 paginated call per region Low
ec2:DescribeRouteTables 100 bucket, 20/s refill 1 call per subnet per VPC function, filtered Low
ec2:DescribeRegions 100 bucket, 20/s refill 1 call total Negligible

The GetFunction limit (100 req/s per account per region) is the most relevant. In normal mode (with code download), each call is naturally paced by the zip download time (~1-5 calls/s). With --skip-code-download, calls are much faster and could approach the limit in regions with hundreds of non-VPC functions.

The ListFunctions 15 req/s limit is shared across all Lambda control plane APIs (excluding GetFunction and invocations). If other processes in the same account are making Lambda API calls concurrently, the combined rate could trigger throttling.

All API calls use exponential backoff with jitter on throttle responses (Throttling, TooManyRequestsException, RequestLimitExceeded), so hitting a limit causes slower scans rather than failures. Results are not lost — the tool retries up to 3 times before recording an error and continuing.

Scaling for Large Accounts

The tool includes several features designed for accounts with hundreds or thousands of Lambda functions:

  • Parallel region scanning — regions are scanned concurrently using a thread pool (default: 10 threads, configurable with --max-workers)
  • Parallel VPC analysis and endpoint detection — functions are processed concurrently within each stage
  • Hostname deduplication — if the same endpoint (e.g., api.stripe.com) appears across 50 functions, DNS resolution and geolocation happen once, not 50 times
  • Summary-only HTML mode — --summary-only skips per-function cards in the HTML report, keeping the file small and fast to generate for large fleets
  • Env-var-only mode — --skip-code-download avoids downloading deployment packages entirely, reducing scan time and data transfer by ~95%

For a fleet of 1,000+ functions across 20+ regions, a recommended fast-scan command:

python -m lambda_auditor.cli --skip-code-download --no-map --summary-only --max-workers 20

This scans env vars only, skips geo-map rendering, produces a summary HTML, and uses 20 parallel threads. All data is ephemeral — nothing is persisted to disk beyond the final report files in the output directory.

Two-Pass Workflow: Summary Then Deep Scan

For large accounts, a two-pass approach gives you the best of both worlds — fast overview first, then targeted deep analysis:

# Pass 1: Fast summary of everything (seconds, minimal API calls)
python -m lambda_auditor.cli --skip-code-download --no-map --summary-only --max-workers 20

# Pass 2: Deep scan of specific functions by name pattern
python -m lambda_auditor.cli --filter "prod-payment-*" --max-workers 20

The --filter flag accepts comma-separated names or glob patterns (prod-*, *-api-*, payment-handler). The --exclude flag removes matching functions after filtering. Both can be combined — include first, then exclude. The tool lists all functions first (cheap API pagination), then filters by name before running the expensive analysis stages.

Example Output

You can generate example reports with randomized synthetic data (no AWS credentials needed):

python generate_example.py

This produces the following artifacts in the examples/ directory:

File Description
audit_report.html Styled HTML report with summary, per-function cards, embedded geo-map, and errors
audit_report.json Machine-readable JSON with full findings
audit_report.pdf Multi-page PDF with summary, function details, geo-map, and errors
example_geo_map.png World map with arc lines from Lambda regions to endpoint locations

Report Preview

Report Preview

Geo-Map Visualization

The geo-map shows arc lines from each Lambda function's AWS region to the geographic location of each detected external endpoint, color-coded by function name:

Geo-Map Example

Sample Data

The example generates 20 synthetic Lambda functions (~45% VPC-attached, ~55% non-VPC) with randomized VPC route analysis results, external endpoint detections, and geolocation data. The HTML report includes:

  • A summary table with total counts by classification and route status
  • Per-function cards showing VPC details (VPC ID, subnets, route status, gateway type) or detected external endpoints
  • An embedded geo-map visualization showing communication paths from AWS regions to endpoint locations worldwide
  • An errors section listing any simulated scan failures

To specify a custom output directory:

python generate_example.py ./my-examples

About

No description, website, or topics provided.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages