Lambda Auditor

Audit AWS Lambda functions for internet connectivity posture. Discovers all Lambda functions across your AWS account, classifies them by VPC attachment, analyzes VPC route tables for internet reachability, detects external endpoints in non-VPC Lambda code/configuration, and produces a comprehensive audit report with an optional geo-map visualization.

Installation

pip install -r requirements.txt

Usage

Run the auditor as a Python module:

# Scan all regions (default)
python -m lambda_auditor.cli

# Scan specific regions
python -m lambda_auditor.cli --regions us-east-1,eu-west-1

# Custom output directory
python -m lambda_auditor.cli --output-dir ./my-reports

# Use a specific AWS profile
python -m lambda_auditor.cli --profile my-profile

# Skip geo-map rendering
python -m lambda_auditor.cli --no-map

# Env-var scanning only (skip deployment package downloads — faster, lower cost)
python -m lambda_auditor.cli --skip-code-download

# Only scan Python and JavaScript files in deployment packages
python -m lambda_auditor.cli --include-extensions .py,.js

# Scan all default file types except JSON and YAML
python -m lambda_auditor.cli --exclude-extensions .json,.yaml

# Parallel scanning with 20 threads (default: 10)
python -m lambda_auditor.cli --max-workers 20

# Filter functions by name pattern (glob-style)
python -m lambda_auditor.cli --filter "prod-*"

# Filter by exact names (comma-separated)
python -m lambda_auditor.cli --filter "payment-handler,auth-service"

# Mix of exact names and patterns
python -m lambda_auditor.cli --filter "payment-handler,prod-*"

# Exclude functions by name or pattern
python -m lambda_auditor.cli --exclude "test-*,dev-*"

# Combine include and exclude
python -m lambda_auditor.cli --filter "prod-*" --exclude "prod-legacy-*"

# Summary-only HTML report (no per-function cards — faster for large fleets)
python -m lambda_auditor.cli --summary-only

Options can be combined:

python -m lambda_auditor.cli --regions us-east-1 --profile my-profile --output-dir ./reports --no-map

Minimum IAM Policy

The auditor uses only read-only API calls. Attach the following policy to the IAM user or role running the audit:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "lambda:ListFunctions",
                "lambda:GetFunction",
                "ec2:DescribeRouteTables",
                "ec2:DescribeRegions"
            ],
            "Resource": "*"
        }
    ]
}

The lambda:GetFunction call returns a pre-signed S3 URL for the deployment package download — no additional S3 or IAM permissions are needed. If your Lambda functions use a customer-managed KMS key for encryption, the auditor's role also needs kms:Decrypt on that key.

The tool never modifies, creates, or deletes any AWS resources.

Authentication

Lambda Auditor uses boto3's standard credential provider chain. It does not implement any custom authentication — it relies entirely on the credentials already configured in your environment. boto3 checks the following sources in order:

Environment variables — AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally AWS_SESSION_TOKEN
Shared credentials file — ~/.aws/credentials
AWS config file — ~/.aws/config
IAM role — If running on an EC2 instance, ECS task, or Lambda function with an attached IAM role

Recommended: Use `aws login` (not long-term access keys)

Long-term IAM access keys (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY) are a security risk — they don't expire and can be leaked. Use one of these temporary credential methods instead.

Option 1: `aws login` (simplest — works with console credentials)

The aws login command lets you authenticate using your existing AWS console credentials (root, IAM user, or federated). It opens a browser, you sign in, and you get temporary credentials valid for up to 12 hours. Requires AWS CLI v2.32.0+.

# Authenticate (opens browser)
aws login

# Or authenticate with a specific profile
aws login --profile my-audit-profile

# Run the auditor
python -m lambda_auditor.cli --profile my-audit-profile

# End session when done
aws logout

Prerequisites:

AWS CLI v2.32.0+ installed
The SignInLocalDevelopmentAccess managed policy attached to your IAM user/role/group (not needed for root)

For remote/headless environments without a browser, use aws login --remote for cross-device authentication.

See the AWS documentation for full details.

Option 2: `aws sso login` (for IAM Identity Center / SSO users)

Configure SSO:

aws configure sso
# Follow the prompts: SSO start URL, region, account, role, profile name

This creates a named profile in ~/.aws/config like:

[profile my-audit-profile]
sso_start_url = https://my-org.awsapps.com/start
sso_region = us-east-1
sso_account_id = 123456789012
sso_role_name = ReadOnlyAccess
region = us-east-1

Log in before running the auditor:

aws sso login --profile my-audit-profile

Run the auditor with that profile:

python -m lambda_auditor.cli --profile my-audit-profile

SSO credentials are temporary (typically 1-12 hours), automatically managed, and don't require storing secrets on disk.

If you must use access keys (least preferred)

If SSO isn't available, configure credentials via the AWS CLI:

aws configure --profile lambda-audit
# Enter your Access Key ID, Secret Access Key, region, output format

Then use:

python -m lambda_auditor.cli --profile lambda-audit

Without --profile, boto3 uses the [default] profile or whatever the AWS_PROFILE environment variable points to.

No credentials are stored, logged, or transmitted by the tool. All API calls are read-only (list, describe, get).

Output

Reports are written to the output directory (default: ./audit_output):

File	Description
`audit_report.json`	Machine-readable JSON report with full findings, summary counts, and errors
`audit_report.html`	Human-readable HTML report with summary, per-function details, and embedded geo-map
`geo_map.png`	World map showing plot lines from Lambda regions to detected external endpoint locations (omitted with `--no-map` or when no endpoints are found)

How It Works

Lambda Auditor follows a five-stage pipeline: discover → classify → analyze → detect → report. Here's exactly what happens at each step.

Step 1: Discover Lambda Functions

The scanner determines which regions to scan using a three-tier approach:

If you pass --regions us-east-1,eu-west-1, it uses exactly those regions
If you don't pass --regions, it calls ec2:DescribeRegions filtered by opt-in-status to discover every region your account has access to (default regions plus any you've opted into)
If that API call fails, it falls back to a hardcoded list of 10 common regions so the scan can still proceed

For each region, the scanner paginates through lambda:ListFunctions to capture the name, ARN, runtime, VPC configuration, region, and last modified date for every function. If a region fails (permissions, throttling), the error is logged and the scan continues with the remaining regions.

Step 2: Classify by VPC Attachment and Package Type

Each discovered function is classified along two dimensions:

VPC attachment: if the function's VpcConfig.SubnetIds list is non-empty, it's VPC-attached. Otherwise it's non-VPC (and has direct internet access by default).
Package type: Zip (code scan eligible — deployment package can be downloaded and scanned) or Container Image (env var scanning only — code is in a container registry, not a downloadable zip).

The report summary itemizes both: total functions, zip-packaged count, container-image count, VPC-attached count, and non-VPC count.

Step 3: Analyze VPC Routes (VPC-attached functions only)

For each VPC-attached function, the tool calls ec2:DescribeRouteTables filtered by the function's subnet IDs. It inspects the routes looking for a 0.0.0.0/0 destination:

Target starts with nat- → has internet route via NAT Gateway
Target starts with igw- → has internet route via Internet Gateway
No such route → no internet route (isolated subnet)
API error → marked as unknown

This tells you which VPC Lambdas can reach the internet and through what gateway type.

Step 4: Detect External Endpoints (non-VPC functions only)

For each non-VPC function, the tool calls lambda:GetFunction to retrieve the function's configuration and deployment package URL. It then:

Scans environment variables for URLs, IP addresses, hostnames, and database connection strings
Downloads the deployment package (zip) and scans source code files for the same patterns. By default it scans .py, .js, .ts, .java, .go, .rb, .cs, .json, .yaml, .yml, .env, .txt, .cfg, .ini, and .toml files. You can customize this with --include-extensions or --exclude-extensions.
Filters out AWS-internal endpoints (*.amazonaws.com, *.amazonwebservices.com)
Deduplicates the results

Use --skip-code-download to skip the zip download entirely and scan only environment variables — this is faster and cheaper for large fleets, catching ~70-80% of endpoints in well-architected environments.

If the deployment package can't be downloaded (timeout, permissions), the tool still reports whatever it found in environment variables and marks detection as "unavailable" for that function.

Note: If an endpoint URL is stored in AWS Secrets Manager or SSM Parameter Store and the Lambda function retrieves it at runtime, the tool will not detect it. Static analysis can only find endpoints that are directly present in code or environment variables — values resolved dynamically from external secret stores at execution time are invisible to this scan.

Step 5: Generate Reports

The tool computes summary counts and generates two reports:

JSON report (audit_report.json): Full machine-readable output with all findings, summary statistics, per-function details, and an errors section listing any regions or functions where analysis was incomplete.
HTML report (audit_report.html): Styled human-readable report with a summary table, per-function cards showing VPC status/route analysis or detected endpoints, and an errors section. If a geo-map was rendered, it's embedded as a base64 image.

Optional: Geo-Map Visualization

Unless --no-map is specified, the tool also:

Resolves each detected endpoint hostname to an IP address via DNS
Geolocates each IP using the free ip-api.com service (latitude, longitude, city, country)
Renders a world map using cartopy/matplotlib with arc lines drawn from each Lambda function's AWS region to the geographic location of each external endpoint, color-coded by function name

This gives you a visual overview of where your Lambda functions are communicating geographically.

Error Handling

The tool is designed to never stop on a single failure:

Region scan failures → logged, other regions continue
Route table lookup failures → marked "unknown", scan continues
Deployment package download failures → marked "unavailable", env var results still reported
DNS resolution failures → endpoint skipped on geo-map
Geolocation rate limits (429) → automatic backoff and retry
All errors are collected and included in the final report's errors section

Limitations

Lambda Auditor performs static analysis of code and configuration — it finds potential endpoints, not confirmed runtime traffic. Keep these limitations in mind:

Endpoint detection may produce false positives (URLs in comments, unused code paths, test fixtures)
Dynamically constructed URLs (e.g., f"https://{host}/api") won't be detected if the hostname comes from a runtime variable or external config service
DNS resolution runs from your machine, which may return different IPs than what the Lambda would resolve to (e.g., CDN edge nodes vary by location)
Deployment packages using container images (Lambda container format) are not scanned for code — only environment variables are analyzed. The report identifies these as "Container-Image (env var scan only)" so you know which functions have limited coverage.

For complete coverage, use Lambda Auditor alongside runtime monitoring (VPC Flow Logs) — static analysis catches potential endpoints in dormant functions, while flow logs capture actual traffic from active ones.

Cost Considerations

The main cost driver is downloading Lambda deployment packages for code scanning. The GetFunction API call itself is free, but the zip download transfers data from S3.

Cost by scale (assuming ~10 MB average package size)

Scale	Data Transfer	Cost (from internet)	Cost (same-region EC2)
100 functions	1 GB	~$0.09	$0.00
500 functions	5 GB	~$0.45	$0.00
1,000 functions	10 GB	~$0.90	$0.00

For daily scans of large fleets, costs compound: 1,000 functions at 25 MB average = ~$810/year from the internet.

Cost optimization tips

Run from within AWS in the same region as your Lambda functions — S3 to same-region compute is free data transfer
Use an S3 Gateway VPC Endpoint (free) if running on EC2/ECS — eliminates NAT Gateway data processing charges ($0.045/GB)
Use --no-map to skip geolocation API calls when you only need the VPC/endpoint analysis
For large fleets (500+ functions), consider that environment variable scanning alone (no zip download) catches 70-80% of external endpoints in well-architected environments. The GetFunction API response includes env vars without any download. Code scanning adds coverage for hardcoded URLs, config files bundled in the zip, and SDK endpoint overrides — but at the cost of downloading every package.

Hidden costs to watch for

NAT Gateway data processing if running on EC2 without an S3 VPC endpoint
Cross-region data transfer when scanning functions across many regions from a single location
Repeated scans (CI/CD pipelines, daily audits) multiply all transfer costs

API Rate Limits

The tool makes sequential AWS API calls, which keeps it well under rate limits for most workloads. However, large accounts should be aware of these service quotas:

API Call	Service Limit	How the tool uses it	Throttle risk
`lambda:GetFunction`	100 req/s per region	1 call per non-VPC function, sequential	Medium for 100+ functions with `--skip-code-download`
`lambda:ListFunctions`	15 req/s (control plane)	1 paginated call per region	Low
`ec2:DescribeRouteTables`	100 bucket, 20/s refill	1 call per subnet per VPC function, filtered	Low
`ec2:DescribeRegions`	100 bucket, 20/s refill	1 call total	Negligible

The GetFunction limit (100 req/s per account per region) is the most relevant. In normal mode (with code download), each call is naturally paced by the zip download time (~1-5 calls/s). With --skip-code-download, calls are much faster and could approach the limit in regions with hundreds of non-VPC functions.

The ListFunctions 15 req/s limit is shared across all Lambda control plane APIs (excluding GetFunction and invocations). If other processes in the same account are making Lambda API calls concurrently, the combined rate could trigger throttling.

All API calls use exponential backoff with jitter on throttle responses (Throttling, TooManyRequestsException, RequestLimitExceeded), so hitting a limit causes slower scans rather than failures. Results are not lost — the tool retries up to 3 times before recording an error and continuing.

Scaling for Large Accounts

The tool includes several features designed for accounts with hundreds or thousands of Lambda functions:

Parallel region scanning — regions are scanned concurrently using a thread pool (default: 10 threads, configurable with --max-workers)
Parallel VPC analysis and endpoint detection — functions are processed concurrently within each stage
Hostname deduplication — if the same endpoint (e.g., api.stripe.com) appears across 50 functions, DNS resolution and geolocation happen once, not 50 times
Summary-only HTML mode — --summary-only skips per-function cards in the HTML report, keeping the file small and fast to generate for large fleets
Env-var-only mode — --skip-code-download avoids downloading deployment packages entirely, reducing scan time and data transfer by ~95%

For a fleet of 1,000+ functions across 20+ regions, a recommended fast-scan command:

python -m lambda_auditor.cli --skip-code-download --no-map --summary-only --max-workers 20

This scans env vars only, skips geo-map rendering, produces a summary HTML, and uses 20 parallel threads. All data is ephemeral — nothing is persisted to disk beyond the final report files in the output directory.

Two-Pass Workflow: Summary Then Deep Scan

For large accounts, a two-pass approach gives you the best of both worlds — fast overview first, then targeted deep analysis:

# Pass 1: Fast summary of everything (seconds, minimal API calls)
python -m lambda_auditor.cli --skip-code-download --no-map --summary-only --max-workers 20

# Pass 2: Deep scan of specific functions by name pattern
python -m lambda_auditor.cli --filter "prod-payment-*" --max-workers 20

The --filter flag accepts comma-separated names or glob patterns (prod-*, *-api-*, payment-handler). The --exclude flag removes matching functions after filtering. Both can be combined — include first, then exclude. The tool lists all functions first (cheap API pagination), then filters by name before running the expensive analysis stages.

Example Output

You can generate example reports with randomized synthetic data (no AWS credentials needed):

python generate_example.py

This produces the following artifacts in the examples/ directory:

File	Description
`audit_report.html`	Styled HTML report with summary, per-function cards, embedded geo-map, and errors
`audit_report.json`	Machine-readable JSON with full findings
`audit_report.pdf`	Multi-page PDF with summary, function details, geo-map, and errors
`example_geo_map.png`	World map with arc lines from Lambda regions to endpoint locations

Report Preview

Geo-Map Visualization

The geo-map shows arc lines from each Lambda function's AWS region to the geographic location of each detected external endpoint, color-coded by function name:

Sample Data

The example generates 20 synthetic Lambda functions (~45% VPC-attached, ~55% non-VPC) with randomized VPC route analysis results, external endpoint detections, and geolocation data. The HTML report includes:

A summary table with total counts by classification and route status
Per-function cards showing VPC details (VPC ID, subnets, route status, gateway type) or detected external endpoints
An embedded geo-map visualization showing communication paths from AWS regions to endpoint locations worldwide
An errors section listing any simulated scan failures

To specify a custom output directory:

python generate_example.py ./my-examples

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
lambda_auditor		lambda_auditor
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
generate_example.py		generate_example.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Lambda Auditor

Installation

Usage

Minimum IAM Policy

Authentication

Recommended: Use aws login (not long-term access keys)

Option 1: aws login (simplest — works with console credentials)

Option 2: aws sso login (for IAM Identity Center / SSO users)

If you must use access keys (least preferred)

Output

How It Works

Step 1: Discover Lambda Functions

Step 2: Classify by VPC Attachment and Package Type

Step 3: Analyze VPC Routes (VPC-attached functions only)

Step 4: Detect External Endpoints (non-VPC functions only)

Step 5: Generate Reports

Optional: Geo-Map Visualization

Error Handling

Limitations

Cost Considerations

Cost by scale (assuming ~10 MB average package size)

Cost optimization tips

Hidden costs to watch for

API Rate Limits

Scaling for Large Accounts

Two-Pass Workflow: Summary Then Deep Scan

Example Output

Report Preview

Geo-Map Visualization

Sample Data

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Recommended: Use `aws login` (not long-term access keys)

Option 1: `aws login` (simplest — works with console credentials)

Option 2: `aws sso login` (for IAM Identity Center / SSO users)

Packages