Skip to content

Latest commit

 

History

History
124 lines (93 loc) · 3.86 KB

File metadata and controls

124 lines (93 loc) · 3.86 KB

PyPangolin

Python Client for the Pangolin Data Catalog.

Installation

pip install pypangolin

For Delta Lake support:

pip install "pypangolin[delta]"

Quick Start

Core Client

from pypangolin import PangolinClient

# Connect to Pangolin
client = PangolinClient(uri="http://localhost:8080")
client.login("username", "password")

# Work with catalogs
catalogs = client.catalogs.list()

### Warehouse Management

```python
# Create an S3 warehouse with custom configuration (e.g. MinIO)
warehouse = client.warehouses.create_s3(
    name="minio_warehouse",
    bucket="my-bucket",
    endpoint="http://minio:9000",
    access_key="minio",
    secret_key="minio123",
    # Pass extra S3 properties via kwargs
    **{"s3.path-style-access": "true"}
)

### PyIceberg Integration

```python
from pypangolin import get_iceberg_catalog

catalog = get_iceberg_catalog("analytics", uri="http://localhost:8080", token="...")
table = catalog.load_table("sales.transactions")

Generic Assets (Delta Lake)

from pypangolin.assets import DeltaAsset

# Write data and register in Pangolin automatically
DeltaAsset.write(
    client, 
    "analytics", "staging", "my_delta_table", 
    dataframe, 
    location="s3://bucket/path"
)

Documentation

Getting Started

Core Features

Table Formats & Assets

Advanced Features

Features

Full API Coverage - Complete support for all Pangolin REST API endpoints
PyIceberg Integration - Seamless Apache Iceberg table operations
Multi-Format Support - Delta, Hudi, Paimon, Parquet, CSV, JSON, and more
Git-like Operations - Branching, merging, tagging with conflict resolution
Governance - Role-based access control and business metadata
Federated Catalogs - Connect to remote Iceberg catalogs
Type-Safe - Pydantic models for all API responses
Audit Logging - Comprehensive audit logs with user attribution

Verification

To run the end-to-end verification suite against a local Pangolin + MinIO stack:

# Ensure MinIO is running and buckets exist
python3 scripts/ensure_buckets.py

# Run verification script
python3 scripts/verify_pypangolin_live.py

License

MIT