Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/md/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
- [Overview](./explanation/python.md)
- [Installation](./how_to/python/installation.md)
- [Loading data into a `Table`](./how_to/python/table.md)
- [`pandas`, `polars` and `pyarrow` integration](./how_to/python/table_data.md)
- [Callbacks and events](./how_to/python/callbacks.md)
- [Multithreading](./how_to/python/multithreading.md)
- [Hosting a WebSocket server](./how_to/python/websocket.md)
Expand Down
81 changes: 81 additions & 0 deletions docs/md/how_to/python/table_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# DataFrame and Arrow Compatibility

`perspective-python` accepts a `Table` constructor argument from any of the
common Python columnar data libraries. In all three cases, `perspective.table`
(and `Table.update()`) consume the input directly — there is no need to
serialize to Apache Arrow IPC bytes yourself. However, note is
still the most efficient way to bulk load data into `Table`.

## PyArrow

```python
import pyarrow as pa
import perspective

arrow_table = pa.table({
"int": pa.array([1, 2, 3], type=pa.int64()),
"float": pa.array([1.5, 2.5, 3.5], type=pa.float64()),
"string": pa.array(["a", "b", "c"], type=pa.string()),
})

table = perspective.table(arrow_table)
```

The same applies to `Table.update()`:

```python
table.update(arrow_table)
```

If you have Arrow data already in IPC format (e.g. read from disk, received
over the wire, or produced by another tool), pass the raw `bytes` directly —
both stream and file formats are auto-detected:

```python
with open("data.arrow", "rb") as f:
table = perspective.table(f.read())
```

## Polars

```python
import polars as pl
import perspective

df = pl.DataFrame({
"a": [1, 2, 3, 4, 5],
"b": ["x", "y", "z", "x", "y"],
})

table = perspective.table(df)
```

Internally, the `DataFrame` is converted to a `pyarrow.Table` before
ingestion, so Polars columns inherit the Arrow type mapping above.

See also Perspective [Virtual Server support for `polars.DataFrame`](./virtual_server/polars.md)

## Pandas

`pandas.DataFrame` is supported via `pyarrow.Table.from_pandas`, which
dictates behavior including type support — see the
[pyarrow pandas docs](https://arrow.apache.org/docs/python/pandas.html) for
details on which pandas dtypes round-trip cleanly.

```python
from datetime import date, datetime
import numpy as np
import pandas as pd
import perspective

data = pd.DataFrame({
"int": np.arange(100),
"float": [i * 1.5 for i in range(100)],
"bool": [True for i in range(100)],
"date": [date.today() for i in range(100)],
"datetime": [datetime.now() for i in range(100)],
"string": [str(i) for i in range(100)],
})

table = perspective.table(data, index="float")
```
12 changes: 11 additions & 1 deletion rust/perspective-client/src/rust/config/expressions.rs
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ pub struct CompletionItemSuggestion {
}

#[doc(hidden)]
pub static COMPLETIONS: [CompletionItemSuggestion; 77] = [
pub static COMPLETIONS: [CompletionItemSuggestion; 79] = [
CompletionItemSuggestion {
label: "var",
insert_text: "var ${1:x := 1}",
Expand Down Expand Up @@ -537,6 +537,16 @@ pub static COMPLETIONS: [CompletionItemSuggestion; 77] = [
insert_text: "is_not_null(${1:x})",
documentation: "Whether x is not a null value",
},
CompletionItemSuggestion {
label: "coalesce",
insert_text: "coalesce(${1:x}, ${2:y})",
documentation: "Returns the first non-null argument.",
},
CompletionItemSuggestion {
label: "contains",
insert_text: "contains(${1:x}, ${2:'substr'})",
documentation: "Whether the string column or value contains the literal substring.",
},
CompletionItemSuggestion {
label: "not",
insert_text: "not(${1:x})",
Expand Down
106 changes: 106 additions & 0 deletions rust/perspective-js/test/js/expressions/numeric.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -1262,6 +1262,112 @@ function validate_binary_operations(output, expressions, operator) {
await table.delete();
});

test("coalesce returns first non-null arg", async function () {
const table = await perspective.table({
a: "integer",
b: "integer",
});

const view = await table.view({
expressions: {
coalesce_ab: 'coalesce("a", "b")',
coalesce_ab_default: 'coalesce("a", "b", 99)',
},
});

await table.update({
a: [1, null, null, 4],
b: [10, 20, null, 40],
});

const result = await view.to_columns();
expect(result["coalesce_ab"]).toEqual([1, 20, null, 4]);
expect(result["coalesce_ab_default"]).toEqual([1, 20, 99, 4]);
await view.delete();
await table.delete();
});

test("coalesce promotes mixed numeric inputs to float", async function () {
const table = await perspective.table({
i: "integer",
f: "float",
});

const view = await table.view({
expressions: {
coalesce_if: 'coalesce("i", "f")',
coalesce_if_default: 'coalesce("i", "f", 0.5)',
},
});

await table.update({
i: [1, null, null, 4],
f: [null, 2.5, null, 4.5],
});

const result = await view.to_columns();
const schema = await view.expression_schema();
expect(schema["coalesce_if"]).toEqual("float");
expect(schema["coalesce_if_default"]).toEqual("float");
expect(result["coalesce_if"]).toEqual([1, 2.5, null, 4]);
expect(result["coalesce_if_default"]).toEqual([1, 2.5, 0.5, 4]);

await view.delete();
await table.delete();
});

test("coalesce with all-null inputs returns null", async function () {
const table = await perspective.table({
a: "integer",
b: "integer",
});

const view = await table.view({
expressions: { coalesce_nulls: 'coalesce("a", "b")' },
});

await table.update({
a: [null, null, null],
b: [null, null, null],
});

const result = await view.to_columns();
expect(result["coalesce_nulls"]).toEqual([null, null, null]);
await view.delete();
await table.delete();
});

test("coalesce fails validation for incompatible types", async function () {
const table = await perspective.table({
a: "integer",
b: "string",
});

const validated = await table.validate_expressions([
'coalesce("a", "b")',
"coalesce(\"a\", 'fallback')",
]);

expect(validated.expression_schema).toEqual({});
expect(validated.errors['coalesce("a", "b")']).toEqual({
column: 0,
line: 0,
error_message:
"Type Error - inputs do not resolve to a valid expression.",
});

expect(validated.errors["coalesce(\"a\", 'fallback')"]).toEqual(
{
column: 0,
line: 0,
error_message:
"Type Error - inputs do not resolve to a valid expression.",
},
);

await table.delete();
});

test("null", async function () {
const table = await perspective.table({
a: "integer",
Expand Down
24 changes: 24 additions & 0 deletions rust/perspective-js/test/js/expressions/string.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,30 @@ const random_string = (
table.delete();
});

test("Coalesce strings", async function () {
const table = await perspective.table({
a: ["ABC", null, null, "HIjK", null],
b: ["xyz", "DEF", null, "stu", null],
});
const view = await table.view({
expressions: {
coalesce_str: 'coalesce("a", "b", \'N/A\')',
},
});
const result = await view.to_columns();
const schema = await view.expression_schema();
expect(schema["coalesce_str"]).toEqual("string");
expect(result["coalesce_str"]).toEqual([
"ABC",
"DEF",
"N/A",
"HIjK",
"N/A",
]);
view.delete();
table.delete();
});

test("Concat", async function () {
const table = await perspective.table({
a: ["abc", "deeeeef", "fg", "hhs", "abcdefghijk"],
Expand Down
Loading
Loading