Skip to content

Commit 34dbb0f

Browse files
committed
docs: update Python package README.md
1 parent 4c86cbb commit 34dbb0f

File tree

1 file changed

+48
-8
lines changed

1 file changed

+48
-8
lines changed

packages/backend/README.md

Lines changed: 48 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,57 @@
1-
# embedding-atlas
1+
# Embedding Atlas
22

3-
A Python package that provides a command line tool to visualize a dataset with embeddings.
3+
A Python package that provides a command line tool to visualize a dataset with embeddings. It also includes a Jupyter widget and a Streamlit widget.
44

5-
## Development Setup
5+
- Documentation: https://apple.github.io/embedding-atlas
6+
- GitHub: https://github.com/apple/embedding-atlas
67

7-
Install the [uv](https://github.com/astral-sh/uv) package manager.
8+
## Installation
89

9-
To launch the command line tool, run it with `uv run`.
10+
```bash
11+
pip install embedding-atlas
12+
```
13+
14+
and then launch the command line tool:
15+
16+
```bash
17+
embedding-atlas [OPTIONS] INPUTS...
18+
```
19+
20+
## Loading Data
21+
22+
You can load your data in two ways: locally or from Hugging Face.
23+
24+
### Loading Local Data
25+
26+
To get started with your own data, run:
27+
28+
```bash
29+
embedding-atlas path_to_dataset.parquet
30+
```
31+
32+
### Loading Hugging Face Data
33+
34+
You can instead load datasets from Hugging Face:
35+
36+
```bash
37+
embedding-atlas huggingface_org/dataset_name
38+
```
39+
40+
## Visualizing Embedding Projections
41+
42+
To visual embedding projections, pre-compute the X and Y coordinates, and specify the column names with `--x` and `--y`, such as:
1043

1144
```bash
12-
uv run embedding-atlas --help
45+
embedding-atlas path_to_dataset.parquet --x projection_x --y projection_y
1346
```
1447

15-
Run `./start.sh` to launch a test server with a online dataset.
48+
You may use the [SentenceTransformers](https://sbert.net/) package to compute high-dimensional embeddings from text data, and then use the [UMAP](https://umap-learn.readthedocs.io/en/latest/index.html) package to compute 2D projections.
49+
50+
You may also specify a column for pre-computed nearest neighbors:
51+
52+
```bash
53+
embedding-atlas path_to_dataset.parquet --x projection_x --y projection_y --neighbors neighbors
54+
```
1655

17-
To build the wheel, run `./build.sh` to build the wheel.
56+
The `neighbors` column should have values in the following format: `{"ids": [id1, id2, ...], "distances": [d1, d2, ...]}`.
57+
If this column is specified, you'll be able to see nearest neighbors for a selected point in the tool.

0 commit comments

Comments
 (0)