OpenProteinAI
diff --git a/‎.gitattributes‎
Lines changed: 2 additions & 0 deletions b/‎.gitattributes‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 15 additions & 18 deletions b/‎README.md‎
Lines changed: 15 additions & 18 deletions
diff --git a/‎openprotein/embeddings/future.py‎
Lines changed: 4 additions & 1 deletion b/‎openprotein/embeddings/future.py‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎openprotein/embeddings/models.py‎
Lines changed: 3 additions & 9 deletions b/‎openprotein/embeddings/models.py‎
Lines changed: 3 additions & 9 deletions
diff --git a/‎openprotein/embeddings/poet.py‎
Lines changed: 1 addition & 1 deletion b/‎openprotein/embeddings/poet.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎openprotein/embeddings/schemas.py‎
Lines changed: 5 additions & 3 deletions b/‎openprotein/embeddings/schemas.py‎
Lines changed: 5 additions & 3 deletions
diff --git a/‎openprotein/jobs/schemas.py‎
Lines changed: 1 addition & 0 deletions b/‎openprotein/jobs/schemas.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎openprotein/models/__init__.py‎
Lines changed: 2 additions & 0 deletions b/‎openprotein/models/__init__.py‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎openprotein/models/foundation/boltzgen.py‎
Lines changed: 192 additions & 0 deletions b/‎openprotein/models/foundation/boltzgen.py‎
Lines changed: 192 additions & 0 deletions
@@ -0,0 +1,2 @@
+# GitHub syntax highlighting
+pixi.lock linguist-language=YAML linguist-generated=true
@@ -1,5 +1,5 @@
 [![PyPI version](https://badge.fury.io/py/openprotein-python.svg)](https://pypi.org/project/openprotein-python/)
-[![Coverage](https://dev.docs.openprotein.ai/api-python/_images/coverage.svg)](https://pypi.org/project/openprotein-python/)
+[![Coverage](https://docs.openprotein.ai/_static/coverage.svg)](https://pypi.org/project/openprotein-python/)
 [![Conda version](https://anaconda.org/openprotein/openprotein-python/badges/version.svg)](https://anaconda.org/openprotein/openprotein-python)
 
 
@@ -10,19 +10,19 @@ The OpenProtein.AI Python Interface provides a user-friendly library to interact
 
 # Table of Contents
 
-|   | Workflow                                           | Description                                          |
-|---|----------------------------------------------------|------------------------------------------------------|
-| 0 | [`Quick start`](#Quick-start)                    | Quick start guide                     |
-| 1 | [`Installation`](https://docs.openprotein.ai/api-python/installation.html)                    | Install guide for pip and conda.                     |
-| 2 | [`Session management`](https://docs.openprotein.ai/api-python/overview.html)        | An overview of the OpenProtein Python Client & the asynchronous jobs system. |
-| 3 | [`Asssay-based Sequence Learning`](https://docs.openprotein.ai/api-python/core_workflow.html) | Covers core tasks such as data upload, model training & prediction, and sequence design. |
-| 4 | [`De Novo prediction & generative models (PoET)`](https://docs.openprotein.ai/api-python/poet_workflow.html) | Covers PoET, a protein LLM for *de novo* scoring, as well as sequence generation. |
-| 5 | [`Protein Language Models & Embeddings`](https://docs.openprotein.ai/api-python/embedding_workflow.html) | Covers methods for creating sequence embeddings with proprietary & open-source models. |
+|   | Workflow                                                                                                     | Description                                                                              |
+|---|--------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|
+| 0 | [`Quick start`](#Quick-start)                                                                                | Quick start guide                                                                        |
+| 1 | [`Installation`](https://docs.openprotein.ai/python-api/installation.html)                                   | Install guide for pip and conda.                                                         |
+| 2 | [`Session management`](https://docs.openprotein.ai/python-api/index.html)                                    | An overview of the OpenProtein Python Client & the asynchronous jobs system.             |
+| 3 | [`Property-Regression-Models`](https://docs.openprotein.ai/python-api/property-regression-models/index.html) | Covers core tasks such as data upload, model training & prediction, and sequence design. |
+| 4 | [`De Novo prediction & generative models (PoET)`](https://docs.openprotein.ai/python-api/poet/index.html)    | Covers PoET, a protein LLM for *de novo* scoring, as well as sequence generation.        |
+| 5 | [`Foundational models`](https://docs.openprotein.ai/python-api/foundation-models/index.html)                 | Covers methods for creating sequence embeddings with proprietary & open-source models.   |
 
 
 # Quick-start
 
-Get started with our quickstart README! You can peruse the [official documentation](https://docs.openprotein.ai/api-python/) for more details!
+Get started with our quickstart README! You can peruse the [official documentation](https://docs.openprotein.ai/python-api/) for more details!
 ## Installation 
 
 To install the python interface using pip, run the following command: 
@@ -37,16 +37,13 @@ conda install -c openprotein openprotein-python
 
 ### Requirements
 
-- Python 3.8 or higher.
-- pydantic version 1.0 or newer.
-- requests version 2.0 or newer.
-- tqdm version 4.0 or newer.
-- pandas version 1.0 or newer.
+- Python 3.10 or higher.
+- Other dependencies as in `pyproject.toml`
 
 # Getting started
 
 
-Read on below for the quick-start guide, or see the [docs](https://docs.openprotein.ai/api-python/) for more information!
+Read on below for the quick-start guide, or see the [docs](https://docs.openprotein.ai/python-api/) for more information!
 
 To begin, create a session using your login credentials.
 ```
@@ -57,10 +54,10 @@ session = openprotein.connect(USERNAME, PASSWORD)
 ```
 ## Job Status
 
-The interface offers `AsyncJobFuture` objects for asynchronous calls, allowing tracking of job status and result retrieval when ready. Given a future, you can check its status and retrieve results.
+The interface offers `Future` objects for asynchronous calls, allowing tracking of job status and result retrieval when ready. Given a future, you can check its status and retrieve results.
 
 ### Checking Job Status
-Check the status of an `AsyncJobFuture` using the following methods:
+Check the status of an `Future` using the following methods:
 ```
 future.refresh()  # call the backend to update the job status
 future.done()     # returns True if the job is done, meaning the status could be SUCCESS, FAILED, or CANCELLED
 
@@ -114,7 +114,10 @@ def sequences(self) -> list[bytes] | list[str]:
         return self._sequences
 
     def stream(self) -> Generator:
-        if self.job_type == JobType.poet_generate:
+        if (
+            self.job_type == JobType.poet_generate
+            or self.job_type == JobType.embeddings_generate
+        ):
             stream = api.request_get_generate_result(
                 session=self.session, job_id=self.id
             )
 
@@ -327,19 +327,13 @@ def fit_umap(
                 "Expected either assay or sequences to fit UMAP on!"
             )
         # get assay_id
-        assay_id = (
-            assay.assay_id
-            if isinstance(assay, AssayMetadata)
-            else assay.id if isinstance(assay, AssayDataset) else assay
-        )
-        model_id = self.id
         return umap_api.fit_umap(
-            model_id=model_id,
+            model=self,
+            reduction=reduction,
             feature_type=FeatureType.PLM,
             sequences=sequences,
-            assay_id=assay_id,
+            assay=assay,
             n_components=n_components,
-            reduction=reduction,
             **kwargs,
         )
 
 
@@ -369,7 +369,7 @@ def fit_umap(
         sequences: list[bytes] | list[str] | None = None,
         assay: AssayDataset | None = None,
         n_components: int = 2,
-        reduction: ReductionType | None = ReductionType.MEAN,
+        reduction: ReductionType = ReductionType.MEAN,
         **kwargs,
     ) -> "UMAPModel":
         """
 
@@ -37,8 +37,8 @@ def __getitem__(self, i):
 
 class EmbeddingsJob(Job, BatchJob):
 
-    job_type: Literal[JobType.embeddings_embed, JobType.embeddings_embed_reduced] = Field(
-        default=JobType.embeddings_embed
+    job_type: Literal[JobType.embeddings_embed, JobType.embeddings_embed_reduced] = (
+        Field(default=JobType.embeddings_embed)
     )
 
 
@@ -75,4 +75,6 @@ class ScoreSingleSiteJob(Job, BatchJob):
 
 class GenerateJob(Job, BatchJob):
 
-    job_type: Literal[JobType.poet_generate] = Field(default=JobType.poet_generate)
+    job_type: Literal[JobType.poet_generate, JobType.embeddings_generate] = Field(
+        default=JobType.poet_generate
+    )
@@ -46,6 +46,7 @@ class JobType(str, Enum):
     embeddings_attn = "/embeddings/attn"
     embeddings_logits = "/embeddings/logits"
     embeddings_embed_reduced = "/embeddings/embed_reduced"
+    embeddings_generate = "/embeddings/generate"
 
     svd_fit = "/svd/fit"
     svd_embed = "/svd/embed"
 
@@ -1,4 +1,6 @@
 """Make the ModelsAPI class available on the package."""
 
+from .foundation.boltzgen import BoltzGenFuture, BoltzGenJob
+from .foundation.proteinmpnn import ProteinMPNNModel
 from .foundation.rfdiffusion import RFdiffusionFuture, RFdiffusionJob
 from .models import ModelsAPI
@@ -0,0 +1,192 @@
+"""BoltzGen model for protein structure and sequence design."""
+
+from typing import Any, BinaryIO, Literal
+
+from pydantic import BaseModel, Field
+
+from openprotein.base import APISession
+from openprotein.common import ModelMetadata
+from openprotein.common.model_metadata import ModelDescription
+from openprotein.jobs import Future, Job
+from openprotein.models.base import ProteinModel
+from openprotein.protein import Protein
+
+
+class BoltzGenRequest(BaseModel):
+    "Specification for an BoltzGen request."
+
+    n: int = 1
+    # protein: Protein
+    structure_text: str | None = None
+    design_spec: dict[str, Any]
+    diffusion_batch_size: int | None = None
+    step_scale: float | None = None
+    noise_scale: float | None = None
+
+
+class BoltzGenJob(Job):
+    """Job schema for an BoltzGen request."""
+
+    job_type: Literal["/models/boltzgen"]
+
+
+class BoltzGenFuture(Future):
+    """Future for handling the results of an BoltzGen job."""
+
+    job: BoltzGenJob
+
+    def get_pdb(self, replicate: int = 0) -> str:
+        """
+        Retrieve the PDB file for a specific design.
+
+        Args:
+            design_index (int): The 0-based index of the design to retrieve.
+
+        Returns:
+            str: The content of the PDB file as a string.
+        """
+        return _boltzgen_api_result_get(
+            session=self.session, job_id=self.id, replicate=replicate
+        )
+
+    def get(self, replicate: int = 0):
+        """Default result accessor, returns the first PDB."""
+        # TODO handle different design index
+        return self.get_pdb(replicate=replicate)
+
+
+def _boltzgen_api_post(
+    session: APISession, request: BoltzGenRequest, **kwargs
+) -> BoltzGenJob:
+    """
+    POST a request for BoltzGen design.
+
+    Returns a Job object that can be used to retrieve results later.
+    """
+    endpoint = "v1/design/models/boltzgen"
+    body = request.model_dump(exclude_none=True)
+    body.update(kwargs)
+    response = session.post(endpoint, json=body)
+    return BoltzGenJob.model_validate(response.json())
+
+
+def _boltzgen_api_get_metadata(session: APISession) -> ModelMetadata:
+    """
+    POST a request for BoltzGen design.
+
+    Returns a Job object that can be used to retrieve results later.
+    """
+    endpoint = f"v1/design/models/boltzgen"
+    response = session.get(endpoint)
+    return ModelMetadata.model_validate(response.json())
+
+
+def _boltzgen_api_result_get(
+    session: APISession, job_id: str, replicate: int = 0
+) -> str:
+    """
+    POST a request for BoltzGen design.
+
+    # Returns a Job object that can be used to retrieve results later.
+    """
+    endpoint = f"v1/design/{job_id}/results"
+    response = session.get(endpoint, params={"replicate": replicate})
+    return response.text
+
+
+class BoltzGenModel(ProteinModel):
+    """
+    BoltzGen model for generating de novo protein structures.
+
+    This model supports functionalities like unconditional design, scaffolding,
+    and binder design.
+    """
+
+    model_id: str = "boltzgen"
+
+    def __init__(self, session: APISession, model_id: str = "boltzgen"):
+        # The model_id from the API might be more specific, e.g., "boltzgen-v1.1"
+        super().__init__(session, model_id)
+
+    def get_metadata(self) -> ModelMetadata:
+        return ModelMetadata(
+            model_id="boltzgen",
+            description=ModelDescription(summary="BoltzGen"),
+            dimension=0,
+            output_types=["pdb"],
+            input_tokens=[],
+            token_descriptions=[[]],
+        )
+
+    def generate(
+        self,
+        design_spec: dict[str, Any],
+        structure_file: str | bytes | BinaryIO | None = None,
+        n: int = 1,
+        diffusion_batch_size: int | None = None,
+        step_scale: float | None = None,
+        noise_scale: float | None = None,
+        **kwargs,
+    ) -> BoltzGenFuture:
+        """
+        Run a protein structure generate job using BoltzGen.
+
+        Parameters
+        ----------
+        design_spec : dict[str, Any]
+            The BoltzGen design specification to run. This is the Python representation
+            of the BoltzGen yaml request specification.
+        structure_file : BinaryIO, optional
+            An input PDB file (as a file-like object) used for inpainting or other
+            guided design tasks where parts of an existing structure are provided.
+        n : int, optional
+            The number of unique design trajectories to run (default is 1).
+        diffusion_batch_size : int, optional
+            The batch size for diffusion sampling. Controls how many samples are
+            processed in parallel during the diffusion process.
+        step_scale : float, optional
+            Scaling factor for the number of diffusion steps. Higher values may
+            improve quality at the cost of longer generation time.
+        noise_scale : float, optional
+            Scaling factor for the noise schedule during diffusion. Controls the
+            amount of noise added at each step of the reverse diffusion process.
+
+        Other Parameters
+        ----------------
+        **kwargs : dict
+            Additional keyword args that are passed directly to the boltzgen
+            inference script. Overwrites any preceding options.
+
+        Returns
+        -------
+        BoltzGenFuture
+            A future object that can be used to retrieve the results of the design
+            job upon completion.
+        """
+        request = BoltzGenRequest(
+            n=n,
+            design_spec=design_spec,
+            diffusion_batch_size=diffusion_batch_size,
+            step_scale=step_scale,
+            noise_scale=noise_scale,
+        )
+        if structure_file is not None:
+            if isinstance(structure_file, bytes):
+                structure_text = structure_file.decode()
+            elif isinstance(structure_file, str):
+                structure_text = structure_file
+            else:
+                structure_text = structure_file.read().decode()
+            request.structure_text = structure_text
+
+        # Submit the job via the private API function
+        job = _boltzgen_api_post(
+            session=self.session,
+            request=request,
+            **kwargs,
+        )
+
+        # Return the future object
+        return BoltzGenFuture(session=self.session, job=job)
+
+    predict = generate
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+# GitHub syntax highlighting`
	`2`	`+pixi.lock linguist-language=YAML linguist-generated=true`