Skip to content

PCA C and Python API#1987

Merged
rapids-bot[bot] merged 10 commits intorapidsai:mainfrom
aamijar:pca-c-api
Apr 15, 2026
Merged

PCA C and Python API#1987
rapids-bot[bot] merged 10 commits intorapidsai:mainfrom
aamijar:pca-c-api

Conversation

@aamijar
Copy link
Copy Markdown
Member

@aamijar aamijar commented Apr 2, 2026

Resolves #1977 and Resolves #1994

@aamijar aamijar self-assigned this Apr 2, 2026
@aamijar aamijar added non-breaking Introduces a non-breaking change feature request New feature or request labels Apr 2, 2026
@aamijar aamijar moved this to In Progress in Unstructured Data Processing Apr 2, 2026
Comment thread c/include/cuvs/preprocessing/pca.h Outdated
*/
enum cuvsPcaSolver {
/** Covariance + divide-and-conquer eigen decomposition */
PCA_COV_EIG_DQ = 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enums should be CUVS prefixed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 27f73ac

Comment thread c/include/cuvs/preprocessing/pca.h
Comment thread c/tests/preprocessing/run_pca_c.c Outdated
output_tensor.manager_ctx = NULL;
output_tensor.deleter = NULL;

cuvsPcaFitTransform(res,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an option to test fit() and predict() independantly too?

Comment thread c/tests/preprocessing/pca_c.cu Outdated
cudaMemcpyDeviceToDevice,
stream));

run_pca(n_rows,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync stream before calling this function as there are no guarantee that the inputs are populated before pca::fit() is called.
Or you can create the C resource in this function and pass it to run_pca

Comment thread c/tests/preprocessing/pca_c.cu Outdated
raft::random::RngState rng(1234ULL);
raft::random::uniform(handle, rng, input.data(), n_rows * n_cols, -1.0f, 1.0f);

RAFT_CUDA_TRY(cudaMemcpyAsync(input_copy.data(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use C++ wrapper for memcpy and streamsync

Comment thread c/tests/preprocessing/run_pca_c.c Outdated
Comment on lines +33 to +47
DLManagedTensor input_tensor;
input_tensor.dl_tensor.data = input_data;
input_tensor.dl_tensor.device.device_type = kDLCUDA;
input_tensor.dl_tensor.device.device_id = 0;
input_tensor.dl_tensor.ndim = 2;
input_tensor.dl_tensor.dtype.code = kDLFloat;
input_tensor.dl_tensor.dtype.bits = 32;
input_tensor.dl_tensor.dtype.lanes = 1;
int64_t input_shape[2] = {n_rows, n_cols};
int64_t input_strides[2] = {1, n_rows};
input_tensor.dl_tensor.shape = input_shape;
input_tensor.dl_tensor.strides = input_strides;
input_tensor.dl_tensor.byte_offset = 0;
input_tensor.manager_ctx = NULL;
input_tensor.deleter = NULL;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use cuvs::core::detail::to_dlpack() for those? That should reduce a lot the number of lines so this function could fit in the same test file

Comment thread c/tests/preprocessing/run_pca_c.c Outdated
res, params, &trans_tensor, &comp_tensor, &sv_tensor, &mu_tensor, &output_tensor);

cuvsPcaParamsDestroy(params);
cuvsResourcesDestroy(res);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync stream before destroying the resource, there is no guarantee that the work is complete

Copy link
Copy Markdown
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving CMake changes

@aamijar aamijar requested a review from a team as a code owner April 7, 2026 23:45
@aamijar aamijar requested a review from a team as a code owner April 8, 2026 00:18
@aamijar aamijar changed the title PCA C API PCA C and Python API Apr 8, 2026
@aamijar
Copy link
Copy Markdown
Member Author

aamijar commented Apr 13, 2026

Hi @lowener, I think all your comments have been addressed.

Copy link
Copy Markdown
Contributor

@lowener lowener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aamijar
Copy link
Copy Markdown
Member Author

aamijar commented Apr 15, 2026

/merge

@rapids-bot rapids-bot bot merged commit 81532b6 into rapidsai:main Apr 15, 2026
79 of 80 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Unstructured Data Processing Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C feature request New feature or request non-breaking Introduces a non-breaking change Python

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

[FEA] Python API for PCA [FEA] C API for PCA

4 participants