FIX: Allow sparse data on TSNE #2827

david-cortes-intel · 2025-12-05T16:31:50Z

Description

The docs say that TSNE does not support sparse inputs, and there are checks against them.

But it turns out that TSNE doesn't use the actual data - instead, it has some preliminary stages where it calculates distances or neighbors or PCA, and the results from those are then used to generate arrays that are passed to the actual TSNE algorithm. Those arrays are always dense, regardless of how the data comes in originally, so in way sklearnex has partial support for sparse data on TSNE.

This PR corrects the docs and removes the checks for sparsity.

Checklist:

Completeness and readability

I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.

codecov · 2025-12-08T11:10:58Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`81.19% <ø> (+0.75%)`	⬆️
github	`82.82% <ø> (+0.77%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

david-cortes-intel · 2025-12-08T11:21:33Z

/azp run Nightly

azure-pipelines · 2025-12-08T11:21:43Z

Azure Pipelines successfully started running 1 pipeline(s).

Vika-F · 2025-12-09T10:03:13Z

sklearn's test_t_sne.py doesn't have a case for sparse PCA initialization testing, only for sparse precomputed distances testing.
I think it worth adding the test on our side to test this branch.

david-cortes-intel · 2025-12-09T11:30:23Z

sklearn's test_t_sne.py doesn't have a case for sparse PCA initialization testing, only for sparse precomputed distances testing. I think it worth adding the test on our side to test this branch.

It does have a test like that as of 1.8: https://github.com/scikit-learn/scikit-learn/blob/bd0dd87f792ee88c8ec70248299aae779b51ac16/sklearn/manifold/tests/test_t_sne.py#L320

david-cortes-intel · 2025-12-09T13:03:41Z

/intelci: run

clarification about sparse handling in tsne

d9951f6

david-cortes-intel added the bug Something isn't working label Dec 5, 2025

david-cortes-intel added 3 commits December 8, 2025 10:13

check for older versions

c7c27f2

fix

efd30f7

better message

18a1634

david-cortes-intel changed the title ~~[Do NOT merge yet] FIX: Allow sparse data on TSNE~~ FIX: Allow sparse data on TSNE Dec 9, 2025

david-cortes-intel marked this pull request as ready for review December 9, 2025 07:48

david-cortes-intel requested review from Vika-F, icfaust, maria-Petrova, syakov-intel and yuejiaointel as code owners December 9, 2025 07:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FIX: Allow sparse data on TSNE #2827

FIX: Allow sparse data on TSNE #2827

Uh oh!

david-cortes-intel commented Dec 5, 2025 •

edited

Loading

Uh oh!

codecov bot commented Dec 8, 2025 •

edited

Loading

Uh oh!

david-cortes-intel commented Dec 8, 2025

Uh oh!

azure-pipelines bot commented Dec 8, 2025

Uh oh!

Vika-F commented Dec 9, 2025

Uh oh!

david-cortes-intel commented Dec 9, 2025

Uh oh!

david-cortes-intel commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FIX: Allow sparse data on TSNE #2827

Are you sure you want to change the base?

FIX: Allow sparse data on TSNE #2827

Uh oh!

Conversation

david-cortes-intel commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

codecov bot commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

david-cortes-intel commented Dec 8, 2025

Uh oh!

azure-pipelines bot commented Dec 8, 2025

Uh oh!

Vika-F commented Dec 9, 2025

Uh oh!

david-cortes-intel commented Dec 9, 2025

Uh oh!

david-cortes-intel commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

david-cortes-intel commented Dec 5, 2025 •

edited

Loading

codecov bot commented Dec 8, 2025 •

edited

Loading