-
Notifications
You must be signed in to change notification settings - Fork 183
FIX: Allow sparse data on TSNE #2827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
/azp run Nightly |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
sklearn's |
It does have a test like that as of 1.8: https://github.com/scikit-learn/scikit-learn/blob/bd0dd87f792ee88c8ec70248299aae779b51ac16/sklearn/manifold/tests/test_t_sne.py#L320 |
|
/intelci: run |
Description
The docs say that TSNE does not support sparse inputs, and there are checks against them.
But it turns out that TSNE doesn't use the actual data - instead, it has some preliminary stages where it calculates distances or neighbors or PCA, and the results from those are then used to generate arrays that are passed to the actual TSNE algorithm. Those arrays are always dense, regardless of how the data comes in originally, so in way sklearnex has partial support for sparse data on TSNE.
This PR corrects the docs and removes the checks for sparsity.
Checklist:
Completeness and readability
Testing