Use NEMAR as the default dataset download source by Copilot · Pull Request #1085 · NeuroTechX/moabb

Copilot · 2026-06-18T12:06:22Z

MOABB datasets need to download from re-hosted NEMAR datasets by default where an equivalent NEMAR ID exists, while preserving each dataset’s existing downloader as a fallback.

NEMAR integration
- Adds nemar-py as a runtime dependency.
- Assigns NEMAR IDs directly on each dataset class via nemar_id.
- Adds a nemar_dl() helper that downloads into MOABB’s existing dataset cache layout.
Download behavior
- BaseDataset.download() now attempts NEMAR first when dataset.nemar_id is available.
- Existing data_path() downloaders remain the fallback path on NEMAR download failure.
- Subject IDs are formatted for BIDS/NEMAR using nemar_subject_template.
Coverage
- Adds non-CI download-equivalence coverage gated behind --dl-data.
- Adds CI-safe coverage ensuring all dataset classes expose nemar_id.
- Adds unit coverage for NEMAR IDs, NEMAR-first behavior, and fallback behavior.

Example:

from moabb.datasets import BNCI2014_001

dataset = BNCI2014_001()
print(dataset.nemar_id)  # "nm000139"

dataset.download(subject_list=[1])  # tries NEMAR first, then existing fallback

bruAristimunha · 2026-06-18T12:17:33Z

Each nemar ID should be in each dataset class, rather than centralized in moabb/datasets/nemar.py. And one test that should be executed in the CI should ensure that all datasets have the nemar_id.

bruAristimunha · 2026-06-18T12:30:51Z

please address my comments and the pre-commit @copilot

Copilot · 2026-06-18T12:43:01Z

please address my comments and the pre-commit @copilot

Addressed in fc908ee: NEMAR IDs now live on dataset classes, the central mapping module was removed, CI-safe nemar_id coverage was added, and pre-commit now passes.

Set nemar_id (plus subject templates / BIDS filters where needed) on each dataset class and load subjects from NEMAR via mne-bids with session/run normalization. IDs verified against the live NEMAR catalog (https://data.nemar.org). Updates download.py helper and tests accordingly.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8effc0d0be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-20T18:05:14Z

+    if force_update and target_dir.exists():
+        try:
+            shutil.rmtree(target_dir)


Avoid deleting earlier subjects during forced NEMAR downloads

When BaseDataset.download() is called with multiple subjects and force_update=True, it invokes nemar_dl() once per subject, so this block removes the whole NEMAR dataset directory before every subject download. The first subject is downloaded, then the second iteration deletes it, and the final cache contains only the last subject instead of the requested list; this breaks the documented “download all data” path whenever users force-refresh a multi-subject dataset.

Useful? React with 👍 / 👎.

Keep each NEMAR id as a single class attribute line instead of an attribute plus a trailing blank line; consolidate PhysionetMI's nemar_subject_template next to its nemar_id. No id values changed.

Add nemar_id to the 18 datasets still missing one, with ids verified against the NEMAR catalog (nemar-cli --mine), including the private deposits Beetl2021_B (nm000274), Lee2019_SSVEP (nm000273) and Mainsah2025_A (nm000269), plus Nieto2022 (OpenNeuro ds003626). Exempt the test fixtures and Schrag2026Pediatric (no NEMAR deposit) in test_all_datasets_have_valid_nemar_id.

PierreGtch · 2026-06-22T09:02:13Z

Each nemar ID should be in each dataset class, rather than centralized in moabb/datasets/nemar.py. And one test that should be executed in the CI should ensure that all datasets have the nemar_id.

What should be the workflow when adding a new dataset to MOABB?
As I understand, you need to first create and merge the MOABB PR to trigger the nemar dataset creation and get a nemar id.
So the MOABB PR would have to be merged with a red test, no ?

For datasets whose NEMAR deposit is assigned but not yet public, keep the id in a commented-out `# nemar_id = "..."` line (so it is ready to enable on publication) and list them in NEMAR_ID_PENDING. These skip test_all_datasets_have_valid_nemar_id, alongside the no-deposit exempt datasets. Nieto2022 (public OpenNeuro ds003626) stays active.

Lock in that download() skips NEMAR entirely and calls the dataset's own data_path when nemar_id is None (e.g. the pending, commented-out deposits).

Add a "Publishing your dataset on NEMAR" section to the adding-a-dataset tutorial: the nemar-cli upload/publish workflow, wiring nemar_id (and nemar_subject_template), and the data_path fallback when nemar_id is unset or a NEMAR download fails.

PierreGtch

LGTM. I made 3 comments, feel free to ignore if you don't agree

PierreGtch · 2026-06-22T12:18:49Z


    _summary_table: dict[str, Any]
+    nemar_id: str | None = None
+    nemar_subject_template: str | None = "{subject}"


nemar_subject_template is not flexible enough. In BIDS, the subject label can be any string: https://bids-specification.readthedocs.io/en/stable/glossary.html#subject-entities
Do we really need nemar_subject_template? Can't we so something smarter to automatically infer the mapping?
The simplest solution would be if subjects in moabb were also strings and not ints.

I think we have some details that needs this, like 001, 01, 1... this is way i went to string, but we can do better.... maybe modify at the nemar-py... thinking

and what about using strings as subject ID in MOABB too?
This way, we don’t need to convert anything. Tt would be simpler and more aligned with BIDS

i think i would prefer to address this later.

Co-authored-by: Pierre Guetschel <25532709+PierreGtch@users.noreply.github.com> Signed-off-by: Bru <b.aristimunha@gmail.com>

nemar_dl removed the whole shared BIDS root on force_update, but download() calls it once per subject, so refreshing a multi-subject dataset wiped every subject except the last. Rely on trust_existing=not force_update to re-fetch the requested subject and drop the rmtree (and the now-unused shutil import). Addresses Codex review comment on PR #1085.

Initial plan

45ff1d2

Copilot AI assigned Copilot and bruAristimunha Jun 18, 2026

Copilot started work on behalf of bruAristimunha June 18, 2026 12:06 View session

Copilot AI added 3 commits June 18, 2026 12:14

Add NEMAR download fallback support

65e1af2

Address NEMAR validation feedback

81c9310

Narrow NEMAR fallback errors

5dcac94

Copilot AI added 2 commits June 18, 2026 12:18

Document NEMAR download helper

1588531

Document NEMAR dataset hooks

1635365

Copilot AI changed the title ~~[WIP] Update moabb to use nemar-py for dataset downloads~~ Use NEMAR as the default dataset download source Jun 18, 2026

Copilot finished work on behalf of bruAristimunha June 18, 2026 12:20

Copilot AI requested a review from bruAristimunha June 18, 2026 12:20

Copilot started work on behalf of bruAristimunha June 18, 2026 12:31 View session

Copilot AI added 4 commits June 18, 2026 12:37

Move NEMAR IDs onto dataset classes

9500614

Refine NEMAR dataset tests

b3571e2

Polish NEMAR changelog and tests

8c47b96

Tighten NEMAR changelog wording

fc908ee

Copilot finished work on behalf of bruAristimunha June 18, 2026 12:43

bruAristimunha added 2 commits June 20, 2026 18:34

Merge branch 'develop' into copilot/update-moabb-nemar-py

8effc0d

bruAristimunha marked this pull request as ready for review June 20, 2026 17:59

chatgpt-codex-connector Bot reviewed Jun 20, 2026

View reviewed changes

bruAristimunha added 2 commits June 20, 2026 20:14

Drop blank line after per-dataset nemar_id

c2b5496

Keep each NEMAR id as a single class attribute line instead of an attribute plus a trailing blank line; consolidate PhysionetMI's nemar_subject_template next to its nemar_id. No id values changed.

bruAristimunha added 2 commits June 22, 2026 11:04

Test that datasets without a NEMAR id use the fallback downloader

0f0251a

Lock in that download() skips NEMAR entirely and calls the dataset's own data_path when nemar_id is None (e.g. the pending, commented-out deposits).

Document publishing a dataset on NEMAR

0468a8a

Add a "Publishing your dataset on NEMAR" section to the adding-a-dataset tutorial: the nemar-cli upload/publish workflow, wiring nemar_id (and nemar_subject_template), and the data_path fallback when nemar_id is unset or a NEMAR download fails.

bruAristimunha requested a review from PierreGtch June 22, 2026 09:52

PierreGtch approved these changes Jun 22, 2026

View reviewed changes

bruAristimunha and others added 2 commits June 22, 2026 14:42

Update examples/tutorials/tutorial_4_adding_a_dataset.py

46ab30b

Co-authored-by: Pierre Guetschel <25532709+PierreGtch@users.noreply.github.com> Signed-off-by: Bru <b.aristimunha@gmail.com>

bruAristimunha mentioned this pull request Jun 24, 2026

nemar_subject_template is not flexible enough. In BIDS, the subject label can be any string: https://bids-specification.readthedocs.io/en/stable/glossary.html#subject-entities #1098

Open

bruAristimunha merged commit 4a259f5 into develop Jun 24, 2026
14 checks passed

bruAristimunha deleted the copilot/update-moabb-nemar-py branch June 24, 2026 11:36

Conversation

Copilot AI commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bruAristimunha commented Jun 18, 2026

Uh oh!

bruAristimunha commented Jun 18, 2026

Uh oh!

Copilot AI commented Jun 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

PierreGtch commented Jun 22, 2026

Uh oh!

PierreGtch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

PierreGtch Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

bruAristimunha Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

PierreGtch Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

bruAristimunha Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Jun 18, 2026 •

edited

Loading