GH-49534: [R] Implement dplyr recode_values(), replace_values(), and replace_when() by thisisnic · Pull Request #49536 · apache/arrow

thisisnic · 2026-03-17T15:04:57Z

Rationale for this change

Implement new dplyr functions

What changes are included in this PR?

Implement them

Are these changes tested?

Yeah

Are there any user-facing changes?

Moar functions

AI Use

Code generated using Claude, with plenty of input from me. I've gone through it in detail and refactored lots, but it needs a last pass before it's ready for review.

GitHub Issue: [R] Implement dplyr recode_values(), replace_values(), and replace_when() #49534

jonkeane

The tests look good, and so long as CI doesn't bonk I'm good with shipping this. I do think we should keep the validation error when being given a vectorized .default though.

jonkeane · 2026-04-07T14:10:37Z

r/tests/testthat/test-dplyr-funcs-conditional.R

-    "`.default` must have size 1, not size 2",
-    class = "validation_error"
+    "`case_when\\(\\)` with vectorized `.default` not supported in Arrow",
+    class = "arrow_not_supported"


I'm not totally sure I think we should make this change. The original message is clearer to me about what needs to happen. I also don't mind that it's a validation_error since that is what it is.

Good shout on the contents of the error message, will update to be more recommendationy!

I'm not sure about the type of error though - my reasoning here was that it's a feature which is supported in dplyr but not arrow, which is where we typically use arrow_not_supported, whereas validation_error is more for things that fail in both. I guess it's tricky when size isn't either length 1 or the same length as the input, as then it's wrong in both, but in the bindings for str_sub/substr we use arrow_not_supported.

Ok, that's fair. I'm ok with arrow_not_supported + the clearer message about number of arguments

r/tests/testthat/test-dplyr-funcs-conditional.R

r/R/dplyr-funcs-conditional.R

jonkeane · 2026-04-07T14:17:00Z

@github-actions crossbow submit -g r

github-actions · 2026-04-07T14:19:56Z

Revision: 825940b

Submitted crossbow builds: ursacomputing/crossbow @ actions-40f5a339fc

Task	Status
r-binary-packages
r-recheck-most
test-r-alpine-linux-cran
test-r-arrow-backwards-compatibility
test-r-depsource-system
test-r-dev-duckdb
test-r-devdocs
test-r-extra-packages
test-r-fedora-clang
test-r-gcc-11
test-r-gcc-12
test-r-install-local
test-r-install-local-minsizerel
test-r-linux-as-cran
test-r-linux-rchk
test-r-linux-sanitizers
test-r-linux-valgrind
test-r-m1-san
test-r-macos-as-cran
test-r-offline-maximal
test-r-ubuntu-22.04
test-r-versions

Co-authored-by: Jonathan Keane <jkeane@gmail.com>

thisisnic · 2026-04-07T15:05:12Z

@github-actions crossbow submit -g r

github-actions · 2026-04-07T15:08:18Z

Revision: bbb4113

Submitted crossbow builds: ursacomputing/crossbow @ actions-29b827672d

Task	Status
r-binary-packages
r-recheck-most
test-r-alpine-linux-cran
test-r-arrow-backwards-compatibility
test-r-depsource-system
test-r-dev-duckdb
test-r-devdocs
test-r-extra-packages
test-r-fedora-clang
test-r-gcc-11
test-r-gcc-12
test-r-install-local
test-r-install-local-minsizerel
test-r-linux-as-cran
test-r-linux-rchk
test-r-linux-sanitizers
test-r-linux-valgrind
test-r-m1-san
test-r-macos-as-cran
test-r-offline-maximal
test-r-ubuntu-22.04
test-r-versions

thisisnic · 2026-04-07T19:26:24Z

CI failure appears to be unrelated (duckddb installation) so I'll merge

conbench-apache-arrow · 2026-04-08T01:50:29Z

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit bb4e492.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them.

github-actions bot added awaiting committer review Awaiting committer review Component: R labels Mar 17, 2026

thisisnic force-pushed the GH-49534-dplyr-recode branch 2 times, most recently from f8ecb8f to 880b775 Compare March 30, 2026 13:21

thisisnic marked this pull request as ready for review April 6, 2026 07:26

thisisnic requested a review from jonkeane as a code owner April 6, 2026 07:26

jonkeane approved these changes Apr 7, 2026

View reviewed changes

github-actions bot added awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Apr 7, 2026

github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting merge Awaiting merge awaiting changes Awaiting changes awaiting change review Awaiting change review labels Apr 7, 2026

thisisnic added 11 commits April 7, 2026 16:04

add new functions and extract out sections as code

e9ea6f5

add additional tests

074a5aa

throw relevant error if empty ...

4bf0c86

handle NA cases better

5f1a781

simplify code

e0f71ab

fix defautl params

19c6cb7

modularise

958eb87

fix docs

391b74d

add examples back

5bc8b8b

add more tests, fix doc

a553cfb

handle list inputs

71076fb

thisisnic and others added 14 commits April 7, 2026 16:04

Add handling of one-sided formulae

51755ff

compact formulae

dfbb913

appease linter

f72f341

remove redundant function

ac10af1

refactor for simplicity again

a228f93

docs

a7e25bd

Simplify code paths

e09fc00

Better errors for .default vectorised

69f2b4d

Run make doc

7894d67

lower cyclocomp

f9b0e78

Fix tests

b07656e

Use rel roxygen2

e3e472c

Update r/tests/testthat/test-dplyr-funcs-conditional.R

172538c

Co-authored-by: Jonathan Keane <jkeane@gmail.com>

Fix error messages

bbb4113

thisisnic force-pushed the GH-49534-dplyr-recode branch from 707bf62 to bbb4113 Compare April 7, 2026 15:04

github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Apr 7, 2026

thisisnic merged commit bb4e492 into apache:main Apr 7, 2026
16 checks passed

thisisnic removed the awaiting changes Awaiting changes label Apr 7, 2026

thisisnic mentioned this pull request Apr 7, 2026

[R] Implement dplyr recode_values(), replace_values(), and replace_when() #49534

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-49534: [R] Implement dplyr recode_values(), replace_values(), and replace_when()#49536

GH-49534: [R] Implement dplyr recode_values(), replace_values(), and replace_when()#49536
thisisnic merged 25 commits intoapache:mainfrom
thisisnic:GH-49534-dplyr-recode

thisisnic commented Mar 17, 2026 •

edited

Loading

Uh oh!

jonkeane left a comment

Uh oh!

jonkeane Apr 7, 2026

Uh oh!

thisisnic Apr 7, 2026

Uh oh!

jonkeane Apr 7, 2026

Uh oh!

Uh oh!

Uh oh!

jonkeane commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

thisisnic commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

thisisnic commented Apr 7, 2026

Uh oh!

Uh oh!

conbench-apache-arrow bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thisisnic commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

AI Use

Uh oh!

jonkeane left a comment

Choose a reason for hiding this comment

Uh oh!

jonkeane Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

thisisnic Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

jonkeane Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jonkeane commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

thisisnic commented Apr 7, 2026

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

thisisnic commented Apr 7, 2026

Uh oh!

Uh oh!

conbench-apache-arrow bot commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thisisnic commented Mar 17, 2026 •

edited

Loading