[RFC] securityadmin replacement

# Introduction

The `securityadmin` tools is a quite minimalist approach at config management for the security plugin; thus, it actually has quite a few issues. It has been announced to be deprecated in #1755 , however there is no real alternative defined so far.

This RFC has several goals:

- Collect requirements for a replacement
- Point out possible options, possibly employing different approaches, and comparing them

# Situation

The `securityadmin` tool is a very minimalist tool to do configuration management for the security plugin.

It basically provides the following functionality:

- It has a built-in basic structural validation of the configuration; this is done by parsing a YAML file into Jackson-annotated Java beans. This can detect basic structural issues in the configuration. However, this approach is not aware of the individual configuration structures employed by the different authentication modules. Thus, these cannot be validated. Also, it is not aware of any information encoded into string values - such as user attribute references, for example.
- This structural validation is part of the tool itself. If you want to enhance or change the validation, the tool needs to be changed. Any cluster-side validation cannot be used by the tool.
- The tool does not use any dedicated API for retrieving and writing configuration to a cluster. It just writes raw JSON data into the `.opendistro_security` index using basic indexing REST API calls.
- The security plugin has a dedicated API for updating its configuration; this however is only capable of operating on a record-level of individual configuration types. It cannot work on full configuration files or on several configuration types at once. This is however a requirement for the securityadmint tool. Additionally, the dedicated API also just performes very rudimentary validation. It is also possible to apply totally invalid configuration to a cluster with it.

This leads to a few disadvantages:

- Whenever the configuration format is changed by a new version of the OpenSearch security plugin, the securityadmin tool needs to be updated by an admin to make sure that the tool is aware of these changes.
- Still, it is easy to apply invalid configuration to a cluster and thus breaking user authentication without hitting a validation error.

The tool has a few further UX issues:

- Command lines for the securityadmin tool tend to be long and to be very difficult to understand. That has several reasons: First, the securityadmin tool does not support any configuration files where connection information could be stored in a reusable manner. Additionally, it employs unusual and difficult to parse acronyms for command line options like `nhnv`, `icl`, `nrhn`, etc.
- Any validation error is just printed out in the standard stack trace format of a Java exception. This is a format which is meant for diagnosing Java programs, but which is not meant for configuration handling.


# New Approach: Dedicated API plus thin client

This situation can be solved by creating a new dedicated API for configuration updates; a new tool would be just a thin client for that API. Thus, all actual configuration validation and processing would happen cluster-side.

## Dedicated API

A dedicated API for security updates would be able to accept a large JSON object containing complete configurations of different configuration types.

It would do an in-depth validation of the new configuration; only if this validation is successful, the configuration would be actually persisted and applied to the cluster.

### Optional: Concurrency control

Another common problem with the securityadmin tool is that it is easy to accidentially overwrite changes made using other means, for example using the Dashboards admin tool. 

This could be avoided by adding concurrency control support to the REST API. The standard REST approach to optimistic concurrency control using `ETag` and `If-Match` headers could be used then.

## In-depth configuration validation

In order to make sure that it is impossible to break a cluster by applying invalid configuration, a new concept for doing in-depth configuration validation is necessary.

This is a fundamental change. At the moment, the security plugin uses basic `Settings` objects to pass configuration to authentication module implementations. These implementations do not have a defined way to report configuration issues. All they can do is throw an exception in case of invalid configurations; this will just cause the auth module to become unavailable.

To achieve in-depth validation, we need to:

- Differentiate between configuration parsing and validation and actual activation of configuration. The configuration must be only activated when the complete configuration has been successfully parsed and validated.
- Define dedidated APIs that allow modules to perform this configuration validation and to report validation errors in a user readable format.

## New tool

A new tool that replaces `securityadmin` should have the following features:

- It shall be a command-line tool.
- It shall be a thin-client for the new configuration API. It shall not define validation logic by itself, but only rely on the API to perform validation and any other configuration related change.
- It shall enhance the ease of use by managing connection information in configuration files; this would allow to significantly reduce the length of the command line. Ideally, the invocation would just look similar to this:

```
./securityconfig.sh update-config path/to/config/
```

- Ideally, the tool can manage several connection configurations for more than one cluster.

- It shall support at least the basic operations `update-config` and `get-config`. Optionally, it might have more convenience operations like "add a single new user".

# New Approach: Saving security configuration as part of cluster state

Recently, there were a few discussions about changing the fundamental storage concept of the security configuration. Utilizing an index means that the cluster needs to achieve a certain initialization state and health before the configuration can be read. Also, it requires the cluster to accept writes in order to support configuration updates.

It might be possible to solve this by storing the configuration inside the cluster state. This would provide a natural way to make it available to the individual nodes.

However, this needs some more considerations:

- Can the complete configuration be stored as a part of the cluster state? Especially the internal user configuration can grow quite big.
- Can this be achieved in the security plugin or does this need core extensions?


# Questions

Please feel invited to comment on the considerations listed above.

Especially, we should discuss:

- Do the described approaches make sense and should we invest work here?
- Are there more requirements which are not yet listed?
- What could be good options to achieve an in-depth validation. Are there standard frameworks that could be employed?
- Are there other possible approaches that should be considered?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] securityadmin replacement #5838

Introduction

Situation

New Approach: Dedicated API plus thin client

Dedicated API

Optional: Concurrency control

In-depth configuration validation

New tool

New Approach: Saving security configuration as part of cluster state

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] securityadmin replacement #5838

Description

Introduction

Situation

New Approach: Dedicated API plus thin client

Dedicated API

Optional: Concurrency control

In-depth configuration validation

New tool

New Approach: Saving security configuration as part of cluster state

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions