Skip to content

Commit 519adaf

Browse files
committed
manual: More on stores, building and mounting in the file system
Expand the manual on 1. store paths (be more explicit about store path base names) 2. How store objects are exposed on the file system (new page and new section of page) 3. Building derivations (lots of stuff on that page)
1 parent 7448aed commit 519adaf

File tree

11 files changed

+309
-101
lines changed

11 files changed

+309
-101
lines changed

doc/manual/redirects.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -337,7 +337,7 @@
337337
"string-literal": "string-literals.html"
338338
},
339339
"language/derivations.html": {
340-
"builder-execution": "../store/building.html#builder-execution"
340+
"builder-execution": "../store/building.html"
341341
},
342342
"installation/installing-binary.html": {
343343
"linux": "uninstall.html#linux",

doc/manual/source/SUMMARY.md.in

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,10 @@
1919
- [Nix Store](store/index.md)
2020
- [File System Object](store/file-system-object.md)
2121
- [Content-Addressing File System Objects](store/file-system-object/content-address.md)
22+
- [Exposing in OS File Systems](store/file-system-object/os-file-system.md)
2223
- [Store Object](store/store-object.md)
2324
- [Content-Addressing Store Objects](store/store-object/content-address.md)
24-
- [Store Path](store/store-path.md)
25+
- [Store Path and Store Directory](store/store-path.md)
2526
- [Store Derivation and Deriving Path](store/derivation/index.md)
2627
- [Derivation Outputs and Types of Derivations](store/derivation/outputs/index.md)
2728
- [Content-addressing derivation outputs](store/derivation/outputs/content-address.md)

doc/manual/source/glossary.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@
3535

3636
A derivation can be thought of as a [pure function](https://en.wikipedia.org/wiki/Pure_function) that produces new [store objects][store object] from existing store objects.
3737

38-
Derivations are implemented as [operating system processes that run in a sandbox](@docroot@/store/building.md#builder-execution).
38+
Derivations are implemented as [operating system processes that run in a sandbox](@docroot@/store/building.md).
3939
This sandbox by default only allows reading from store objects specified as inputs, and only allows writing to designated [outputs][output] to be [captured as store objects](@docroot@/store/building.md#processing-outputs).
4040

4141
A derivation is typically specified as a [derivation expression] in the [Nix language], and [instantiated][instantiate] to a [store derivation].

doc/manual/source/protocols/json/schema/store-object-info-v2.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ $defs:
107107
type: string
108108
title: Store Directory
109109
description: |
110-
The [store directory](@docroot@/store/store-path.md#store-directory) this store object belongs to (e.g. `/nix/store`).
110+
The [path to the store directory](@docroot@/store/store-path.md#store-directory-path) this store object belongs within (e.g. `/nix/store`).
111111
additionalProperties: false
112112

113113
impure:

doc/manual/source/protocols/store-path.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ where
1818

1919
- `name` = the name of the store object.
2020

21-
- `store-dir` = the [store directory](@docroot@/store/store-path.md#store-directory)
21+
- `store-dir` = the [path of the store directory](@docroot@/store/store-path.md#store-directory-path)
2222

2323
- `digest` = base-32 representation of the compressed to 160 bits [SHA-256] hash of `fingerprint`
2424

Lines changed: 148 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -1,101 +1,187 @@
11
# Building
22

3-
## Normalizing derivation inputs
3+
As discussed in the [main page on derivations](./derivation/index.md):
44

5-
- Each input must be [realised] prior to building the derivation in question.
5+
> A derivation is a specification for running an executable on precisely defined input to produce one or more [store objects][store object].
66
7-
[realised]: @docroot@/glossary.md#gloss-realise
7+
This page describes *building* a derivation, which is to say following the instructions in the derivation to actually run the executable.
8+
In some cases the derivation is self-explanatory.
9+
For example, the arguments specified in the derivation really are the arguments passed to the executable.
10+
In other cases, however, there is additional procedure true for all derivations, which is therefore *not* specified in the derivation.
11+
This page specifies this invariant procedure that is true for all derivations, too.
12+
13+
The chief design consideration for the building process is *determinism*.
14+
Conventional operating systems are typically not designed with determinism in mind.
15+
But determinism is needed to make Nix's caching a transparent abstraction.
16+
17+
> **Explanation**
18+
>
19+
> For example, no one wants to slightly modify a derivation, and then find that it no longer builds for an unrelated reason, because the original derivation *also* doesn't build anymore, but the cache hit on the original derivation was hiding this.
20+
> We want builds that once succeed to continue succeeding, to encourage fearless modification of old build recipes.
21+
> Determinism is what enables things that once worked to keep working.
22+
23+
The life cycle of a build can be broken down into 3 parts:
24+
25+
1. Spawn the builder process with the proper environment, including the correct process arguments, environment variables, and file system state.
26+
27+
2. Wait for the standard output and error of the process to be closed and/or the process to exit.
28+
(If the standard streams are closed but the process hasn't exited, Nix will kill the process.)
29+
30+
Nix also logs the standard output and error of the process, but this is just for human convenience and does not influence the behavior of the system.
31+
(Builder processes have no idea what the consumer of their standard output and error does with those streams, only that they are indeed consumed so buffers do not fill up and writes to them will continue to succeed.)
32+
33+
3. Processing the outputs after the builder has exited.
834

9-
- Once this is done, the derivation is *normalized*, replacing each input deriving path with its store path, which we now know from realising the input.
35+
The builder process on exit should have left beyond files for each output the derivation is supposed to produce.
36+
The files must be processed to turn them into bona fide store objects.
37+
If the processing suceeds, those store objects are associated with the derivation as (the results of) a successful build.
1038

11-
## Builder Execution {#builder-execution}
39+
Step (3) happens externally, with just inert data since the process has exited or been killed by then.
40+
Step (1) however is best described not from Nix's perspective, but from the process's perspective.
41+
42+
> **Explanation**
43+
>
44+
> Ultimately, what matters for determinism is the behavior of IO operations that the process attempts (whether these are successes or failures), because of how they affect the output files, and how they affect the further execution of the builder process.
45+
> From Nix (and the operating system)'s perspective, there are many, many different ways --- different implementation strategies --- of effecting the same I/O behavior,
46+
> But from the process's perspective, there is only one correct behavior.
47+
48+
## What derivations can be built
49+
50+
Actually only some derivations are ready to be built.
51+
In particular, only [*resolved*](./resolution.md) derivations can be built.
52+
That is to say, a derivation that depends on other derivations is not ready yet to be built, because those other derivations might not be built.
53+
If the other derivations are indeed built, we can witness this fact by resolving the derivation, and converting all the derivation's input references into plain store paths.
54+
55+
> **Note**
56+
>
57+
> Note that [input-addressing](derivation/outputs/input-address.md) derivations are improperly resolved.
58+
> As discussed on the linked page, the current input-addressing algorithm does not respect resolution-equivalence of derivations (\\(\\sim_\mathrm{Drv}\\)).
59+
> That means that if Nix properly resolved an input-addressed derivation, the resolved derivation would have different input addresses, violating expectations.
60+
> Nix therefore improperly resolves the derivation, keeping its original input address output paths, creating an invalid derivation that is both resolved and instructed to create the outputs at the originally expected paths.
61+
62+
## Environment of the builder process
1263

1364
The [`builder`](./derivation/index.md#builder) is executed as follows:
1465

15-
- A temporary directory is created where the build will take place. The
16-
current directory is changed to this directory.
66+
### File system
67+
68+
The builder should have access to a limited file system where only certain objects are available.
69+
The most important exposed files are the inputs (other store objects) of the (resolved) derivation.
70+
Additionally, some other files are exposed.
71+
72+
#### Store inputs
73+
74+
The builder will be run against a file system in which the [closure] of the inputs is mounted inside the [store directory][store directory path].
75+
In particular, consider a store that just contains this closure.
76+
That store may be exposed to the file system according to the rules specified in the [Exposing Store Objects in OS File Systems](./store-path.md#exposing) documentation.
77+
This precisely defines the file system layout of the store that should be visible to the builder process.
78+
79+
> **Note**
80+
>
81+
> Historically, Nix exposed *at least* the following store contents to the builder, but also arbitrarily other store objects, due to limitations around operating systems' file system virtualization capabilities, and wanting to avoid copying or moving files.
82+
> It still can do this in so-called *unsandboxed* builds.
83+
>
84+
> Such builds should be considered an unsafe extension, but one that works less badly against non-malicious derivations than might be expected.
85+
> This is because store paths are relatively unpredictable, so a well-behaved program is unlikely to stumble upon a store object it wasn't supposed to know about.
86+
>
87+
> As operating systems developed better file system primitives, the need for disabling sandboxing has lessened greatly over the years, and this trend should continue into the future.
88+
89+
[realised]: @docroot@/glossary.md#gloss-realise
90+
[closure]: @docroot@/glossary.md#gloss-closure
91+
[store directory path]: ./store-path.md#store-directory-path
92+
93+
### Other file system state
94+
95+
- The current working directory of the builder process will be a fresh temporary directory that is initially empty.
1796

1897
See the per-store [`build-dir`](@docroot@/store/types/local-store.md#store-local-store-build-dir) setting for more information.
1998

20-
- The environment is cleared and set to the derivation attributes, as
21-
specified above.
99+
- Basic device nodes for essential operations (null device, random number generation, standard streams as a pseudo terminal)
100+
101+
(A pseudo terminal would not be strictly necessary since the standard streams are passively logging, not there to facilitate interaction.
102+
But it is still useful to entice programs to do nicer logging with e.g. colors etc.)
103+
104+
- On Linux: Process information via `/proc`
105+
106+
- Minimal user and group identity information
22107

23-
- In addition, the following variables are set:
108+
- A loopback-only network configuration with hostname set to `localhost`
24109

25-
- `NIX_BUILD_TOP` contains the path of the temporary directory for
26-
this build.
110+
> **Note**
111+
>
112+
> Fixed-output derivations have access to additional operating system state to facilitate communication with the outside world, such as network name resolution and TLS certificate verification.
113+
> This is necessary because these derivations are allowed to access the network, unlike regular derivations which are fully sandboxed.
27114
28-
- Also, `TMPDIR`, `TEMPDIR`, `TMP`, `TEMP` are set to point to the
29-
temporary directory. This is to prevent the builder from
30-
accidentally writing temporary files anywhere else. Doing so
31-
might cause interference by other processes.
115+
### Environment variables {#env-vars}
32116

33-
- `PATH` is set to `/path-not-set` to prevent shells from
34-
initialising it to their built-in default value.
117+
The environment is cleared and set to the derivation attributes, as
118+
specified above.
35119

36-
- `HOME` is set to `/homeless-shelter` to prevent programs from
37-
using `/etc/passwd` or the like to find the user's home
38-
directory, which could cause impurity. Usually, when `HOME` is
39-
set, it is used as the location of the home directory, even if
40-
it points to a non-existent path.
120+
For most derivations types this must contain at least:
41121

42-
- `NIX_STORE` is set to the path of the top-level Nix store
43-
directory (typically, `/nix/store`).
122+
- For each output declared in `outputs`, the corresponding environment variable is set to point to the intended path in the Nix store for that output.
123+
Each output path is a concatenation of the cryptographic hash of all build inputs, the `name` attribute and the output name.
124+
(The output name is omitted if it’s `out`.)
44125

45-
- `NIX_ATTRS_JSON_FILE` & `NIX_ATTRS_SH_FILE` if `__structuredAttrs`
46-
is set to `true` for the derivation. A detailed explanation of this
47-
behavior can be found in the
48-
[section about structured attrs](@docroot@/language/advanced-attributes.md#adv-attr-structuredAttrs).
126+
In addition, the following variables are set:
49127

50-
- For each output declared in `outputs`, the corresponding
51-
environment variable is set to point to the intended path in the
52-
Nix store for that output. Each output path is a concatenation
53-
of the cryptographic hash of all build inputs, the `name`
54-
attribute and the output name. (The output name is omitted if
55-
it’s `out`.)
128+
- `NIX_BUILD_TOP` contains the path of the temporary directory for this build.
56129

57-
- If an output path already exists, it is removed. Also, locks are
58-
acquired to prevent multiple [Nix instances][Nix instance] from performing the same
59-
build at the same time.
130+
- Also, `TMPDIR`, `TEMPDIR`, `TMP`, `TEMP` are set to point to the temporary directory.
131+
This is to prevent the builder from accidentally writing temporary files anywhere else.
132+
Doing so might cause interference by other processes.
60133

61-
- A log of the combined standard output and error is written to
62-
`/nix/var/log/nix`.
134+
- `PATH` is set to `/path-not-set` to prevent shells from initialising it to their built-in default value.
63135

64-
- The builder is executed with the arguments specified by the
65-
attribute `args`. If it exits with exit code 0, it is considered to
66-
have succeeded.
136+
- `HOME` is set to `/homeless-shelter`.
137+
(Without sandboxing, this serves as "soft sandboxing" --- it discourages programs from using `/etc/passwd` or the like to find the user's home directory, which could cause impurity.)
138+
Usually, when `HOME` is set, it is used as the location of the home directory, even if it points to a non-existent path.
67139

68-
- The temporary directory is removed (unless the `-K` option was
69-
specified).
140+
- `NIX_STORE` is set to the path of the top-level Nix [store directory path] (typically, `/nix/store`).
141+
142+
- `NIX_ATTRS_JSON_FILE` & `NIX_ATTRS_SH_FILE` if `__structuredAttrs` is set to `true` for the derivation.
143+
A detailed explanation of this behavior can be found in the [section about structured attrs](@docroot@/language/advanced-attributes.md#adv-attr-structuredAttrs).
144+
145+
## Builder Execution
146+
147+
- If an output path already exists, it is removed.
148+
Also, locks are acquired to prevent multiple [Nix instances][Nix instance] from performing the same build at the same time.
149+
150+
- A log of the combined standard output and error is written to `/nix/var/log/nix`.
151+
152+
- The builder is executed with the arguments specified by the attribute `args`.
153+
If it exits with exit code 0, it is considered to have succeeded.
154+
155+
- The temporary directory is removed (unless the [`--keep-failed`](@docroot@/command-ref/opt-common.md#opt-keep-failed) option was specified).
70156

71157
## Processing outputs
72158

73159
If the builder exited successfully, the following steps happen in order to turn the output directories left behind by the builder into proper store objects:
74160

75161
- **Normalize the file permissions**
76162

77-
Nix sets the last-modified timestamp on all files
78-
in the build result to 1 (00:00:01 1/1/1970 UTC), sets the group to
79-
the default group, and sets the mode of the file to 0444 or 0555
80-
(i.e., read-only, with execute permission enabled if the file was
81-
originally executable). Any possible `setuid` and `setgid`
82-
bits are cleared.
83-
84-
> **Note**
85-
>
86-
> Setuid and setgid programs are not currently supported by Nix.
87-
> This is because the Nix archives used in deployment have no concept of ownership information,
88-
> and because it makes the build result dependent on the user performing the build.
163+
The files must conform to the model described in the [Exposing in OS file systems](./file-system-object/os-file-system.md) section.
164+
For example, timestamps and permissions must be forced to sentinel values.
89165

90166
- **Calculate the references**
91167

92-
Nix scans each output path for
93-
references to input paths by looking for the hash parts of the input
94-
paths. Since these are potential runtime dependencies, Nix registers
95-
them as dependencies of the output paths.
168+
Nix scans each output path for references to input store objects by looking for the store path digests of each input.
169+
(The name part is ignored when scanning; an input's hash part that is not followed by a `-` and the correct name part still scans as a reference.
170+
Likewise, a digest not preceded by the [store directory path] also still scans as a reference.)
171+
Since these are potential runtime dependencies, Nix will register them as references of the output store object they occur in.
96172

97-
Nix also scans for references to other outputs' paths in the same way, because outputs are allowed to refer to each other.
173+
Nix also scans for references from one output to another in the same way, because outputs are allowed to refer to each other.
98174
If the outputs' references to each other form a cycle, this is an error, because the references of store objects much be acyclic.
99175

176+
In the case of derivations with fixed in advance output paths (i.e. [input-addressing] derivations, or [fixed content-addressing] derivations), the actual final store path to each output is used during the build.
177+
For [floating content-addressing] derivations, however, the final store path is not known in advance by definition.
178+
Scratch store paths must therefore be used instead.
179+
Scanning will use those scratch paths, but then any output-to-be that contains such a scanned scratch path must be rewritten to instead use the final (content-addressed) path of the output in question.
180+
181+
At this point, the file system data is in the proper form, and the valid acyclic reference data for each output is also calculated, so the outputs can be registered as proper store objects, and associated with the derivation in the [build trace] in the record for a successful build.
100182

101183
[Nix instance]: @docroot@/glossary.md#gloss-nix-instance
184+
[input-addressing]: ./derivation/outputs/input-address.md
185+
[fixed content-addressing]: ./derivation/outputs/content-address.md#fixed
186+
[floating content-addressing]: ./derivation/outputs/content-address.md#floating
187+
[build trace]: ./build-trace.md

0 commit comments

Comments
 (0)