Otherwise https://www.shellcheck.net/ would complain:
Line 141:
assert_line --index 0 "~/.bash_profile read"
^------------------^ SC2088 (warning): Tilde
does not expand in quotes.
Use $HOME.
See: https://www.shellcheck.net/wiki/SC2088
This is a false positive. There's no need for the tilde to be expanded
because it's not being used for any file system operation. It's merely
a human-readable string.
However, it's easier to change the string to use $HOME than littering
the file with ShellCheck's inline 'disable' directives.
https://github.com/containers/toolbox/pull/1366
These files aren't marked as executable, and shouldn't be, because they
aren't meant to be standalone executable scripts. They're meant to be
part of a test suite driven by Bats. Therefore, it doesn't make sense
for them to have shebangs, because it gives the opposite impression.
The shebangs were actually being used by external tools like Coverity to
deduce the shell when running shellcheck(1). Shellcheck's inline
'shell' directive is a more obvious way to achieve that.
https://github.com/containers/toolbox/pull/1363
The setup_suite.bash file is meant to be written in Bash, and is not
supposed to have any Bats-specific syntax. That's why it has the *.bash
suffix, not *.bats. If Bats finds a setup_suite.bash file, when running
the test suite, it uses Bash's source(1) builtin to read the file.
This is a cosmetic change. The Bats syntax is a superset of the Bash
syntax. Therefore, it didn't make a difference to external tools like
Coverity that use the shebang to deduce the shell for shellcheck(1).
Secondly setup_suite.bash isn't meant to be an executable script and,
hence, the shebang has no effect on how the file is used. However, it's
still a commonly used hint about the contents of the file, and it's
better to be accurate than misleading.
A subsequent commit will replace the shebangs in the test suite with
ShellCheck's 'shell' directives.
Fallout from 7a387dcc8bhttps://github.com/containers/toolbox/pull/1363
It's one less invocation of an external command, which is good because
spawning a new process is generally expensive.
One positive side-effect of this is that on some Active Directory
set-ups, the entry point no longer fails with:
Error: failed to remove password for user login@company.com: failed
to invoke passwd(1)
... because of:
# passwd --delete login@company.com
passwd: Libuser error at line: 210 - name contains invalid char `@'.
This is purely an accident, and isn't meant to be an intential change to
support Active Directory. Tools like useradd(8) and usermod(8) from
Shadow aren't meant to work with Active Directory users, and, hence, it
can still break in other ways. For that, one option is to expose $USER
from the host operating system to the Toolbx container through a Varlink
interface that can be used by nss-systemd inside the container.
Based on an idea from Si.
https://github.com/containers/toolbox/issues/585
Until now, configureUsers() was pushing the burden of deciding whether
to add a new user or modify an existing one on the callers, even though
it can trivially decide itself. Involving the caller loosens the
encapsulation of the user configuration logic by spreading it across
configureUsers() and it's caller, and adds an extra function parameter
that needs to be carefully set and is vulnerable to programmer errors.
Fallout from 9ea6fe5852https://github.com/containers/toolbox/pull/1356
These tests assume that the group and user information on the host
operating system can be provided by different plugins for the GNU Name
Service Switch (or NSS) functionality of the GNU C Library. eg., on
enterprise FreeIPA set-ups. However, it's expected that everything
inside the Toolbx container will be provided by /etc/group, /etc/passwd,
/etc/shadow, etc..
While /etc/group and /etc/passwd can be read by any user, /etc/shadow
can only be read by root. However, it's awkward to use sudo(8) in the
test cases involving /etc/shadow, because they ensure that root and
$USER don't need passwords to authenticate inside the container, and
sudo(8) itself depends on that. If sudo(8) is used, the test suite can
behave unexpectedly if Toolbx didn't set up the container correctly.
eg., it can get blocked waiting for a password.
Hence, 'podman unshare' is used instead to enter the container's initial
user namespace, where $USER from the host appears as root. This is
sufficient because the test cases only need to read /etc/shadow inside
the Toolbx container.
https://github.com/containers/toolbox/pull/1355
Sometimes locations such as /var/lib/flatpak, /var/lib/systemd/coredump
and /var/log/journal sit on security hardened mount points that are
marked as 'nosuid,nodev,noexec' [1]. In such cases, when Toolbx is used
rootless, an attempt to bind mount these locations read-only at runtime
with mount(8) fails because of permission problems:
# mount --rbind -o ro <source> <containerPath>
mount: <containerPath>: filesystem was mounted, but any subsequent
operation failed: Unknown error 5005.
(Note that the above error message from mount(8) was subsequently
improved to show something more meaningful than 'Unknown error' [2].)
The problem is that 'init-container' is running inside the container's
mount and user namespace, and the source paths were mounted inside the
host's namespace with 'nosuid,nodev,noexec'. The above mount(8) call
tries to remove the 'nosuid,nodev,noexec' flags from the mount point and
replace them with only 'ro', which is something that can't be done from
a child namespace.
Note that this doesn't fail when Toolbx is running as root. This is
because the container uses the host's user namespace and is able to
remove the 'nosuid,nodev,noexec' flags from the mount point and replace
them with only 'ro'. Even though it doesn't fail, the flags shouldn't
get replaced like that inside the container, because it removes the
security hardening of those mount points.
There's actually no benefit in bind mounting these paths as read-only.
It was historically done this way 'just to be safe' because a user isn't
expected to write to these locations from inside a container. However,
Toolbx doesn't intend to provide any heightened security beyond what's
already available on the host.
Hence, it's better to get out of the way and leave it to the permissions
on the source location from the host operating system to guard the
castle. This is accomplished by not passing any file system options to
mount(8) [1].
Based on an idea from Si.
[1] https://man7.org/linux/man-pages/man8/mount.8.html
[2] util-linux commit 9420ca34dc8b6f0f
https://github.com/util-linux/util-linux/commit/9420ca34dc8b6f0fhttps://github.com/util-linux/util-linux/pull/2376https://github.com/containers/toolbox/issues/911
The '[' and 'test' implementations from GNU coreutils don't support '-v'
as a way to check if a shell variable is set [1]. Only Bash's built-in
implementations do.
This is quite confusing and makes it difficult to find out what '-v'
actually does. eg., 'man --all test' only shows the manual for the GNU
coreutils version, which doesn't list '-v' [1], and, 'man --all [' only
shows the manual for Bash's built-ins, which also doesn't list '-v'.
One has to go to the bash(1) manual to find it [2].
Elsewhere in the code base [3], the same thing is accomplished with '-z'
and parameter substitution, which are more widely supported and, hence,
easier to find documentation for.
[1] https://manpages.debian.org/testing/coreutils/test.1.en.html
[2] https://linux.die.net/man/1/bash
[3] Commit 84ae385f33https://github.com/containers/toolbox/pull/1334https://github.com/containers/toolbox/pull/1341
'[' is a command that's the same as 'test' and they might be implemented
as standalone executables or shell built-ins. Therefore, the negation
(ie., '!') has to cover the entire command to operate on its exit code.
Instead, if it's writtten as '[ ! ... ]', then the negation becomes an
argument to '[', which isn't the same thing.
Fallout from 54a2ca1eadhttps://github.com/containers/toolbox/pull/1341
First, it's not a good idea to use awk(1) as a grep(1) replacement.
Unless one really needs the AWK programming language, it's better to
stick to grep(1) because it's simpler.
Secondly, it's better to look for a specific os-release(5) field instead
of looking for the occurrence of 'rawhide' anywhere in the file, because
it lowers the possibility of false positives.
https://github.com/containers/toolbox/pull/1336
The Zuul executor contains Ansible 2.13.7 whose 'dnf' module is not
working as it should with Fedora Rawhide because of the DNF5 Change [1].
Unlike DNF4, DNF5 no longer pulls in the python3-dnf RPM, which causes:
TASK [Install RPM packages]
fedora-rawhide | ERROR
fedora-rawhide | {
fedora-rawhide | "msg": "Could not import the dnf python module
using /usr/bin/python3 (3.12.0b3 (main, Jun 21 2023, 00:00:00)
[GCC 13.1.1 20230614 (Red Hat 13.1.1-4)]). Please install
`python3-dnf` or `python2-dnf` package or ensure you have
specified the correct ansible_python_interpreter. (attempted
['/usr/libexec/platform-python', '/usr/bin/python3',
'/usr/bin/python2', '/usr/bin/python'])",
fedora-rawhide | "results": []
fedora-rawhide | }
This adds a workaround that explicitly installs the python3-dnf RPM
using Ansible's 'command' module. It should be removed after Zuul
contains a newer release of Ansible.
[1] https://fedoraproject.org/wiki/Changes/ReplaceDnfWithDnf5https://github.com/containers/toolbox/pull/1338
Signed-off-by: Daniel Pawlik <dpawlik@redhat.com>
First, it's not a good idea to use awk(1) as a grep(1) replacement.
Unless one really needs the AWK programming language, it's better to
stick to grep(1) because it's simpler.
Secondly, it's better to look for a specific os-release(5) field instead
of looking for the occurrence of 'rawhide' anywhere in the file, because
it lowers the possibility of false positives.
https://github.com/containers/toolbox/pull/1332
The following caveats must be noted:
* Podman sets the Toolbx container's soft limit for the maximum number
of open file descriptors to the host's hard limit, which is often
greater than the host's soft limit [1].
* The ulimit(1) options -P, -T, -b, and -k don't work on Fedora 38
because the corresponding resource arguments for getrlimit(2) are
absent from the operating system. These are RLIMIT_NPTS,
RLIMIT_PTHREAD, RLIMIT_SBSIZE and RLIMIT_KQUEUES respectively.
[1] https://github.com/containers/podman/issues/17681https://github.com/containers/toolbox/issues/213
The current approach of extracting the VERSION_ID field from
os-release(5) assumes that the value is not quoted. There's no
guarantee that this will be the case. It only happens to be so on
Fedora by chance, and is different on Ubuntu:
$ cat /etc/os-release
...
VERSION_ID="22.04"
...
This means that "22.04", including the double quotes, is read as the
value of VERSION_ID on Ubuntu, not 22.04. This is wrong because this
value can't be used as is in image and container names. There's no
image called quay.io/toolbx/ubuntu-toolbox:"22.04" and double quotes are
not allowed in container names.
Instead, use the same approach as profile.d/toolbox.sh and the old POSIX
shell implementation that doesn't rely on the quoting of the
os-release(5) values.
Fallout from b27795a03ehttps://github.com/containers/toolbox/pull/1320
The current approach of selecting all the os-release(5) fields that have
'ID' in their name (eg., ID, VERSION_ID, PLATFORM_ID, VARIANT_ID, etc.)
and then picking the first one, assumes that the ID field will always be
placed above the others in os-release(5). There's no guarantee that
this will be the case. It only happens to be so on Fedora by chance,
and is different on Ubuntu:
$ cat /etc/os-release
...
VERSION_ID="22.04"
...
ID=ubuntu
ID_LIKE=debian
...
This means that "22.04" is read as the value of ID on Ubuntu, which is
clearly wrong.
Instead, use the same approach as profile.d/toolbox.sh and the old POSIX
shell implementation that doesn't rely on the order of the os-release(5)
fields.
Fallout from 54a2ca1eadhttps://github.com/containers/toolbox/pull/1320
Ansible's 'shell' module is almost exactly like the 'command' module,
except that it runs the command through a command line shell so that
environment variables like HOSTNAME and operations like '*', '<' and '>'
work. None of those things are necessary are here. Hence, it's better
to use the 'command' module as elsewhere.
Note that, unlike Ansible's 'shell' module, the 'command' module doesn't
support inline scripts. So, each command needs to be in its own
separate task.
https://github.com/containers/toolbox/pull/1318
We wasted some time trying to get the tests running locally, when all we
were missing were the 'git submodule ...' commands.
Add some more obvious hints about this possible stumbling block.
Note that Bats cautions against printing outside the @test, setup* or
teardown* functions [1]. In this case, doing so leads to the first line
of the error output going missing, when using the pretty formatter for
human consumption:
$ bats --formatter pretty ./test/system
✗ setup_suite
Forgot to run 'git submodule init' and 'git submodule update' ?
bats warning: Executed 1 instead of expected 191 tests
191 tests, 1 failure, 190 not run
[1] https://bats-core.readthedocs.io/en/stable/writing-tests.htmlhttps://github.com/containers/toolbox/pull/1298
Signed-off-by: Matthias Clasen <mclasen@redhat.com>
The 000-setup.bats and 999-teardown.bats files were added [1] at a time
when Bats didn't offer any hooks for suite-wide setup and teardown.
That changed in Bats 1.7.0, which introduced the setup_suite and
teardown_suite hooks. These hooks make it easier to run a subset of the
tests, which is a good thing.
In the past, to run a subset of the tests, one had to do:
$ bats ./test/system/000-setup.bats ./test/system/002-help.bats \
./test/system/999-teardown.bats
Now, one only has to do:
$ bats ./test/system/002-help.bats
Commit e22a82fec8 already added a dependency on Bats >= 1.7.0.
Therefore, it should be exploited wherever possible to simplify things.
[1] Commit 54a2ca1eadhttps://github.com/containers/toolbox/issues/751
[2] Bats commit fb467ec3f04e322a
https://github.com/bats-core/bats-core/issues/39https://bats-core.readthedocs.io/en/stable/writing-tests.htmlhttps://github.com/containers/toolbox/pull/1317
Bats 1.7.0 emits a warning if a feature that is only available starting
from a certain version of Bats onwards is used without specifying that
version [1]:
BW02: Using flags on `run` requires at least BATS_VERSION=1.5.0. Use
`bats_require_minimum_version 1.5.0` to fix this message.
(from function `bats_warn_minimum_guaranteed_version' in file
/usr/lib/bats-core/warnings.bash, line 32,
from function `run' in file
/usr/lib/bats-core/test_functions.bash, line 227,
in test file test/system/001-version.bats, line 27)
Note that bats_require_minimum_version itself is only available from
Bats 1.7.0 [2]. Hence, even though the specific feature here (using
flags on 'run') only requires Bats >= 1.5.0, in practice Bats >= 1.7.0
is needed. Fortunately, commit e22a82fec8 already added a
dependency on Bats >= 1.7.0. So, there's nothing to worry about.
[1] Bats commit 82002bb6c1a5c418
https://github.com/bats-core/bats-core/issues/556https://bats-core.readthedocs.io/en/stable/warnings/BW02.html
[2] Bats commit 71d6b71cebc3d32b
https://github.com/bats-core/bats-core/issues/556https://bats-core.readthedocs.io/en/stable/warnings/BW02.htmlhttps://github.com/containers/toolbox/pull/1315
Commit e22a82fec8 already added a dependency on Bats >= 1.7.0,
which is present on Fedora >= 36. Therefore, it should be exploited
wherever possible to simplify things.
Earlier, when the line counts were checked only with Bats >= 1.7.0,
there was a need to separately check the whole standard error and
output streams with 'assert_output' for the tests to be useful on
Fedora 35, which only had Bats 1.5.0. Now that the line counts are
being checked unconditionally, there's no need for that anymore.
Note that bats_require_minimum_version itself is only available from
Bats 1.7.0 [1].
[1] Bats commit 71d6b71cebc3d32b
https://github.com/bats-core/bats-core/issues/556https://bats-core.readthedocs.io/en/stable/warnings/BW02.htmlhttps://github.com/containers/toolbox/pull/1314
This allows using the 'distro' option to create and enter Arch Linux
containers. Due to Arch's rolling-release model, the 'release' option
isn't required. If 'release' is used, the accepted values are 'latest'
and 'rolling'.
https://github.com/containers/toolbox/pull/1311
Operating system distributions like Arch Linux that follow a
rolling-release model don't have the concept of a release. The latest
snapshot is the only available release.
A subsequent commit will add built-in support for Arch Linux. Hence,
the code can no longer assume that every distribution will have a
matching release.
Note that just because an operating system distribution may not have the
concept of a release, it doesn't mean that it will accept an invalid
'release' option.
https://github.com/containers/toolbox/pull/1311
The VERSION_ID field in os-release(5) is optional [1]. It's absent on
Arch Linux, which follows a rolling-release model and uses the BUILD_ID
field instead:
BUILD_ID=rolling
A subsequent commit will add built-in support for Arch Linux. Hence,
the code to get the default release from the host operating system can
no longer assume the presence of the VERSION_ID field in os-release(5).
Note that the arch-toolbox image is tagged with 'latest', in accordance
with OCI conventions, not 'rolling' [2,3], which is the os-release(5)
BUILD_ID. Therefore, it will be wise to use 'latest' as the default
release on Arch Linux, to simplify how the default release matches with
the default image's tag. This means that a os-release(5) field can't be
used for the default release on Arch.
[1] https://www.freedesktop.org/software/systemd/man/os-release.html
[2] Commit 2568528cb7https://github.com/containers/toolbox/pull/861
[3] Commit a4e5861ae5https://github.com/containers/toolbox/pull/1308https://github.com/containers/toolbox/pull/1303