GNU roff 1.23 stopped remapping unescaped Hyphen-Minus (ie., - or 0x2D)
characters in the input to Hyphen-Minus in the output. Instead, it
follows the specified behaviour of converting unescaped Hyphen-Minus
characters in the input to Hyphen (ie., ‐ or 0x2010) in the output. To
get Hyphen-Minus characters in the output, one needs to escape the
Hyphen-Minus with a backslash (ie., \-) in the input [1].
Therefore, the command line options documented in the manuals are no
longer prefixed with the Hyphen-Minus character that's needed to
ctually use them. This breaks copying and pasting from the manuals and
searching within them.
Unfortunately, escaping the Hyphen-Minus characters in Markdown doesn't
have the intended effect of having Hyphen-Minus in the generated manual
pages [2]. Therefore, this is worked around by having the tests check
for both Hyphen-Minus and Hyphen.
Note that some operating system distributions, like Debian, have
reverted this change from GNU roff, but others haven't. So, unless it
can be guaranteed that the manuals will always have Hyphen-Minus
regardless of which GNU roff version or variant is being used, the tests
need to check for both.
[1] https://lwn.net/Articles/947941/https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00001.htmlhttps://git.savannah.gnu.org/cgit/groff.git/tree/PROBLEMS?h=1.23.0#n82
[2] https://github.com/cpuguy83/go-md2man/issues/101https://github.com/containers/toolbox/pull/1398
This should finally ensure that the fedora-toolbox image doesn't have
any package that had its content, such as documentation or translations,
stripped out by the fedora base image.
Until now, missing-docs had a hand-maintained list of packages that had
their content stripped out by the fedora base image. These packages are
reinstalled when building the fedora-toolbox image to restore the lost
content. Unfortunately, this list was incomplete because it was only
updated when someone noticed that something is missing.
Now, the list is generated with:
$ rpm --all --query --state --queryformat "PACKAGE: %{NAME}\n"
... to ensure that it's always complete.
The existing built-in test to ensure that the desired files are actually
present in the final image was extended to cover some of those that were
absent. A new built-in test, based on the above rpm(1) command, was
added as a fallback to ensure that the final image doesn't have any
package with missing content.
Only the images for currently maintained Fedoras (ie., 37, 38 and 39)
were updated.
As suggested by Brian Campbell.
https://github.com/containers/toolbox/issues/603
The shadow-utils package has always been part of the fedora base image.
It's explicitly listed in extra-packages as a safeguard against losing
useradd(8) and usermod(8) by mistake because they are needed by the
entry point of a Toolbx container [1]. Hence, the need to restore the
shadow-utils documentation that was stripped out in the base image.
Only the images for currently maintained Fedoras (ie., 37, 38 and 39)
were updated.
[1] Commit c6772f0f11https://github.com/containers/toolbox/commit/c6772f0f112e8004https://github.com/containers/toolbox/pull/1394
Ansible's built-in 'package' module doesn't show any details when
installing the RPMs. All that can be seen is:
TASK [Install RPM packages]
fedora-rawhide | changed
Therefore, there's no way to know what version of the packages got
installed.
In this case, not knowing the go-md2man(1) version being used by the CI
makes it difficult to know why the tests are failing on Fedora Rawhide
and Fedora 39 with:
not ok 3 help: Command 'help' in 177ms
# (from function `assert_line' in file
test/system/libs/bats-assert/src/assert.bash, line 479,
# in test file test/system/002-help.bats, line 48)
# `assert_line --index 0 --partial "toolbox(1)"' failed
# /usr/bin/man
#
# -- line does not contain substring --
# index : 0
# substring : toolbox(1)
# line : troff:<standard input>:33: warning: cannot select font
'C'
# --
#
It could be either because the CI is still using an older version of
go-md2man(1) [1,2], or that there's some other problem.
[1] Fedora golang-github-cpuguy83-md2man commit 117806d50e401c19
https://src.fedoraproject.org/rpms/golang-github-cpuguy83-md2man/c/117806d50e401c19https://src.fedoraproject.org/rpms/golang-github-cpuguy83-md2man/pull-request/3
[2] go-md2man commit d85280db9b54b574
https://github.com/cpuguy83/go-md2man/commit/d85280db9b54b574https://github.com/cpuguy83/go-md2man/issues/99https://github.com/containers/toolbox/pull/1386
Currently, some of the names of the tests were too long, and had
inconsistent and verbose wording. This made it difficult to look at
them and get a gist of all the scenarios being tested. The names are
like headings. They shouldn't be too long, should capture the primary
objective of the test and be consistent in their wording.
Note that the term 'usage screen' was particularly confusing. Prior to
commit 3dc106e10a, 'usage screen' in the names of the tests also
referred to the very brief listing of the commands and options that's
shown by 'toolbox help' and 'toolbox --help' in the absence of man(1).
In the context of this change, the term referred to the brief two line
error message that's shown when an unknown command or flag is used. So,
it will be good to not use it anymore.
https://github.com/containers/toolbox/pull/1386
Commit 5e63e9ec9b added a 'help' command to show the toolbox(1)
manual or a manual page for a specific command, and made the --help flag
identical to it. Therefore it's misleading to say that the --help flag
should show the usage screen. The usage screen is a brief listing of
the commands and options, which isn't the same thing as the more
detailed manuals.
Later, after this test was written, commit 40fc1689a3 added a
fallback for host operating systems without man(1), like Fedora CoreOS,
that would show a very brief usage screen with only the most common
commands.
To make it more confusing, the test was checking for a string that's
common to both the toolbox(1) manual and the fallback brief usage screen
that might be shown by 'toolbox --help'. This meant that it was neither
able to distinguish between the code paths nor ensure that they were
working as intended.
This was resolved by adapting the existing 'toolbox --help' test to
strictly ensure that it's showing the toolbox(1) manual when man(1) is
present, and by adding a new test to strictly ensure that it's showing
the fallback brief usage screen when man(1) is absent.
Until Bats 1.10.0, 'run --keep-empty-lines' had a bug where it counted
the trailing newline on the last line as a separate line [1]. However,
Bats 1.10.0 is only available in Fedora >= 39 and is absent from Fedoras
37 and 38.
Fallout from b27795a03e
[1] Bats commit 6648e2143bffb933
https://github.com/bats-core/bats-core/commit/6648e2143bffb933https://github.com/bats-core/bats-core/issues/708https://github.com/containers/toolbox/pull/1386
Until now, only the packages that are present in the fedora base image,
and had their documentation stripped out, were being tested for the
availability of documentation. There were no tests for the extra
packages that get added to the base image to form the fedora-toolbox
image.
The util-linux and xz packages were picked as examples for these new
tests. The xz package is a particularly good example because it has
translations for its manuals. It can help test that the fedora-toolbox
image is localized just like Fedora Silverblue and Workstation.
Only the images for currently maintained Fedoras (ie., 37, 38 and 39)
were updated.
https://github.com/containers/toolbox/pull/1384
Any system-wide customization to Bash's history facilities done through
a custom /etc/profile.d configuration snippet on the host operating
system gets lost inside the Toolbx container.
This is because Toolbx doesn't know what name to expect for the custom
/etc/profile.d snippet on the host, and, hence, can't give access to it
through a bind mount or symbolic link inside the container. The user
can definitely set up their own symbolic link inside the container to a
snippet inside /run/host/etc/profile.d. However, it's tedious to do
that for all containers, and the user may not even know that they are
missing the customization until they notice something wrong with the
history, which is shared across all containers and the host, and at that
point they might have already lost commands that they can't easily
reconstruct.
Therefore, it's worth trying to improve the situation by default.
This tries to preserve the environment variables used to customize
Bash's history facilities [1] across the host operating system and
Toolbx container. It assumes that the Bash start-up scripts inside the
container won't overwrite any of the propagated variables, which might
not always be the case [2].
[1] https://www.gnu.org/software/bash/manual/html_node/Bash-History-Facilities.htmlhttps://www.gnu.org/software/bash/manual/html_node/Bash-Variables.html
[2] https://pagure.io/setup/pull-request/48https://github.com/containers/toolbox/issues/1359
Currently, if 'skopeo copy ...' fails to download and cache an OCI image
during setup_suite(), the test suite doesn't immediately fail, but
continues. It only fails later when trying to set up the Docker
registry and contains a lot of noise:
not ok 1 setup_suite
# (from function `assert_success' in file
test/system/libs/bats-assert/src/assert.bash, line 114,
# from function `_setup_docker_registry' in file
test/system/libs/helpers.bash, line 211,
# from function `setup_suite' in test file
test/system/setup_suite.bash, line 59)
# `_setup_docker_registry' failed
# Failed to cache image registry.fedoraproject.org/fedora-toolbox:38
to /tmp/bats-run-GyTP7A/image-cache/fedora-toolbox-38
#
# -- command failed --
# status : 1
# output : time="2023-09-25T12:19:52+02:00" level=fatal
msg="initializing source
docker://registry.fedoraproject.org/fedora-toolbox:38-foo:
reading manifest 38-foo in
registry.fedoraproject.org/fedora-toolbox: manifest unknown"
# --
#
# Failed to cache image quay.io/toolbx/arch-toolbox:latest to
/tmp/bats-run-GyTP7A/image-cache/arch-toolbox-latest
#
# -- command failed --
# status : 1
# output : time="2023-09-25T12:20:48+02:00" level=fatal
msg="initializing source
docker://quay.io/toolbx/arch-toolbox:latest-foo: reading
manifest latest-foo in quay.io/toolbx/arch-toolbox: manifest
unknown"
# --
#
# Failed to cache image registry.fedoraproject.org/fedora-toolbox:34
to /tmp/bats-run-GyTP7A/image-cache/fedora-toolbox-34
#
# -- command failed --
# status : 1
# output : time="2023-09-25T12:21:42+02:00" level=fatal
msg="initializing source
docker://registry.fedoraproject.org/fedora-toolbox:34-foo:
reading manifest 34-foo in
registry.fedoraproject.org/fedora-toolbox: manifest unknown"
# --
#
# ...
#
# -- command failed --
# status : 1
# output : time="2023-09-25T12:26:33+02:00" level=fatal
msg="determining manifest MIME type for
dir:/tmp/bats-run-GyTP7A/image-cache/fedora-toolbox-34: open
/tmp/bats-run-GyTP7A/image-cache/fedora-toolbox-34/manifest.json:
no such file or directory"
# --
#
# docker-registry
# 27fa141e291e64e4c7a148c88ddab219ff2bfb5802a2982dc4188dc11f41692d
# Untagged: quay.io/toolbox_tests/registry:latest
# Deleted: fea5a12cde107bb407bc44ede6dd9edea1d2b4171cd8e52b0cb330bf45e517e1
It makes it look as if the root cause of the failure is related to
setting up the Docker registry, which it isn't, and all that noise makes
it difficult to spot the actual problem.
Instead, from now on, it will be more obvious:
not ok 1 setup_suite
# (from function `setup_suite' in test file
test/system/setup_suite.bash, line 44)
# `_pull_and_cache_distro_image "$system_id" "$system_version" ||
false' failed
# Failed to cache image registry.fedoraproject.org/fedora-toolbox:38
to /tmp/bats-run-62b8CU/image-cache/fedora-toolbox-38
# time="2023-09-25T13:55:42+02:00" level=fatal msg="initializing
source docker://registry.fedoraproject.org/fedora-toolbox:38-foo:
reading manifest 38-foo in
registry.fedoraproject.org/fedora-toolbox: manifest unknown"
Note that Bats' 'run' helper [1] isn't designed to work inside
setup_suite(). eg., 'run --separate-stderr' doesn't work because
BATS_TEST_TMPDIR isn't defined.
[1] https://bats-core.readthedocs.io/en/stable/writing-tests.htmlhttps://github.com/containers/toolbox/pull/1377
If setup_suite() fails for some reason, then an unrelated message from
'podman system reset' would show up:
not ok 1 setup_suite
# (from function `setup_suite' in test file
test/system/setup_suite.bash, line 43)
# `_pull_and_cache_distro_image foo || false' failed
# Requested distro (foo) does not have a matching image
# A "/home/rishi/.cache/toolbox/system-test-storage/storage.conf"
config file exists.
# Remove this file if you did not modify the configuration.
This extra error message from 'podman system reset' serves no purpose
because it's not related to the cause of the setup_suite() failure.
It's just noise and it's better to silence it.
https://github.com/containers/toolbox/pull/1375
If setup_suite() fails for some reason, causing the Docker registry to
not be created, then an unrelated message from 'podman stop' would show
up:
not ok 1 setup_suite
# (from function `setup_suite' in test file
test/system/setup_suite.bash, line 43)
# `_pull_and_cache_distro_image foo || false' failed
# Requested distro (foo) does not have a matching image
# Error: no container with name or ID "docker-registry" found: no such
container
# ...
# ...
This extra error message from 'podman stop' serves no purpose because
it's not related to the cause of the setup_suite() failure. It's just
noise and it's better to silence it.
https://github.com/containers/toolbox/pull/1375
Contrary to what the documentation might seem to imply [1], Bats' 'fail'
helper only aborts a test case under certain circumstances. eg., when
called from setup_suite(), but not from within a child function, and a
@test case, but not from within the 'run' helper.
If 'fail' is called from within 'run', then the code after it will
continue to execute. The test case will only fail if 'run' eventually
catches a non-zero exit code that's caught by 'assert_success' [2].
Similarly, it doesn't abort if called from within a child function in
setup_suite().
Currently, _pull_and_cache_distro_image() is a child function called
from setup_suite(). So 'fail' won't abort if an invalid distribution is
specified.
Fortunately, pull_distro_image() is being called from within @test
cases, but outside 'run'. So, there's no problem with it now. However,
some future code changes can unknowingly alter this reality and it too
can run into unexpected behaviour.
Therefore, it's better to be safe, and explicitly specify a non-zero
exit code after 'fail'. It will ensure that it works as expected under
all circumstances.
[1] https://github.com/bats-core/bats-support
[2] https://github.com/bats-core/bats-asserthttps://github.com/containers/toolbox/pull/1375
Currently, if a Toolbx container's entry point fails to initialize the
container, there's no way to see the debug logs and error messages from
the entry point:
not ok 106 container: Check container starts without issues
# (from function `assert_success' in file
test/system/libs/bats-assert/src/assert.bash, line 114,
# in test file test/system/103-container.bats, line 39)
# `assert_success' failed
#
# -- command failed --
# status : 1
# output :
# --
#
Instead, from now on, they will be visible:
not ok 106 container: Check container starts without issues
# (from function `assert_success' in file
test/system/libs/bats-assert/src/assert.bash, line 114,
# in test file test/system/103-container.bats, line 39)
# `assert_success' failed
#
# -- command failed --
# status : 1
# output (90 lines):
# Failed to initialize container fedora-toolbox-38
# level=debug msg="Running as real user ID 0"
# level=debug msg="Resolved absolute path to the executable as
/usr/bin/toolbox"
# level=debug msg="TOOLBOX_PATH is /opt/bin/toolbox"
# level=debug msg="Migrating to newer Podman"
# level=debug msg="Migration not needed: running inside a container"
# level=debug msg="Setting up configuration"
# ...
# --
#
https://github.com/containers/toolbox/pull/1374
Bats' 'run' helper returns with an exit code of 0 even when the command
that it was given to run failed with a non-zero exit code [1]. This is
to enable making further assertions about the command after 'run' has
finished. If there's nothing that checks for failures, then it will
continue as if everything is alright.
Therefore, currently, if 'podman logs' fails, there's no indication of
it and the test only fails later because it thinks that the container
failed to initialize:
not ok 106 container: Check container starts without issues
# (from function `assert_success' in file
test/system/libs/bats-assert/src/assert.bash, line 114,
# in test file test/system/103-container.bats, line 39)
# `assert_success' failed
#
# -- command failed --
# status : 1
# output :
# --
#
Instead, from now on, it will be more obvious:
not ok 106 container: Check container starts without issues
# (from function `assert_success' in file
test/system/libs/bats-assert/src/assert.bash, line 114,
# in test file test/system/103-container.bats, line 39)
# `assert_success' failed
#
# -- command failed --
# status : 125
# output (2 lines):
# Failed to invoke '/usr/bin/podman logs'
# Error: no container with name or ID "foo" found: no such container
# --
#
One alternative was to use 'assert_success' [2] to assert that the
command given to 'run' succeeded. That would show the 'podman logs'
failure as:
not ok 106 container: Check container starts without issues
# (from function `assert_success' in file
test/system/libs/bats-assert/src/assert.bash, line 114,
# in test file test/system/103-container.bats, line 39)
# `assert_success' failed
#
# -- command failed --
# status : 1
# output (29 lines):
#
# -- command failed --
# status : 125
# output : Error: no container with name or ID "foo" found: no such
container
# --
#
# ...
#
# -- command failed --
# status : 125
# output : Error: no container with name or ID "foo" found: no such
container
# --
# --
#
However, it's a bit too noisy because of the 'assert_success' not
terminating container_started() and continuing to loop for the remaining
attempts.
[1] https://bats-core.readthedocs.io/en/stable/writing-tests.html
[2] https://github.com/bats-core/bats-asserthttps://github.com/containers/toolbox/pull/1372
A subsequent commit will use this variable to set the return value for a
different condition. Therefore, the name needs to be changed to suit
the purpose.
https://github.com/containers/toolbox/pull/1372
Until Bats 1.10.0, 'run' with options had a bug where it would overwrite
the value of the 'i' variable even outside 'run' [1].
In these particular instances, no options are being passed to 'run',
and, hence, currently there's no problem. However, in case a future
commit adds an option, then it could lead to hard-to-debug problems.
eg., --separate-stderr sets 'i' to 1, --show-output-of-passing-tests
sets it to 2, etc.. Therefore, depending on the flag and the loop, the
loop might get terminated prematurely or run infinitely or something
else.
Moreover, Bats 1.10.0 is only available in Fedora >= 39 and is absent
from Fedoras 37 and 38. Therefore, it's not possible to consider this
bug fixed.
Hence, it's better to preemptively work around it to avoid any future
issues.
[1] Bats commit 502dc47dd063c187
https://github.com/bats-core/bats-core/commit/502dc47dd063c187https://github.com/bats-core/bats-core/issues/726https://github.com/containers/toolbox/pull/1373