It turns out that at least since Fedora 30 [1], the gnupg2 package has
been part of the fedora base image, because it's required by the dnf
package:
dnf -> python3-dnf -> python3-libdnf -> libdnf -> gpgme -> gnupg2
Hence, the need to restore the gnupg2 documentation that was stripped
out in the base image.
Only the images for currently maintained Fedoras (ie., 36, 37 and 38)
were updated.
[1] It's difficult to find out if the gnupg2 package wasn't part of the
fedora base image before Fedora 30, because those images are no
longer available from registry.fedoraproject.org.
https://github.com/containers/toolbox/pull/1228
Building an OCI image leads to so much spew that it's hard to notice if
something unexpected happened, and as seen in the previous commit [1],
unexpected things do happen.
Therefore, this adds a built-in test to ensure that the desired files
are actually present in the final image. Right now it only checks the
presence of some representative manuals to ensure that the packages
listed in the 'missing-docs' file really do get reinstalled, and the
documentation that was stripped out in the base image really does get
restored.
Only the images for currently maintained Fedoras (ie., 36, 37 and 38)
were updated.
[1] Commit 1fc50176c9https://github.com/containers/toolbox/pull/1226https://github.com/containers/toolbox/pull/1226
The RPM packages in the base 'fedora' image can be older than the those
currently available in the DNF 'updates' repository [1], but at the same
time newer than those available in the DNF 'fedora' repository [1]. The
first part happens because the base image isn't updated as often as the
individual packages, so the 'updates' repository can have newer RPMs.
The second part happens because the base image does get updated after a
stable Fedora has been released, and hence can have newer RPMs than the
'fedora' repository.
This is complicated by the fact that packages can get pulled directly
from Fedora's Koji build system into the base 'fedora' image before
they make it to one of the well-known repositories like 'fedora' or
'updates' [1]. These packages are marked as having come from the
koji-override-0 repository.
All that combined can lead to unexpected behaviour when DNF is invoked
to reinstall or swap the RPM packages in the base image. Some examples
below.
The base fedora:36 image contains glibc-minimal-langpack-2.35-20.fc36
that came from koji-override-0, while 'fedora' and 'updates' have
glibc-all-langpacks-2.35-4.fc36 and glibc-all-langpacks-2.35-22.fc36
respectively. This leads to:
STEP 8/15: RUN dnf -y swap glibc-minimal-langpack glibc-all-langpacks
Last metadata expiration check: 0:00:03 ago on Wed Feb 1 12:37:04...
Dependencies resolved.
======================================================================
Package Arch Version Repository
======================================================================
Installing:
glibc-all-langpacks x86_64 2.35-4.fc36 fedora
Removing:
glibc-minimal-langpack x86_64 2.35-20.fc36 @koji-override-0
Downgrading:
glibc x86_64 2.35-4.fc36 fedora
glibc-common x86_64 2.35-4.fc36 fedora
That's unexpected. Instead of upgrading all the glibc sub-packages to
the latest version from 'updates', it's downgrading them to the older
version from 'fedora'.
Similarly, the base fedora:36 image has bash-5.2.9-2.fc36.x86_64 from
koji-override-0, and there is bash-5.2.15-1.fc36.x86_64 in 'updates'.
This leads to:
STEP 10/15: RUN dnf -y reinstall $(<missing-docs)
Last metadata expiration check: 0:00:06 ago on Wed Feb 1 12:37:04...
Package acl available, but not installed.
No match for argument: acl
Installed package bash-5.2.9-2.fc36.x86_64 (from koji-override-0) not
available.
That's unexpected. Instead of upgrading bash to the latest version from
'updates', it's simply skipping the 'reinstall', which means that the
documentation that was stripped out in the base image doesn't get
restored.
Updating all the RPM packages in the base 'fedora' image to match the
contents of the 'updates' repository before making any changes to the
image's package set will avoid such unexpected behaviour.
Only the images for currently maintained Fedoras (ie., 36, 37 and 38)
were updated.
[1] https://docs.fedoraproject.org/en-US/quick-docs/repositories/https://github.com/containers/toolbox/pull/1226
The URLs for the RHEL Toolbx images based on the Red Hat Universal Base
Images (or UBI) are a bit more complicated to construct, in comparison
to the URLs for Fedora's fedora-toolbox images. It's not enough to just
concatenate the registry, the image's basename and the release. Some
parts of the URL depend on the release's major number, which requires
custom code.
So far, the release's major number was hard coded to 8 since only RHEL 8
Toolbx containers were supported.
To support other RHEL major releases, it's necessary to have custom code
to construct the URLs for the Toolbx images.
https://github.com/containers/toolbox/issues/1065
On enterprise FreeIPA set-ups, the subordinate user and group IDs are
provided by SSSD's sss plugin for the GNU Name Service Switch (or NSS)
functionality of the GNU C Library. They are not listed in /etc/subuid
and /etc/subgid. Therefore, its necessary to use libsubid.so to check
the subordinate ID ranges.
The CGO interaction with libsubid.so is loosely based on 'readSubid' in
github.com/containers/storage/pkg/idtools [1].
However, unlike 'readSubid', this code considers the absence of any
range (ie., nRanges == 0) to be an error as well.
More importantly, this code uses dlopen(3) and friends to dynamically
load the symbols from libsubid.so, instead of linking to libsubid.so at
build-time and having the dependency noted in the /usr/bin/toolbox
binary. This is done because libsubid.so itself depends on several
other shared libraries, and indirect dependencies can't be influenced
by the RUNPATH [2] embedded in the /usr/bin/toolbox binary [3]. Hence,
when the binary is used inside Toolbx containers (eg., as the entry
point), those indirect dependencies won't be picked from the host's
runtime against which the binary was built. This can render the binary
useless due to ABI compatibility issues. Using dlopen(3) avoids this
problem, especially because libsubid.so is only used when running on the
host.
Care was taken to not load and link libsubid.so twice to separately
validate the subordinate ID ranges for the user and the group. Note
that libsubid_init() must be passed a FILE pointer for logging.
Otherwise, it will create it's own for logging, and there's no way to
close it during dlclose(3).
Version 4 of the libsubid.so API/ABI [4] was released in Shadow 4.10,
which is newer than the versions shipped on RHEL 8 and Debian 10 [5],
and even that newer version had some problems [6]. Therefore, support
for older versions, with the relevant workarounds, is necessary.
Fortunately, the oldest that needs to be support is Shadow 4.9 because
that's when libsubid.so was introduced [7].
Note that SUBID_ABI_VERSION was only introduced with version 4 of the
libsubid.so API/ABI released in Shadow 4.10 [8]. The first release of
libsubid.so in Shadow 4.9 already had an ABI version of 3.0.0 [9], since
it was bumped a few times during development, so that's what's assumed
when SUBID_ABI_VERSION is absent.
This code doesn't set the public variables Prog and shadow_logfd that
older Shadow versions used to expect for logging, because from Shadow
4.9 onwards there's a separate function [4,10] to specify these. This
can be changed if there are libsubid.so versions in the wild that really
do need those public variables to be set.
Finally, ISO C99 is required because of the use of <stdbool.h> in the
libsubid.so API.
Some changes by Debarshi Ray.
[1] https://github.com/containers/storage/blob/main/pkg/idtools/idtools_supported.go
[2] https://man7.org/linux/man-pages/man8/ld.so.8.html
[3] Commit 6063eb27b9https://github.com/containers/toolbox/issues/821
[4] Shadow commit 32f641b207f6ddff
https://github.com/shadow-maint/shadow/commit/32f641b207f6ddffhttps://github.com/shadow-maint/shadow/issues/443
[5] https://packages.debian.org/source/buster/shadow
[6] Shadow commit 79157cbad87f42cd
https://github.com/shadow-maint/shadow/commit/79157cbad87f42cdhttps://github.com/shadow-maint/shadow/issues/465
[7] Shadow commit 0a7888b1fad613a0
https://github.com/shadow-maint/shadow/commit/0a7888b1fad613a0https://github.com/shadow-maint/shadow/issues/154
[8] Shadow commit 0c9f64140852e8d5
https://github.com/shadow-maint/shadow/commit/0c9f64140852e8d5https://github.com/shadow-maint/shadow/pull/449
[9] Shadow commit 3d670ba7ed58f910
https://github.com/shadow-maint/shadow/commit/3d670ba7ed58f910https://github.com/shadow-maint/shadow/issues/339
[10] Shadow commit 2b22a6909dba60d
https://github.com/shadow-maint/shadow/commit/2b22a6909dba60dhttps://github.com/shadow-maint/shadow/issues/325https://github.com/containers/toolbox/issues/1074
Signed-off-by: Martin Jackson <martjack@redhat.com>
Building Toolbx requires a C compiler [1], which defaults to GCC on
Fedora and CentOS Stream. It's good to explicitly require it, so that
it doesn't go missing from the build.
Showing the version of the C compiler is a big help when debugging weird
build problems involving the toolchain. A following commit will use CGO
to link to libsubid.so, which will only increase the relevance of the C
compiler.
[1] Commit c8aaed52c5https://github.com/containers/toolbox/pull/923https://github.com/containers/toolbox/pull/1218
Ever since commit bafbbe81c9, the shell completions are generated
using the Toolbx binary, and the 'completion' sub-directory no longer
has any source code, but only the build scripts to invoke the Toolbx
binary to generate them. This is a good opportunity to simplify the
layout of this Git repository by reducing the number of sub-directories.
The file containing the Bash completions had to be renamed to avoid
colliding with the name of the Toolbx binary, since they are both
generated in the same sub-directory.
https://github.com/containers/toolbox/pull/1216
The Meson adapter scripts are simple enough that they don't need
detailed descriptions for their command line arguments. The cost of
formulating succint descriptions doesn't justify the benefits.
https://github.com/containers/toolbox/pull/1216
The errors should be propagated up the call chain either verbatim or by
wrapping them with all relevant context when necessary (as long as they
don't violate the API boundaries).
The errors should be logged only when there's a break in the upward
propagation, either because they need to be reformatted before being
shown to the user or because they would expose implementation details
that aren't part of the API contract. Not logging the errors in such
cases might make it difficult to debug problems later on.
https://github.com/containers/toolbox/pull/1202
Currently, the titles of the manuals are rendered with a pair of empty
parentheses and no section title:
toolbox(1)() toolbox(1)()
NAME
toolbox - Tool for containerized command line environments...
However, they should be:
toolbox(1) General Commands Manual toolbox(1)
NAME
toolbox - Tool for containerized command line environments...
This is because the troff generated by go-md2man from Markdown has a
faulty invocation of the .TH macro [1]:
.nh
.TH toolbox(1)
.SH NAME
.PP
toolbox - Tool for containerized command line environments on Linux
It should be:
.nh
.TH toolbox 1
.SH NAME
.PP
toolbox - Tool for containerized command line environments on Linux
Original patch from Andrew Denton for Podman [2].
[1] https://www.gnu.org/software/groff/manual/groff.html
[2] Podman commit 63c779a857b55b00
https://github.com/containers/podman/pull/15621https://github.com/containers/toolbox/pull/1210
Otherwise https://www.shellcheck.net/ would complain:
Line 2479:
shift
^---^ SC2317 (info): Command appears to be unreachable. Check usage
(or ignore if invoked indirectly).
See: https://www.shellcheck.net/wiki/SC2317
Fedora Rawhide now has ShellCheck-0.9.0, which flags these new problems,
while so far it only had ShellCheck-0.8.0.
ShellCheck is correct that this is unreachable code. However, given the
lack of built-in command line parsing facilities in POSIX shell, this
code pattern has so far turned out to be quite handy. It's flexible
enough to be able to handle different combinations of commands and
options, and is easy to read. Trying to 'fix' the code will likely
cause more problems than it will solve.
Moreover, the POSIX shell implementation has been replaced by the Go
implementation quite a long time ago. It's no longer maintained and has
been kept only for historical reasons. Therefore, it's not worth
spending any significant amount of time on it.
https://github.com/containers/toolbox/pull/1211
The name of a node in a nodeset is meant to be a human-readable name. A
name with an obscure prefix like 'ci-node-' makes it look more profound
than it really is.
https://github.com/containers/toolbox/pull/1206
The 'unit tests' are no longer just unit tests. They also run a bunch
of static analysis tools like ShellCheck, codespell, gofmt and 'go vet'.
Since newer versions of these tools are generally better at catching
problems in the codebase, it will be better to run the 'unit tests' on
Fedora Rawhide with the latest versions than older stable Fedoras.
The timeout for the 'unit tests' need to be increased because Fedora
Rawhide is slower than stable Fedoras. Currently, the timeout for the
'unit tests' running on Fedora 36 is 10 minutes. Increasing it to 20
minutes when running on Fedora Rawhide wasn't enough, so maybe 30 will
be sufficient.
Note that this is only feasible because the Fedora Rawhide builds are
now more robust against stale DNF caches [1]. Otherwise, it wouldn't
have been wise to use Fedora Rawhide to test anything which isn't also
being tested elsewhere, because the Fedora Rawhide builds might have
stayed broken for extended periods of time due to reasons completely
unrelated to Toolbx.
[1] Commit 995c6d175ehttps://github.com/containers/toolbox/pull/1201https://github.com/containers/toolbox/pull/1206
This will be used by the subsequent commit to have a separate set of
dependencies for CentOS Stream 9 builds. eg., unlike Fedora, CentOS
Stream 9 doesn't have the ShellCheck, bats and fish RPMs.
https://github.com/containers/toolbox/pull/1171
Currently, the standard error and output streams of the child commands
invoked by 'meson test' are redirected to a separate log file. When the
tests fail, it's difficult, or maybe even impossible, to access this
file from the Zuul CI, and all that can be seen is something like:
1/7 shellcheck src/go-build-wrapper OK 0.04s
2/7 shellcheck profile.d/toolbox.sh FAIL 0.06s exit status 1
>>> MALLOC_PERTURB_=241 /usr/bin/shellcheck
--shell=sh
/home/zuul-worker/src/github.com/containers/toolbox/builddir/../profile.d/toolbox.sh
3/7 go fmt FAIL 0.05s exit status 1
>>> MALLOC_PERTURB_=209 /usr/bin/python3
/home/zuul-worker/src/github.com/containers/toolbox/src/meson_go_fmt.py
/home/zuul-worker/src/github.com/containers/toolbox/src
4/7 codespell FAIL 0.31s exit status 65
>>> MALLOC_PERTURB_=180 /usr/bin/codespell
--check-filenames
--check-hidden
--context 3
--exclude-file /home/zuul-worker/src/github.com/containers/toolbox/.codespellexcludefile
--skip /home/zuul-worker/src/github.com/containers/toolbox/builddir
--skip /home/zuul-worker/src/github.com/containers/toolbox/.git
--skip /home/zuul-worker/src/github.com/containers/toolbox/test/system/libs/bats-assert
--skip /home/zuul-worker/src/github.com/containers/toolbox/test/system/libs/bats-support
/home/zuul-worker/src/github.com/containers/toolbox
5/7 shellcheck toolbox (deprecated) FAIL 1.09s exit status 1
>>> MALLOC_PERTURB_=233 /usr/bin/shellcheck
/home/zuul-worker/src/github.com/containers/toolbox/builddir/../toolbox
6/7 go test OK 1.89s
7/7 go vet OK 17.60s
This doesn't have enough information to understand what caused the tests
to fail on non-interactive CI environments.
Not redirecting the standard error and output streams of the child
commands invoked by 'meson test' will readily reveal more details about
the test failures and remove the need to find the log file created by
Meson.
https://github.com/containers/toolbox/pull/1171
Otherwise codespell would complain:
: @test "create: Try to create a container with invalid custom name...
> run $TOOLBOX -y create "ßpeci@l.Nam€"
:
./test/system/101-create.bats:57: Nam ==> Name
CentOS Stream 9 has codespell-2.2.1, while so far the 'unit tests' were
being run on Fedora 36, which only has codespell-2.1.0.
This is a step towards testing on CentOS Stream 9.
https://github.com/containers/toolbox/pull/1200
CentOS Stream 9 has codespell-2.2.1, while so far the 'unit tests' were
being run on Fedora 36, which only has codespell-2.1.0.
This is a step towards testing on CentOS Stream 9.
Fallout from ecd1ced719https://github.com/containers/toolbox/pull/1200
Otherwise codespell would complain:
: {"/tmp", "/run/host/tmp", "rslave"},
> {"/var/lib/flatpak", "/run/host/var/lib/flatpak", "ro"},
: {"/var/lib/libvirt", "/run/host/var/lib/libvirt", ""},
./src/cmd/initContainer.go:61: ro ==> to, row, rob, rod, roe, rot
CentOS Stream 9 has codespell-2.2.1, while so far the 'unit tests' were
being run on Fedora 36, which only has codespell-2.1.0.
This is a step towards testing on CentOS Stream 9.
https://github.com/containers/toolbox/pull/1200
Otherwise https://www.shellcheck.net/ would complain:
Line 86:
term_just_first_character="${TERM%$term_without_first_character}"
^-- SC2295 (info): Expansions inside
${..} need to be quoted
separately, otherwise they match
as patterns.
See: https://www.shellcheck.net/wiki/SC2295
CentOS Stream 9 has ShellCheck-0.8.0, while so far the 'unit tests' were
being run on Fedora 36, which only has ShellCheck-0.7.2.
This is a step towards testing on CentOS Stream 9.
https://github.com/containers/toolbox/pull/1200
CentOS Stream 9 has golang-1.19.2, while so far the 'unit tests' were
being run on Fedora 36, which only has golang-1.18.8.
This is a step towards testing on CentOS Stream 9.
https://github.com/containers/toolbox/pull/1199
CentOS Stream 9 has codespell-2.2.1, while so far the 'unit tests' were
being run on Fedora 36, which only has codespell-2.1.0.
This is a step towards testing on CentOS Stream 9.
Fallout from 708fa593e2https://github.com/containers/toolbox/pull/1199
Different versions of ShellCheck and codespell may treat the same code
base differently. eg., these tools are currently being used on Fedora
36 as part of the 'unit tests', but CentOS Stream 9 has newer versions
that are stricter and catch several new problems.
Knowing the versions of the tools used in the tests helps to understand
these differences, and is a step towards testing on CentOS Stream 9.
https://github.com/containers/toolbox/pull/1199
Note that 'run --keep-empty-lines' counts the trailing newline on the
last line as a separate line.
Until Bats 1.7.0, 'run --keep-empty-lines' had a bug where even when a
command produced no output, it would report a line count of one [1] due
to a stray line feed character. This needs to be conditionalized, since
Fedora 35 has Bats 1.5.0.
[1] https://github.com/bats-core/bats-core/issues/573https://github.com/containers/toolbox/issues/1043
Currently, if an image was copied with:
$ skopeo copy \
containers-storage:registry.fedoraproject.org/fedora-toolbox:36 \
containers-storage:localhost/fedora-toolbox:36
... or:
$ podman tag \
registry.fedoraproject.org/fedora-toolbox:36 \
localhost/fedora-toolbox:36
... then it would show up twice in 'list' with the same name, and in the
wrong order.
Either as:
$ toolbox list --images
IMAGE ID IMAGE NAME CREATED
2110dbbc33d2 localhost/fedora-toolbox:36 1 day...
e085805ade4a registry.access.redhat.com/ubi8/toolbox:latest 1 day...
2110dbbc33d2 localhost/fedora-toolbox:36 1 day...
70cbe2ce60ca registry.fedoraproject.org/fedora-toolbox:34 1 day...
... or as:
$ toolbox list --images
IMAGE ID IMAGE NAME CREATED
2110dbbc33d2 registry.fedoraproject.org/fedora-toolbox:36 1 day...
e085805ade4a registry.access.redhat.com/ubi8/toolbox:latest 1 day...
2110dbbc33d2 registry.fedoraproject.org/fedora-toolbox:36 1 day...
70cbe2ce60ca registry.fedoraproject.org/fedora-toolbox:34 1 day...
The correct output should be similar to 'podman images', and be sorted
in ascending order of the names:
$ toolbox list --images
IMAGE ID IMAGE NAME CREATED
2110dbbc33d2 localhost/fedora-toolbox:36 1 day...
e085805ade4a registry.access.redhat.com/ubi8/toolbox:latest 1 day...
70cbe2ce60ca registry.fedoraproject.org/fedora-toolbox:34 1 day...
2110dbbc33d2 registry.fedoraproject.org/fedora-toolbox:36 1 day...
The problem is that, in these situations, 'podman images --format json'
returns separate identical JSON collections for each copy of the image,
and all of those copies have multiple names:
[
{
"Id": "2110dbbc33d2",
...
"Names": [
"localhost/fedora-toolbox:36",
"registry.fedoraproject.org/fedora-toolbox:36"
],
...
},
{
"Id": "e085805ade4a",
...
"Names": [
"registry.access.redhat.com/ubi8/toolbox:latest"
],
...
},
{
"Id": "2110dbbc33d2",
...
"Names": [
"localhost/fedora-toolbox:36",
"registry.fedoraproject.org/fedora-toolbox:36"
],
...
}
{
"Id": "70cbe2ce60ca",
...
"Names": [
"registry.fedoraproject.org/fedora-toolbox:34"
],
...
},
]
The image objects need to be flattened to have only one unique name per
copy, but with the same ID, and then sorted to ensure the right order.
Note that the ordering was already broken since commit 2369da5d31,
which started using 'podman images --sort repository'. Podman can sort
by either the image's repository or tag, but not by the unified name,
which is what Toolbx needs. Therefore, even without copied images,
Toolbx really does need to sort the images itself.
Prior to commit 2369da5d31, the ordering was correct, but copied
images would only show up once.
Fallout from 2369da5d31
This reverts parts of commit 67e210378e.
https://github.com/containers/toolbox/issues/1043