Originally suggested solution for "Return Of the Hidden Number Problem"
is arguably too expensive. While it has marginal impact on slower
curves, none to ~6%, optimized implementations suffer real penalties.
Most notably sign with P-256 went more than 2 times[!] slower. Instead,
just implement constant-time BN_mod_add_quick.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/6664)
It was false positive, but one can as well view it as readability issue.
Switch even to unsigned indices because % BN_BYTES takes 4-6 instructions
with signed dividend vs. 1 (one) with unsigned.
Reviewed-by: Rich Salz <rsalz@openssl.org>
By default `ec_scalar_mul_ladder` (which uses the Lopez-Dahab ladder
implementation) is used only for (k * Generator) or (k * VariablePoint).
ECDSA verification uses (a * Generator + b * VariablePoint): this commit
forces the use of `ec_scalar_mul_ladder` also for the ECDSA verification
path, while using the default wNAF implementation for any other case.
With this commit `ec_scalar_mul_ladder` loses the static attribute, and
is added to ec_lcl.h so EC_METHODs can directly use it.
While working on a new custom EC_POINTs_mul implementation, I realized
that many checks (e.g. all the points being compatible with the given
EC_GROUP, creating a temporary BN_CTX if `ctx == NULL`, check for the
corner case `scalar == NULL && num == 0`) were duplicated again and
again in every single implementation (and actually some
implementations lacked some of the tests).
I thought that it makes way more sense for those checks that are
independent from the actual implementation and should always be done, to
be moved in the EC_POINTs_mul wrapper: so this commit also includes
these changes.
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6690)
This commit uses the new ladder scaffold to implement a specialized
ladder step based on differential addition-and-doubling in mixed
Lopez-Dahab projective coordinates, modified to independently blind the
operands.
The arithmetic in `ladder_pre`, `ladder_step` and `ladder_post` is
auto generated with tooling:
- see, e.g., "Guide to ECC" Alg 3.40 for reference about the
`ladder_pre` implementation;
- see https://www.hyperelliptic.org/EFD/g12o/auto-code/shortw/xz/ladder/mladd-2003-s.op3
for the differential addition-and-doubling formulas implemented in
`ladder_step`;
- see, e.g., "Fast Multiplication on Elliptic Curves over GF(2**m)
without Precomputation" (Lopez and Dahab, CHES 1999) Appendix Alg Mxy
for the `ladder_post` implementation to recover the `(x,y)` result in
affine coordinates.
Co-authored-by: Billy Brumley <bbrumley@gmail.com>
Co-authored-by: Sohaib ul Hassan <soh.19.hassan@gmail.com>
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6690)
for specialized Montgomery ladder implementations
PR #6009 and #6070 replaced the default EC point multiplication path for
prime and binary curves with a unified Montgomery ladder implementation
with various timing attack defenses (for the common paths when a secret
scalar is feed to the point multiplication).
The newly introduced default implementation directly used
EC_POINT_add/dbl in the main loop.
The scaffolding introduced by this commit allows EC_METHODs to define a
specialized `ladder_step` function to improve performances by taking
advantage of efficient formulas for differential addition-and-doubling
and different coordinate systems.
- `ladder_pre` is executed before the main loop of the ladder: by
default it copies the input point P into S, and doubles it into R.
Specialized implementations could, e.g., use this hook to transition
to different coordinate systems before copying and doubling;
- `ladder_step` is the core of the Montgomery ladder loop: by default it
computes `S := R+S; R := 2R;`, but specific implementations could,
e.g., implement a more efficient formula for differential
addition-and-doubling;
- `ladder_post` is executed after the Montgomery ladder loop: by default
it's a noop, but specialized implementations could, e.g., use this
hook to transition back from the coordinate system used for optimizing
the differential addition-and-doubling or recover the y coordinate of
the result point.
This commit also renames `ec_mul_consttime` to `ec_scalar_mul_ladder`,
as it better corresponds to what this function does: nothing can be
truly said about the constant-timeness of the overall execution of this
function, given that the underlying operations are not necessarily
constant-time themselves.
What this implementation ensures is that the same fixed sequence of
operations is executed for each scalar multiplication (for a given
EC_GROUP), with no dependency on the value of the input scalar.
Co-authored-by: Sohaib ul Hassan <soh.19.hassan@gmail.com>
Co-authored-by: Billy Brumley <bbrumley@gmail.com>
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6690)
Run `make update ERROR_REBUILD=-rebuild` to remove some stale error
codes for SM2 (which is now using its own submodule for error codes,
i.e., `SM2_*`).
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6690)
Move base 2^64 code to own #if section. It was nested in base 2^51 section,
which arguably might have been tricky to follow.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6699)
Base 2^64 addition/subtraction and final reduction failed to treat
partially reduced values correctly.
Thanks to Wycheproof Project for vectors and Paul Kehrer for report.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6699)
"Computationally constant-time" means that it might still leak
information about input's length, but only in cases when input
is missing complete BN_ULONG limbs. But even then leak is possible
only if attacker can observe memory access pattern with limb
granularity.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/5254)
Note that exported functions maintain original behaviour, so that
external callers won't observe difference. While internally we can
now perform Montogomery multiplication on fixed-length vectors, fixed
at modulus size. The new functions, bn_to_mont_fixed_top and
bn_mul_mont_fixed_top, are declared in bn_int.h, because one can use
them even outside bn, e.g. in RSA, DSA, ECDSA...
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/6662)
The new flag marks vectors that were not treated with bn_correct_top,
in other words such vectors are permitted to be zero padded. For now
it's BN_DEBUG-only flag, as initial use case for zero-padded vectors
would be controlled Montgomery multiplication/exponentiation, not
general purpose. For general purpose use another type might be more
appropriate. Advantage of this suggestion is that it's possible to
back-port it...
bn/bn_div.c: fix memory sanitizer problem.
bn/bn_sqr.c: harmonize with BN_mul.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/6662)
Trouble is that addition is postponing expansion till carry is
calculated, and if addition carries, top word can be zero, which
triggers assertion in bn_check_top.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: David Benjamin <davidben@google.com>
(Merged from https://github.com/openssl/openssl/pull/6662)
Fix the NULL check lack in a different way that is more compatible with
non-NULL branch. Refer #6632
Also mark and pop the error stack instead of clearing all errors when something
goes awry in CONF_get_number.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6643)
The sense of the check for build-time support for most hashes was inverted.
CLA: trivial
Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6673)
Also avoids calling EVP_MD_size() and a missing negative result check.
Issue found by Coverity.
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6592)
Check for a negative EVP_MD_size().
Don't dereference group until we've checked if it is NULL.
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6592)
Improvement coefficients vary with TLS fragment length and platform, on
most Intel processors maximum improvement is ~50%, while on Ryzen - 80%.
The "secret" is new dedicated ChaCha20_128 code path and vectorized xor
helpers.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6638)
The 128-byte vectors are extensively used in chacha20_poly1305_tls_cipher
and dedicated code path is ~30-50% faster on most platforms.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6626)
The problematic case falls back to a NULL conf which returns the result
of getenv(2). If this returns NULL, everything was good. If this returns
a string an attempt to convert it to a number is made using the function
pointers from conf.
This fix uses the strtol(3) function instead, we don't have the
configuration settings and this behaves as the default would.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6632)
The issue was discovered on the x86/64 when attempting to include
libcrypto inside another shared library. A relocation of type
R_X86_64_PC32 was generated which causes a linker error.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6595)
Occasionally, e.g. when compiling for elderly glibc, you end up passing
-D_GNU_SOURCE on command line, and doing so triggered warning...
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6616)
Inputs not longer than 64 bytes are processed ~10% faster, longer
lengths not divisble by 64, e.g. 255, up to ~20%. Unfortunately it's
impossible to measure with apps/speed.c, -aead benchmarks TLS-like
call sequence, but not exact. It took specially crafted code path...
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6597)
Currently if you encounter application data while waiting for a
close_notify from the peer, and you have called SSL_shutdown() then
you will get a -1 return (fatal error) and SSL_ERROR_SYSCALL from
SSL_get_error(). This isn't accurate (it should be SSL_ERROR_SSL) and
isn't persistent (you can call SSL_shutdown() again and it might then work).
We change this into a proper fatal error that is persistent.
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Reviewed-by: Kurt Roeckx <kurt@roeckx.be>
(Merged from https://github.com/openssl/openssl/pull/6340)
This allows operation inside a chroot environment without having the
random device present.
A new call, RAND_keep_random_devices_open(), has been introduced that can
be used to control file descriptor use by the random seed sources. Some
seed sources maintain open file descriptors by default, which allows
such sources to operate in a chroot(2) jail without the associated device
nodes being available.
Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
(Merged from https://github.com/openssl/openssl/pull/6432)
Implement support for stateful TLSv1.3 tickets, and use them if
SSL_OP_NO_TICKET is set.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Viktor Dukhovni <viktor@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6563)
This happens on systems that perform is* character classifictions as
array lookup, e.g. NetBSD.
Reviewed-by: Rich Salz <rsalz@openssl.org>
Reviewed-by: Paul Dale <paul.dale@oracle.com>
(Merged from https://github.com/openssl/openssl/pull/6584)
Unlike other ELF systems, HP-UX run-time linker fails to detect symbol
availability through weak declaration.
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6583)
Internal submodules of libcrypto may require non-public functions from
the EC submodule.
In preparation to use `ec_group_do_inverse_ord()` (from #6116) inside
the SM2 submodule to apply a SCA mitigation on the modular inversion,
this commit moves the `ec_group_do_inverse_ord()` prototype declaration
from the EC-local `crypto/ec/ec_lcl.h` header to the
`crypto/include/internal/ec_int.h` inter-module private header.
Reviewed-by: Andy Polyakov <appro@openssl.org>
Reviewed-by: Matt Caswell <matt@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6521)
BN_CTX_end() does not handle NULL input, so we must manually check
before calling from the cleanup handler.
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from https://github.com/openssl/openssl/pull/6502)