Improve KDF handling #526

Narrat · 2024-07-24T15:37:55Z

The current handling of the KDF functionality is somewhat suboptimal in my opinion.

two variables: KDF for pbkdf & ARGON2 für argon2
check and setting both variables

Checks on KDF:
https://github.com/dyne/tomb/blob/master/tomb#L843 <-- enables --kdf in output of --help
https://github.com/dyne/tomb/blob/master/tomb#L1149 <-- where it failed with --kdftype=
https://github.com/dyne/tomb/blob/master/tomb#L1592 <-- only $KDF is checked for enabling KDF

Checks on argon2:
argon2 implementation covered behind $KDF. Nothing is done with $ARGON2 itself.

Changing ARGON2 to KDF is not practical as an not installed argon2 will unset $KDF.

Additional thoughts:
The input for --kdf should be optional. For argon2 a safe default will be assumed. That can also be done for pbkdf2.
This can be done via removing checking --kdf from the argon2 case (this option is already checked before entering switch/case).
And setting the default value can be moved from the argon2 case. With 3 as default pbkdf2 results with 3000000, which AFAIR is deemed save.
Although it can of course in every case a separate default value be set.

Maybe a general case should be added to the switch/case. Makes it more recognizable if an unsupported kdf option was supplied instead of rather vaguely failing at line 1149.

And maybe rename the options?
--kdftype -> --kdf
NEW --kdfiter to get declare an optional iteration value
Would make possible to only supply --kdf <kdf-implementation> to get a secured key. If one wants to adjust the default that can be additionally done via --kdfiter and --kdfmem.

How to achieve:

Split switch/case and only check on $ARGON2 or $KDF.
Disadvantage: Leads to duplicated code
additional check on $ARGON2
Nicer, but it uses tomb-kdb-pbkdf2-gensalt to generate a salt. Available via the pbkdf2 implementation from tomb.
Can "tomb-kdb-pbkdf2-gensalt" be replaced?
Maybe: https://stackoverflow.com/questions/23837061/a-random-string-generator-in-a-bash-script-isnt-respecting-the-number-of-given/23837515#23837515
as fallback, if the C-program isn't available

Current changes that are untested. Just a prototype of the thoughts I had on the topic:

diff --git a/tomb b/tomb
index 34b4102..a2cb270 100755
--- a/tomb
+++ b/tomb
@@ -840,8 +840,12 @@ usage() {
 		_print " --sphx-host  host associated with the key (for use with pitchforkedsphinx)"
 	}
 
-	[[ $KDF == 1 ]] && {
-		_print " --kdf        forge keys armored against dictionary attacks"
+	[[ $KDF == 1 ]] || [[ $ARGON2 == 1 ]] && {
+		_print " --kdf        forge keys armored against dictionary attacks" # needs the note, that this also accepts an argument for iteration
+		_print " --kdftype    what KDF function to use (pbkdf2, argon2)"
+	}
+	[[ $ARGON2 == 1 ]] && {
+		_print " --kdfmem     memory to be used for argon2"
 	}
 
 	echo
@@ -1589,11 +1593,18 @@ gen_key() {
 		fi
 
 		header=""
-		[[ $KDF == 1 ]] && {
+		[[ $KDF == 1 ]] || [[ $ARGON2 == 1 ]] && {
 			{ option_is_set --kdf } && {
-				# KDF is a new key strenghtening technique against brute forcing
+				# KDF is a key strengthening technique against brute forcing
 				# see: https://github.com/dyne/Tomb/issues/82
+				# Two algorithm currently supported:
+				# * pbkdf2 (covers against CPU)
+				# * argon2 (covers against CPU, memory and parallelism)
 				itertime="`option_value --kdf`"
+				# Set default (argon2 has a default of 3 iterations; the resulting itertime with this
+				# default is considered safe enough for pbkdf2)
+				itertime=${itertime:-3}
+
 				# removing support of floating points because they can't be type checked well
 				# if [[ "$itertime" != <-> ]]; then
 				# 	unset tombpass
@@ -1602,7 +1613,15 @@ gen_key() {
 				# 	_failure "Depending on the speed of machines using this tomb, use 1 to 10, or more"
 				# 	return 1
 				# fi
-				# # --kdf takes one parameter: iter time (on present machine) in seconds
+				# # --kdf takes one optional parameter: iterations (translated to the respective itertime and iterations)
+
+				# Generating salt (either via tomb-kdb-pbkdf2 or a shell fallback)
+				if $(command -v tomb-kdb-pbkdf2-gensalt 1>/dev/null 2>/dev/null); then
+				    kdfsalt=`tomb-kdb-pbkdf2-gensalt`
+				else
+				    kdfsalt=$(LC_CTYPE=C tr -cd 'a-zA-Z0-9,;.:_#*+~!@$%&()=?{[]}|><-' < /dev/random | head -c 64)
+				fi
+				_message "kdf salt: ::1 kdfsalt::" $kdfsalt
 
 				kdftype="`option_value --kdftype`"
 				kdftype=${kdftype:-pbkdf2}
@@ -1610,26 +1629,19 @@ gen_key() {
 				    pbkdf2)
 					local -i microseconds
 					microseconds=$(( itertime * 1000000 ))
-					_success "Using KDF, iteration time: ::1 microseconds::" $microseconds
-					_message "generating salt"
-					pbkdf2_salt=`tomb-kdb-pbkdf2-gensalt`
+					_success "Using pbkdf2 as KDF, iteration time: ::1 microseconds::" $microseconds
 					_message "calculating iterations"
 					pbkdf2_iter=`tomb-kdb-pbkdf2-getiter $microseconds`
 					_message "encoding the password"
 					# We use a length of 64bytes = 512bits (more than needed!?)
-					tombpass=`tomb-kdb-pbkdf2 $pbkdf2_salt $pbkdf2_iter 64 <<<"${tombpass}"`
-
-					header="_KDF_pbkdf2sha1_${pbkdf2_salt}_${pbkdf2_iter}_64\n"
+					tombpass=`tomb-kdb-pbkdf2 $kdf_salt $pbkdf2_iter 64 <<<"${tombpass}"`
+					header="_KDF_pbkdf2sha1_${kdf_salt}_${pbkdf2_iter}_64\n"
 					;;
 				    argon2)
-					_success "Using KDF Argon2"
+					_success "Using Argon2 as KDF"
 					kdfmem="`option_value --kdfmem`"
 					kdfmem=${kdfmem:-18}
 					_message "memory used: 2^::1 kdfmemory::" $kdfmem
-          itertime="`option_value --kdf`"
-          itertime=${itertime:-3}
-					kdfsalt=`tomb-kdb-pbkdf2-gensalt`
-					_message "kdf salt: ::1 kdfsalt::" $kdfsalt
 					_message "kdf iterations: ::1 kdfiterations::" $itertime
 					tombpass=`argon2 $kdfsalt -m $kdfmem -t $itertime -l 64 -r <<<"${tombpass}"`
 					header="_KDF_argon2_${kdfsalt}_${itertime}_${kdfmem}_64\n"
@@ -2095,7 +2107,7 @@ forge_key() {
 			 $destkey $algo
 
 	[[ $KDF == 1 ]] && { ! option_is_set -g } && {
-		_message "Using KDF to protect the key password (`option_value --kdf` rounds)"
+		_message "Using KDF to protect the key password (`option_value --kdf` rounds)" # something to be done here to see the default
 	}
 
 	TOMBKEYFILE="$destkey"	  # Set global variable

The text was updated successfully, but these errors were encountered:

jaromil · 2024-07-24T17:34:26Z

Definitely a place that needs improvement, thanks for the extensive analysis. Will be watching this PR, according to your intentions we should end up with less code and more readable ✌🏽

Narrat · 2024-07-25T19:41:32Z

Not so sure on the less code part :D I tried at least... But yeah, tomb became a chunky shell script over the time :D

Previously it wasn't possible to use argon2 as KDF function without the tomb tools from extras/kdf-keys being available. To change that behaviour introduce checks on the ARGON2 variable. Additionally add a fallback function to create a salt that is compatible to tomb-kdb-pbkdf2-gensalt. Options specific for the different supported KDF algorithm are reorganized. Some options align between the various KDF and some are unique to them. The output of -h is enhanced with the various --kdf options and depends on the available optional tools. argon2 specific cli arguments won't be displayed if argon2 is not available. Add case for results beside argon2 and pbkdf2. Key creation won't be stopped, just a warning is issued that the resulting key won't be protected via KDF. Regarding the cli options. The argument for the suboption --kdf is made optional. In that regard one needs to make sure, that if --kdf is the last option before an argument one needs to use - to separate or use -k. Example: tomb forge --kdf - testkey.tomb Example: tomb forge --kdf -k testkey.tomb Example: tomb forge -k testkey.tomb --kdf Additonally the kdf options are reorganized, which is a possible breaking change for scripts or GUI helpers. * --kdftype is changed to --kdf * --kdfiter is introduced as replacement the for previous --kdf definition * --kdfpar is introduced to support the parallelism option of argon2 Only --kdf is mandatory to get a key which is protected with KDF. For every other option safe defaults are set and can be optionally adjusted. KDF related subcommand options are removed where they don't come into play. gen_key() is only called in forge and passwd. Closes dyne#526

Previously it wasn't possible to use argon2 as KDF function without the tomb tools from extras/kdf-keys being available. To change that behaviour introduce checks on the ARGON2 variable. Additionally add a fallback function to create a salt that is compatible to tomb-kdb-pbkdf2-gensalt. Options specific for the different supported KDF algorithm are reorganized. Some options align between the various KDF and some are unique to them. The output of -h is enhanced with the various --kdf options and depends on the available optional tools. argon2 specific cli arguments won't be displayed if argon2 is not available. Add case for results beside argon2 and pbkdf2. Key creation won't be stopped, just a warning is issued that the resulting key won't be protected via KDF. Regarding the cli options. The argument for the suboption --kdf is made optional. In that regard one needs to make sure, that if --kdf is the last option before an argument one needs to use - to separate or use -k. Example: tomb forge --kdf - testkey.tomb Example: tomb forge --kdf -k testkey.tomb Example: tomb forge -k testkey.tomb --kdf Additonally the kdf options are reorganized, which is a possible breaking change for scripts or GUI helpers. * --kdftype is changed to --kdf * --kdfiter is introduced as replacement the for previous --kdf definition * --kdfpar is introduced to support the parallelism option of argon2 (nice to have if someone wants to adjust memory or iteration costs without increasing the time that much) Only --kdf is mandatory to get a key which is protected with KDF. For every other option safe defaults are set and can be optionally adjusted. KDF related subcommand options are removed where they don't come into play. gen_key() is only called in forge and passwd. Closes dyne#526

jaromil · 2024-09-08T02:44:56Z

I will look into your PR and try to merge above conflicts as I did merge some duplicate code with mine, apologies for the mess.

to this analysis let me also add the possibility to run a timed benchmark before creating the key, so that a default above the minimum can be set according to the speed of machine. also the iteration setting then can refer to seconds rather than absolute value.

Narrat · 2024-09-08T08:46:55Z

No worries. You could also say what should be dropped or changed and I adjust it :D No problem to adjust the test changes onto the test you added instead of the initial add of my own. If that was what you meant.

to this analysis let me also add the possibility to run a timed benchmark before creating the key, so that a default above the minimum can be set according to the speed of machine. also the iteration setting then can refer to seconds rather than absolute value.

Not sure if I follow completly, but imo it shouldn't be necessary to complicate it that much. Especially since the iteration in argon2 doesn't have the weight as for pbkdf2. 3 vs 800000+ (as suggested by some parties).
Example of adjusting the iteration for argon2:

$  time argon2 holladollahey -m 12 -p 1 -t 3 <<< $(printf "test") 
Type:		Argon2i
Iterations:	3
Memory:		4096 KiB
Parallelism:	1
Hash:		6af9b7dc03d4530ec429b44df9cf7d9cdb46e0b309789199248508a8df064918
Encoded:	$argon2i$v=19$m=4096,t=3,p=1$aG9sbGFkb2xsYWhleQ$avm33APUUw7EKbRN+c99nNtG4LMJeJGZJIUIqN8GSRg
0.010 seconds
Verification ok
argon2 holladollahey -m 12 -p 1 -t 3 <<< $(printf "test")  0,02s user 0,00s system 98% cpu 0,021 total

$  time argon2 holladollahey -m 12 -p 1 -t 30 <<< $(printf "test") 
Type:		Argon2i
Iterations:	30
Memory:		4096 KiB
Parallelism:	1
Hash:		cf9ef6d124c4f31b944a647be7b41eab3ab0603385d88ff7feec826a6854e036
Encoded:	$argon2i$v=19$m=4096,t=30,p=1$aG9sbGFkb2xsYWhleQ$z5720STE8xuUSmR757QeqzqwYDOF2I/3/uyCamhU4DY
0.073 seconds
Verification ok
argon2 holladollahey -m 12 -p 1 -t 30 <<< $(printf "test")  0,15s user 0,00s system 99% cpu 0,148 total

Increasing it tenfold doesn't add that much, whereas adjusting the memory requirement achieves more:

$  time argon2 holladollahey -m 12 -p 1 -t 3 <<< $(printf "test") 
Type:		Argon2i
Iterations:	3
Memory:		4096 KiB
Parallelism:	1
Hash:		6af9b7dc03d4530ec429b44df9cf7d9cdb46e0b309789199248508a8df064918
Encoded:	$argon2i$v=19$m=4096,t=3,p=1$aG9sbGFkb2xsYWhleQ$avm33APUUw7EKbRN+c99nNtG4LMJeJGZJIUIqN8GSRg
0.016 seconds
Verification ok
argon2 holladollahey -m 12 -p 1 -t 3 <<< $(printf "test")  0,03s user 0,00s system 98% cpu 0,032 total

$  time argon2 holladollahey -m 17 -p 1 -t 3 <<< $(printf "test") 
Type:		Argon2i
Iterations:	3
Memory:		131072 KiB
Parallelism:	1
Hash:		a3ded701563d8fa3b5fc68dc905eec06600426a995a3750a03ead1a99160aad0
Encoded:	$argon2i$v=19$m=131072,t=3,p=1$aG9sbGFkb2xsYWhleQ$o97XAVY9j6O1/GjckF7sBmAEJqmVo3UKA+rRqZFgqtA
0.282 seconds
Verification ok
argon2 holladollahey -m 17 -p 1 -t 3 <<< $(printf "test")  0,54s user 0,03s system 99% cpu 0,570 total

The current implementation of my patches implements a default of 3 for --kdfiter because default of argon2. argon2 will use it as an absolute value, but the pbkdf2 implementation is different in that regard since it was introduced. It will take the input as seconds and run it through a command which calculates the number of iterations depending on $HOST. Which was in the millions for the systems I tested on. And doing so is fine for this algo.
But adding the same for argon2 would be contrary to its design. With pbkdf2 the regular user also suffers from the timecost which isn't the case with argon2. Instead it slows down the process for "dedicated" setups to crack a password like GPUs or ASICs where the number of cores is really high, but relatively sparse RAM available for every CPU. A case where pbkdf2 fails spectaculary since it's an algorithm which is really cheap to run and can therefore easily be parallelized. And then the timecost doesn't cost that much anymore. The hashes/second generated is vastly higher for pbkdf2 than for argon2 in that case. Only if $HOST CPU itself is used pbkdf2 will be slower to crack, which is kinda a mood point as most of the home user systems have a GPU available. Taking my relatively old RX 480 into account those are 2000+ Cores compared to 4 CPU Cores.

Which opens up a can of worms for the future :D

Support more? bcrypt, scrypt, catena, Lyra2, yescrypt, Makwa
Which one should be really the default?
- with cryptsetup/LUKS2 the default changed to argon2 (argon2id; Time cost: 13; Memory: 1048576; Threads: 4)
- Tomb uses a custom implementation, which could be seen as problematic
Which argon2 implementation should be supported: The general advice is to use argon2id

Narrat mentioned this issue Jul 24, 2024

Improve KDF handling #527

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve KDF handling #526

Improve KDF handling #526

Narrat commented Jul 24, 2024 •

edited

Loading

jaromil commented Jul 24, 2024

Narrat commented Jul 25, 2024

jaromil commented Sep 8, 2024

Narrat commented Sep 8, 2024 •

edited

Loading

Improve KDF handling #526

Improve KDF handling #526

Comments

Narrat commented Jul 24, 2024 • edited Loading

jaromil commented Jul 24, 2024

Narrat commented Jul 25, 2024

jaromil commented Sep 8, 2024

Narrat commented Sep 8, 2024 • edited Loading

Narrat commented Jul 24, 2024 •

edited

Loading

Narrat commented Sep 8, 2024 •

edited

Loading