-
Notifications
You must be signed in to change notification settings - Fork 669
Store password as hash instead of plaintext #3276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Would you please create a separate pull request with just the fix for #1591? The fix for that is only a few lines ( |
| from .profile_menu import ProfileMenu, select_greeter, select_profile | ||
| from .profiles_handler import profile_handler | ||
|
|
||
| __all__ = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@correctmost I know you just merged this change but I run into a circular dependency when loading profile configuration from a config file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to delete :). The profile handler and profile model code still need more untangling.
|
@Torxed should be ready for a review |
|
Relevant excerpt from
If this is merged, all user passwords created with archinstall will use bcrypt instead of the Arch Linux default of yescrypt. If Arch Linux staff decided on yescrypt it should not be up to archinstall to change this. This also does not fix #1111 if disk encryption passwords will still be in plaintext. Sorry but this does not have my support. |
|
I'm aware of the documentation and that the PR deviates from the default behaviour. The reason default values exist is to provide a single set option for a user without having to let them choose between multiple options that would all be viable choices. Same in this case, Arch had to pick a single option to go with as default, but that doesn't mean that other options are necessarily worse. As stated in the description I couldn't find a way to either use PAM or create Regarding the issue #1111 at hand, it is correct that this PR does not solve the entire problem with plaintext passwords but only addresses the user account ones as also stated in the description. Looking at why archinstall has plaintext passwords stored in a file in the first place the main use case seems to be that users can point the installer to the previously generated configuration files and start the installation without having to manually go through all steps again. A potential option to workaround it completely could be:
|
|
I'm having a look! Overall this is a very welcome change. |
|
So I'm not an expert in C, but perhaps this could be a starting point: import ctypes
import ctypes.util
# Try to locate libxcrypt
lib_path = ctypes.util.find_library("xcrypt") or ctypes.util.find_library("crypt")
if not lib_path:
raise RuntimeError("libxcrypt not found")
libxcrypt = ctypes.CDLL(lib_path)
libxcrypt.crypt.argtypes = [ctypes.c_char_p, ctypes.c_char_p]
libxcrypt.crypt.restype = ctypes.c_char_p
password = b"some password"
salt = b"$y$j9T$KbYPzGJXFPvMbHGhKSD8x/"
hashed = libxcrypt.crypt(password, salt)Should generate something along the lines of: Would this help? If so, perhaps something along the lines of: import base64
import ctypes
import ctypes.util
import os
def gen_yescrypt_salt(cost=10):
if not (1 <= cost <= 11):
raise ValueError("Cost must be between 1 and 11.")
param_string = f"j{cost}T"
b64_salt = base64.b64encode(os.urandom(16)).decode("ascii")
# libxcrypt expects a base64 salt without padding and trimmed to 22 chars
b64_salt = b64_salt.replace("+", ".").replace("=", "")[:22]
return f"$y${param_string}${b64_salt}"
def gen_yescrypt_hash(password, salt=None):
if salt is None:
salt = gen_yescrypt_salt()
if type(salt) is str:
salt = salt.encode('UTF-8')
if type(password) is str:
password = password.encode('UTF-8')
lib_path = ctypes.util.find_library("xcrypt") or ctypes.util.find_library("crypt")
libxcrypt = ctypes.CDLL(lib_path)
libxcrypt.crypt.argtypes = [ctypes.c_char_p, ctypes.c_char_p]
libxcrypt.crypt.restype = ctypes.c_char_p
return libxcrypt.crypt(password, salt)What I'm mostly unsure about, is how I can verify if the output from |
|
That's some serious black magic :) Given that ur does tge problem of the cleartext disk encryption password remains. Any thoughts on the option I mentioned to pass passwords via the arguments? |
It probably looks more scary than it is hehe. We'll need to update the By doing this approach, we'll import and utilize the same library shadow uses for
This is how it was back in the days, I don't mind reverting to it.
Personal RantI'm very torn here. Because one side of the aisle it makes sense to not write the credentials because it's easy to handle it incorrectly, leaving credentials vulnerable even if we use a hash algorithm. Ultimately I would hope for a passwordless approach using HSM's, but not everyone have access to such devices. Long story short, perhaps asking for a password when loading a configuration is the most sane and secure approach we can support? As long as we don't use
|
|
Or we keep storing all in plaintext passwords and encrypt the credentials file with a master password that needs to be entered on save. And when the credentials file is passed we prompt for it? |
I like that approach! There's a overwhelmingly high chance that people will forget it ^^ The only drawback will be that we'll need to have to have additional handling of the different scenarios. But I like the idea of this :) Basically a vault, and inside hashed credentials :) |
Indeed, or they will potentially use the same password as they gave their user account ;)
I think the friction won't be too much, we will need some form of prompt when a user selects to save the credentials file, and then some handling of decryption when the file gets passed as arg
Will we still need to hash the credentials inside? We can try to hash the user account password with your approach above but the disk encryption one will be in clear text I don't have access to the ISO atm but looking at https://gitlab.archlinux.org/archlinux/archiso/-/blob/4b709bcd5fc5e426ba75f72e1b80e59922d7c175/configs/releng/packages.x86_64 I can't see |
I think we should, in case we want to offer unencrypted JSON files for credentials in the future and while passing to the command-line (remember that the password if using Also, the disk password is fine. But we should perhaps do (a bit of pseudo code): As it would lower the risk of us forgetting it somewhere.
It is, via: base <- shadow <- libxcrypt But just to be sure, for instance in docker containers, we should add it as an explicit dependency. It also allows other packages to explicitly see that we depend on it so any updates to that library will tell them to notify us about any breaking changes :) |
|
Okay sounds good, in that case I'll rework the PR to switch to the yescrypt hash instead of bcrypt. And in a future PR address the encryption of the file and potential arguments. |
|
@Torxed Update the PR with the |
archinstall/lib/models/users.py
Outdated
| @cached_property | ||
| def _libcrypt(self): | ||
| libc = ctypes.CDLL("libcrypt.so") | ||
|
|
||
| @classmethod | ||
| def _parse_backwards_compatible(cls, config_users: dict[str, dict[str, str]], sudo: bool) -> list['User']: | ||
| if len(config_users.keys()) > 0: | ||
| username = list(config_users.keys())[0] | ||
| password = config_users[username]['!password'] | ||
| libc.crypt.argtypes = [ctypes.c_char_p, ctypes.c_char_p] | ||
| libc.crypt.restype = ctypes.c_char_p | ||
|
|
||
| libc.crypt_gensalt.argtypes = [ctypes.c_char_p, ctypes.c_int, ctypes.c_void_p, ctypes.c_size_t] | ||
| libc.crypt_gensalt.restype = ctypes.c_char_p | ||
|
|
||
| return libc | ||
|
|
||
| def _gen_yescrypt_salt(self) -> bytes: | ||
| salt_prefix = b'$y$' | ||
| rounds = 9 | ||
| entropy = os.urandom(64) | ||
|
|
||
| salt = self._libcrypt.crypt_gensalt(salt_prefix, rounds, entropy, 64) | ||
| return salt | ||
|
|
||
| def _gen_yescrypt_hash(self, plaintext: str) -> str: | ||
| """ | ||
| Generation of the yescrypt hash was used from mkpasswd included | ||
| in the whois package which seems to be one of the few tools | ||
| that actually provides the functionality | ||
| https://github.com/rfc1036/whois/blob/next/mkpasswd.c | ||
| """ | ||
| enc_plaintext = plaintext.encode('UTF-8') | ||
| salt = self._gen_yescrypt_salt() | ||
|
|
||
| if password: | ||
| return [User(username, password, sudo)] | ||
| yescrypt_hash = self._libcrypt.crypt(enc_plaintext, salt) | ||
| return yescrypt_hash.decode('UTF-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I have shown from documentation (#3276 (comment)), it is PAM that is responsible for encrypting passwords.
Here is the relevant code:
https://github.com/linux-pam/linux-pam/blob/3da5496be24031a5c806752d9f1892100346d8ff/modules/pam_unix/passverify.c#L494-L514
The rounds variable is handled here:
https://github.com/linux-pam/linux-pam/blob/3da5496be24031a5c806752d9f1892100346d8ff/modules/pam_unix/support.c#L189-L244
So for rounds the YESCRYPT_COST_FACTOR is read in from /etc/login.defs, set to a default of 5 if it is not set, and has enforcement of a sane value.
The code you wrote has rounds set to 9.
This is from just a quick glance over this. If you are going to emulate this, please put in the proper time to understand it so that the implementation is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing that out, I'll update the rounds accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also recheck the rest of this logic against the relevant PAM code. I was only using rounds to demonstrate that the code does not properly emulate the handling done by PAM, it may not be the only discrepancy. I did not check beyond that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be fair, that's on me as well seeing as I laid the foundation for this code.
I believe that if we can set sane defaults that work with the login after an install, then that's good enough for a v1.0 of this.
As the overall benefit from using hashes outweigh 1:1 code logic with PAM or whomever, as long as it works. Separate PR's can be raised for such cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you got #3276 (comment) from an LLM chat bot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, why?
It was mostly copy paste from old garbage I had laying around from 5 years ago plus random googling on this specific hash function parameters.
As I've never worked with yescrypt that was new to me if that's what you're fishing for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it based on the appearance of the first example, the brevity of it and no functions. I took a look at the script you linked and admit I was wrong. Thanks for addressing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend moving this chunk of code to its own module, something like libcrypt.py, and using this python source code as an example of how the code should be structured:
https://github.com/python/cpython/blob/bfc292abc10091a9298027596d107652ee4fe166/Lib/_pyrepl/_minimal_curses.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An implementation of the above recommendation:
libcrypt.py
import ctypes
class error(Exception):
pass
libcrypt = ctypes.CDLL('libcrypt.so')
libcrypt.crypt.argtypes = [ctypes.c_char_p, ctypes.c_char_p]
libcrypt.crypt.restype = ctypes.c_char_p
libcrypt.crypt_gensalt.argtypes = [ctypes.c_char_p, ctypes.c_ulong, ctypes.c_char_p, ctypes.c_int]
libcrypt.crypt_gensalt.restype = ctypes.c_char_p
def crypt(phrase: bytes, setting: bytes) -> bytes:
result = libcrypt.crypt(phrase, setting)
if result is None:
raise error('crypt() returned NULL')
return result
def crypt_gensalt(prefix: bytes, count: int) -> bytes:
setting = libcrypt.crypt_gensalt(prefix, count, None, 0)
if setting is None:
raise error('crypt_gensalt() returned NULL')
return settingSimple use example:
import sys
import libcrypt
def main() -> int:
if len(sys.argv) < 2:
return 2
password = sys.argv[1]
prefix = b'$y$'
rounds = 5
try:
setting = libcrypt.crypt_gensalt(prefix, rounds)
hashed = libcrypt.crypt(password.encode(), setting)
except libcrypt.error as err:
print(err, file=sys.stderr)
return 1
print(hashed.decode())
return 0
if __name__ == '__main__':
sys.exit(main())|
If a user interacting with archinstall through the TUI inputs a password for a root or user account, will the password be encrypted using this code and bypass the process in which PAM handles encryption of the password even if the user does not intend to save and reuse credentials? |
Correct, I opted for consistency to keep the hashing of passwords in a single place. I thought it might confuse users if they specify a password them PAM hashes and sets it in the shadow file and the config contains a different hash generated by archinstall. |
|
The PKGBUILD will need |
|
@codefiles Update the PR.
which means it'll be ignore and the default value of 5 applied. I've hardcoded it now to 5 instead of 9 but did not implement the actual reading of the value from |
|
The ISO just reflects how I would have preferred you to keep |
|
Thanks for the feedback, I've included attempting to read the value from |

Currently the defined user password is stored in plaintext in the
user_credentials.jsonfile if the configuration is saved. A security concern was originally reported in #1111 and another related issue to allow users to define hashes instead of passwords #1029.Based on this announcement in 2023-09-22 archlinux is using
yescryptas by default https://archlinux.org/news/changes-to-default-password-hashing-algorithm-and-umask-settings/.After some research there doesn't seem to be a python package capable of generating a
yescrypthash. But as @Torxed suggested down in the comments we can loadlibcrypt.soand call thecryptfunction. This generally used on Linux systems for encryption and therefore a reliable option.So far the password for a defined user account has been set with
chpasswd username:password, however, the command also allows to pass a password hash withchpasswd --encrypted, which means we can first generate ayescrypthash and then pass it tochpasswd!superuserPreviously
New
Not solved (yet)
The disk encryption password will continue to be stored in cleartext as I haven't seen any obvious way to use a hash in luks nor for HSM devices.