0

I'm building a SaaS where some data needs to be stored in an encrypted format.

You should be able to access this data from any device, as long as you remember your account credentials.

Also, if you want it, you should be able to share this encrypted data with another user.

I came up with the following idea:

  • when an account is created, I generate a private key and a public key (on client side)
  • the private key is encrypted using the sign-up password (still, on client side)
  • both the hashed password and the encrypted private key will be sent to the server

After the user login, the encrypted private key would be fetched from the server and, on the client, it would be decrypted using the password (which is only known by the user).

Even if the server database would be exposed, there is no way you could decrypt the private keys. To decrypt a private key you would need to know the plain-text user password, which obviously is not stored on the server.

Is this a bad idea? What are the security risks I should be aware of? Do you see any security issue?

4
  • So the same password is used for login at the server and for decrypting the private key? Assuming that the password is send to the server when logging in (typical approach and you don't say otherwise) it means that the server also gets at this moment access to the password for the private key (since it is the same) and can decrypt it - clearly insecure. Commented Nov 16, 2024 at 18:17
  • 1
    OP, See protonmail.com/blog/encrypted_email_authentication for an interesting read on how ProtonMail uses SRP to derive two keypairs from a user's password - one that is used to authenticate with the server, and another that is used for client-side encryption. Commented Nov 16, 2024 at 18:56
  • @SteffenUllrich I see your point. What if (by using one of the methods suggested by the other users) the authentication key is different from the data encryption one? Commented Nov 16, 2024 at 23:41
  • 1
    @lorenzo: as long as it impossible to derive the encryption key from the authentication key it should be fine. Note that a simple fast hash just with a different prefix might not be sufficient, if the input to the has is only a maybe simple password. In this case the server with access to the authentication key could try to brute-force the password and then derive the encryption key from it. Commented Nov 17, 2024 at 6:07

1 Answer 1

7

It looks like you're trying to build a zero-knowledge end-to-end encryption system. I commend your consideration of security and privacy. However, be aware that these systems are hard to build right.

Fundamentally, the scheme you describe makes sense. Client-side key generation is of course essential, as is client-side encryption of that key. The catch comes from where the key-encrypting-key (KEK) which encrypts (or "wraps") the user's private key (PvK) comes from. Deriving it from the user's password is an obvious candidate - that's a secret that the user already has, and has already entered into the client - but it comes with a bunch of pitfalls.

  • Typically, the password is sent to the server during login. Obviously, if you don't want the server to be able to re-derive the KEK (or otherwise access the PvK), then the server can never see the password. But that complicates user authentication. One option is to perform extra hashing of the user's password on the client. For example, use a secure key derivation / password hashing function with a known salt (e.g. the username) to derive the KEK, and then hash the KEK itself another time to derive the authentication secret (the value that is sent to the server in place of the password).
  • The server should further hash this authentication secret (password hash) with a server-only random salt before storing into / verifying against the database, to mitigate the risk of database leaks.
  • If you encrypt anything with a key derived from a password, then changing passwords can become expensive. A password-derived key should not be used to encrypt data directly, otherwise in order to change a password you have to re-encrypt everything. However, encrypting just your PvK is generally fine; on password change, you only need to decrypt the PvK with the old password-derived KEK and re-encrypt with the new one.
  • Forgotten passwords mean complete loss of user data. There are various options to mitigate this, such as generating a recovery key that also wraps the PvK (or even just displaying the PvK in hex format or something) and telling the user they need to copy it down and store it somewhere safe, but fundamentally this is a hard step (and people will forget their passwords; you must plan for this. Though the plan can be just clearly advising "if you forget this you will lose all of your data in the account, forever" if you want).

An additional consideration for data sharing and asymmetric cryptography: Asymmetric keys shouldn't ever be used to encrypt data directly, at least not of any significant length. They're far too computationally expensive to use for that, and not designed for bulk data encryption operations anyhow. However, they can be used to secure symmetric cryptographic keys that are used for bulk data encryption; this is known as a hybrid cryptosystem and is how ~everything using asymmetric encryption works. Generate (on the client) a data encryption key (DEK) for each datum, encrypt each datum with its DEK and each DEK with each public key of the users who have access. Store the collection of encryptions of the DEK for each datum along with the encrypted datum itself. This lets you share specific data to other users, without giving them the keys to all of your data.

Finally, as with all cryptosystems, there's a few solid pieces of advice:

  • Never roll your own. Definitely don't create your own primitives (ciphers, hash functions, modes of operation, etc.) unless you have serious backgrounds in both math and cryptography. If at all possible, don't create your own constructions (authenticated encryption schemes, key exchange protocols, etc.). If you can help it, don't even design novel cryptosystems (such as this one); if there isn't an off-the-shelf design you can use, at least look at existing ones that are similar (for example, this system is similar to what zero-knowledge password managers use), copy from them where possible, and where you can't copy, understand why they made the choices they made and take that into consideration.
  • Relatedly, always use well-tested implementations where possible. Don't write a cryptographic primitive yourself, even a well-known one. Use a library for that. Ideally, the library should avoid you even touching the primitives at all, though; you want a library that takes as few inputs from you as possible, to avoid the risk that you generated or handled something insecurely.
  • Don't forget that encryption does not inherently provide authentication (proof that the message was encrypted by who you expect) or even integrity (proof that the message wasn't tampered with, inherently comes with authentication), and indeed many encryption schemes are trivially vulnerable to bit-flipping attacks. Libraries you should use will only offer constructions that include protections against this; other libraries (looking at you, OpenSSL) will make you jump through a bunch of hoops even if you think you're using authenticated encryption, and should be avoided.

There are more considerations - this is a very complicated topic, with a bunch of problems such as "if you're doing this via a webapp, it is ~impossible to make it secure against a malicious/compromised web server" - but that should be enough to give you some ideas of the typical pitfalls here.

9
  • I am deeply grateful for your exhaustive reply. The only thing I still not understand is how I'd share an encrypted data with a different user, since I don't have its KEK for re-encrypting the datum's DEK. Probably I'm missing something, I can't see how to fit asymmetric encryption in your 3rd listed point. Commented Nov 16, 2024 at 23:26
  • 1
    Sorry about that, I remembered in the middle of writing that bullet point that you wanted data sharing via asymmetric crypto rather than just using symmetric for everything, and didn't fix everything up properly. I think it's fixed up now. Just beware: this is still way underspecified compared to the documentation I would create if I were designing this system professionally, nor did I take my own advice and reference existing designs (aside from working from memory of those designs) as I wrote it. For highly sensitive stuff, get somebody who thoroughly knows the topic to carefully review it! Commented Nov 17, 2024 at 19:29
  • I would say that the client-side hashing, in the setup you suggest, should not be done with a predictable salt as this opens the door to more precomputation attacks. The salt used in the first hashing should be unique and unpredictable, but the salts used in subsequent hashes can be predictable. Commented Nov 18, 2024 at 14:09
  • @n-l-i: The salt is not meant to be a secret. It's purpose is to prevent precomputed rainbow attacks. It just needs to be unique (--> usernames are okay). So, whether use use the encoded username as the salt, or some other byte array is equal in terms of security, from my perspective Commented Nov 18, 2024 at 21:12
  • 1
    @n-l-i Where / how do you propose storing or deriving an unpredictable salt that must be available on every client (even ones that have never been used), must be unique to every user, and must be available before authentication (since their use occurs as part of deriving the login credential)? I proposed the username because it's a unique value that the user can themselves reliably supply. You could have the server generate and store client-hash salts but they need to be exposed unauthenticated, so an attacker can get them too. Commented Nov 19, 2024 at 8:11

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.