From b48e844564fc27e6bfdf9d4e5721c3650b31e50c Mon Sep 17 00:00:00 2001 From: Michael Snoyman Date: Tue, 14 Feb 2017 16:31:54 +0200 Subject: [PATCH] Hackage Security and Stack --- posts.yaml | 3 + posts/hackage-security-stack.md | 136 ++++++++++++++++++++++++++++++++ 2 files changed, 139 insertions(+) create mode 100644 posts/hackage-security-stack.md diff --git a/posts.yaml b/posts.yaml index c39cd8c..8d0b2f1 100644 --- a/posts.yaml +++ b/posts.yaml @@ -1,3 +1,6 @@ +- file: posts/hackage-security-stack.md + title: Hackage Security and Stack + time: 2017-02-14T15:24:20Z - file: posts/stackage-design-choices.md title: "Stackage design choices: making Haskell curated package sets" day: 2017-01-23 diff --git a/posts/hackage-security-stack.md b/posts/hackage-security-stack.md new file mode 100644 index 0000000..24c76f7 --- /dev/null +++ b/posts/hackage-security-stack.md @@ -0,0 +1,136 @@ +Back in 2015, there were two proposals made for securing package +distribution in Haskell. The Stackage team proposed and implemented a +solution using HTTPS and Git, which was then used as the default in +Stack. Meanwhile, the Hackage team moved ahead with +hackage-security. Over the past few weeks, I've been working on moving +Stack over to hackage-security (more on motivation below). The current +status of the overall hackage-security roll-out is: + +* Hackage is now providing the relevant data for hackage-security (the + 01-index.tar file and signature files) +* cabal-install will move over to hackage-security in its next release +* The FP Complete Hackage mirror is using hackage-security (and in + particular + [Herbert's hackage-mirror-tool](https://github.com/hvr/hackage-mirror-tool)) + to run its + [S3-backed mirror](https://www.fpcomplete.com/blog/2016/09/updated-hackage-mirroring). +* On the master branch, Stack defaults to using hackage-security for + downloading package metadata. We may even + [remove support for Git-based indices entirely](https://github.com/commercialhaskell/stack/tree/drop-git-index), + but that's a discussion for another day. + +One upside to this is more reliable package index download time. We +have had complaints from some firewalled users of slow Git clone time, +so this is a good thing. We're still planning on maintaining the +Git-based package indices for people using them (to my knowledge they +are still being used by Nix, and all-cabal-metadata is still used to +power a lot of the information on stackage.org). + +However, there's one significant downside I've encountered in the +current implementation that I want to discuss. + +## Background + +Quick summary of how hackage-security works: there is a `01-index.tar` +file, the contents of which I'll discuss momentarily. This is the file +which is downloaded by Stack/cabal-install when you "update your +index." It is signed by a cryptographic algorithm specified within the +hackage-security project, and whenever a client does an update, it +must verify the signature. In theory, when that signature is verified, +we know that the contents of the `01-index.tar` file are unmodified. + +Within this file are two (relevant) kinds of files: the `.cabal` files +for every upload to Hackage (including revisions), and `.json` files +containing metadata about the package tarballs +themselves. Importantly, this includes a SHA256 checksum and the size +of the tarball. Using these already-validated-to-be-correct JSON +files, we can download and verify a package tarball, even over an +insecure connection. + +The alternative Git-based approach that the Stackage team proposed has +an almost-identical JSON file concept in the all-cabal-hashes +repo. Originally, these were generated by downloading tarballs from +https://hackage.haskell.org (note the HTTPS). However, a number of +months back it became known that the connection between the CDN in +front of Hackage and Hackage itself was not TLS-secured, and therefore +reliance on HTTPS was not possible. We now rely on the JSON files +provided by hackage-security to generate the JSON files used in the +Git repo. + +## The problem + +With that background, the bug is easy to describe: sometimes the +`.json` files are missing from the `01-index.tar` file. This was +[originally opened in April 2016](https://github.com/haskell/hackage-server/issues/488) +(for Americans: on tax day no less), and then +[I rediscovered the issue three weeks ago](https://github.com/haskell/hackage-security/issues/183) +when working on Stack. + +Over the weekend, another `.json` file went missing, resulting in +[the FP Complete mirror not receiving updates](https://github.com/hvr/hackage-mirror-tool/issues/2) +until I +[manually updated the list of missing index files](https://github.com/hvr/hackage-mirror-tool/pull/3). +Due to the inability to securely generate the `.json` file in the +`all-cabal-hashes` Git repo without the file existing upstream, that +file is now missing in `all-cabal-hashes`, causing downstream issues +to the Nix team. + +## How it manifests + +There are a number of outcomes to be aware of from this issue: + +* The FP Complete mirror, and any other mirror using Herbert's tool, + will sometimes stop updating if a new JSON file is missing. This is + an annoyance for end users, and a frustration for the mirror + maintainers. Fortunately, updating the mirror tool code with the + added index isn't too heavy a burden. Unfortunately, due to the lack + of HTTPS between Hackage and its CDN, there's no truly secure way to + do this update. +* End users cannot currently use packages securely if they are + affected by this bug. You can + [see the full list](https://github.com/snoyberg/hackage-mirror-tool/blob/c58d3fe3ab893e57346130bae2e5906c0efedc4a/src/IndexShaSum.hs#L161) + at the time of writing this post. +* Stack has had code in place to reject indices that do not provide + complete signature cover for a long while (I _think_ since its + initial release). Unfortunately, this code cannot be turned on for + hackage-security (which is how I discovered this bug in the first + place). We can implement a new functionality with weaker + requirements (refuse to download a package that is missing signature + information), but ideally we could use the more strict semantics. +* The Nix team cannot rely on hashes being present in + `all-cabal-hashes`. I can't speak to the Nix team internal + processes, and cannot therefore assess how big an impact that is. + +## Conclusion + +Overall, I'm still very happy that we've moved Stack over to +hackage-security: + +* It fixed an immediate problem for users behind a firewall, which we + otherwise would have needed to work around with new code + (downloading a Git repo snapshot). Avoiding writing new code is + always a win :). +* Layering the HTTPS/Git-based security system on top of + hackage-security doesn't make things more secure, it just adds two + layers for security holes to exist in instead of one. From a + security standpoint, if Hackage is providing a security mechanism, + it makes sense to leverage it directly. Said another way: if it + turns out that hackage-security is completely insecure, our + Git-based layer would have been vulnerable anyway since it relied on + hackage-security. +* By moving both Stack and cabal-install over to hackage-security for + client access, we'll be able to test that code more thoroughly, + hopefully resulting in a more reliable security mechanism for both + projects to share + ([small example of such stress-testing](https://github.com/haskell/hackage-security/issues/184)). +* Stack has always maintained compaitibility with some form of non-Git + index, so we've always had two code paths for index updates. As + hinted at above, this change opens the door to removing the + Git-based code path. And removing code is almost as good as avoiding + writing new code. +* I would still feel more comfortable with the security of Hackage if + HTTPS was used throughout, if only as a level of sanity in case all + else fails. I hope that in the future the connection between Hackage + and its CDN switches from insecure to secure. I also hope that + cabal-install is still planning on moving over to using HTTPS for + its downloads.