cpeikert / lol Goto Github PK

View Code? Open in Web Editor NEW

223.0 17.0 45.0 4.06 MB

Λ ⚬ λ: Functional Lattice Cryptography

Haskell 92.64% Shell 0.21% C++ 7.15%

cryptography lattices ring-lwe haskell-library lol fully-homomorphic-encryption

lol's People

Contributors

Stargazers

Watchers

lol's Issues

Make SymmSHE more type safe: in the power of g, and the degree

There are two ways we can make SymmSHE ciphertexts more type safe: 1. in the implicit power of g they carry around, and 2. in the degree of the Polynomial. In both cases, the current lack of type safety leads to some implementation holes (calls to error) and potential usage errors.

Currently we have

data CT m zp r'q = CT Encoding Int zp (Polynomial r'q)

where the (runtime) Int carries the implicit "power of g" that is removed upon decryption.

However, this power of g can actually be known at compile time: when multiplying two ciphertexts, the result's power is one more than the sum of the operands' powers. When adding ciphertexts, we do some work to equalize their powers to their largest, which is also the power of the output. There's also absorbGFactors, the helper function for embedCT and twaceCT, which makes the power equal to zero. These latter functions even have some holes that would be removed by putting the power into the type.

The proposal is to instead have:

data CT k m zp r'q = CT Encoding zp (Polynomial r'q)

where typically k :: Nat (but that doesn't have to be enforced in the type declaration, just like m :: Fact isn't).

We could then restrict embedCT and twaceCT to work only for k ~ Z, and make absorbGFactors go from arbitrary k :: Nat to Z. (Also, embedCT and twaceCT should first just call absorbGFactors themselves -- there are already comments saying that this is how they should work, and I'm not sure why it isn't already done.)

The slight drawback is that CT k m zp r'q wouldn't be an instance of Additive or Ring anymore, because the k types of the inputs and outputs don't always match. (This is why, long ago, I didn't put k into the type in the first place.) But this is hardly a loss; we can just define specialized addCT and mulCT functions (or/and operators) that have the right types. We never really use the polymorphism of + and * to work with a type that could be a CT or something else.

Currently the Polynomial r'q implicitly carries its degree as a runtime value. But this degree can actually be known at compile time: a sum of ciphertexts has the maximum degree of the operands, a product of ciphertexts has the sum of the degrees of the operands, a key-switched ciphertext has degree one, etc. Moreover, the key-switching functions have holes where the input ciphertext has too large of a degree.

The proposal is to include the degree as a type argument of CT:

data CT d k m zp r'q = CT Encoding zp (Polynomial d r'q)

where we'd have a home-grown Polynomial implementation indexed by d :: Nat. (We aren't using much from numeric-prelude's Polynomial, so a new implementation should be straightforward.) We could then give all the functions the appropriate types, restricting the input types as necessary.

I'm not 100% sure what the downstream effects of this change would be in ALCHEMY. Currently we always have explicit CT types anyway, so those could be changed as needed without difficulty. But are there any effects on the type families that map plaintext types to ciphertext types? How would these families choose k and/or d? The degree may not be a problem because we always relinearize after multiplication, but k seems like it can grow without bound in general. So this issue requires some more thought.

Multiuser Scheme

I read your paper (https://web.eecs.umich.edu/~cpeikert/pubs/multikey.pdf), do you plan on including such a scheme in Lol.

Precision with at least one bad modulus

We must be careful when any moduli in a product are "bad" (not prime and 1 mod m). This is because our current representation forces all components to work over Complex Double, thus limiting their size to ~25 bits, rather than the expected ~31 bits.

It would be nice if the "good" components of the modulus could be represented with their integer CRT, while the bad components would continue to be represented with Complex Double. This would allow the good moduli to be a full 31 bits (bad moduli are still restricted to 25). To achieve this, we probably need to represent a Cyc over a ring product as a product of Cycs (using a data family?).

Use non-removable `Tagged` analogue for `gadget`, `decompose`, `encode`, `correct`

Currently, a client can remove the Tagged gad annotation from any of the types related to the gadget methods, and use the underlying lists in unsafe ways. It would be safer to have an annotation that can be removed only via /our/ interfaces. (This annotation could just be a newtype for Tagged.) For example:

correct would remove the tag, as it does currently.
a special knapsack method would compute the dot-product of two tagged quantities, namely, gadget or something derived from it, and an output of decompose. In fact, these quantities perhaps should have /different kinds of tags/, to reflect their qualitatively different nature.

Test suite fails on i686

Citing from http://hydra.nixos.org/build/49722958/nixlog/1/raw:

running tests
Running 1 test suites...
Test suite test-lol-cpp: RUNNING...
Test suite test-lol-cpp: FAIL
Test suite logged to: dist/test/lol-cpp-0.0.0.3-test-lol-cpp.log
0 of 1 test suites (0 of 1 test cases) passed.

Include index parameter in `CRTrans`?

Currently, class CRTrans r takes just one argument (the base ring r), and its method crtInfo :: Int -> Maybe (CRTInfo r) takes the index of the CRT. At some point there may have been a good reason for having this argument, but currently I don't see it.

We should consider changing the class to CRTrans m r, possibly with a Reflects m Int superclass (but maybe not). Then crtInfo would have type TaggedT m Maybe (CRTInfo r), or perhaps Tagged m (Maybe (CRTInfo r)). This tagging seems much more natural and consistent with what we do elsewhere in the code.

Before attempting this change, we should check the existing CRTrans instances and usages. For example, we'll need instances where the argument m may have kind Factored, PrimePower, or Nat. We already have similar (non-overlapping) instances for Reflects, so I don't think this presents any problem.

Unify `Rescale` and `RescaleCyc`

RescaleCyc nominally lets us supply the rescaling basis, but this can be done with Rescale by rescaling tagged Cyc elements (note: this should allow a Rescale(Cyc) instance for UCyc C.)

The main challenge in removing the class is that we currently have overlapping instances, which seem inherent due to having a different algorithm for product rings. This means that the compiler can't tell that we have a Rescale instance if m changes. If we did not have overlapping instances, m could appear in the instance head and GHC could still deduce the constraint.

It's unclear how to move forward on this issue, but we all agree that it would be better if RescaleCyc didn't have to exist.

License Change

Current GPL license for this type of project (library) make is somewhat limiting for derived works to be used in products that can become commercial (thus very often demand proprietary part to be closed).

Would you consider changing the license to LGPL or even Apache-2.0 to preserve the openness of the project, but not limit creation of commercial products on it?

Please list all .proto files needed to read the challenges

Are the two files challenges.proto and RLWE.proto all one needs to read the challenges? And please specify which should be used to read the .challenge file and which for the .instance file.

Could you give an example code reading the instance in C or C++? (or actually in any language). It'll be super helpful. Thanks!

Improve `rescaleCyc` efficiency for Pow/Dec bases

The current definition of (say) rescaleCyc Pow for a tuple in the Dec basis may convert the entire tuple to Dec basis. The offending code is (b - reduce z): reduce z would be in the Dec basis, while each b is in the Pow basis. It is up to (-) to determine which term gets converted. To ensure the most efficient rescaling, we should "adviseBasis" to attempt to put z into the basis of b (before reducing). This will make the case above more efficient, while not impacting the efficiency of rescaling a CRT input.

Use Tensor `Ring` instance instead of `fmap (*)` for CycRep

Additive and Ring instances for CycRep currently define (*) = fmap (*). This could be (is?) much slower than letting individual Tensor instances define their own instances. Instead, we should add an entailment for Additive and Ring (any others?) so that CycRep's instances simply call the corresponding tensor instance.

Remove duplicated Tensor logic

We have logic for tensoring prime transforms to prime-power transforms, and for tensoring prime-power transforms, in both Tensor.hs and RTCommon.hs.

In a personal communication, cpeikert also noted:

All the reindexing stuff (extIndices* and baseIndices*) between Zm^* and [n] has duplication. I toyed with the idea of having a special DSL for binary or ternary (0, +-1) matrices that have a single nonzero entry per row or column, which I think is general enough to capture all the two-index transformations without killing efficiency. We could even preserve the ext/baseIndices vectors if desired, but compute them in a nicer way.

RescaleCyc Usability

Currently it's unclear which way to associate product rings for optimal efficiency without inspecting our code. This makes it difficult for users to write the best code. Is there some way we can improve this situation?

For example, if I do a keySwitch where zq ~ b and zq' ~ (a,b), we end up calling the optimized Cyc.RescaleCyc instance. However, if I choose zq' ~ (b,a), I end up with the generic, less efficient Cyc.RescaleCyc instance. One way to resolve this particular idiosyncrasy is to add a third Cyc.RescaleCyc instance for (a,b) to a.

However, another overlapping instance only compounds this problem: The overlapping RescaleCyc instances means that some valid rescales are not allowed. Take for example, rescaling from three moduli to one like this: ((a,b), c) -> c. This matches the instance head for the optimized instance, but not the instance constraints, since we don't have, e.g. Unbox (LiftOf (a,b)) ~ Unbox Integer. Thus the compiler throws an error. If we instead remove the optimized instance altogether, the compiler picks the generic instance and accepts the code, since the generic instance doesn't require Unbox Integer.

Make CTensor into a data family?

Consider making CTensor into a data family much like Data.Vector.Unbox is. This would give a nicer data layout for pair element types, and would allow the C code not to have to worry about pairs at all. It might even lead to better performance due to memory layout.

Issues we identified:

Under current Tensor interface, because the element type isn't an argument we have no way to be polymorphic over that element type in our (one) Tensor instance for CT. This is a problem for scalarPow, e.g.: we can't differentiate primitive inputs from pair inputs.
A potential solution to the above issue is to add the element type r to the Tensor class. But this is a big change and we don't see how to implement functions like fmapT that change the element type.
Another potential solution is to create a class, limited to CT land, that has the element type r as an argument. This would let us be polymorphic on the element type. The new class might be an almost duplicate of Tensor but with this extra argument, and (probably) without the troublesome fmapT, which would be implemented in the Tensor instance.

Long term, the right way to go here may be to make Tensor more like the two-argument Vector class. It seems like Vectors class methods are somehow a minimal DSL for vector/stream operations. We would be aiming for some minimal DSL that lets us express decomposable linear transforms. Though it's not clear how that would interact with C code.

Allow user-specified Gaussian Sampling Precision

Currently UCyc uses Doubles to sample gaussians, but this is likely not enough precision for security proofs. We need a way to let users choose what level of precision they want.

Make SymmSHE handle noise estimates and rescaling, at runtime

ALCHEMY's type families for tracking noise levels and choosing the moduli from a pool are really complicated, and tax ghc so badly that we often can't even compile.

So, an idea is to make noise-tracking and modulus selection dynamic (at runtime) rather than static, and do this within an SHE implementation. (This should probably sit as an alternative SHE alongside the existing SymmSHE, since it's a very different approach, and there are merits to both.)

The basic idea is that CT would be parameterized by a list of moduli that it could use, though the ciphertext might use only a prefix of that list for its actual modulus. Additionally, a CT value would internally carry a noise estimate (essentially, its pNoise), which would be used to decide when to mod-switch. When two ciphertexts (with the same type list of moduli) are combined in some way, their actual moduli might not match and would need to be reconciled, probably by switching down to match the one with the smaller pNoise.

I'm not sure about the exact form of the CT data type and its implementation. It would probably need to be a GADT: when you match on the constructor, you would discover the type of the actual modulus being used, and it should give you some guarantee (in the form of a constraint) that it's a prefix of the full modulus list. When combining two ciphertexts of the same type, their respective constraints should allow for determining the largest common prefix or whatever, and getting a Rescale[Cyc] constraint, and anything else that would be needed. This all seems doable in principle, but is nontrivial to work out properly.

Tracking pNoise and selecting moduli dynamically has some advantages. Obviously, the main one is the removal of many complex type families, including ones like PreMul. Less significantly, the noise estimates can be made somewhat more accurate because they can use the actual values of the moduli and degrees (n'=phi(m')) and so forth, rather than a "one size fits all" formula.

The main disadvantage is that we'd have runtime errors (or incorrect decryption) in cases where the moduli in the CT type aren't big enough to handle the results of a homomorphic computation. This is not the end of the world, as one can just add more moduli at one place in the code.

What would be really cool is to get something like the best of both worlds, where the SHE implementation is able to do some global (but runtime) analysis to figure out how many moduli of a huge given list are needed, and starts with those for fresh ciphertexts.

Weaken CElt

Currently, CElt is a catch-all constraint synonym. The result is a dishonest interface that overly-restricts functions and instances (i.e., requiring NFData to get a random sample or to add two ring elts). We fixed up TElt; CElt should be analogous forUCyc/Cyc: it should contain constraints need to hold the tensor (Tensor t, TElt t r and TElt (CRTExt r)), as well as to change bases (CRTrans r), but probably not Random r, Eq r, NFData r, etc. Unfortunately, due to RT's implementation of divGCRT, changing bases requires ID r and ZT r.

Old version of criterion-measurement does not compile on Apple Silicon

criterion-measurement-0.1.2.0 does not compile on Apple Silicon #238. The problem is fixed as of criterion-measurement-0.1.3.0. Raising the version requirement to criterion-measurement-0.1.3.0 solves the problem.

Constraints on SymmSHE

We discussed adding the constraint that the PT modulus p must be coprime with the odd factors of m to decrypt (and absorbGFactors?) Adding this constraint limits what types can be used for the plaintext modulus, because the constraint only works if p :: Factored. But this restriction is already in place for, e.g., ring tunneling where we have the same constraint.

Make ring products strict

Our current representation of ring products uses Haskell tuples, which are lazy. It would probably be better to use a strict data structure.

Verify that two-index tensor ops are OK for m=m'

The paper assumes wlog that $m' > m \geq 1$ are powers of the same prime $p$. In some cases the derivations explicitly use these the inequalities $m' > m$ or $m' > 1$. But in reality it's perfectly valid for one or both of these to be equalities. So we should check that the code behaves properly in these cases, yielding identity transforms (or whatever is appropriate).

Add Lol to Stack nightly build

Once we get TravisCI set up, we can add Lol to the nightly build on Stackage: https://github.com/fpco/stackage/#get-your-package-included.

Remove `IntegralDomain` from `divG`

We currently need a bogus instance of IntegralDomain for product rings in order to be able to call divG. This instance fails at runtime when oddRadical m is not invertible mod p. for example if we try to use a "bad" plaintext modulus with SHE. The error in that case would be a runtime error about a failure of modInv, but we would instead like a more meaningful error saying something about the necessary conditions that are violated. We proposed a special class like CRTrans that just supplies divModOddRad.

cpeikert / lol Goto Github PK

lol's People

Contributors

Stargazers

Watchers

Forkers

lol's Issues

Recommend Projects

Recommend Topics

Recommend Org