cpeikert / lol Goto Github PK
View Code? Open in Web Editor NEWΛ ⚬ λ: Functional Lattice Cryptography
Λ ⚬ λ: Functional Lattice Cryptography
There are two ways we can make SymmSHE ciphertexts more type safe: 1. in the implicit power of g
they carry around, and 2. in the degree of the Polynomial
. In both cases, the current lack of type safety leads to some implementation holes (calls to error
) and potential usage errors.
Currently we have
data CT m zp r'q = CT Encoding Int zp (Polynomial r'q)
where the (runtime) Int
carries the implicit "power of g
" that is removed upon decryption.
However, this power of g
can actually be known at compile time: when multiplying two ciphertexts, the result's power is one more than the sum of the operands' powers. When adding ciphertexts, we do some work to equalize their powers to their largest, which is also the power of the output. There's also absorbGFactors
, the helper function for embedCT
and twaceCT
, which makes the power equal to zero. These latter functions even have some holes that would be removed by putting the power into the type.
The proposal is to instead have:
data CT k m zp r'q = CT Encoding zp (Polynomial r'q)
where typically k :: Nat
(but that doesn't have to be enforced in the type declaration, just like m :: Fact
isn't).
We could then restrict embedCT
and twaceCT
to work only for k ~ Z
, and make absorbGFactors
go from arbitrary k :: Nat
to Z
. (Also, embedCT
and twaceCT
should first just call absorbGFactors
themselves -- there are already comments saying that this is how they should work, and I'm not sure why it isn't already done.)
The slight drawback is that CT k m zp r'q
wouldn't be an instance of Additive
or Ring
anymore, because the k
types of the inputs and outputs don't always match. (This is why, long ago, I didn't put k
into the type in the first place.) But this is hardly a loss; we can just define specialized addCT
and mulCT
functions (or/and operators) that have the right types. We never really use the polymorphism of +
and *
to work with a type that could be a CT
or something else.
Polynomial r'q
implicitly carries its degree as a runtime value. But this degree can actually be known at compile time: a sum of ciphertexts has the maximum degree of the operands, a product of ciphertexts has the sum of the degrees of the operands, a key-switched ciphertext has degree one, etc. Moreover, the key-switching functions have holes where the input ciphertext has too large of a degree.The proposal is to include the degree as a type argument of CT
:
data CT d k m zp r'q = CT Encoding zp (Polynomial d r'q)
where we'd have a home-grown Polynomial
implementation indexed by d :: Nat
. (We aren't using much from numeric-prelude's Polynomial
, so a new implementation should be straightforward.) We could then give all the functions the appropriate types, restricting the input types as necessary.
I'm not 100% sure what the downstream effects of this change would be in ALCHEMY. Currently we always have explicit CT types anyway, so those could be changed as needed without difficulty. But are there any effects on the type families that map plaintext types to ciphertext types? How would these families choose k
and/or d
? The degree may not be a problem because we always relinearize after multiplication, but k
seems like it can grow without bound in general. So this issue requires some more thought.
I read your paper (https://web.eecs.umich.edu/~cpeikert/pubs/multikey.pdf), do you plan on including such a scheme in Lol.
We must be careful when any moduli in a product are "bad" (not prime and 1 mod m). This is because our current representation forces all components to work over Complex Double
, thus limiting their size to ~25 bits, rather than the expected ~31 bits.
It would be nice if the "good" components of the modulus could be represented with their integer CRT, while the bad components would continue to be represented with Complex Double. This would allow the good moduli to be a full 31 bits (bad moduli are still restricted to 25). To achieve this, we probably need to represent a Cyc over a ring product as a product of Cycs (using a data family?).
Currently, a client can remove the Tagged gad
annotation from any of the types related to the gadget methods, and use the underlying lists in unsafe ways. It would be safer to have an annotation that can be removed only via /our/ interfaces. (This annotation could just be a newtype
for Tagged
.) For example:
correct
would remove the tag, as it does currently.knapsack
method would compute the dot-product of two tagged quantities, namely, gadget
or something derived from it, and an output of decompose
. In fact, these quantities perhaps should have /different kinds of tags/, to reflect their qualitatively different nature.Citing from http://hydra.nixos.org/build/49722958/nixlog/1/raw:
running tests
Running 1 test suites...
Test suite test-lol-cpp: RUNNING...
Test suite test-lol-cpp: FAIL
Test suite logged to: dist/test/lol-cpp-0.0.0.3-test-lol-cpp.log
0 of 1 test suites (0 of 1 test cases) passed.
Currently, class CRTrans r
takes just one argument (the base ring r
), and its method crtInfo :: Int -> Maybe (CRTInfo r)
takes the index of the CRT. At some point there may have been a good reason for having this argument, but currently I don't see it.
We should consider changing the class to CRTrans m r
, possibly with a Reflects m Int
superclass (but maybe not). Then crtInfo
would have type TaggedT m Maybe (CRTInfo r)
, or perhaps Tagged m (Maybe (CRTInfo r))
. This tagging seems much more natural and consistent with what we do elsewhere in the code.
Before attempting this change, we should check the existing CRTrans
instances and usages. For example, we'll need instances where the argument m
may have kind Factored
, PrimePower
, or Nat
. We already have similar (non-overlapping) instances for Reflects
, so I don't think this presents any problem.
RescaleCyc
nominally lets us supply the rescaling basis, but this can be done with Rescale
by rescaling tagged Cyc
elements (note: this should allow a Rescale(Cyc)
instance for UCyc C
.)
The main challenge in removing the class is that we currently have overlapping instances, which seem inherent due to having a different algorithm for product rings. This means that the compiler can't tell that we have a Rescale
instance if m
changes. If we did not have overlapping instances, m
could appear in the instance head and GHC could still deduce the constraint.
It's unclear how to move forward on this issue, but we all agree that it would be better if RescaleCyc
didn't have to exist.
Current GPL license for this type of project (library) make is somewhat limiting for derived works to be used in products that can become commercial (thus very often demand proprietary part to be closed).
Would you consider changing the license to LGPL or even Apache-2.0 to preserve the openness of the project, but not limit creation of commercial products on it?
Are the two files challenges.proto and RLWE.proto all one needs to read the challenges? And please specify which should be used to read the .challenge file and which for the .instance file.
Could you give an example code reading the instance in C or C++? (or actually in any language). It'll be super helpful. Thanks!
The current definition of (say) rescaleCyc Pow
for a tuple in the Dec basis may convert the entire tuple to Dec basis. The offending code is (b - reduce z)
: reduce z
would be in the Dec basis, while each b
is in the Pow basis. It is up to (-)
to determine which term gets converted. To ensure the most efficient rescaling, we should "adviseBasis" to attempt to put z
into the basis of b
(before reducing). This will make the case above more efficient, while not impacting the efficiency of rescaling a CRT input.
Additive
and Ring
instances for CycRep
currently define (*) = fmap (*)
. This could be (is?) much slower than letting individual Tensor
instances define their own instances. Instead, we should add an entailment for Additive
and Ring
(any others?) so that CycRep
's instances simply call the corresponding tensor instance.
We have logic for tensoring prime transforms to prime-power transforms, and for tensoring prime-power transforms, in both Tensor.hs and RTCommon.hs.
In a personal communication, cpeikert also noted:
All the reindexing stuff (extIndices* and baseIndices*) between Zm^* and [n] has duplication. I toyed with the idea of having a special DSL for binary or ternary (0, +-1) matrices that have a single nonzero entry per row or column, which I think is general enough to capture all the two-index transformations without killing efficiency. We could even preserve the ext/baseIndices vectors if desired, but compute them in a nicer way.
Currently it's unclear which way to associate product rings for optimal efficiency without inspecting our code. This makes it difficult for users to write the best code. Is there some way we can improve this situation?
For example, if I do a keySwitch where zq ~ b
and zq' ~ (a,b)
, we end up calling the optimized Cyc.RescaleCyc
instance. However, if I choose zq' ~ (b,a)
, I end up with the generic, less efficient Cyc.RescaleCyc
instance. One way to resolve this particular idiosyncrasy is to add a third Cyc.RescaleCyc
instance for (a,b)
to a
.
However, another overlapping instance only compounds this problem: The overlapping RescaleCyc
instances means that some valid rescales are not allowed. Take for example, rescaling from three moduli to one like this: ((a,b), c) -> c
. This matches the instance head for the optimized instance, but not the instance constraints, since we don't have, e.g. Unbox (LiftOf (a,b)) ~ Unbox Integer
. Thus the compiler throws an error. If we instead remove the optimized instance altogether, the compiler picks the generic instance and accepts the code, since the generic instance doesn't require Unbox Integer
.
Consider making CTensor
into a data family much like Data.Vector.Unbox
is. This would give a nicer data layout for pair element types, and would allow the C code not to have to worry about pairs at all. It might even lead to better performance due to memory layout.
Issues we identified:
Tensor
interface, because the element type isn't an argument we have no way to be polymorphic over that element type in our (one) Tensor
instance for CT
. This is a problem for scalarPow
, e.g.: we can't differentiate primitive inputs from pair inputs.r
to the Tensor
class. But this is a big change and we don't see how to implement functions like fmapT
that change the element type.CT
land, that has the element type r
as an argument. This would let us be polymorphic on the element type. The new class might be an almost duplicate of Tensor
but with this extra argument, and (probably) without the troublesome fmapT
, which would be implemented in the Tensor
instance.Long term, the right way to go here may be to make Tensor
more like the two-argument Vector
class. It seems like Vector
s class methods are somehow a minimal DSL for vector/stream operations. We would be aiming for some minimal DSL that lets us express decomposable linear transforms. Though it's not clear how that would interact with C code.
Currently UCyc
uses Double
s to sample gaussians, but this is likely not enough precision for security proofs. We need a way to let users choose what level of precision they want.
ALCHEMY's type families for tracking noise levels and choosing the moduli from a pool are really complicated, and tax ghc so badly that we often can't even compile.
So, an idea is to make noise-tracking and modulus selection dynamic (at runtime) rather than static, and do this within an SHE implementation. (This should probably sit as an alternative SHE alongside the existing SymmSHE, since it's a very different approach, and there are merits to both.)
The basic idea is that CT
would be parameterized by a list of moduli that it could use, though the ciphertext might use only a prefix of that list for its actual modulus. Additionally, a CT
value would internally carry a noise estimate (essentially, its pNoise), which would be used to decide when to mod-switch. When two ciphertexts (with the same type list of moduli) are combined in some way, their actual moduli might not match and would need to be reconciled, probably by switching down to match the one with the smaller pNoise.
I'm not sure about the exact form of the CT data type and its implementation. It would probably need to be a GADT: when you match on the constructor, you would discover the type of the actual modulus being used, and it should give you some guarantee (in the form of a constraint) that it's a prefix of the full modulus list. When combining two ciphertexts of the same type, their respective constraints should allow for determining the largest common prefix or whatever, and getting a Rescale[Cyc]
constraint, and anything else that would be needed. This all seems doable in principle, but is nontrivial to work out properly.
Tracking pNoise and selecting moduli dynamically has some advantages. Obviously, the main one is the removal of many complex type families, including ones like PreMul
. Less significantly, the noise estimates can be made somewhat more accurate because they can use the actual values of the moduli and degrees (n'=phi(m')) and so forth, rather than a "one size fits all" formula.
The main disadvantage is that we'd have runtime error
s (or incorrect decryption) in cases where the moduli in the CT
type aren't big enough to handle the results of a homomorphic computation. This is not the end of the world, as one can just add more moduli at one place in the code.
What would be really cool is to get something like the best of both worlds, where the SHE implementation is able to do some global (but runtime) analysis to figure out how many moduli of a huge given list are needed, and starts with those for fresh ciphertexts.
Currently, CElt
is a catch-all constraint synonym. The result is a dishonest interface that overly-restricts functions and instances (i.e., requiring NFData
to get a random sample or to add two ring elts). We fixed up TElt
; CElt
should be analogous forUCyc
/Cyc
: it should contain constraints need to hold the tensor (Tensor t
, TElt t r
and TElt (CRTExt r)
), as well as to change bases (CRTrans r
), but probably not Random r
, Eq r
, NFData r
, etc. Unfortunately, due to RT
's implementation of divGCRT
, changing bases requires ID r
and ZT r
.
criterion-measurement-0.1.2.0 does not compile on Apple Silicon #238. The problem is fixed as of criterion-measurement-0.1.3.0. Raising the version requirement to criterion-measurement-0.1.3.0 solves the problem.
We discussed adding the constraint that the PT modulus p
must be coprime with the odd factors of m
to decrypt
(and absorbGFactors
?) Adding this constraint limits what types can be used for the plaintext modulus, because the constraint only works if p :: Factored
. But this restriction is already in place for, e.g., ring tunneling where we have the same constraint.
Our current representation of ring products uses Haskell tuples, which are lazy. It would probably be better to use a strict data structure.
The paper assumes wlog that
Once we get TravisCI set up, we can add Lol to the nightly build on Stackage: https://github.com/fpco/stackage/#get-your-package-included.
We currently need a bogus instance of IntegralDomain
for product rings in order to be able to call divG
. This instance fails at runtime when oddRadical m
is not invertible mod p
. for example if we try to use a "bad" plaintext modulus with SHE. The error in that case would be a runtime error about a failure of modInv
, but we would instead like a more meaningful error saying something about the necessary conditions that are violated. We proposed a special class like CRTrans
that just supplies divModOddRad
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.