Giter VIP home page Giter VIP logo

skip-thoughts's People

Contributors

cshallue avatar ryankiros avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

skip-thoughts's Issues

How to change number of CPU?

I am looking at the cpu utilization and it looks like when I am encoding documents, code is just using 4 cpu even though I have 16. I can't find the variable to change the default cpu numbers in skipthought code. Any idea..? Thanks!

Newbie question// How to set the paths?

I have my models in a folder called models in a subdirectory where the skipthoughts.py is.
In what way do I need to change the path destination for my local setup.

Current output gives me :

IOError: [Errno 2] No such file or directory: '/u/rkiros/public_html/models/uni_skip.npz.pkl'

Value Error: Sequence is shorter then the required number of steps

When I tried to encode a list of 20 sentences, it was successful but then now I am trying to encode a list of 283007 sentences, named data1, and now I'm getting value error while running vec_d1 = encoder.encode(data1) as follow:

0
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-27-f84532db8b01> in <module>()
----> 1 vec_d1 = encoder.encode(data1)

C:\Users\Anurag\Documents\skip-thoughts-master\skipthoughts.pyc in encode(self, X, use_norm, verbose, batch_size, use_eos)
    100       Encode sentences in the list X. Each entry will return a vector
    101       """
--> 102       return encode(self._model, X, use_norm, verbose, batch_size, use_eos)
    103 
    104 

C:\Users\Anurag\Documents\skip-thoughts-master\skipthoughts.pyc in encode(model, X, use_norm, verbose, batch_size, use_eos)
    153                 bff = model['f_w2v2'](bembedding, numpy.ones((len(caption)+1,len(caps)), dtype='float32'))
    154             else:
--> 155                 uff = model['f_w2v'](uembedding, numpy.ones((len(caption),len(caps)), dtype='float32'))
    156                 bff = model['f_w2v2'](bembedding, numpy.ones((len(caption),len(caps)), dtype='float32'))
    157             if use_norm:

C:\ProgramData\Anaconda2\lib\site-packages\theano\compile\function_module.pyc in __call__(self, *args, **kwargs)
    896                     node=self.fn.nodes[self.fn.position_of_error],
    897                     thunk=thunk,
--> 898                     storage_map=getattr(self.fn, 'storage_map', None))
    899             else:
    900                 # old-style linkers raise their own exceptions

C:\ProgramData\Anaconda2\lib\site-packages\theano\gof\link.pyc in raise_with_op(node, thunk, exc_info, storage_map)
    323         # extra long error message in that case.
    324         pass
--> 325     reraise(exc_type, exc_value, exc_trace)
    326 
    327 

C:\ProgramData\Anaconda2\lib\site-packages\theano\compile\function_module.pyc in __call__(self, *args, **kwargs)
    882         try:
    883             outputs =\
--> 884                 self.fn() if output_subset is None else\
    885                 self.fn(output_subset=output_subset)
    886         except Exception:

C:\ProgramData\Anaconda2\lib\site-packages\theano\scan_module\scan_op.pyc in rval(p, i, o, n, allow_gc)
    987         def rval(p=p, i=node_input_storage, o=node_output_storage, n=node,
    988                  allow_gc=allow_gc):
--> 989             r = p(n, [x[0] for x in i], o)
    990             for o in node.outputs:
    991                 compute_map[o][0] = True

C:\ProgramData\Anaconda2\lib\site-packages\theano\scan_module\scan_op.pyc in p(node, args, outs)
    976                                                 args,
    977                                                 outs,
--> 978                                                 self, node)
    979         except (ImportError, theano.gof.cmodule.MissingGXX):
    980             p = self.execute

theano/scan_module/scan_perform.pyx in theano.scan_module.scan_perform.perform (C:\Users\Anurag\AppData\Local\Theano\compiledir_Windows-10-10.0.10586-Intel64_Family_6_Model_142_Stepping_9_GenuineIntel-2.7.13-64\scan_perform\mod.cpp:2737)()

ValueError: ('Sequence is shorter then the required number of steps : (n_steps, seq, seq.shape):', 1, array([], shape=(0L, 4L, 1L), dtype=float32), (0L, 4L, 1L))
Apply node that caused the error: forall_inplace,cpu,encoder__layers}(Elemwise{Maximum}[(0, 0)].0, InplaceDimShuffle{0,1,x}.0, Elemwise{sub,no_inplace}.0, Subtensor{int64:int64:int8}.0, Subtensor{int64:int64:int8}.0, IncSubtensor{InplaceSet;:int64:}.0, encoder_U, encoder_Ux, ScalarFromTensor.0, ScalarFromTensor.0)
Toposort index: 50
Inputs types: [TensorType(int64, scalar), TensorType(float32, (False, False, True)), TensorType(float32, (False, False, True)), TensorType(float32, 3D), TensorType(float32, 3D), TensorType(float32, 3D), TensorType(float32, matrix), TensorType(float32, matrix), Scalar(int64), Scalar(int64)]
Inputs shapes: [(), (0L, 4L, 1L), (0L, 4L, 1L), (0L, 4L, 4800L), (0L, 4L, 2400L), (3L, 4L, 2400L), (2400L, 4800L), (2400L, 2400L), (), ()]
Inputs strides: [(), (16L, 4L, 4L), (16L, 4L, 4L), (76800L, 19200L, 4L), (38400L, 9600L, 4L), (38400L, 9600L, 4L), (19200L, 4L), (9600L, 4L), (), ()]
Inputs values: [array(1L, dtype=int64), array([], shape=(0L, 4L, 1L), dtype=float32), array([], shape=(0L, 4L, 1L), dtype=float32), array([], shape=(0L, 4L, 4800L), dtype=float32), array([], shape=(0L, 4L, 2400L), dtype=float32), 'not shown', 'not shown', 'not shown', 2400, 4800]
Inputs type_num: [9, 11, 11, 11, 11, 11, 11, 11, 9, 9]
Outputs clients: [[Subtensor{int64}(forall_inplace,cpu,encoder__layers}.0, ScalarFromTensor.0)]]

Debugprint of the apply node: 
forall_inplace,cpu,encoder__layers} [id A] <TensorType(float32, 3D)> ''   
 |Elemwise{Maximum}[(0, 0)] [id B] <TensorType(int64, scalar)> ''   
 | |Elemwise{Composite{minimum(((i0 + i1) - i2), i3)}} [id C] <TensorType(int64, scalar)> ''   
 | | |Elemwise{Composite{Switch(LT(i0, i1), (i0 + i2), i0)}} [id D] <TensorType(int64, scalar)> ''   
 | | | |Elemwise{Composite{Switch(LT((i0 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2)), i3), (i4 - i0), Switch(GE((i0 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2)), (i2 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2))), (i5 + i0), Switch(LE((i2 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2)), i3), (i5 + i0), i0)))}} [id E] <TensorType(int64, scalar)> ''   
 | | | | |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | | | | | |embedding [id G] <TensorType(float32, 3D)>
 | | | | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 | | | | |Elemwise{add,no_inplace} [id I] <TensorType(int64, scalar)> ''   
 | | | | | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 | | | | | |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | | | | |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 | | | | |TensorConstant{-2} [id K] <TensorType(int64, scalar)>
 | | | | |TensorConstant{2} [id L] <TensorType(int64, scalar)>
 | | | |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 | | | |Elemwise{add,no_inplace} [id I] <TensorType(int64, scalar)> ''   
 | | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 | | |TensorConstant{1} [id M] <TensorType(int8, scalar)>
 | | |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 |InplaceDimShuffle{0,1,x} [id N] <TensorType(float32, (False, False, True))> ''   
 | |Subtensor{int64:int64:int8} [id O] <TensorType(float32, matrix)> ''   
 |   |x_mask [id P] <TensorType(float32, matrix)>
 |   |ScalarFromTensor [id Q] <int64> ''   
 |   | |Elemwise{switch,no_inplace} [id R] <TensorType(int64, scalar)> ''   
 |   |   |Elemwise{le,no_inplace} [id S] <TensorType(bool, scalar)> ''   
 |   |   | |Elemwise{Composite{Switch(LT(i0, i1), i0, i1)}} [id T] <TensorType(int64, scalar)> ''   
 |   |   | | |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 |   |   | | |Shape_i{0} [id U] <TensorType(int64, scalar)> ''   
 |   |   | |   |x_mask [id P] <TensorType(float32, matrix)>
 |   |   | |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 |   |   |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 |   |   |TensorConstant{0} [id V] <TensorType(int64, scalar)>
 |   |ScalarFromTensor [id W] <int64> ''   
 |   | |Elemwise{Composite{Switch(i0, i1, minimum(i2, i3))}}[(0, 2)] [id X] <TensorType(int64, scalar)> ''   
 |   |   |Elemwise{le,no_inplace} [id S] <TensorType(bool, scalar)> ''   
 |   |   |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 |   |   |Elemwise{Composite{Switch(LT(i0, i1), i0, i1)}} [id T] <TensorType(int64, scalar)> ''   
 |   |   |Shape_i{0} [id U] <TensorType(int64, scalar)> ''   
 |   |Constant{1} [id Y] <int8>
 |Elemwise{sub,no_inplace} [id Z] <TensorType(float32, (False, False, True))> ''   
 | |TensorConstant{(1L, 1L, 1L) of 1.0} [id BA] <TensorType(float32, (True, True, True))>
 | |InplaceDimShuffle{0,1,x} [id N] <TensorType(float32, (False, False, True))> ''   
 |Subtensor{int64:int64:int8} [id BB] <TensorType(float32, 3D)> ''   
 | |Elemwise{Add}[(0, 0)] [id BC] <TensorType(float32, 3D)> ''   
 | | |Reshape{3} [id BD] <TensorType(float32, 3D)> ''   
 | | | |Dot22 [id BE] <TensorType(float32, matrix)> ''   
 | | | | |Reshape{2} [id BF] <TensorType(float32, matrix)> ''   
 | | | | | |embedding [id G] <TensorType(float32, 3D)>
 | | | | | |MakeVector{dtype='int64'} [id BG] <TensorType(int64, vector)> ''   
 | | | | |   |Elemwise{Mul}[(0, 1)] [id BH] <TensorType(int64, scalar)> ''   
 | | | | |   | |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | | | | |   | |Shape_i{1} [id BI] <TensorType(int64, scalar)> ''   
 | | | | |   |   |embedding [id G] <TensorType(float32, 3D)>
 | | | | |   |Shape_i{2} [id BJ] <TensorType(int64, scalar)> ''   
 | | | | |     |embedding [id G] <TensorType(float32, 3D)>
 | | | | |encoder_W [id BK] <TensorType(float32, matrix)>
 | | | |MakeVector{dtype='int64'} [id BL] <TensorType(int64, vector)> ''   
 | | |   |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | | |   |Shape_i{1} [id BI] <TensorType(int64, scalar)> ''   
 | | |   |Shape_i{1} [id BM] <TensorType(int64, scalar)> ''   
 | | |     |encoder_W [id BK] <TensorType(float32, matrix)>
 | | |InplaceDimShuffle{x,x,0} [id BN] <TensorType(float32, (True, True, False))> ''   
 | |   |encoder_b [id BO] <TensorType(float32, vector)>
 | |ScalarFromTensor [id BP] <int64> ''   
 | | |Elemwise{Composite{Switch(LE(i0, i1), i1, i2)}}[(0, 0)] [id BQ] <TensorType(int64, scalar)> ''   
 | |   |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | |   |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 | |   |TensorConstant{0} [id V] <TensorType(int64, scalar)>
 | |ScalarFromTensor [id BR] <int64> ''   
 | | |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | |Constant{1} [id Y] <int8>
 |Subtensor{int64:int64:int8} [id BS] <TensorType(float32, 3D)> ''   
 | |Elemwise{Add}[(0, 0)] [id BT] <TensorType(float32, 3D)> ''   
 | | |Reshape{3} [id BU] <TensorType(float32, 3D)> ''   
 | | | |Dot22 [id BV] <TensorType(float32, matrix)> ''   
 | | | | |Reshape{2} [id BF] <TensorType(float32, matrix)> ''   
 | | | | |encoder_Wx [id BW] <TensorType(float32, matrix)>
 | | | |MakeVector{dtype='int64'} [id BX] <TensorType(int64, vector)> ''   
 | | |   |Shape_i{0} [id F] <TensorType(int64, scalar)> ''   
 | | |   |Shape_i{1} [id BI] <TensorType(int64, scalar)> ''   
 | | |   |Shape_i{1} [id BY] <TensorType(int64, scalar)> ''   
 | | |     |encoder_Wx [id BW] <TensorType(float32, matrix)>
 | | |InplaceDimShuffle{x,x,0} [id BZ] <TensorType(float32, (True, True, False))> ''   
 | |   |encoder_bx [id CA] <TensorType(float32, vector)>
 | |ScalarFromTensor [id BP] <int64> ''   
 | |ScalarFromTensor [id BR] <int64> ''   
 | |Constant{1} [id Y] <int8>
 |IncSubtensor{InplaceSet;:int64:} [id CB] <TensorType(float32, 3D)> ''   
 | |AllocEmpty{dtype='float32'} [id CC] <TensorType(float32, 3D)> ''   
 | | |Elemwise{Composite{(Switch(LT(maximum(i0, i1), i2), (maximum(i0, i1) + i3), (maximum(i0, i1) - i2)) + i4)}}[(0, 0)] [id CD] <TensorType(int64, scalar)> ''   
 | | | |Elemwise{Composite{((maximum(i0, i1) - Switch(LT(i2, i3), (i2 + i4), i2)) + i1)}}[(0, 2)] [id CE] <TensorType(int64, scalar)> ''   
 | | | | |Elemwise{Composite{minimum(((i0 + i1) - i2), i3)}} [id C] <TensorType(int64, scalar)> ''   
 | | | | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 | | | | |Elemwise{Composite{Switch(LT((i0 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2)), i3), (i4 - i0), Switch(GE((i0 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2)), (i2 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2))), (i5 + i0), Switch(LE((i2 - Composite{Switch(LT(i0, i1), i0, i1)}(i1, i2)), i3), (i5 + i0), i0)))}} [id E] <TensorType(int64, scalar)> ''   
 | | | | |TensorConstant{0} [id J] <TensorType(int8, scalar)>
 | | | | |Elemwise{add,no_inplace} [id I] <TensorType(int64, scalar)> ''   
 | | | |TensorConstant{2} [id L] <TensorType(int64, scalar)>
 | | | |TensorConstant{1} [id M] <TensorType(int8, scalar)>
 | | | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 | | | |TensorConstant{1} [id H] <TensorType(int64, scalar)>
 | | |Shape_i{1} [id BI] <TensorType(int64, scalar)> ''   
 | | |Shape_i{1} [id CF] <TensorType(int64, scalar)> ''   
 | |   |encoder_Ux [id CG] <TensorType(float32, matrix)>
 | |Rebroadcast{0} [id CH] <TensorType(float32, 3D)> ''   
 | | |Alloc [id CI] <TensorType(float32, (True, False, False))> ''   
 | |   |TensorConstant{0.0} [id CJ] <TensorType(float32, scalar)>
 | |   |TensorConstant{1} [id M] <TensorType(int8, scalar)>
 | |   |Shape_i{1} [id BI] <TensorType(int64, scalar)> ''   
 | |   |Shape_i{1} [id CF] <TensorType(int64, scalar)> ''   
 | |Constant{1} [id CK] <int64>
 |encoder_U [id CL] <TensorType(float32, matrix)>
 |encoder_Ux [id CG] <TensorType(float32, matrix)>
 |ScalarFromTensor [id CM] <int64> ''   
 | |Shape_i{1} [id CF] <TensorType(int64, scalar)> ''   
 |ScalarFromTensor [id CN] <int64> ''   
   |Elemwise{Mul}[(0, 1)] [id CO] <TensorType(int64, scalar)> ''   
     |TensorConstant{2} [id L] <TensorType(int64, scalar)>
     |Shape_i{1} [id CF] <TensorType(int64, scalar)> ''   

Inner graphs of the scan ops:

forall_inplace,cpu,encoder__layers} [id A] <TensorType(float32, 3D)> ''   
 >Elemwise{Composite{((i0 * ((scalar_sigmoid(i1) * i2) + ((i3 - scalar_sigmoid(i1)) * tanh(((i4 * scalar_sigmoid(i5)) + i6))))) + (i7 * i2))}} [id CP] <TensorType(float32, matrix)> ''   
 > |<TensorType(float32, col)> [id CQ] <TensorType(float32, col)> -> [id N]
 > |Subtensor{::, int64:int64:} [id CR] <TensorType(float32, matrix)> ''   
 > | |Gemm{no_inplace} [id CS] <TensorType(float32, matrix)> ''   
 > | | |<TensorType(float32, matrix)> [id CT] <TensorType(float32, matrix)> -> [id BB]
 > | | |TensorConstant{1.0} [id CU] <TensorType(float32, scalar)>
 > | | |<TensorType(float32, matrix)> [id CV] <TensorType(float32, matrix)> -> [id CB]
 > | | |encoder_U_copy [id CW] <TensorType(float32, matrix)> -> [id CL]
 > | | |TensorConstant{1.0} [id CU] <TensorType(float32, scalar)>
 > | |<int64> [id CX] <int64> -> [id CM]
 > | |<int64> [id CY] <int64> -> [id CN]
 > |<TensorType(float32, matrix)> [id CV] <TensorType(float32, matrix)> -> [id CB]
 > |TensorConstant{(1L, 1L) of 1.0} [id CZ] <TensorType(float32, (True, True))>
 > |Dot22 [id DA] <TensorType(float32, matrix)> ''   
 > | |<TensorType(float32, matrix)> [id CV] <TensorType(float32, matrix)> -> [id CB]
 > | |encoder_Ux_copy [id DB] <TensorType(float32, matrix)> -> [id CG]
 > |Subtensor{::, int64:int64:} [id DC] <TensorType(float32, matrix)> ''   
 > | |Gemm{no_inplace} [id CS] <TensorType(float32, matrix)> ''   
 > | |Constant{0} [id DD] <int64>
 > | |<int64> [id CX] <int64> -> [id CM]
 > |<TensorType(float32, matrix)> [id DE] <TensorType(float32, matrix)> -> [id BS]
 > |<TensorType(float32, col)> [id DF] <TensorType(float32, col)> -> [id Z]

Storage map footprint:
 - encoder_U, Shared Input, Shape: (2400L, 4800L), ElemSize: 4 Byte(s), TotalSize: 46080000 Byte(s)
 - encoder_Ux, Shared Input, Shape: (2400L, 2400L), ElemSize: 4 Byte(s), TotalSize: 23040000 Byte(s)
 - encoder_W, Shared Input, Shape: (620L, 4800L), ElemSize: 4 Byte(s), TotalSize: 11904000 Byte(s)
 - encoder_Wx, Shared Input, Shape: (620L, 2400L), ElemSize: 4 Byte(s), TotalSize: 5952000 Byte(s)
 - IncSubtensor{InplaceSet;:int64:}.0, Shape: (3L, 4L, 2400L), ElemSize: 4 Byte(s), TotalSize: 115200 Byte(s)
 - encoder_b, Shared Input, Shape: (4800L,), ElemSize: 4 Byte(s), TotalSize: 19200 Byte(s)
 - encoder_bx, Shared Input, Shape: (2400L,), ElemSize: 4 Byte(s), TotalSize: 9600 Byte(s)
 - TensorConstant{2}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - Elemwise{Maximum}[(0, 0)].0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - TensorConstant{0}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - Constant{1}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - TensorConstant{1}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - ScalarFromTensor.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - Elemwise{Composite{(((i0 - maximum(i1, i2)) - i3) + maximum(i4, i5))}}[(0, 0)].0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - ScalarFromTensor.0, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - TensorConstant{-2}, Shape: (), ElemSize: 8 Byte(s), TotalSize: 8.0 Byte(s)
 - TensorConstant{(1L, 1L, 1L) of 1.0}, Shape: (1L, 1L, 1L), ElemSize: 4 Byte(s), TotalSize: 4 Byte(s)
 - TensorConstant{0.0}, Shape: (), ElemSize: 4 Byte(s), TotalSize: 4.0 Byte(s)
 - TensorConstant{2}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
 - TensorConstant{1}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
 - TensorConstant{0}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
 - Constant{1}, Shape: (), ElemSize: 1 Byte(s), TotalSize: 1.0 Byte(s)
 - Elemwise{sub,no_inplace}.0, Shape: (0L, 4L, 1L), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
 - embedding, Input, Shape: (0L, 4L, 620L), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
 - InplaceDimShuffle{0,1,x}.0, Shape: (0L, 4L, 1L), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
 - Subtensor{int64:int64:int8}.0, Shape: (0L, 4L, 2400L), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
 - x_mask, Input, Shape: (0L, 4L), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
 - Subtensor{int64:int64:int8}.0, Shape: (0L, 4L, 4800L), ElemSize: 4 Byte(s), TotalSize: 0 Byte(s)
 TotalSize: 87120084.0 Byte(s) 0.081 GB
 TotalSize inputs: 87004852.0 Byte(s) 0.081 GB

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.

Could anyone please help me out?
Thanks.

Using skip-thought representations for computing sentence similarity

Can we use the skip-thought vectors generated as a representation of sentences for further tasks like in vision we use something like a pre-trained AlexNet model to obtain feature representation and use for further tasks.
I encoded using the (utable.npy, btable.npy, etc) and tried computing a sentence similarity/ cosine similarity of vectors obtained but the values seem high.
Is it that my understanding of using a model given to obtain a pre-trained network for representation wrong ?

The sentences for example are

  1. Two women and as many men managed to escape, in an injured condition, from the shrine.
  2. Both had done plenty of damage to the Federer brand already during the sun-shot afternoon, threatening to spoil that much-anticipated potential final showdown between the Swiss icon and his lifelong rival, Rafael Nadal.

The cosine similarity of the above sentences comes up to be 0.79259253.
Isn't the cosine similarity expected to be lesser.
Am I making a mistake ?

Github code (decoder GRU layer) different from paper

Hi,

I'm stepping through the code and noticed that de GRU layer of the decoder doesn't take into account the context provided by the encoder.
As stated in the paper, eq 5, 6 and 7, an extra parameter is added to incorporate the context every time step during decoding.

I think the code is missing the following function (taken from neural machine translation example):
https://github.com/kyunghyuncho/dl4mt-material/blob/master/session1/nmt.py#L352

I can add it myself but I'm afraid that I will miss out something and therefore won't be able to get the same results. Especially, given that it should train for 2 weeks...

Best,

Fréderic

Key Error

size of my vocab is 175.But why is it looking for 176?
Error:
model = tools.load_model(embed_map)
File "/home/raksha/prog/skip-thoughts-master/training/tools.py", line 74, in load_model
table = lookup_table(options, embed_map, worddict, word_idict, f_emb)
File "/home/raksha/prog/skip-thoughts-master/training/tools.py", line 163, in lookup_table
wordvecs = get_embeddings(options, word_idict, f_emb)
File "/home/raksha/prog/skip-thoughts-master/training/tools.py", line 184, in get_embeddings
d[word_idict[i]] = ff
KeyError: 176

Download Error:Connection reset by peer

I follow the guide in ReadMe.md to download the dictionary.txt file. It works well in the begin,but when progress bar reach to 31%,it throws an error:connection reset by peer.I can download the other 6 files,such as uni_skip.npz,bi_skip.npz.pkl...Could you help me to download this file or provide an avilable download link? Thanks in advance!

pygpu.gpuarray.GpuArrayException: out of memory on small corpus

Dear experts,

I have encountered a memory issue while attempting to train a 1 million sentence corpus using the skip thought model. In the paper, the model was trained on well in excess of 1 million sentences. From Table 1 of the paper, https://arxiv.org/pdf/1506.06726.pdf, it looks like the training set consisted of 74 million sentences.

If this is really a GPU memory limitation, how was the model in the paper trained, and on what sort of hardware specifications?

I am currently working off a AWS instance with a single Tesla K80 GPU with 12 GB of memory. The memory error is displayed below.

Thank You,

Kuhan

Traceback (most recent call last):
File "training_notes.py", line 25, in
'adam', 64, model_name, vocab_name, 10, False)
File "/home/ec2-user/py27_version/skip-thoughts/training/train.py", line 119, in trainer
f_grad_shared, f_update = eval(optimizer)(lr, tparams, grads, inps, cost)
File "/home/ec2-user/py27_version/skip-thoughts/training/optim.py", line 31, in adam
v = theano.shared(p.get_value() * 0.)
File "/home/ec2-user/anaconda3/envs/py27/lib/python2.7/site-packages/theano/compile/sharedvalue.py", line 268, in shared
allow_downcast=allow_downcast, **kwargs)
File "/home/ec2-user/anaconda3/envs/py27/lib/python2.7/site-packages/theano/gpuarray/type.py", line 669, in gpuarray_shared_constructor
context=type.context)
File "pygpu/gpuarray.pyx", line 915, in pygpu.gpuarray.array (pygpu/gpuarray.c:12223)
File "pygpu/gpuarray.pyx", line 970, in pygpu.gpuarray.carray (pygpu/gpuarray.c:13105)
File "pygpu/gpuarray.pyx", line 664, in pygpu.gpuarray.pygpu_fromhostdata (pygpu/gpuarray.c:9847)
File "pygpu/gpuarray.pyx", line 301, in pygpu.gpuarray.array_copy_from_host (pygpu/gpuarray.c:5813)
pygpu.gpuarray.GpuArrayException: out of memory

Erros when run the Semantic-Relatedness task

When I run the Semantic-Relatedness task
import eval_sick
eval_sick.evaluate(model, evaltest=True)

Errors:
In [7]: import eval_sick
Using Theano backend.

In [8]: eval_sick.evaluate(model, evaltest=True)
Preparing data...
Computing training skipthoughts...
Computing development skipthoughts...
Computing feature combinations...
Encoding labels...

Compiling model...

AttributeError Traceback (most recent call last)
in ()
----> 1 eval_sick.evaluate(model, evaltest=True)

/data/skip-thoughts/eval_sick.pyc in evaluate(model, seed, evaltest)
40
41 print 'Compiling model...'
---> 42 lrmodel = prepare_model(ninputs=trainF.shape[1])
43
44 print 'Training...'

/data/skip-thoughts/eval_sick.pyc in prepare_model(ninputs, nclass)
73 lrmodel.add(Dense(ninputs, nclass))
74 lrmodel.add(Activation('softmax'))
---> 75 lrmodel.compile(loss='categorical_crossentropy', optimizer='adam')
76 return lrmodel
77

/usr/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/models.pyc in compile(self, optimizer, loss, class_mode)
433 self.X_test = self.get_input(train=False)
434
--> 435 self.y_train = self.get_output(train=True)
436 self.y_test = self.get_output(train=False)
437

/usr/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/containers.pyc in get_output(self, train)
126
127 def get_output(self, train=False):
--> 128 return self.layers[-1].get_output(train)
129
130 def set_input(self):

/usr/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
669
670 def get_output(self, train=False):
--> 671 X = self.get_input(train)
672 return self.activation(X)
673

/usr/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_input(self, train)
171 if previous_layer_id in self.layer_cache:
172 return self.layer_cache[previous_layer_id]
--> 173 previous_output = self.previous.get_output(train=train)
174 if hasattr(self, 'layer_cache') and self.cache_enabled:
175 previous_layer_id = '%s_%s' % (id(self.previous), train)

/usr/lib/python2.7/site-packages/Keras-0.3.1-py2.7.egg/keras/layers/core.pyc in get_output(self, train)
961 def get_output(self, train=False):
962 X = self.get_input(train)
--> 963 output = self.activation(K.dot(X, self.W) + self.b)
964 return output
965

AttributeError: 'Dense' object has no attribute 'W'

DeprecationWarning: Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.

We have changed the line to load model to gensim.models.KeyedVectors.load_word2vec_format, still we are getting the following error

import tools
embed_map = tools.load_googlenews_vectors()
model = tools.load_model(embed_map)

DeprecationWarning Traceback (most recent call last)
in ()
1 import tools
----> 2 embed_map = tools.load_googlenews_vectors()
3 model = tools.load_model(embed_map)

/home/mpl2/daksh/tarun/skip-thoughts/training/tools.pyc in load_googlenews_vectors()
154
155 embed_map = gensim.models.KeyedVectors.load_word2vec_format(path_to_word2vec, binary=True)
--> 156 return embed_map
157
158 def lookup_table(options, embed_map, worddict, word_idict, f_emb, use_norm=False):

/usr/local/lib/python2.7/dist-packages/gensim/models/word2vec.pyc in load_word2vec_format(cls, fname, fvocab, binary, encoding, unicode_errors, limit, datatype)
1446 limit=None, datatype=REAL):
1447 """Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead."""
-> 1448 raise DeprecationWarning("Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.")
1449
1450 def save_word2vec_format(self, fname, fvocab=None, binary=False):

DeprecationWarning: Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.

TypeError: 'cannot deepcopy this pattern object'

A type error occurs at the following line while running the evaluation code eval_sick.py:

bestlrmodel = copy.deepcopy(lrmodel)
TypeError: 'cannot deepcopy this pattern object'

Any quick fix?

import error vocab

import vocab

Traceback (most recent call last):
File "<pyshell#0>", line 1, in
import vocab
ImportError: No module named vocab

floating point error in encoding

I have successfully trained skipthoughts from scratch on my own data. Now given new data X, I want to get its vectors representation. So i do the following

vectors = skipthoughts.encode(model, X)

But this gives me the following error:

floating point exception (core dumped)

I have also set the THEANO_FLAGS floatX to float32.
Any help will be appreciated.

Failed to interpret file as a pickle

Hi. I am new to this and was trying to run the code, but I am getting an error:

IOError: Failed to interpret file '/Users/anubhav/Studies/penseur-master/data/uni_skip.npz' as a pickle

How can I solve it?

Erros when run the Semantic-Relatedness task

When I run the Semantic-Relatedness task
import eval_sick
eval_sick.evaluate(model, evaltest=True)

Errors:
Traceback (most recent call last):
File "/home/mmc/Downloads/skip-thoughts-master/Semantic-Relatedness.py", line 8, in
eval_sick.evaluate(model, evaltest=False)
File "/home/mmc/Downloads/skip-thoughts-master/eval_sick.py", line 42, in evaluate
lrmodel = prepare_model(ninputs=trainF.shape[1])
File "/home/mmc/Downloads/skip-thoughts-master/eval_sick.py", line 75, in prepare_model
lrmodel.compile(loss='categorical_crossentropy', optimizer='adam')
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 350, in compile
self.X_train = self.get_input(train=True)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/containers.py", line 64, in get_input
return self.layers[0].get_input(train)
File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 94, in get_input
and is not an input layer.')
Exception: Layer is not connected and is not an input layer.

decoding problem

Hi. Thanks very very much for your skip-thought vector.

I can use encode() function with downloaded data(utable.npy, btable.npy, uni_skip.npz, etc.).

And now, I want to decode my encoded sentences with same data. but I have problem with decoding.

Can I get model data(in skip-thoughts-master/decoding/tools.py, path_to_model data and

path_to_dictionary data) like encoding process?

Erros when run the training

When I run the training
Step 4: Loading saved models

Errors:
In [10]: model = tools.load_model(embed_map)
Loading dictionary...
Creating inverted dictionary...
Loading model options...
Loading model parameters...
Compiling encoder...
/usr/lib/python2.7/site-packages/theano/scan_module/scan.py:1019: Warning: In the strict mode, all neccessary shared variables must be passed as a part of non_sequences
'must be passed as a part of non_sequences', Warning)

Creating word lookup tables...

KeyError Traceback (most recent call last)
in ()
----> 1 model = tools.load_model(embed_map)

/data/skip-thoughts/training/tools.pyc in load_model(embed_map)
75 # Lookup table using vocab expansion trick
76 print 'Creating word lookup tables...'
---> 77 table = lookup_table(options, embed_map, worddict, word_idict, f_emb)
78
79 # Store everything we need in a dictionary

/data/skip-thoughts/training/tools.pyc in lookup_table(options, embed_map, worddict, word_idict, f_emb, use_norm)
164 Create a lookup table from linear mapping of word2vec into RNN word space
165 """
--> 166 wordvecs = get_embeddings(options, word_idict, f_emb)
167 clf = train_regressor(options, embed_map, wordvecs, worddict)
168 table = apply_regressor(clf, embed_map, use_norm=use_norm)

/data/skip-thoughts/training/tools.pyc in get_embeddings(options, word_idict, f_emb, use_norm)
185 if use_norm:
186 ff /= norm(ff)
--> 187 d[word_idict[i]] = ff
188 return d
189

KeyError: 7818

What kind of preprocessing is required for the sentences?

If I have a corpus of documents, each with multiple sentences, how should I preprocess these sentences so that when they are tokenized they yield useful tokens?

For example, should the words be lower-cased, stemmed, punctuation, digits, and stopwords removed? Or is none of this necessary?

Document borders during training?

Hi Ryan,

Your paper was a very nice read. One question I'm unclear on after reading the paper and skimming the code: did you respect document boundaries in your training objective? It looks like you just concatenated all your documents together, meaning the last sentence of one document will need to predict the first sentence of the next. While you use very large documents (books), I could see this confounding things using resources with many smaller documents, like Wikipedia.

Just asking for clarification.

Thanks,
Stephen

Typeerror in decoding

Hi,
I followed the decoding tutorial, prepared a list of sentences of X. I was trying to reconstruct a sentence. That said, this line in my code: train.trainer(X, X, skmodel), got me an error:
TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

My usage:

THEANO_FLAGS='floatX=float32'
python my_code.py

Thank you very much!

How to train the Bi-skip model mentioned in the paper?

Hello Ryan,

Thanks a lot for the excellent code. I was trying to train the Bi-skip model mentioned in your paper on my dataset. However only the uni-skip model seems to be present in the training code. Is that correct? Do you plan to add it?

Thanks in advance!
Bhuwan

TypeError during training

Hi Ryan,

I tried to start training a new model from scratch following you instructions but soon after I execute the command:
train.trainer(X)
I get following error:
TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

Don't have any ideas what I am doing wrong. Do you have any suggestions?

Artem

Ranking loss bug?

Hi Ryan,

I ran your code for image-sentence ranking.
I suppose embedded image vectors are not normalized before computing loss.
Is it correct?

Mayu

What is 'utable' and 'btable'?

A model has been trained according to the instruction here. I can load the model using the following commands:

import tools
embed_map = tools.load_googlenews_vectors()
model = tools.load_model(embed_map)

After that, I want to run an experiment (for example: Semantic-Relatedness):

import eval_sick
eval_sick.evaluate(model, evaltest=True)

Here is the output:

/Users/AmirHJ/projects/skip-thoughts/skipthoughts.pyc in encode(model, X, use_norm, verbose, batch_size, use_eos)
     97     # word dictionary and init
     98     d = defaultdict(lambda : 0)
---> 99     for w in model['utable'].keys():
    100         d[w] = 1
    101     ufeatures = numpy.zeros((len(X), model['uoptions']['dim']), dtype='float32')

KeyError: 'utable'

Would you please help me to run experiments on the trained model?

Exception:

I get following error when import skypthoughts in IPython:

This after I edited the print missing parenthesis problem.
Any help?

Lot's of clang: errors...

Exception: Compilation failed (return status=1): clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-tbm'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-fma4'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-prfchw'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-pcommit'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-clwb'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-pku'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-smap'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-rdseed'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-sse4a'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-clflushopt'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-mwaitx'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-prefetchwt1'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-sgx'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-sha'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: unknown argument: '-target-feature'. clang: error: no such file or directory: '+sse2'. clang: error: no such file or directory: '+cx16'. clang: error: no such file or directory: '+bmi2'. clang: error: language not recognized: 'savec'. clang: error: no such file or directory: '+fsgsbase'. clang: error: no such file or directory: '+popcnt'. clang: error: no such file or directory: '+aes'. clang: error: language not recognized: 'saves'. clang: error: no such file or directory: '+mmx'. clang: error: language not recognized: 'op'. clang: error: no such file or directory: '+hle'. clang: error: no such file or directory: '+xsave'. clang: error: no such file or directory: '+invpcid'. clang: error: no such file or directory: '+avx'. clang: error: no such file or directory: '+rtm'. clang: error: no such file or directory: '+fma'. clang: error: no such file or directory: '+bmi'. clang: error: no such file or directory: '+rdrnd'. clang: error: no such file or directory: '+sse4.1'. clang: error: no such file or directory: '+sse4.2'. clang: error: no such file or directory: '+avx2'. clang: error: no such file or directory: '+sse'. clang: error: no such file or directory: '+lzcnt'. clang: error: no such file or directory: '+pclmul'. clang: error: no such file or directory: '+f16c'. clang: error: no such file or directory: '+ssse3'. clang: error: no such file or directory: '+cmov'. clang: error: no such file or directory: '+movbe'. clang: error: no such file or directory: '+xsaveopt'. clang: error: no such file or directory: '+sse3'. 

train.trainer returns error

TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')

Facing problem in implementing decoding using skip-thoughts vector.

Hello,
I am facing problem in implementing decoding using skip-thoughts vector.
I am getting following error while running decoder.py

Loading model parameters...
Compiling encoders...
Loading tables...
Packing up...
10
{'grad_clip': 5.0, 'dim': 1600, 'optimizer': 'adam', 'dim_word': 620, 'dictionary': '/home/utsav/skip-thoughts/dict', 'decay_c': 0.0, 'reload_': False, 'n_words': 18, 'batch_size': 1, 'encoder': 'gru', 'maxlen_w': 10, 'saveto': '/home/utsav/skip-thoughts/saved_models/sample.npz', 'embeddings': None, 'decoder': 'gru', 'sampleFreq': 1, 'max_epochs': 20, 'dispFreq': 1, 'dimctx': 4800, 'doutput': False, 'saveFreq': 1}
Loading dictionary...
Building model
Building sampler
Building f_init... Building f_next.. Done
Building f_log_probs... Done
Building f_cost... Done
Done
Building f_grad... Building optimizers... Optimization
Epoch  0
OK
Traceback (most recent call last):
  File "decoder.py", line 21, in <module>
    train.trainer(X,X,encoder)
  File "/home/utsav/skip-thoughts/decoding/train.py", line 180, in trainer
    x, mask, ctx = homogeneous_data.prepare_data(x, c, worddict, stmodel, maxlen=maxlen_w, n_words=n_words)
  File "/home/utsav/skip-thoughts/decoding/homogeneous_data.py", line 109, in prepare_data
    feat_list = skipthoughts.encode(model, feat_list, use_eos=False, verbose=False)
  File "/home/utsav/skip-thoughts/skipthoughts.py", line 115, in encode
    for w in model['utable'].keys():
TypeError: 'Encoder' object has no attribute '__getitem__'


Decoder.py :

import sys
sys.path.append('/home/utsav/skip-thoughts')
import skipthoughts
import nltk
import vocab
import train

model = skipthoughts.load_model()
encoder = skipthoughts.Encoder(model)

f = open('../sample.txt','r')
sentences = [nltk.word_tokenize(r.lower()) for r in f.readlines()]
X = [' '.join(s) for s in sentences]

worddict, wordcount = vocab.build_dictionary(X)

print len(worddict)

vocab.save_dictionary(worddict, wordcount, '/home/utsav/skip-thoughts/dict')

train.trainer(X,X,encoder)

Train.py :

# main trainer
def trainer(X, C, stmodel,
            dimctx=4800, #vector dimensionality
            dim_word=620, # word vector dimensionality
            dim=1600, # the number of GRU units
            encoder='gru',
            decoder='gru',
            doutput=False,
            max_epochs=20,
            dispFreq=1,
            decay_c=0.,
            grad_clip=5.,
            n_words=10,
            maxlen_w=10,
            optimizer='adam',
            batch_size = 1,
            saveto='/home/utsav/skip-thoughts/saved_models/sample.npz',
            dictionary='/home/utsav/skip-thoughts/dict',
            embeddings=None,
            saveFreq=1,
            sampleFreq=1,
            reload_=False):

Here is my training text :

He is a good person.
He is a good person.
I am a serious guy.
I am a serious guy.
He is a good person.
He is a good person.
I am a serious guy.
I am a serious guy.

example usage of `nn_words()`

Do you have an example usage of nn_words, i.e., as shown in "Table 3: Nearest neighbours of words after vocabulary expansion" in the paper?

It's not clear how to prepare the parameters to pass in -

Many thanks,

Issue with turning off verbose with skip thought encoder

Hi there. Thank you so much for the amazing models. When I try to run your neural storyteller example with your pretrained models, and call the function generate.load_all(), everything loads correctly. But at the end, when packing everything up, return z returns a printout of all the vectors in my command line. I've tried to set verbose=False in the skipthoughts.py encode function, but it doesn't seem to turn it off. Could you point me to the correct line that is causing this?

I apologize if this is a silly oversight on my part.

Thank you so much.

Are sentence vectors accessible from the trained model?

Dear experts,

I recently stumbled upon the Skip-Thought paper and found it extremely interesting. I have managed to train a small model using some 2.7 million sentences for testing purposes. My primary interesting is to understand sentence to sentence similarity by comparing the distance between the vector embeddings.

My question is, after training, can the vector representation of the training sentences be accessed from the model? I know I can encode the sentences afterward by using tools.encode(), but for a large number of sentences this will take quite some time, time that is already in addition to the training itself.

Naively, I thought that analogous to doc2vec models, there would be a dictionary of sentences (like a dictionary of paragraphs/documents), along with their vector embeddings.

Is this the case? Perhaps I misunderstood sections of the paper. I can certainly find the token level embeddings in the OrderedDict called model['table'].

Thank You and keep up the good work!

Kuhan

training decoder

First of all, this project is awesome!
I succeeded in getting the encoder to work by following directions on the home readme. I am stuck at the compilation stage of the decoder right now. My code is failing at line 163 of train.py with the following Theano error:

TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{add,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.'

I'm running on CPU with Mac OS X and I tried both Theano 0.7.0 and the bleeding-edge git versions (both produced the same error). My C and X vectors are identical lists of one-sentence strings (though they should not be responsible for this issue).

I welcome any advice!
Sam

Add a license file

Currently you have to scroll down to the bottom of README.md to determine how the code in this repository is licensed.

Please add a separate LICENSE file to make it clear.

Faster training

You mentioned in the README that you're working on faster training of skip-thought vectors. Could you shed some light on that?

can i get .pickle of chinese

Hi,
from the nltk, I can find the file as following:

czech.pickle*
danish.pickle*
dutch.pickle*
english.pickle*
estonian.pickle*
finnish.pickle*
french.pickle*
german.pickle*
greek.pickle*
italian.pickle*
norwegian.pickle*
portuguese.pickle*
README*
slovene.pickle*
spanish.pickle*
swedish.pickle*
turkish.pickle*

can i get the pickle file of chinese?

best,
Lan

Errors when running with an empty string

I have a bunch of text files, where some of the lines are empty. Now I've seen that, when trying to extract skip-thought vectors, It shuts down without extracting features for the whole file.

There is a nan when calculate uff[j].

i'm using this code to calculate the Semantic-Relatedness. When i run this code
import eval_sick
eval_sick.evaluate(model, evaltest=True)
I got an error as follow:

File "/home/weixinru/skip-thoughts-master/skipthoughts.py", line 145, in encode
bff[j] /= norm(bff[j])
File "/usr/lib64/python2.7/site-packages/scipy/linalg/misc.py", line 129, in norm
a = np.asarray_chkfinite(a)
File "/usr/lib/python2.7/site-packages/numpy/lib/function_base.py", line 1033, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

I also found that this code run well when the input file is small. And if the file has more than about 200 rows the error will happen

KeyError in decoding

Upon running the load_model function in the decoding package with both the uni_skip and bi_skip models provided, I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "tools.py", line 47, in load_model
params = init_params(options)
File "model.py", line 27, in init_params
params = get_layer('ff')[0](options, params, prefix='ff_state', nin=options['dimctx'], nout=options['dim'])
KeyError: 'dimctx'

Even after I attempt to hard code the value of dimctx, the gru_cond_simple throws a similar error.

Documentation Addition

I had to add "floatX=float32" to the THEANO_FLAGS in order to get the decoder trainer to work. It took me a minute to figure this out, and I've used Theano before - I expect that one might be a time consuming bug to fix for a non-Theano user. Maybe a good addition to the docs.

building a custom decoder

I'm trying to train a custom decoder. I am using the bookcorpus data with science fiction novels. According to the docs:
https://github.com/ryankiros/skip-thoughts/blob/master/decoding/README.md

"We assume that you have two lists of strings available: X which are the target sentences and C which are the source sentences. "

what exactly is X and C with these books? I see the bookcorpus data are just text files. I'm not sure what kind of processing im supposed to do here to get them in the right format for training.

gru_layer code different from paper

image
image
The code snippet in "training/layers.py" and "skipthoughts.py" doesn't match the formulas in the paper, should it be modified as

preactx = tensor.dot(h_ * r, Ux)
preactx = preactx + xx_

skipthoughts.encode is very slow on large number of sentences

If the number of sentences I'm trying to encode (i.e., the list X in the encode function in skipthoughts.py) is large - say around 10K or 100K then it takes a very long time to run encode.
What can be done to speed it up? It's seems the theano functions - 'w2v' and 'w2v2' - are very slow. Any help will be really appreciated... thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.