khoomeik / deredact Goto Github PK

De-redacting Elon's Email with Character-count Constrained Llama2 Decoding

License: MIT License

Python 100.00%

deredact's Introduction

🦙🔍 De-redacting Elon's Email with Character-count Constrained Llama2 Decoding

Many are attempting to de-redact Andrej's email, but Elon's email seems easier to solve with constrained decoding on a causal LLM due to its much longer prefix. Below I present a decoding algorithm that searches for completions that meet the character-count constraints imposed by the redaction's word breaks.

Here are a few initial completions it generated that match the required word lengths and may be worth exploring (though it wasn't able to complete these):

"Nothing matters but the scale and scale requires money..."
"Nothing changes at all unless you raise several..."
"Google research is far beyond any other research group..."
"Google Research is now doing the most cutting edge work..."
"Please consider how to change the above scenario from zero probability..."
"Please consider how to ensure that your research team and technology remain aligned with..."
"Please consider how to raise and spend capital with the appropriate degree..."
"Please consider all the facts and think through your plan thoroughly before making any decision..."

Some notes about the algorithm:

we select the top top_k_to_decode logits to check if they correspond to tokens with correct word length
we look for tokens with a character count that matches word_length or word_length - 1 to account for a space or punctuation
amongst these tokens with the correct length, we then select those with at least min_p_logits * i * prob_gamma
- the i * prob_gamma term increases minimum probability as the sequence gets longer (since possibilities will naturally narrow)
for each next token prediction, we recursively continue our decoding process

Contributing

We love contributions! Please check Issues for some high priority directions for improvement with fairly detailed descriptions for how to get started. Specifically, Issue #2 is TOP priority right now. Migrating away from this recursive architecture will enable vast speedups via parallelization and more.

deredact's People

Contributors

Stargazers

Watchers

deredact's Issues

Convert from recursive structure to a loop + queue

This should enable us to speed up significantly via batching as well as explore a wider array of completions by e.g. randomly sampling from the queue instead of proceeding via ~greedy DFS.

Parallelize/batchify the search process

Once we have a queue instead of a recursive DFS, we can sample batches from the queue at a time and extend the queue by a list of new_tok_seqs constructed from next_tok_ids each time. Please also include a batch_size parameter.

Depends on: Issue #2

Tune decoding hyperparams

top_k_to_decode: selects how many of the top logits to sample for checking their token's character count.
min_p_logits: once tokens that have the correct character count are found, this sets the minimum post-softmax probability for a token to be included for further decoding.
prob_gamma: up-weights the min_p_logits as the decoded sequence gets longer since possibilities naturally narrow as completions get longer.

Current hyperparam exploration has been in the ballpark of these:

top_k_to_decode = 200
min_p_logits = 0.004
prob_gamma = 1.3

We may want to change the 2-stage filtering approach and deprecate the top_k_to_decode vs min_p_logits distinction.
We may want to set up a more sophisticated scheduler or weighting scheme for min_p_logits.

Issue #3 may introduce some new hyperparams as well related to batch sampling

More detailed word length constraints

Just posting some thoughts before I have to go to work. I think we can get more information out of those constraints than an approximate character count. We can measure the width of each black box in pixels, and check the corresponding size of each letter in Sohne font. This gives us a further constrained set of words that could fit in each blank section.

Measuring the email:

This gives us a new form for word_lengths that looks something like this:
{0: 107, 1: 129, 2: 152, 3: 145, 4: 122, 5: 122, 6: 39, 7: 281, 8: 54, 9: 62, 10: 24, 11: 24, 12: 46, 13: 31, 14: 39, 15: 62, 16: 39, 17: 31, 18: 84, 19: 54, 20: 54, 21: 31, 22: 61, 23: 54, 24: 31, 25: 31, 26: 31, 27: 23, 28: 46, 29: 31, 30: 31, 31: 69, 32: 69, 33: 39, 34: 39, 35: 31, 36: 46}

Comparing with Sohne letters
sohne_letters = {" ": 3.376015625, "!": 3.55203125, '"': 5.28, "#": 10.56, "$": 8.880078125, "%": 12.288046875, "&": 10.928046875, "'": 2.8, "(": 5.1040625, ")": 5.1040625, "*": 6.91203125, "+": 9.728046875, ",": 3.568046875, "-": 6, ".": 3.568046875, "/": 4.024375, "0": 9.968046875, "1": 5.762578125, "2": 8.928046875, "3": 9.12, "4": 9.727109375, "5": 8.960390625, "6": 9.328984375, "7": 8.9240625, "8": 9.568046875, "9": 9.328984375, ":": 3.568046875, ";": 3.568046875, "<": 9.728046875, "=": 9.728046875, ">": 9.728046875, "?": 8.176015625, "@": 14.176015625, "A": 11.07203125, "B": 10.176015625, "C": 10.576015625, "D": 11.216015625, "E": 9.520078125, "F": 9.296015625, "G": 11.520078125, "H": 11.760078125, "I": 4.0640625, "J": 6.323046875, "K": 10.288046875, "L": 8.8, "M": 13.663984375, "N": 11.6, "O": 11.647734375, "P": 10, "Q": 11.616015625, "R": 10.35203125, "S": 9.56546875, "T": 9.856015625, "U": 11.15203125, "V": 10.848046875, "W": 14.736015625, "X": 10.4640625, "Y": 10.096015625, "Z": 9.84, "[": 5.04, "]": 5.04, "^": 7.296015625, "_": 6.880078125, "a": 8.416015625, "b": 9.488046875, "c": 8.3040625, "d": 9.488046875, "e": 8.608046875, "f": 5.08625, "g": 9.408671875, "h": 9.0240625, "i": 3.84, "j": 4.0459375, "k": 8.528046875, "l": 3.84, "m": 13.936015625, "n": 9.0240625, "o": 8.99203125, "p": 9.488046875, "q": 9.488046875, "r": 5.92, "s": 7.71171875, "t": 5.408359375, "u": 9.0240625, "v": 8.11203125, "w": 11.520078125, "x": 8.19203125, "y": 8.11203125, "z": 7.92, "{": 5.84, "|": 4, "}": 5.84, "~": 8.063984375}

With a scoring function to handle measurements like this:
def pixels_of_word(word): return np.sum([letter_lengths[letter] if letter in letter_lengths.keys() else 0 for letter in word])

Combining the two suggests for example, that words 10 and 11 (which are the same length of ~23 pixels) must be from the set
['are', 'our', 'still', 'per', 'bar', 'city', 'run', 'red', 'sky', 'her', '3rd', 'nor']
Although we could set a larger margin of error on it to get more possibilities.

Experiment with different LLMs

larger models
models fine-tuned on Elon's corpus e.g. twitter history

Handle subword tokens

The LLM might generate "research" and subsequently "ers". We need to detect if each tok in top_toks is a subword token and filter it out during construction of next_toks. So we can keep "research", but ignore any subsequent subtokens.

In the future, we may want to "look ahead" to filter out "research" if it doesn't have any highly likely non-subword completions. For example "El" is almost always followed by the subword "on".

Another feature here is to select for tokens that are likely followed by subword tokens if the token + all of its subword tokens sum to the correct character count.

khoomeik / deredact Goto Github PK

deredact's Introduction

🦙🔍 De-redacting Elon's Email with Character-count Constrained Llama2 Decoding

Contributing

deredact's People

Contributors

Stargazers

Watchers

deredact's Issues

Convert from recursive structure to a loop + queue

Parallelize/batchify the search process

Tune decoding hyperparams

More detailed word length constraints

Experiment with different LLMs

Handle subword tokens

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent