ngsankha / rbsyn Goto Github PK
View Code? Open in Web Editor NEWProgram synthesis for Ruby
License: BSD 3-Clause "New" or "Revised" License
Program synthesis for Ruby
License: BSD 3-Clause "New" or "Revised" License
Right now, all the strategies are run in an unspecified order by taking the descendants of the base class. Specifying an order is required for best results.
Finally get to synthesis of something like:
def self.username_available?(username, email = nil)
lower = username.downcase
return false if reserved_username?(lower)
return true if !User.exists?(username_lower: lower)
# staged users can use the same username since they will take over the account
email.present? && User.joins(:user_emails).exists?(staged: true, username_lower: lower, user_emails: { primary: true, email: email })
end
The way our branch synthesis works might sometimes trigger a call to the new candidate programs generation routine. So the final program with the branches might be higher in size than the program before.
The candidate program generation loop takes into account size constraints - but the branching doesn't. Figure out a way to do that while respecting size constraints.
Test fails now with a no candidate found. It works in the Docker image submitted to artifact review.
The subexpression Post.where(slug: arg1).first
in #13 is repeated over and over again, it would be nice if it can be refactored out. We have effect information to know that only reads are performed so it should be safe to put this subexpression on top.
We can fall back on the prog cond tuples to do the if statement generation at a later time. Will lead to better branch folding opportunities.
Disable side effects flag still knows that a side effect assertion failed just disables the feedback to the synthesis loop. That still an improvement over hunting in the dark. Should stop the assertion check altogether.
Fix this ASAP.
only the first type definition is used during synthesis now. Should use all methods
An rspec like DSL for specifying tests that can be used to synthesize functions. Example:
define :username_available?, "(String) -> %bool" do
spec "returns true when user doesn't exist" do
username_available? 'bruce1'
post { |result|
result == true
}
end
spec "returns false when user exists" do
pre {
u = User.create(name: 'Bruce Wayne', username: 'bruce1', password: 'coolcool')
u.emails.create(email: '[email protected]')
}
username_available? 'bruce1'
post { |result|
result == false
}
end
assert_equal generate_program, %{
def username_available?(arg0)
User.exists?(username: arg0)
end
}.strip
end
we only look up envref when filling holes at depth 0. They can also be used for higher depths.
The reasoning for this was the synthesis engine if stumbles across the same expression we can bump up the count. This is slow a lot of times, and reusing old expressions smartly can help.
Joins input type computation ignore the label for a particular association, so they don't work all the time.
Given a function argument of type { email: ?String, active: ?%bool, username: ?String, name: ?String }
, and if asked to give a program that returns a String the synthesizer will return programs like:
arg0.[](:name)
arg0.[](:username)
arg0.[](:active)
arg0.[](:email)
The program with :active
is not correct. Even though it can be eliminated after testing, doing so with type checking could be cheaper. Such programs are generated because the argument can be String or bool. But only after the argument is filled that the precise type can be calculated. Such kind of analysis can be directly implemented as well.
reorder worklist based on some similarity index with preconditions
There are multiple environments at the moment:
These could be merged for a much cleaner code
Need some large scale refactoring to clean up the code.
Can we use petri nets for better computation of paths through types? Side effects help when there are assertions that test that side effect. But there is no way to reach a side effect path if there is no assertion for it. Petri nets could be helpful.
Example:
# type: () -> B
def my_method
a = A.new()
a.set_some_state(:foo)
B.lookup(a)
end
there are place where any possible exceptions are blindly ignored because it could be from the synthesized code (which arguably could be wrong). This makes debugging a pain during development. Make a project specific error class that can be used to wrap all known errors and handled properly.
RbSyn is non-interactive at the moment---there are specs and then the synthesizer finds some code that satisfies it without any further human input. However, the core of RbSyn supports partial programs in the form of holes, but that functionality is not exposed anywhere. A VSCode plugin will be the best way to write programs (partial/full) and let RbSyn do the completion. This involves the following steps:
make holes a part of the AST, right now the expansion of expression feels very adhoc
Variables declared in pre-condition, the function invocation and the post condition cannot refer to each other.
the specification is provided in a custom DSL that requires us to write pre
and post
blocks. There is a possible way to write more natural specifications that looks exactly like tests, without separate pre and post blocks.
wrapped_func1:
call synthesized func
raise an exception wrapping the return value
wrapped_func2:
call wrapped_func1
if need to run user provided post condition:
catch the exception and return unwrapped return value
else:
forward the exception
eval_ast:
set the flag to indicate if running user provided postcond or not
define wrapped_func2 instead of synthesized func in the eval context
eval the spec
if exception:
call the engine provided post cond
else:
we have run the user post cond
This is a tricky change to make, but will make the user specs look much nicer.
n
tests and fails m
tests, keep the program corresponding to n
tests together (along with branch condition) and recursively apply this procedure.This method will eliminate individual cost for generating candidates for each test.
the way the code is presently written, the receiver during side effect is always a variable.
Avoid cyclic dependency in variable assignment by checking DAG of expr environment
Couple of things becoming clear while writing the paper and getting more benchmarks to pass:
ideas from separation logic could help in more optimizations of our approach. Read the paper Structuring the synthesis of heap-manipulating programs for inspiration.
Current setup can satisfy effect requirements that are more than one-level deep. This can be solved with an effect reachability graph. Will be a matter of modifying the type reachability graph mostly.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.