Giter VIP home page Giter VIP logo

tyche's People

Contributors

kylejharper avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

tyche's Issues

Adapt to wasted comp space

We should track comp space and expand or reduce it based on hits.

if <we're never maxing out comp space>
  reduce comp space so raw has more space

if <we're regularly maxing out comp space && comp is INITIAL_RATIO>
  increase comp space to avoid faults

buffer__update_ref() should lock itself

The spaghetti-string nature of the tyche project originally required callers to lock/unlock to check for victimization. As the code becomes more structured this is becoming more and more unnecessary. Remove it and modify the callers to support this.

list__search should send a copy or reference

The list__search and list__easy_search functions don't allow the user to specify if they want a copy or a reference. The caller should be allowed to specify if they want one or the other.

Get rid of list locking

The nature of localized locking has removed the need to use list__acquire_write_lock anymore. Remove it and consider eliminating the sweeper timing since it's no longer a truly stop-the-world event.

Code Name ACCRS

Now that the co-mingling of tyche and the ACCRS logic (specifically list.c/.h and buffer.c/.h) is broken up more appropriately, it's time to actually give the ACCRS logic pieces a proper code name so I can actually refer to them intelligently.

This will grant the distinction of saying "Tyche is an implementation of an ACCRS system which currently uses CODENAME for the list/pool management."

Locality (of buffers)

Consider creating batches (or all?) of buffers ahead of time and put them in a free-list in an effort to optimize performance through spatial locality of reference (memory locality).

Since the majority of the searching is actually done through the Skiplist Node objects, it might be prudent (perhaps even more important) to create free-lists of those such that traversing them ends up referring to fewer pages. Research will have to be done to determine if a list-per-level is necessary and if, somehow, associating one level with another (since ->down pointers exist) would be helpful... yea... :\

Update and delete in tyche

The workers in tyche need to employ a configurable amount of update and delete (remove) frequencies. This is the final piece of CRUD we've been missing. After this tyche can be configured with a preconfigured amount of updating and deleting to mimic database patterns and such.

This will also be the final piece of CRUD outside of the List/Pool "core" logic. Allowing tyche to issue CRUD operations while the "core" also does CRUD (specifically, Updates from compression and deletions from sweeping) will pinpoint any performance or synchronization issues.

Deadlock

There's a deadlock that still occurs, particularly when the fixed ratio is low (lots of sweeping), which means it's probably in the compressors. My guess is I'm using pthread_cond_broadcast when I should just be using pthread_cond_signal.

This currently only affects tyche, which is purpose-built to push the limits of the system. It's possible this would never affect anyone using it as an API, but it still needs to be fixed.

Remove Skip down pointer

Consider removing the ->down pointer for skip lists and switching to fixed-size arrays:

L4--------------------------------------------------------->|NULL|
L3------------------------------------------------>| |*|--->|NULL|
L2------------>| |*|------------------------------>| |*|--->|NULL|
L1------------>| |*|--------------------->| |*|--->| |*|--->|NULL|
L0--->|1|*|--->|2|*|--->|3|*|--->|4|*|--->|5|*|--->|6|*|--->|NULL|
        ^_^      ^_^      ^_^      ^_^      ^_^      ^_^    ^____^

The areas denoted by ^_^ are arrays of pointers to further Skiplist Nodes. The left-to-right logic is still in place and works the same. However, when it's time to go "down" we don't use a pointer to the next-lower level. Instead, we have a fixed-size array associated with the node which allows us to go down by simply decrementing the level.

Note: this will fundamentally change how the skiplist is traversed at the physical level, while keeping the logical bits the same (forward and down). The primary concern I have is that my current SkiplistNode struct only requires 32 bytes; if I end up requiring a MAX_LEVELS (32) element array of pointers (8 bytes) PER NODE then we're going to drastically increase the amount of memory our skiplist consumes. This will need to be contrasted with the performance gains from better cache utilization.

My original statement above (about 15 minutes ago) indicated that we would still have the same number of SkiplistNodes in the Skip List; this was misleading. The pipes above |I2| are meant to indicate a single array as pat of a single SkiplistNode that contains that array. (I've updated the ASCII diagram accordingly).

Also of import: if we implement this, we will have to have a node per buffer no matter what, and the slnode will always be the same size so it really just becomes part of the cost of doing business. This means we will no longer need a nearest_neighbor concept which could eliminate a few things.

Update: for reference: http://ticki.github.io/blog/skip-lists-done-right/

For future reference, tyche version 0.0.17 performed as follows:

e.g.:  bin/tyche -m 1500000000 -p /tmp/ram_drive -d 60 -v -D 5 -U 5
U 0  D 0  100% mem:  323 million
U 5  D 0  100% mem:  235 million
U 0  D 5  100% mem:  186 million
U 5  D 5  100% mem:  139 million

Bias (page favoritism)

Tyche needs to support a switch for bias:

tyche ... -b X,Y
X == The percentage of buffers that make up Y
Y == The percentage of hits these buffers represent

E.g.: tyche ... -b 20,80
Means: consider the first 20% of the available pages as accounting for 80% of all the hits we should attempt.
This example would follow the Pareto principle (the "80/20 Rule")

Without this we cannot introduce bias which naturally occurs in different workloads. For example, most databases have certain pages/rows/etc that are used drastically more than others.

Merge reader and writer conditions for buffers, or switch to spinwait

Originally, I only wanted readers or writers to wake up when pending_writers was set or unset. However, the nature of pending_writers might mean I can only use one condition. This would save 48 bytes per buffer, which is ~25% of an entire buffer's overhead.

Specifically, I think I can reduce significant overhead by using pthread_rwlock_t instead of emulating it myself and calling a bunch of broadcasts. There might be other issues that made me avoid this early on in the programming of tyche... but my guess is I did it with 2 conditions and a mutex because I simply didn't know any better.
NOTE: Trading in an rwlock_t means I will avoid TWO (both) conditions (48 bytes each) AND the reference counters (for pinning) AND the native pin counting/tracking might be faster AND I can get rid of pending_writers... add it all up before deciding if it's worthwhile

It's entirely possible I'm wrong about this... or that the performance change won't be worth it... but it's worth checking out. Also be aware that SkiplistNode structs rely on the associated Buffer's lock, at a minimum, for protection (this is safe because each SLNode is associated with a buffer by design, therefore to lock an entire "stack" of slnodes at different levels, we just lock the buffer.

Block and unblock have a race condition

The buffer__block() and buffer__unblock() functions have a may have a race condition because they require the caller to release their own pin before getting the block.

There is also a risk of keeping a pin and waiting for ref count to hit 1 instead of 0. If multiple writers are pending and one is a removal thread, it could wipe out the buffer while we have someone trying to block() it... need to think through it.

Locking head creates a bottleneck

Functions that lock the list->head create a bottleneck. By virtue of the skip list itself we're going to hit bottle necks as we descend.

Efforts should be made to only lock when the slnode(s) have a ->right pointer in their array of levels that matches. In other words, don't lock slnodes as we traverse unless we're absolutely sure we're going to amend some of the values in the array of down/next pointers.

See #13

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.