ezrosent / allocators-rs Goto Github PK

View Code? Open in Web Editor NEW

303.0 303.0 27.0 2.88 MB

Allocators in Rust

License: Apache License 2.0

Rust 97.43% Shell 2.57%

allocators-rs's People

Contributors

Stargazers

Watchers

allocators-rs's Issues

Add missing punctuation in README.md

The Contributing section of README.md ends in a sentence which is missing a period.

mmap-alloc: Support full Alloc API

Currently, we only support alloc, dealloc, alloc_zeroed, and alloc_excess; the other methods of the Alloc trait are left as their default implementations. However, some of them might benefit from an implementation that is made with awareness of mmap's semantics.

slab-alloc: Make ptr_map support no-std and no-os

Currently, the hash map implementation in ptr_map relies on Vec and Box. Eventually, it should either drop these dependencies (and be implemented in terms of raw chunks of memory and raw pointers) or be upgraded once Vec and Box support parametric allocators.

Once this is done, we should make it so that the user can supply a custom allocator for this purpose.

Progress:

Make ptr_map support parametric allocators.
Allow the user to supply an allocator to be used with ptr_map.

Ensure elfmalloc works on iOS

Given this jemalloc issue, we should verify that elfmalloc works on iOS in addition to Mac desktops.

elfmalloc: Support Windows mapped memory

Currently, our handling of mapped memory is very specific to Unix semantics: memory can be mapped in a "clean" state in which virtual memory is consumed, but no physical memory is used. On the first use of a mapped region, the region will become "dirtied" and the OS will allocate physical memory resources.

On Windows, on the other hand, using memory that has not yet been dirtied ("committed", in Windows parlance) results in a fault. Mapped but uncommitted memory must be explicitly committed before it can be used. Like on Unix, dirty/committed memory can be explicitly uncommitted (roughly equivalent to madvise(MADV_DONTNEED) on Unix). For a good coverage of memory handling on Windows, see https://blogs.technet.microsoft.com/markrussinovich/2008/11/17/pushing-the-limits-of-windows-virtual-memory/.

Supporting Windows memory entails:

Extending the allocators in mmap-alloc to support transitions between uncommitted and committed memory on Windows
Making use of this functionality in elfmalloc

elfmalloc: Implement malloc-bind::LayoutFinder

In order to allow us to use malloc-bind to create C API bindings for elfmalloc, we need an implementation of the malloc-bind::LayoutFinder trait, which maps from allocated objects to their layouts. All of the logic to do that already exists in elfmalloc - it just needs to be encapsulated in a LayoutFinder implementation.

Upgrade interpolate_idents

Once SkylerLipthay/interpolate_idents#23 is fixed, upgrade object-alloc-test and slab-alloc to the new version.

mmap-alloc: Support committing/uncommitting memory on Windows

In addition to implementing the Alloc and UntypedObjectAlloc traits, the allocators in mmap-alloc also provide system-specific methods for handling the lifecycle of mapped memory. On Unix, this means the ability to commit and uncommit by accessing pages to ensure that they are mapped and calling madvise with MADV_DONTNEED or MADV_FREE.

A similar (though not identical) lifecycle of committing and uncommitting exists on Windows, and we should support it.

elfmalloc: Explore optimizations given layout

Most of elfmalloc is designed to perform well given the C API, which doesn't provide any layout information on dealloc, realloc, etc. However, we aim to also provide a Rust allocator. Thus, it's worth exploring optimizations that we can make when compiling for Rust programs. Possible examples include:

Not having to keep track of metadata for slabs, which we need in the C API to map pointers to object class
Being able to have more distinct slab sizes (note that there is not a one-to-one mapping between object class sizes and slab sizes)

mmap-alloc: Implement mark_unused for Windows

On Linux and Mac, we have a mark_unused function that is used when a mapping call returns 0. In this case, we request a different mapping, but leave the 0 page mapped so that future calls will never return that page again. In order to make bugs easier to find and to avoid wasting resources, we call mark_unused on the page, which informs the kernel that we don't need a physical page to back it, and sets the page's permissions to PROT_NONE so that any future accesses will result in a segfault.

Equivalents to this behavior probably exist on Windows, so we should implement mark_unused for Windows as well.

test-scripts/check-copyright-comments.sh: Fix comment typo

This line should read first_modification_year <file> instead of last_modification_year <file> (as it currently does).

Switch to using rustfmt-nightly

Switch to using rustfmt-nightly. In order to do this all the way, we should:

Format all existing code using rustfmt-nightly
Create a script in the test-scripts directory that can verify that all code is already properly formatted (see here for instructions), and adding this to Travis and AppVeyor CI configurations

Shorten type names?

RFC 356 specifies that names without prefixes should be preferred over names with prefixes - e.g., io::Error over io::IoError.

Currently, we use the latter naming convention - e.g., slab::SlabAlloc, slab::UntypedSlabAlloc, etc. We should abide by the proper, officially-accepted naming convention.

Open questions:

Should we rename ObjectAlloc? Probably not. It's in the object_alloc crate, and object_alloc::Alloc is pretty unwieldy - most people will want to use object_alloc::ObjectAlloc and then use the type as ObjectAlloc without the crate prefix.
Do we expect most people to use slab::Alloc (and other similarly-named types) as slab::Alloc or just Alloc? If it's the latter, then it may actually be more clear to leave it as is (SlabAlloc). This also avoids a conflict with the Alloc trait from the alloc crate, which is likely to be used by code that also uses a slab allocator.

Travis CI: Build for more platforms/compilers

Right now we only build for Linux and Mac on the nightly compiler. However, some of the crates may support being built on stable. Additionally, we should test on 32-bit. Thus:

Figure out which crates can be built without nightly. If there are any, then:
- Add beta and stable to the list of compiler versions
- Figure out what minimum compiler version we can build with, and build with that version in addition to current beta and stable
- Use travis-cargo instead of normal cargo so we can use the --only nightly flag to make sure we don't run tests on beta or stable that require nightly
~~Figure out if Travis supports 32-bit, and if so, run the tests on 32-bit~~ Travis doesn't support 32-bit, but it's apparently possible to get it to work.

alloc-fmt: Allow disabling backtrace

Currently, alloc-fmt always tries to print a backtrace, although that often fails because printing the backtrace involves allocation (we detect this failure and just abort). It'd be nice if the user could decide at compile time (or maybe even at runtime?) whether they wanted backtraces.

Two options I see for this:

Add a feature (we probably want it included in the crate's list of default features) that enables backtraces
Add the ability to parse the RUST_BACKTRACE environment variable according to how it is used in normal Rust. This one is tricky because it should ideally be done in a way that avoids allocation (allocating is not the end of the world so long as, just like with backtraces, we can detect any recursion and abort)

slab-alloc: Sort partially-full slabs

Currently, slabs are lumped into three buckets: empty, partially-full, and full. We allocate from the partially-full slabs if there are any, and otherwise from the full slabs. This makes it so that full slabs - which are eligible to be freed back to the system - are only made ineligible if no partially-full slabs exist.

We can improve this system - and make it more likely that a partially-full slab will become full and thus eligible for freeing - by sorting the slabs by fullness and allocating from the least-full slabs first. A heap might work, although a better option might be a simplified heap-like structure like the one described in Section 3 of SuperMalloc: A Super Fast Multithreaded Malloc for 64-bit Machines:

To implement fullest fit, SuperMalloc provides, for each object size, a heap-like data structure to sort pages by how full they are. This data structure solves a simpler problem than the general priority-queue problem because pages have a bounded number of different fullness states. For example, pages containing 64-byte objects can be in only one of 65 different states corresponding to 0 objects free, 1 object free, and so forth up to 64 objects free. For each size class, SuperMalloc simply maintains an array of length 65 containing linked lists of pages, along with a bitmap to indicate which sizes have nonempty linked lists. The linked list cells are kept outside the pages so that the pages can be uncommitted when they become empty.

bsalloc segfaults on Windows

bsalloc currently constructs its mmap_alloc::MapAlloc instance as MapAlloc::default(), which doesn't commit allocated memory. This is fine on POSIX, but on Windows, it breaks (uncommitted memory must be committed manually). We have a couple of options:

Configure MapAlloc to allocate already-committed memory on all platforms
Configure MapAlloc to allocate already-committed memory on Windows only (leave the other platforms as they are, allocating uncommitted memory)
Leave it up to bsalloc's user to commit memory

I lean towards the first for simplicity's sake, but I'd be OK with the second as well. The third seems like a bad idea to me, but I'm willing to be told I'm wrong.

Dual-license under MIT

Would it be possible to dual-license under the MIT license? This would allow use in the Rust compiler distribution and the Redox operating system.

elfmalloc: Use multiple creeks to aid in metadata mapping

Currently, we use a single creek to back all allocations that are not mmap'd directly. However, if we used multiple creeks, we could use creek membership as a proxy for membership information. This would allow us to, given a pointer, figure out what creek it came from, and thus to know something about its identity.

For example, if we had a different creek for each size class, then we could save a page on each large page allocation. Currently, for large objects, we mmap directly, but we have to mmap an extra page in order to store metadata so that, given a pointer, we can back up a page and use this stored metadata to figure out the size of the object. This would allow us to instead have a creek for power-of-two sized size classes. This would usually be more wasteful of virtual memory than the metadata-in-a-page approach, but it would be less wasteful of physical memory because only the pages needed for the object would actually be dirtied.

bsalloc: Support allocation failures.

Currently, bsalloc is written to assume that memory allocation (in particular, from mmap) can never fail. Instead, it should gracefully handle allocation failure and propagate the error up to the user.

Add AppVeyor testing

Because this set of crates is at least attempting to support Windows, it should have CI testing on AppVeyor to make sure the Windows support is working.

slab-alloc: Create mechanism for supporting time in no-std and no-os

Currently, in util::workingset, we use std::time::Instant to keep track of time. Instead, we need some mechanism (probably a trait) for keeping track of time that is agnostic to implementation, an implementation of this mechanism for a no-std environment (where OS support is present), and a mechanism for allowing the user to provide their own implementation for no-os environments.

Progress:

Create a mechanism for abstracting over time, create an implementation of this mechanism that is backed by std::time::Instant, and update util::workingset to use this mechanism.
Create an implementation of this mechanism that does not rely on std, but only on OS-provided functionality (e.g., libc on Unix and kernel32 on Windows).
Modify the slab allocator code to allow the user to provide their own implementation of this mechanism.

mmap-alloc: Consider removing huge page support.

Huge pages have a number of limitations that make it so they don't place nicely with the other features we support, and it may be worth it to remove support entirely:

On Linux, you can't uncommit a mapping created using huge pages
On Windows, you cannot map huge pages without also committing them at the same time

mmap-alloc: Eliminate distinction between MapAlloc and HugeMapAlloc?

Currently, we have two allocators: MapAlloc and HugeMapAlloc. MapAlloc always allocates using the system's page size, and HugeMapAlloc always allocates with a huge page size that is fixed at construction time.

However, it would be possible to infer the right choice - normal pages, or a particular size of huge pages - depending on the Layout passed to various methods. We should consider whether that's sufficient in all cases and thus whether it'd make sense to collapse both allocator types into one that infers what page size to use based on the provided Layout.

malloc-bind: Support exact semantics of C API

Currently, a few behaviors of the C API are not matched perfectly:

pvalloc: sizes need to be rounded up to a page size
realloc/reallocf: Decide what to return when size == 0
realloc/reallocf: Decide what alignment to use
realloc/reallocf: Decide whether to round size up to a multiple of 8
posix_memalign/memalign: Round size up to a multiple of the alignment

Abide by Rust API Guidelines

For each recommendation in the Rust API Guidelines, ensure that we either abide by it or have a good reason not to.

mmap-alloc: Change permissions before reallocing

On non-Linux platforms, the default realloc implementation is used. This breaks when either of the read or write permissions is not configured because the default implementation allocates a new object and then manually copies the bytes from the old to the new. Thus, we need to override realloc in order to temporarily change the memory permissions, perform the copy, and then change the permissions back.

sysconf: Add default hugepage support on Linux

Currently, priv_default_hugepage on Linux always returns None.

We need to implement it. It consists of parsing /proc/meminfo' looking for a line of the form Hugepagesize: xxx kB. What makes this tricky is that it has to be done using only libc`, and without heap allocation.

slab-alloc: Support no-std and no-os

slab-alloc currently has an os feature and a std feature (which depends on the os feature). However, the code as it stands today can't actually compile without both features being enabled (which is the default). This issue tracks progress towards making slab-alloc truly support a no-std or no-os environment.

Tasks:

Create a mechanism for abstracting over time. Currently, in util::workingset, we use std::time::Instant to keep track of time. Instead, we need some mechanism (probably a trait) for keeping track of time that is agnostic to implementation. (#3)
Once a mechanism for abstracting over time exists, we should make it so that the user can supply it. In an os environment, we can implement this ourselves, but in a no-os environment, we have no way of knowing how time is represented or how to get the current time, so it must be provided by the user. (#3)
Support custom allocators for the pointer hashmap (ptr_map module). Currently, it relies on Vec and Box. Eventually, it should either drop these dependencies (and be implemented in terms of raw chunks of memory and raw pointers) or be upgraded once Vec and Box support parametric allocators. (#6)
Once we support custom allocators in ptr_map, we should make it so that the user can supply a custom allocator for this purpose. (#6)

Test dlopen for dynamically-loaded elfmalloc

Once #85 is merged, dlopen should theoretically work for dynamically loading elfc-created dynamic libraries. Test to ensure this works in practice.

Add benchmark against rpmalloc / lockfree-malloc?

It looks like rpmalloc has benchmarks showing it surpasses lockless performance, but I'm not sure if it's the same lockless in your comparisons.

I like the elfmalloc graphs because they show multiple aspects of every allocator in various scenarios.

If it's easy enough, can you throw a few lock free allocator benches into the performance comparisons?

malloc-bind: Document exact semantics of each C allocation function

It took a lot of reading of manpages and Windows documentation articles to suss out the exact behavior of each of the C allocation API functions on Linux, Mac, and Windows. Since we expect users to sometimes override our default implementations, we should document the semantics so that they don't have to embark on the same trek through documentation in order to figure it out. Perhaps more importantly, some of our default implementations call other C allocation API functions on the assumption that they behave exactly as documented; if a user was to override one of these dependency functions and not implement the same semantics, it could break our implementations of other functions.

malloc-bind: LayoutFinder's methods should be unsafe

Since LayoutFinder's methods take raw pointers and they may have to dereference these pointers, the methods should be unsafe or else safe Rust could cause a memory safety violation by passing in an invalid raw pointer.

elfmalloc-performance.md: Confusing graphs

Apart from the fact that the axes on these graphs could be much more readable, a few of them have duplicate titles, in one case (the shbench graphs) there isn't even any text in between the two sets of identically named graphs. Suggestions:

Clarify the difference between the first two and the latter two graphs for the shbench comparison
Remove the titles from the graphs themselves and add them to the markdown, so they can be searched
Get rid of the weird scientific notation y axis on the first producer-consumer graph
Put units on the graph axis labels instead of in the title of the graph

elfmalloc: Make PageAlloc its own ObjectAlloc

We use a PageAlloc to allocate the memory used to back slabs in elfmalloc. It consists of a backing allocator (currently a creek) and a pair of bagpipes - one for clean (uncommited) memory, and one for dirty (potentially committed) memory. It's currently tied to the elfmalloc code, but there's no reason it needs to be.

We should make it its own ObjectAlloc. A few open questions:

What to call it? PageAlloc is a pretty generic, uninformative name, and makes more sense in the context of elfmalloc.
Should it implement both ObjectAlloc and UntypedObjectAlloc? Probably.
Should it be parametric on backing allocator, or should it always use a creek?
Related to the previous point, how can we make the handling of dirty vs clean memory more generic? For example, the distinction between dirty and clean pages is handled differently on Windows than on Unix, and the tradeoffs that we currently make (namely, wantonly wasting virtual memory space) make much less sense on 32-bit where virtual memory is in short supply. Our current thinking on this with respect to the Unix vs Windows question is to have an explicit commit method that is a no-op on Unix, and does the right thing on Windows (where committing must happen explicitly).
Should it be moved into its own crate?

alloc-fmt: Skip early frames of backtrace

In alloc-fmt's print_backtrace_and_abort, we currently print the entire backtrace including print_backtrace_and_abort itself and even the anonymous local closure inside that function. It would be good to figure out a way to only begin printing frames once we've reached the parent of print_backtrace_and_abort (which is the function invoking the macro).

Typo in project description

The project's description on GitHub is "Allocactors in Rust", there is an extra 'c'. Not a huge issue, just something that ought to be changed.

alloc-tls: Handle thread-local storage on platforms without #[thread_local]

On platforms without the #[thread_local] attribute, thread-local storage (TLS) for an allocator is particularly tricky because most obvious implementations (including the one implemented in the standard library) require allocation to work, and do so in such a way that detecting what state a TLS value is in requires allocation, meaning that recursion (caused by an allocation call accessing TLS, triggering further allocation) cannot be detected. While it's possible that this will be fixed at some point in the future, it won't be fixed any time soon.

Inspired by this comment on a jemalloc issue, our best bet may be to register a handler that is called by pthread whenever a thread spawns, and use this opportunity to preemptively initialize TLS for that thread.

AppVeyor: Mark appveyor.sh files executable

The various appveyor.sh scripts in each directory are currently not marked as executable. This hasn't actually proven to be a problem so far, but they should be anyway.

elfc: Switch to using malloc_bind

Currently, elfc implements the C API by hand. We should instead use malloc_bind. ~~Blocked by #12, but once that's done,~~ it's just a matter of going through the motions of using malloc_bind, which is very straightforward.

mmap-alloc: Support all Unices

Currently, mmap-alloc only supports Windows, Mac, and Linux. However, most of the logic on Mac and Linux are generic to most or all Unix systems. We should make it so as much of the logic as possible can be marked as #[cfg(unix)] rather than #[cfg(target_os = "linux")] or #[cfg(target_os = "macos")].

Replace offset with wrapping_offset

Currently we use offset in a number of crates. However, it is unsafe to use offset to implement an allocator because, according to the docs, "both the starting and resulting pointer must be either in bounds or one byte past the end of an allocated object."

Thus, we should switch to using wrapping_offset, which does not have this requirement.

According to the wrapping_offset docs, offset allows the compiler to make more aggressive optimizations, so we should be careful that this change does not regress performance.

Travis: Spurious test failures on Mac

Our Travis CI tests are failing on Mac before the core of the tests even gets executed. I suspect it's related to this issue, but I haven't been able to verify this.

C hooks

ay publish C hooks so that C code can use this allocator!

http://zsiciarz.github.io/24daysofrust/book/vol1/day23.html

Add links in documentation

Currently, most documentation comments do not include hyperlinks to other documentation when referencing other Rust objects.

mmap-alloc: debug_assert! that sizes are non-zero

It is undefined behavior (UB) to pass a zero-sized Layout to the Alloc trait's allocation methods. mmap-alloc's MapAlloc relies on size being non-zero for correctness. While this is allowed, we should still debug_assert! that requested sizes are zero in order to make debugging easier for our users.

slab-alloc: Be exception-safe

Currently, the slab allocator is not exception safe - if a user-provided initialization function panics and unwinds, the allocator may be left in an invalid state. We should fix this.

Exception safety in unsafe Rust code is a very subtle thing. A good place to start is The Rustinomicon.

elfmalloc: Add feature to remove smallest object size class

In the C allocation API (implemented by malloc-bind/elfc), a minimum alignment is guaranteed for all allocations.

Luckily for us, we're planning to make it so that every object size class guarantees the maximum possible alignment for any size that maps to that size class. This means that, for example, any 8-byte object is automatically 8-byte aligned.

Currently, in order to use this fact to satisfy the C API, we have to special-case allocations whose size is less than the minimum by rounding them up. This involves branching in the hot path, which we'd like to avoid.

Instead, we can avoid this by simply getting rid of any object size class smaller than the minimum alignment. Any allocation request with a size smaller than the minimum will automatically be served by the smallest object size class, and will thus automatically be properly aligned. Since this is only needed for the C API (not the Rust Alloc trait API), and would reduce performance for the Rust API (for small allocations), we should put this behavior behind a feature (off by default) which elfc enables.

elfmalloc: Come up with fallback strategy if overcommit isn't available

On systems without overcommitting, we can fail to allocate the creek. In these cases, we need a fallback option that works with the C interface (malloc-bind's Malloc trait).

mmap-alloc: Document that alloc_helper requires a page-multiple size

alloc_helper includes the following code:

let unmap_size = size - allocator.pagesize;
if unmap_size > 0 {
    unmap(allocator.pagesize as *mut u8, unmap_size);
}

This code is only correct if size is at least as large as allocator.pagesize. Currently, we always call alloc_helper with a size that is rounded up to the next multiple of the page size, so this is fine. However, we should a) document this requirement and, b) add a debug_assert! to ensure that it holds.

ezrosent / allocators-rs Goto Github PK

allocators-rs's People

Contributors

Stargazers

Watchers

Forkers

allocators-rs's Issues

Recommend Projects

Recommend Topics

Recommend Org