<input type="checkbox" id="" disabled=""

Regarding alloca: just do this: <div class="highlight highlight-source-zig notrans

After talking to <a class="user-mention notranslate" data-hovercard-type="user" data-h

Here's my current proposal for HashMap: <div class="highlight highlight-source-zig

generics,about ziglang/zig

Comments (6)

andrewrk commented on May 14, 2024 1

Here is a previous proposal for generic functions and containers:

pub struct List#(T: type) {
    items: ?&T,
    length: isize,
    capacity: isize,

    pub fn deinit(l: &List) {
        free(l.items);
        l.items = null;
    }

    pub fn append(l: &List, item: T) -> error {
        const err = l.ensure_capacity(l.length + 1);
        if err != error.None {
            return err;
        }
        const raw_items = l.items ?? unreachable;
        l.raw_items[l.length] = item;
        l.length += 1;
        return 0;
    }

    pub fn at(l: List, index: usize) -> T {
        assert(index < l.length);
        const raw_items = l.items ?? unreachable;
        return raw_items[index];
    }

    pub fn ptr_at(l: &List, index: usize) -> &T {
        assert(index < l.length);
        const raw_items = l.items ?? unreachable;
        return &raw_items[index];
    }

    pub fn clear(l: &List) {
        l.length = 0;
    }

    pub fn pop(l: &List) -> T {
        assert(l.length >= 1);
        l.length -= 1;
        return l.items[l.length];
    }

    fn ensure_capacity(l: &List, new_capacity: usize) -> error {
        var better_capacity = max(l.capacity, 16);
        while better_capacity < new_capacity {
            better_capacity *= 2;
        }
        if better_capacity != l.capacity {
            const new_items = realloc(l.items, better_capacity) ?? { return error.NoMem };
            l.items = new_items;
            l.capacity = better_capacity;
        }
        error.None
    }
}

pub fn malloc#(T: type)(count: usize) -> ?&T { realloc(None, count) }

pub fn realloc#(T: type)(ptr: ?&T, new_count: usize) -> ?&T {

}

pub fn free#(T: type)(ptr: ?&T) {

}

Here is my new proposal.

The first simplification I make to the language is to require : before all types. : says to the parser: "expect a type now." So you would have to put ":" before every type lookup, for example:

var x :i32 = 2;

Note that this is exactly the same syntax so far. But let's look at more syntax:

// old
fn f() -> i32 {}
// new (unnecessary "->" deleted)
fn f() :i32 {}

//old
    const argc = asm("mov (%%rsp), %[argc]" : [argc] "=r" (-> isize));
// new (again unnecessary "->" deleted)
    const argc = asm("mov (%%rsp), %[argc]" , [argc] "=r" (:isize));

One nice consequence of this proposal is that # builtin functions (such as #sizeof()) no longer have to have special parser code. Everything can go in the @ namespace, like this: @sizeof(:u32). If you do @sizeof(some_var) you'll get a type mismatch error. Expected meta type, got normal type.

Further, all the @add_with_overflow_{i8,u8,i16,u16,i32,u32,i64,u64} builtins can be rolled up into one that accepts a metatype: @add_with_overflow(:u64, op1, op2, &result).

One more thing about builtins. We need a function that serves the purpose of alloca which I believe is the compiler's job. So I'd like a @alloca_array(type, count) where the type parameter is a metatype and then it will return an array of type type with size count. Maybe also @alloca(type) which just returns a pointer to type.

Here's what an implementation of type-generic max looks like in the old proposal but with the new : syntax:

fn max#(T :type)(a :T, b :T) :T {
    if (a > b) a else b
}

Here's the new proposal:

fn max(T :type, a :T, b :T) :T {
    if (a > b) a else b
}

Now the type parameter T is just a normal parameter, of type metatype. Parameters can refer to previous parameters, so :T resolves correctly. :T is equivalent to: @child_type(@typeof(T)).

Example usage of this max implementation:

const x :i32 = 1234;
const y :i32 = 5678;
const z = max(@typeof(x), x, y);

This follows the pattern set by conditional compilation (#11) where the compiler uses normal (unambiguous) program semantics to accomplish things at compile time. For example, the implementation of max could switch on T or do if checks on it, and these would evaluate at compile time. So, type-generic functions don't need any fancy syntax. Type-generic containers still do, though:

pub struct List(T :type) {
    items :?&T,
    length :isize,
    capacity :isize,

    // ... member functions....
}

The only difference here is the # is gone in List(T :type). Since types are always marked with : we don't need # to indicate type generic parameters:

var list :List(:u8);

In the same way that enum semantics work, containers that have all void parameters (e.g. no parameters) can have the parentheses left off. (Perhaps as briefly discussed, also functions with no parentheses isn't so silly after all.)

Having : always mark types is consistent with the philosophy of having code be very obvious what exactly it is doing. It removes the ambiguity of:

foo.bar(1234)

foo could be an expression or a type. With this proposal, it is unambiguously an expression, and if it were a type it would look like:

:foo.bar(1234)

from zig.

andrewrk commented on May 14, 2024

Regarding alloca: just do this:

fn f(count: isize) => {
    var array: [count]u8;
}

If the size component of an array type can't be constant expression evaluated, then it will be allocated on the stack with a runtime size.

from zig.

phase commented on May 14, 2024

Would whitespace be allowed after : for types? A lot of languages and formatting styles use name: type or name : type.

from zig.

andrewrk commented on May 14, 2024

After talking to @thejoshwolfe I discarded the colon syntax. Now everywhere that a type is expected, an expression is valid. However, the expression must evaluate to a constant at compile time.

The language specification (#75) does not exist yet even in draft form, but you can look at some of the code in std/* or run_tests.cpp to see.

from zig.

andrewrk commented on May 14, 2024

Current generics proposal. Here's an example of what generic sort would look like:

enum Cmp {
    Less,
    Equal,
    Greater,
}

struct OversizeInt(n: i32) {
    values: [n]u64,
}

const int256 = OversizeInt(4);
const int512 = OversizeInt(8);

fn compare(n: i32)(a: OversizeInt(n), b: OversizeInt(n)) -> Cmp {
    for (a.values) |_, i| {
        if (a.values[i] < b.values[i]) {
            return Cmp.Less;
        } else if (a.values[i] > b.values[i]) {
            return Cmp.Greater;
        }
    }
    return Cmp.Equal;
}

fn sort(T: type, cmp: fn(T, T)->Cmp)(array: []T) {
    // left as exercise for the reader ;-)
}

export const compare_int256 = compare(4);

fn foo() {
    var array: [100]int256 = undefined;

    sort(int256, compare(4))(array);
}

Malloc and free:

error OutOfMem;

fn malloc(T: type)(n: isize) -> %[]T {
    // ...
}

fn realloc(T: type)(mem: []T, new_len: isize) -> %[]T {
    // ...
}

fn free(T: type)(mem: []T) {
    // ...
}

fn foo() -> %void {
    const mem = malloc(u8)(100) %% |err| return err;
    defer free(u8)(mem);
}

List data structure:

struct List(T: type) {
    items: []T,
    length: isize,

    pub fn init() -> List(T) {
        List(T) {
            .items = []T{},
            .length = 0,
        }
    }

    pub fn deinit(l: &List(T)) {
        free(T)(l.items);
        l.items = null;
    }

    pub append(l: &List(T), item: T) -> %void {
        const new_length = l.length + 1;
        l.ensure_capacity(new_length) %% |err| return err;
        l.items[l.length] = item;
        l.length = new_length;
    }

    void ensure_capacity(l: &List(T), new_capacity: isize) -> %void {
        var better_capacity = max(isize)(l.items.len, 16);
        while (better_capacity < new_capacity) {
            better_capacity *= 2;
        }
        if (better_capacity != l.items.len) {
            l.items = realloc(T)(l.items, better_capacity) %% |err| return err;
        }
    }
}

fn max(T: type)(a: T, b: T) -> T {
    return if (a > b) a else b;
}

fn foo() -> %void {
    var list = List(u8).init();
    defer list.deinit();

    list.append(1) %% |err| return err;
    list.append(250) %% |err| return err;
}

from zig.

andrewrk commented on May 14, 2024

Here's my current proposal for HashMap:

const assert = @import("index.zig").assert;
const math = @import("math.zig");
const mem = @import("mem.zig");
const Allocator = mem.Allocator;

const want_modification_safety = !@compile_var("is_release");
const debug_u32 = if (want_modification_safety) void else u32;

pub struct HashMap(K: type, V: type, hash: fn(key: K)->u32, eql: fn(a: K, b: K)->bool) {
    entries: []Entry,
    size: isize,
    max_distance_from_start_index: isize,
    allocator: &Allocator,
    // this is used to detect bugs where a hashtable is edited while an iterator is running.
    modification_count: debug_u32,

    const Self = HashMap(K, V, hash, eql);

    pub struct Entry {
        used: bool,
        distance_from_start_index: isize,
        key: K,
        value: V,
    }

    pub struct Iterator {
        hm: &Self,
        // how many items have we returned
        count: isize,
        // iterator through the entry array
        index: isize,
        // used to detect concurrent modification
        initial_modification_count: debug_u32,

        pub fn next(it: &Iterator) -> ?&Entry {
            if (want_modification_safety) {
                assert(it.initial_modification_count == it.hm.modification_count); // concurrent modification
            }
            if (it.count >= it.hm.size) return null;
            while (it.index < it.hm.entries.len; it.index += 1) {
                const entry = &it.hm.entries[it.index];
                if (entry.used) {
                    it.index += 1;
                    it.count += 1;
                    return entry;
                }
            }
            unreachable{} // no next item
        }
    };

    pub fn init(hm: &Self, allocator: &Allocator, capacity: isize) {
        assert(capacity > 0);
        hm.allocator = allocator;
        hm.init_capacity(capacity);
    }

    pub fn deinit(hm: &Self) {
        free(_entries);
    }

    pub fn clear(hm: &Self) {
        for (hm.entries) |*entry| {
            entry.used = false;
        }
        hm.size = 0;
        hm.max_distance_from_start_index = 0;
        hm.increment_modification_count();
    }

    pub fn put(hm: &Self, key: K, value: V) {
        hm.increment_modification_count();
        hm.internal_put(key, value);

        // if we get too full (60%), double the capacity
        if (hm.size * 5 >= hm.entries.len * 3) {
            const old_entries = hm.entries;
            hm.init_capacity(hm.entries.len * 2);
            // dump all of the old elements into the new table
            for (old_entries) |*old_entry| {
                if (old_entry.used) {
                    hm.internal_put(old_entry.key, old_entry.value);
                }
            }
            hm.allocator.free(hm.allocator, ([]u8)(old_entries));
        }
    }

    pub fn get(hm: &Self, key: K) {
        return internal_get(key);
    }

    pub fn remove(hm: &Self, key: K) {
        hm.increment_modification_count();
        const start_index = hm.key_to_index(key);
        {var roll_over: isize = 0; while (roll_over <= hm.max_distance_from_start_index; roll_over += 1) {
            const index = (start_index + roll_over) % hm.entries.len;
            const entry = &hm.entries[index];

            assert(entry.used); // key not found

            if (!eql(entry.key, key)) continue;

            while (roll_over < hm.entries.len; roll_over += 1) {
                const next_index = (start_index + roll_over + 1) % hm.entries.len;
                const next_entry = &hm.entries[next_index];
                if (!next_entry.used || next_entry.distance_from_start_index == 0) {
                    entry.used = false;
                    hm.size -= 1;
                    return;
                }
                *entry = *next_entry;
                entry.distance_from_start_index -= 1;
                entry = next_entry;
            }
            unreachable{} // shifting everything in the table
        }}
        unreachable{} // key not found
    }

    pub fn entry_iterator(hm: &Self) -> Iterator {
        return Iterator {
            .hm = hm,
            .count = 0,
            .index = 0,
            .initial_modification_count = hm.modification_count,
        };
    }

    fn init_capacity(hm: &Self, capacity: isize) {
        hm.capacity = capacity;
        hm.entries = ([]Entry)(hm.allocator.alloc(hm.allocator, capacity * @sizeof(Entry)));
        hm.size = 0;
        hm.max_distance_from_start_index = 0;
        for (hm.entries) |*entry| {
            entry.used = false;
        }
    }

    fn increment_modification_count(hm: &Self) {
        if (want_modification_safety) {
            hm.modification_count += 1;
        }
    }

    fn internal_put(hm: &Self, K orig_key, V orig_value) {
        var key = orig_key;
        var value = orig_value;
        const start_index = key_to_index(key);
        var roll_over: isize = 0;
        var distance_from_start_index: isize = 0;
        while (roll_over < hm.entries.len; {roll_over += 1; distance_from_start_index += 1}) {
            const index = (start_index + roll_over) % hm.entries.len;
            const entry = &hm.entries[index];

            if (entry.used && !eql(entry.key, key)) {
                if (entry.distance_from_start_index < distance_from_start_index) {
                    // robin hood to the rescue
                    const tmp = *entry;
                    hm.max_distance_from_start_index = math.max(isize)(
                        hm.max_distance_from_start_index, distance_from_start_index);
                    *entry = Entry {
                        .used = true,
                        .distance_from_start_index = distance_from_start_index,
                        .key = key,
                        .value = value,
                    };
                    key = tmp.key;
                    value = tmp.value;
                    distance_from_start_index = tmp.distance_from_start_index;
                }
                continue;
            }

            if (!entry.used) {
                // adding an entry. otherwise overwriting old value with
                // same key
                hm.size += 1;
            }

            hm.max_distance_from_start_index = math.max(isize)(distance_from_start_index, hm.max_distance_from_start_index);
            *entry = {
                .used = true,
                .distance_from_start_index = distance_from_start_index,
                .key = key,
                .value = value,
            };
            return;
        }
        unreachable{} // put into a full map
    }

    fn internal_get(hm: &Self, key: K) -> ?&Entry {
        const start_index = key_to_index(key);
        {var roll_over: isize = 0; while (roll_over <= hm.max_distance_from_start_index; roll_over += 1) {
            const index = (start_index + roll_over) % hm.entries.len;
            const entry = &hm.entries[index];

            if (!entry.used) return null;
            if (eql(entry.key, key)) return entry;
        }}
        return null;
    }

    Entry *internal_get(const K &key) const {
        int start_index = key_to_index(key);
        for (int roll_over = 0; roll_over <= _max_distance_from_start_index; roll_over += 1) {
            int index = (start_index + roll_over) % _capacity;
            Entry *entry = &_entries[index];

            if (!entry->used)
                return NULL;

            if (EqualFn(entry->key, key))
                return entry;
        }
        return NULL;
    }

    fn key_to_index(hm: &Self, key: K) -> isize {
        return isize(hash(key)) % hm.entries.len;
    }
}

var global_allocator = Allocator {
    .alloc = global_alloc,
    .realloc = global_realloc,
    .free = global_free,
    .context = null,
};

var some_mem: [200]u8 = undefined;
var some_mem_index: isize = 0;

fn global_alloc(self: &Allocator, n: isize) -> %[]u8 {
    const result = some_mem[some_mem_index ... some_mem_index + n];
    some_mem_index += n;
    return result;
}

fn global_realloc(self: &Allocator, old_mem: []u8, new_size: isize) -> %[]u8 {
    const result = %return global_alloc(self, new_size);
    @memcpy(result.ptr, old_mem.ptr, old_mem.len);
    return result;
}

fn global_free(self: &Allocator, old_mem: []u8) {
}

#attribute("test")
fn basic_hash_map_test() {
    var map: HashMap(i32, i32, hash_i32, eql_i32);
    map.init(&global_allocator, 4);
    defer map.deinit();
}

fn hash_i32(x: i32) -> u32 {
    *(&u32)(&x)
}
fn eql_i32(a: i32, b: i32) -> bool {
    a == b
}

As you can see it depends on ability to put const variable declarations inside a struct, which I think makes sense. Maybe we should then change struct field syntax to var foo: type;:

pub struct HashMap(K: type, V: type, hash: fn(key: K)->u32, eql: fn(a: K, b: K)->bool) {
    var entries: []Entry;
    var size: isize;
    var max_distance_from_start_index: isize;
    var allocator: &Allocator;
    // this is used to detect bugs where a hashtable is edited while an iterator is running.
    var modification_count: debug_u32;

But this asks the question, should there be an exception where variables in structs don't need to be initialized (with = undefined). What happens if you provide an initialization? Maybe this could just be a special case where variable initializations are not allowed? Maybe we should not allow variable declarations inside structs? Maybe we keep existing field syntax and a variable declaration inside a struct is still a global variable, just scoped to the struct?

from zig.

generics about zig HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent