Giter VIP home page Giter VIP logo

Comments (6)

andrewrk avatar andrewrk commented on May 14, 2024 1

Here is a previous proposal for generic functions and containers:

pub struct List#(T: type) {
    items: ?&T,
    length: isize,
    capacity: isize,

    pub fn deinit(l: &List) {
        free(l.items);
        l.items = null;
    }

    pub fn append(l: &List, item: T) -> error {
        const err = l.ensure_capacity(l.length + 1);
        if err != error.None {
            return err;
        }
        const raw_items = l.items ?? unreachable;
        l.raw_items[l.length] = item;
        l.length += 1;
        return 0;
    }

    pub fn at(l: List, index: usize) -> T {
        assert(index < l.length);
        const raw_items = l.items ?? unreachable;
        return raw_items[index];
    }

    pub fn ptr_at(l: &List, index: usize) -> &T {
        assert(index < l.length);
        const raw_items = l.items ?? unreachable;
        return &raw_items[index];
    }

    pub fn clear(l: &List) {
        l.length = 0;
    }

    pub fn pop(l: &List) -> T {
        assert(l.length >= 1);
        l.length -= 1;
        return l.items[l.length];
    }

    fn ensure_capacity(l: &List, new_capacity: usize) -> error {
        var better_capacity = max(l.capacity, 16);
        while better_capacity < new_capacity {
            better_capacity *= 2;
        }
        if better_capacity != l.capacity {
            const new_items = realloc(l.items, better_capacity) ?? { return error.NoMem };
            l.items = new_items;
            l.capacity = better_capacity;
        }
        error.None
    }
}

pub fn malloc#(T: type)(count: usize) -> ?&T { realloc(None, count) }

pub fn realloc#(T: type)(ptr: ?&T, new_count: usize) -> ?&T {

}

pub fn free#(T: type)(ptr: ?&T) {

}

Here is my new proposal.

The first simplification I make to the language is to require : before all types. : says to the parser: "expect a type now." So you would have to put ":" before every type lookup, for example:

var x :i32 = 2;

Note that this is exactly the same syntax so far. But let's look at more syntax:

// old
fn f() -> i32 {}
// new (unnecessary "->" deleted)
fn f() :i32 {}

//old
    const argc = asm("mov (%%rsp), %[argc]" : [argc] "=r" (-> isize));
// new (again unnecessary "->" deleted)
    const argc = asm("mov (%%rsp), %[argc]" , [argc] "=r" (:isize));

One nice consequence of this proposal is that # builtin functions (such as #sizeof()) no longer have to have special parser code. Everything can go in the @ namespace, like this: @sizeof(:u32). If you do @sizeof(some_var) you'll get a type mismatch error. Expected meta type, got normal type.

Further, all the @add_with_overflow_{i8,u8,i16,u16,i32,u32,i64,u64} builtins can be rolled up into one that accepts a metatype: @add_with_overflow(:u64, op1, op2, &result).

One more thing about builtins. We need a function that serves the purpose of alloca which I believe is the compiler's job. So I'd like a @alloca_array(type, count) where the type parameter is a metatype and then it will return an array of type type with size count. Maybe also @alloca(type) which just returns a pointer to type.

Here's what an implementation of type-generic max looks like in the old proposal but with the new : syntax:

fn max#(T :type)(a :T, b :T) :T {
    if (a > b) a else b
}

Here's the new proposal:

fn max(T :type, a :T, b :T) :T {
    if (a > b) a else b
}

Now the type parameter T is just a normal parameter, of type metatype. Parameters can refer to previous parameters, so :T resolves correctly. :T is equivalent to: @child_type(@typeof(T)).

Example usage of this max implementation:

const x :i32 = 1234;
const y :i32 = 5678;
const z = max(@typeof(x), x, y);

This follows the pattern set by conditional compilation (#11) where the compiler uses normal (unambiguous) program semantics to accomplish things at compile time. For example, the implementation of max could switch on T or do if checks on it, and these would evaluate at compile time. So, type-generic functions don't need any fancy syntax. Type-generic containers still do, though:

pub struct List(T :type) {
    items :?&T,
    length :isize,
    capacity :isize,

    // ... member functions....
}

The only difference here is the # is gone in List(T :type). Since types are always marked with : we don't need # to indicate type generic parameters:

var list :List(:u8);

In the same way that enum semantics work, containers that have all void parameters (e.g. no parameters) can have the parentheses left off. (Perhaps as briefly discussed, also functions with no parentheses isn't so silly after all.)

Having : always mark types is consistent with the philosophy of having code be very obvious what exactly it is doing. It removes the ambiguity of:

foo.bar(1234)

foo could be an expression or a type. With this proposal, it is unambiguously an expression, and if it were a type it would look like:

:foo.bar(1234)

from zig.

andrewrk avatar andrewrk commented on May 14, 2024

Regarding alloca: just do this:

fn f(count: isize) => {
    var array: [count]u8;
}

If the size component of an array type can't be constant expression evaluated, then it will be allocated on the stack with a runtime size.

from zig.

phase avatar phase commented on May 14, 2024

Would whitespace be allowed after : for types? A lot of languages and formatting styles use name: type or name : type.

from zig.

andrewrk avatar andrewrk commented on May 14, 2024

After talking to @thejoshwolfe I discarded the colon syntax. Now everywhere that a type is expected, an expression is valid. However, the expression must evaluate to a constant at compile time.

The language specification (#75) does not exist yet even in draft form, but you can look at some of the code in std/* or run_tests.cpp to see.

from zig.

andrewrk avatar andrewrk commented on May 14, 2024

Current generics proposal. Here's an example of what generic sort would look like:

enum Cmp {
    Less,
    Equal,
    Greater,
}

struct OversizeInt(n: i32) {
    values: [n]u64,
}

const int256 = OversizeInt(4);
const int512 = OversizeInt(8);

fn compare(n: i32)(a: OversizeInt(n), b: OversizeInt(n)) -> Cmp {
    for (a.values) |_, i| {
        if (a.values[i] < b.values[i]) {
            return Cmp.Less;
        } else if (a.values[i] > b.values[i]) {
            return Cmp.Greater;
        }
    }
    return Cmp.Equal;
}

fn sort(T: type, cmp: fn(T, T)->Cmp)(array: []T) {
    // left as exercise for the reader ;-)
}

export const compare_int256 = compare(4);

fn foo() {
    var array: [100]int256 = undefined;

    sort(int256, compare(4))(array);
}

Malloc and free:

error OutOfMem;

fn malloc(T: type)(n: isize) -> %[]T {
    // ...
}

fn realloc(T: type)(mem: []T, new_len: isize) -> %[]T {
    // ...
}

fn free(T: type)(mem: []T) {
    // ...
}

fn foo() -> %void {
    const mem = malloc(u8)(100) %% |err| return err;
    defer free(u8)(mem);
}

List data structure:

struct List(T: type) {
    items: []T,
    length: isize,

    pub fn init() -> List(T) {
        List(T) {
            .items = []T{},
            .length = 0,
        }
    }

    pub fn deinit(l: &List(T)) {
        free(T)(l.items);
        l.items = null;
    }

    pub append(l: &List(T), item: T) -> %void {
        const new_length = l.length + 1;
        l.ensure_capacity(new_length) %% |err| return err;
        l.items[l.length] = item;
        l.length = new_length;
    }

    void ensure_capacity(l: &List(T), new_capacity: isize) -> %void {
        var better_capacity = max(isize)(l.items.len, 16);
        while (better_capacity < new_capacity) {
            better_capacity *= 2;
        }
        if (better_capacity != l.items.len) {
            l.items = realloc(T)(l.items, better_capacity) %% |err| return err;
        }
    }
}

fn max(T: type)(a: T, b: T) -> T {
    return if (a > b) a else b;
}

fn foo() -> %void {
    var list = List(u8).init();
    defer list.deinit();

    list.append(1) %% |err| return err;
    list.append(250) %% |err| return err;
}

from zig.

andrewrk avatar andrewrk commented on May 14, 2024

Here's my current proposal for HashMap:

const assert = @import("index.zig").assert;
const math = @import("math.zig");
const mem = @import("mem.zig");
const Allocator = mem.Allocator;

const want_modification_safety = !@compile_var("is_release");
const debug_u32 = if (want_modification_safety) void else u32;

pub struct HashMap(K: type, V: type, hash: fn(key: K)->u32, eql: fn(a: K, b: K)->bool) {
    entries: []Entry,
    size: isize,
    max_distance_from_start_index: isize,
    allocator: &Allocator,
    // this is used to detect bugs where a hashtable is edited while an iterator is running.
    modification_count: debug_u32,

    const Self = HashMap(K, V, hash, eql);

    pub struct Entry {
        used: bool,
        distance_from_start_index: isize,
        key: K,
        value: V,
    }

    pub struct Iterator {
        hm: &Self,
        // how many items have we returned
        count: isize,
        // iterator through the entry array
        index: isize,
        // used to detect concurrent modification
        initial_modification_count: debug_u32,

        pub fn next(it: &Iterator) -> ?&Entry {
            if (want_modification_safety) {
                assert(it.initial_modification_count == it.hm.modification_count); // concurrent modification
            }
            if (it.count >= it.hm.size) return null;
            while (it.index < it.hm.entries.len; it.index += 1) {
                const entry = &it.hm.entries[it.index];
                if (entry.used) {
                    it.index += 1;
                    it.count += 1;
                    return entry;
                }
            }
            unreachable{} // no next item
        }
    };

    pub fn init(hm: &Self, allocator: &Allocator, capacity: isize) {
        assert(capacity > 0);
        hm.allocator = allocator;
        hm.init_capacity(capacity);
    }

    pub fn deinit(hm: &Self) {
        free(_entries);
    }

    pub fn clear(hm: &Self) {
        for (hm.entries) |*entry| {
            entry.used = false;
        }
        hm.size = 0;
        hm.max_distance_from_start_index = 0;
        hm.increment_modification_count();
    }

    pub fn put(hm: &Self, key: K, value: V) {
        hm.increment_modification_count();
        hm.internal_put(key, value);

        // if we get too full (60%), double the capacity
        if (hm.size * 5 >= hm.entries.len * 3) {
            const old_entries = hm.entries;
            hm.init_capacity(hm.entries.len * 2);
            // dump all of the old elements into the new table
            for (old_entries) |*old_entry| {
                if (old_entry.used) {
                    hm.internal_put(old_entry.key, old_entry.value);
                }
            }
            hm.allocator.free(hm.allocator, ([]u8)(old_entries));
        }
    }

    pub fn get(hm: &Self, key: K) {
        return internal_get(key);
    }

    pub fn remove(hm: &Self, key: K) {
        hm.increment_modification_count();
        const start_index = hm.key_to_index(key);
        {var roll_over: isize = 0; while (roll_over <= hm.max_distance_from_start_index; roll_over += 1) {
            const index = (start_index + roll_over) % hm.entries.len;
            const entry = &hm.entries[index];

            assert(entry.used); // key not found

            if (!eql(entry.key, key)) continue;

            while (roll_over < hm.entries.len; roll_over += 1) {
                const next_index = (start_index + roll_over + 1) % hm.entries.len;
                const next_entry = &hm.entries[next_index];
                if (!next_entry.used || next_entry.distance_from_start_index == 0) {
                    entry.used = false;
                    hm.size -= 1;
                    return;
                }
                *entry = *next_entry;
                entry.distance_from_start_index -= 1;
                entry = next_entry;
            }
            unreachable{} // shifting everything in the table
        }}
        unreachable{} // key not found
    }

    pub fn entry_iterator(hm: &Self) -> Iterator {
        return Iterator {
            .hm = hm,
            .count = 0,
            .index = 0,
            .initial_modification_count = hm.modification_count,
        };
    }

    fn init_capacity(hm: &Self, capacity: isize) {
        hm.capacity = capacity;
        hm.entries = ([]Entry)(hm.allocator.alloc(hm.allocator, capacity * @sizeof(Entry)));
        hm.size = 0;
        hm.max_distance_from_start_index = 0;
        for (hm.entries) |*entry| {
            entry.used = false;
        }
    }

    fn increment_modification_count(hm: &Self) {
        if (want_modification_safety) {
            hm.modification_count += 1;
        }
    }

    fn internal_put(hm: &Self, K orig_key, V orig_value) {
        var key = orig_key;
        var value = orig_value;
        const start_index = key_to_index(key);
        var roll_over: isize = 0;
        var distance_from_start_index: isize = 0;
        while (roll_over < hm.entries.len; {roll_over += 1; distance_from_start_index += 1}) {
            const index = (start_index + roll_over) % hm.entries.len;
            const entry = &hm.entries[index];

            if (entry.used && !eql(entry.key, key)) {
                if (entry.distance_from_start_index < distance_from_start_index) {
                    // robin hood to the rescue
                    const tmp = *entry;
                    hm.max_distance_from_start_index = math.max(isize)(
                        hm.max_distance_from_start_index, distance_from_start_index);
                    *entry = Entry {
                        .used = true,
                        .distance_from_start_index = distance_from_start_index,
                        .key = key,
                        .value = value,
                    };
                    key = tmp.key;
                    value = tmp.value;
                    distance_from_start_index = tmp.distance_from_start_index;
                }
                continue;
            }

            if (!entry.used) {
                // adding an entry. otherwise overwriting old value with
                // same key
                hm.size += 1;
            }

            hm.max_distance_from_start_index = math.max(isize)(distance_from_start_index, hm.max_distance_from_start_index);
            *entry = {
                .used = true,
                .distance_from_start_index = distance_from_start_index,
                .key = key,
                .value = value,
            };
            return;
        }
        unreachable{} // put into a full map
    }

    fn internal_get(hm: &Self, key: K) -> ?&Entry {
        const start_index = key_to_index(key);
        {var roll_over: isize = 0; while (roll_over <= hm.max_distance_from_start_index; roll_over += 1) {
            const index = (start_index + roll_over) % hm.entries.len;
            const entry = &hm.entries[index];

            if (!entry.used) return null;
            if (eql(entry.key, key)) return entry;
        }}
        return null;
    }

    Entry *internal_get(const K &key) const {
        int start_index = key_to_index(key);
        for (int roll_over = 0; roll_over <= _max_distance_from_start_index; roll_over += 1) {
            int index = (start_index + roll_over) % _capacity;
            Entry *entry = &_entries[index];

            if (!entry->used)
                return NULL;

            if (EqualFn(entry->key, key))
                return entry;
        }
        return NULL;
    }

    fn key_to_index(hm: &Self, key: K) -> isize {
        return isize(hash(key)) % hm.entries.len;
    }
}

var global_allocator = Allocator {
    .alloc = global_alloc,
    .realloc = global_realloc,
    .free = global_free,
    .context = null,
};

var some_mem: [200]u8 = undefined;
var some_mem_index: isize = 0;

fn global_alloc(self: &Allocator, n: isize) -> %[]u8 {
    const result = some_mem[some_mem_index ... some_mem_index + n];
    some_mem_index += n;
    return result;
}

fn global_realloc(self: &Allocator, old_mem: []u8, new_size: isize) -> %[]u8 {
    const result = %return global_alloc(self, new_size);
    @memcpy(result.ptr, old_mem.ptr, old_mem.len);
    return result;
}

fn global_free(self: &Allocator, old_mem: []u8) {
}

#attribute("test")
fn basic_hash_map_test() {
    var map: HashMap(i32, i32, hash_i32, eql_i32);
    map.init(&global_allocator, 4);
    defer map.deinit();
}

fn hash_i32(x: i32) -> u32 {
    *(&u32)(&x)
}
fn eql_i32(a: i32, b: i32) -> bool {
    a == b
}

As you can see it depends on ability to put const variable declarations inside a struct, which I think makes sense. Maybe we should then change struct field syntax to var foo: type;:

pub struct HashMap(K: type, V: type, hash: fn(key: K)->u32, eql: fn(a: K, b: K)->bool) {
    var entries: []Entry;
    var size: isize;
    var max_distance_from_start_index: isize;
    var allocator: &Allocator;
    // this is used to detect bugs where a hashtable is edited while an iterator is running.
    var modification_count: debug_u32;

But this asks the question, should there be an exception where variables in structs don't need to be initialized (with = undefined). What happens if you provide an initialization? Maybe this could just be a special case where variable initializations are not allowed? Maybe we should not allow variable declarations inside structs? Maybe we keep existing field syntax and a variable declaration inside a struct is still a global variable, just scoped to the struct?

from zig.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.