0

I want to write a function that will read a text file, and return its lines as an array of strings for further processing.

To this end, I am trying to get my head around correctly handling an N-length array of strings, for N determined at runtime.

I had this basic example with a hard-coded array of length two. It runs fine, and does not report any leaks:

const std = @import("std");
const stdout = std.io.getStdOut().writer();
var gpa = std.heap.GeneralPurposeAllocator(.{}){};

pub fn main() !void {
    const alloc = gpa.allocator();
    defer _ = gpa.deinit();

    const names = try getNames(alloc);
    defer destroyNames(alloc, names);

    for(names) |name| {
        try stdout.print("Hello {s}\n", .{name});
    }
}


fn destroyNames(alloc:std.mem.Allocator, names:*[2][]const u8) void {
    for(names) |name| {
        // Correctly frees the '[]const u8'
        alloc.free(name);
    }
    _ = alloc.destroy(names);
}


fn getNames(alloc:std.mem.Allocator) !*[2][]const u8 {
    const names = [_][]const u8{"Alix", "Bub"};

    const new_names = try alloc.create([2][]const u8);
    const title = "Prof";

    for(names, 0..) |name, i| {
        new_names[i] = try std.fmt.allocPrint(alloc, "{s} {s}", .{title, name} );
    }

    return new_names;
}

Satisfied with this, I proceeded to move just a couple of things around. The following runs, but fails on the deferred destroyNames operation

const std = @import("std");
const stdout = std.io.getStdOut().writer();
var gpa = std.heap.GeneralPurposeAllocator(.{}){};

pub fn main() !void {
    const alloc = gpa.allocator();
    defer _ = gpa.deinit();

    const names = try getNames(alloc);
    defer destroyNames(alloc, names);

    for(names.*) |name| {
        try stdout.print("Hello {s}\n", .{name});
    }
}


fn destroyNames(alloc:std.mem.Allocator, names:*[][]const u8) void {
    for(names.*) |name| {
        // ERROR - core dumps here...!
        // Cannot properly free a '[]const u8' supplied like this
        alloc.free(name);
    }
    _ = alloc.destroy(names);
}


fn getNames(alloc:std.mem.Allocator) !*[][]const u8 {
    // Stand-in. Hypothetically, read from a N-line file...
    const names = [_][]const u8{"Alix", "Bub"};

    var new_names:[][]const u8 = undefined;
    new_names = try alloc.create([names.len][]const u8);
    const title = "Prof";

    for(names, 0..) |name, i| {
        new_names[i] = try std.fmt.allocPrint(alloc, "{s} {s}", .{title, name} );
    }

    return &new_names;
}

Running results in

$ zig run length-general.zig 
Hello Prof Alix
Hello Prof Bub
General protection exception (no address available)
/home/tai/.local/var/zig/zig-linux-x86_64-0.14.0-dev.1366+d997ddaa1/lib/compiler_rt/memset.zig:19:14: 0x10f7f10 in memset (compiler_rt)
            d[0] = c;
             ^
/home/tai/.local/var/zig/zig-linux-x86_64-0.14.0-dev.1366+d997ddaa1/lib/std/mem/Allocator.zig:313:26: 0x1040e57 in free__anon_3449 (length-general)
    @memset(non_const_ptr[0..bytes_len], undefined);
                         ^
/home/tai/scratch-zig/length-general.zig:26:19: 0x103bacc in destroyNames (length-general)
        alloc.free(name);
                  ^
/home/tai/scratch-zig/length-general.zig:14:23: 0x103b59c in main (length-general)
    defer destroyNames(alloc, names);
                      ^
/home/tai/.local/var/zig/zig-linux-x86_64-0.14.0-dev.1366+d997ddaa1/lib/std/start.zig:615:37: 0x103ad2f in posixCallMainAndExit (length-general)
            const result = root.main() catch |err| {
                                    ^
/home/tai/.local/var/zig/zig-linux-x86_64-0.14.0-dev.1366+d997ddaa1/lib/std/start.zig:250:5: 0x103a90f in _start (length-general)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)
Aborted (core dumped)

As an attempted check, I did try to alloc.free(name.*) in both examples, and both identically reported error: index syntax required for slice type '[]const u8', implying to me they both identically receive indeed a []const u8

How can I fix these errors?

1 Answer 1

1

In the modified version, the function getNames is returning a pointer to a value stored on the stack, which is not allowed.

In zig, array types are somewhat confusing:

  • [N]u8 is a "array", a value type
  • []u8 is a "slice", a pointer type. It is similar to struct {ptr: [*]u8, len: usize}
  • [*]u8 is a pointer, like *u8 but it allows array indexing syntax.

The type *[2]u8 can be implicitly casted to a slice []u8. You are then returning &new_names which is taking a pointer to a stack value. Data in the stack gets overwritten after you return from the function, and then everything gets messed up.

The solution is to return [][]const u8 instead of *[][]const u8.


Another thing to note:

try alloc.create([names.len][]const u8);

The length of array types must be known at comptime - so this call only works because names.len is known at comptime to be 2. If you were reading from a file, you wouldn't know the length at comptime and this wouldn't work. You likely want to use alloc.alloc() here to allocate a slice:

try alloc.alloc(u8, names.len);

and alloc.free() to free it


Another thing:

var new_names:[][]const u8 = undefined;
new_names = ...;

There is no reason to initialize a variable to undefined that you're immediately going to fill on the next line.

const new_names = ...;
Sign up to request clarification or add additional context in comments.

6 Comments

Thanks for the explanation. I am getting the sense that I am not thinking about the task idiomatically... in more abstract languages I would happily have an N-array of various-length strings, all lengths unknown at compile time, resulting in an "arbitrary array of arbitrary strings". I think this is not a systems/memory way of thinking. Is there a "pattern" or idiomatic practice for this kind of use-case? Presumably a struct to represent a line, and a more complex "-list" type item?
(I have indeed played around with some of the changes you suggested, but so far not brought the code back to even a working-but-for-segfault state....)
> I would happily have an N-array of various-length strings, all lengths unknown at compile time, resulting in an "arbitrary array of arbitrary strings" < This is completely possible, but it has to be a slice of slices. [][]u8 is perfectly valid because it is a pointer to a contiguous array in memory of pointers. Your second code block worked for me after: for(names.*) -> for(names), names:*[][]const u8 -> names:[][]const u8, for(names.*) -> for(names), alloc.destroy(names) -> alloc.free(names), !*[][]const u8 -> [][]const u8, return &new_names -> return new_names
.... I had done all that, except for destroy -> free because I had the item at names allocated using create ... I thought create had to entail use of destroy ? By not changing that, I was getting to a compile time error I didn't know what to do with .../lib/std/mem/Allocator.zig:113:28: error: ptr must be a single item pointer ...
Create should mean use destroy - free happens to work fine here because the size and alignment is the same. You should switch create to alloc if you're going to use free, or alternatively return *[2][]const u8 from the fn which is the type create returns
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.