tauoverpi / zangle Goto Github PK

License: GNU Affero General Public License v3.0

Zig 100.00%

zangle's Introduction

Zangle

Zangle is a literate programming tool for extracting code fragments from markdown and other types of text documents into separate files ready for compilation.

NOTE: Currently zangle only supports markdown.

Community

matrix: https://matrix.to/#/#zangle:tchncs.de

Building

minimal requirements:
- 0.11.0-dev.4410+76f7b40e1 or higher

$ zig build -Drelease

Invocation

Let book/ be a directory of markdown files.

Tangle all files within a document

$ zangle tangle book/

List all files in a document

$ zangle ls book/

Render the content of a tag to stdout

$ zangle call book/ --tag='interpreter step'

Render a graph representing document structure

$ zangle graph book/ | dot -Tpng -o grpah.png

Render a graph representing the structure of a single file output

$ zangle graph book/ --file=lib/Linker.zig | dot -Tpng -o grpah.png

Find where given tags reside within output files (TODO)

$ zangle find README.md --tag='parser codegen' --tag='command-line parser'

Create a new literate document from existing files (TODO)

$ find src lib -name '*.zig' | zangle init build.zig --stdin > Zangle.md

Example

This project fetches the real package from sr.ht using the new zig package manager however most options are the same as the init-exe template with a few minor changes. The general structure follows:

const std = @import("std");

pub fn build(b: *std.Build) void {
    [[declare release and target options]]
    [[import zangle from the dependency list, set target parameters, and install the artifact]]
    [[setup a run command such that it can be tested without having write the path to the binary in zig-out]]
}

For the target options, .ReleaseSafe was chosen such that the program would panic as soon as it invokes safety-checked undefined behaviour.

const target = b.standardTargetOptions(.{});
const optimize = b.standardOptimizeOption(.{
    .preferred_optimize_mode = .ReleaseSafe,
});

The package is hosted on sr.ht as a sub project of a game project which uses zangle to document every design choice made for all code included in the final game.

.{
    .name = "zangle",
    .version = "0.3.0",

    .dependencies = .{
        .zangle = .{
            .url = "https://git.sr.ht/~tauoverpi/levy/archive/b449d537902961e4d5d7704378f00655068d5cf1.tar.gz",
        },
    },
}

In build.zig, the real zangle is loaded as a dependency and set to follow the local target and optimization configuration.

const dep = b.dependency("zangle", .{});
const zangle = dep.artifact("zangle");
zangle.target = target;
zangle.optimize = optimize;

Then installed with b.installArtifact() which also ensures that the executable is built upon invoking zig build.

b.installArtifact(zangle);

Finally, testing out zangle should require no more than zig build run to invoke it.

const run_cmd = b.addRunArtifact(zangle);

run_cmd.step.dependOn(b.getInstallStep());

if (b.args) |args| {
    run_cmd.addArgs(args);
}

const run_step = b.step("run", "Run the app");
run_step.dependOn(&run_cmd.step);

This concludes the example zangle document with two files written where one included other tags.

zangle's People

Contributors

Stargazers

Watchers

zangle's Issues

Vulnerability: No filtering of '..' in non-absolute paths

The zangle code produced by README.MD includes a flag to switch enabling absolute paths with the intent to prevent escape from the working directory, as that would allow user-priviledged malicious actions to occur from untrusted documents.

However, this can be bypassed without the flag on Linux systems using the double dot file system syntax. Including an arbitrary number of '../' sections in the relative path would allow for the escape of the working directory up to root '/', from which point the malicious file context can descend again into accessible directories. Presently this is the only check for the path.

if (path[0] == '/' and !options.allow_absolute_paths) { return error.@"Absolute paths disabled; use --allow-absolute-paths to enable them."; }

I propose this is an oversight that needs to be corrected for safety and usability purposes with a more robust check of the paths provided for each file context. At minimum the '..' syntax should be filtered against, and a measure taken to ensure a similar issue cannot occur on Windows.

Vulnerability: No symlink checking to prevent escape from working directory

Adding to the issues faced in #1 I realised that the path needs to be fully evaluated after the base 'absolute path' and '..' checks to compensate for the existance of symlinking. A symlink that leaves the working directory would be yet another way of accessing parts of the file system that it shouldn't.

This issue is more complex due to how symlinks differ significantly across platforms and filesystems, and whether it is a hard or soft symlink, but it is still a crucial factor for security purposes. The simplest solution would be resolving each file path completely and comparing the absolute path with the working directory, but I am not personally familiar with the way that is done to account for symlinking.

As long as the start of absolute path to the file starts with the absolute path of the working directory, the operations with fixes from #3 should be sufficient to protect against malicious or unintentional modification of files outside of the working directory, though a default behaviour of refusing to overwrite existing files would be advisable as a clean or diff command can always be provided as well.

Feature: Reverse graph for block annotation and block reference search

The graph command produces the top down reference tree for the program, but in complex programs, finding the parents of a block could be quite difficult and searching the entire graph is not exactly ideal.

Providing a reversed graph with block targetting would allow for a user to quickly identify all the instances where a given block is used, which is particularly useful for reverse engineering as well as providing a way to automatically annotate code blocks with in-document linking to all parents that use the block, so users can navigate up and down the codebase within the document itself.

As an example, for a markdown file containing:

Output a set of words
    lang: zig esc: [[]] tag: #just return a string
    ---------------------------------
    pub fn test() []u8 {
        const str : []u8 = "[[words:from('text')]]";
        return str;
    }

These are the words to be said
    lang: text esc: none tag: #words
    --------------------------------
    arbitrary text

The query of zangle graph --reverse --target=#words example.md might output something along the lines of (as a simple single-target list output example):

[ "#words", [.{.tag="#just return a string", .line=2, .character=25}]]

This could then be used by other tools or in the cleanup command to annotate blocks with references, to for example output the following:

Output a set of words
#just return a string

pub fn test() []u8 {
    const str : []u8 = "[[words:from('text')]]";
    return str;
}

Code block referenced in: example.md, line 2, column 5

These are the words to be said
#words

arbitrary text

Code block referenced in: #just return a string, line 2, column 25

The flexibility of this kind of reference extraction would also be quite useful for debugging and potential integration with document editors, such as vim plugins.

Poor error for duplicate files

Issue:

Running commands when multiple inputs generate the same file will result in the following error message:

Unhandled error: error.DuplicateFile

This is a bug in Zangle, please report it on the issue tracker
along with any information that may help in resolving the bug.

  https://github.com/tauoverpi/zangle/issues/new

Expected behaviour:

The error-case should actually be handled and it should state which files were duplicated.

How to reproduce:

Provide multiple files that generate identical files as args
Run commands like zangle ls <files>

Using a directory with two different files generating the same file:

$ zangle ls foobar
Unhandled error: error.DuplicateFile

This is a bug in Zangle, please report it on the issue tracker
along with any information that may help in resolving the bug.

  https://github.com/tauoverpi/zangle/issues/new

Using the same file twice:

$ zangle ls foobar/foo.md foobar/foo.md
Unhandled error: error.DuplicateFile

This is a bug in Zangle, please report it on the issue tracker
along with any information that may help in resolving the bug.

  https://github.com/tauoverpi/zangle/issues/new

Sample fies:

foobar/foo.md:

```txt file: foobar.md
foo
```

foobar/bar.md:

```txt file: foobar.md
bar
```

Linking error with certain path notations

Issue:

Using zangle with directories with a trailing slash causes a linking error:

$ zangle ls a/
a/y.md: line: 1 col: 1 reason: missing_symbol context: y
error: LinkingFailed
/nix/store/jz6fyj4c0c67z7f4iwisj3aypwkq2flf-zig-0.11.0-dev.3348+3faf376b0/lib/std/array_list.zig:0:0: 0x2106d3 in main (zangle)
/home/plumeus/.cache/zig/p/1220cee43eb2a5f3204909b2457a431f2816e1bcceb2eb251d3613808a9a30e85d9f/tools/zangle.zig:0:0: 0x20f4c5 in main (zangle)

Expected behaviour:

The command should behave the same whether the trailing slash exists or not, as shown below:

$ zangle ls a
x.txt

$ zangle tangle a && cat $(zangle ls a)
Lorem ipsum

How to reproduce:

Make a directory containing one file that depends on another file in the same directory (example provided below)
Run zangle <command> <dir>/ where the command could be something like ls or graph

Files used for reproducing:

a/x.md:

```txt file: x.txt
[[a/y.md::y]]
```

a/y.md:

```whatever tag: y
Lorem ipsum
```

tauoverpi / zangle Goto Github PK

zangle's Introduction

Zangle

Community

Building

Invocation

Example

zangle's People

Contributors

Stargazers

Watchers

zangle's Issues

Issue:

Expected behaviour:

How to reproduce:

Sample fies:

Issue:

Expected behaviour:

How to reproduce:

Files used for reproducing:

Recommend Projects

Recommend Topics

Recommend Org