Giter VIP home page Giter VIP logo

Comments (14)

cessen avatar cessen commented on August 21, 2024 2

To expand on what @pickfire said: I think using RopeBuilder is just generally the better way to go for Helix, even aside from transcoding. It will let you robustly handle a lot of things that from_reader() won't. from_reader() is really just a convenience method for more casual use, quick prototyping, etc.

For example, another good use-case for RopeBuilder is loading very large files. from_reader() will just block until the whole file is loaded, which isn't great if it takes a while. But using RopeBuilder + handling IO yourself, you can do things like showing a progress bar and letting the user cancel if it's taking too long.

from helix.

cessen avatar cessen commented on August 21, 2024 1

Some thoughts below, from a semi-random passer-by:

For other text encodings: Ropey (the rope library Helix is using) is designed for streaming transcoding on both load and save. So using it in combination with something like the encoding_rs crate to do the transcoding, and some reasonable auto-detection of text encodings, would handle most valid text files you might encounter. (It won't handle all esoteric corner-cases, of course, because the text encoding landscape is... complicated. But it will at least handle most common encodings in a reasonable way.)

For binary files: there's not really an obvious "right" thing for a text editor to do, so it's more about just picking an option and going with it. Some possibilities include:

  • Including a hex-editor mode. This is what e.g. Sublime Text does.
  • Transcode it as latin-1, which round-trips losslessly for any file but displays the data as garbled text.
  • Refuse to open the file, with a message explaining that it's either binary or an unsupported text encoding. (And possibly suggest opening it in a hex editor.)

from helix.

kirawi avatar kirawi commented on August 21, 2024 1

I'll take a stab at this.

from helix.

pickfire avatar pickfire commented on August 21, 2024 1

A discussion happened at cessen/ropey#40, @cessen suggested that we use RopeBuilder if we intend to read from different file encoding, we may also want to do our own buffering if we do it that way.

from helix.

kirawi avatar kirawi commented on August 21, 2024 1

For example, another good use-case for RopeBuilder is loading very large files. from_reader() will just block until the whole file is loaded, which isn't great if it takes a while. But using RopeBuilder + handling IO yourself, you can do things like showing a progress bar and letting the user cancel if it's taking too long.

Even loading sqlite.c (5.2MB - 5385123 loc) is just taking roughly one second to load the file and start even in debug build. I wonder if we will get into cases where we need a progress bar for it. Anyone have any idea which file ls larger to test?

Maybe a huge XML file?

from helix.

L-as avatar L-as commented on August 21, 2024

Optimally it would show it as UTF-8, but show some placeholder for incorrect encodings like in kakoune. Lossless round-trips aren't necessary IMO since often you don't want to edit the file, and if you do, it's often because you want to remove the incorrectly encoded parts.

from helix.

archseer avatar archseer commented on August 21, 2024

(Hey @cessen 👋🏻 Thank you for building Ropey, it's great!)

So far I've made no effort to support anything other than UTF-8. I think moving forward we could use encoding_rs + hsivonen/chardetng for encoding detection, but I'd probably maintain a whitelist of encodings we'd allow. I've gotten hit by esoteric edge cases before where the encoding was under-specified.

from helix.

cessen avatar cessen commented on August 21, 2024

(Hey @cessen 👋🏻 Thank you for building Ropey, it's great!)

You're very welcome! If you end up running into any problems with it, please don't hesitate to file an issue. :-)

but I'd probably maintain a whitelist of encodings we'd allow.

Yeah, I think that's a very reasonable approach. Supporting a curated set of encodings that you're confident work correctly is better than trying to support everything unreliably.

from helix.

pickfire avatar pickfire commented on August 21, 2024

For example, another good use-case for RopeBuilder is loading very large files. from_reader() will just block until the whole file is loaded, which isn't great if it takes a while. But using RopeBuilder + handling IO yourself, you can do things like showing a progress bar and letting the user cancel if it's taking too long.

Even loading sqlite.c (5.2MB - 5385123 loc) is just taking roughly one second to load the file and start even in debug build. I wonder if we will get into cases where we need a progress bar for it. Anyone have any idea which file ls larger to test?

from helix.

archseer avatar archseer commented on August 21, 2024

Note: we're also discussing large file loading with regards to Led here: #219

from helix.

cessen avatar cessen commented on August 21, 2024

Here are a couple of test files I use when testing my own editor:

100mb.txt.zip (download is ~450 KB, unzipped ~100 MB)
1gb.txt.zip (download is ~4.5 MB, unzipped ~1 GB)

Having said that, even the 1 GB file loads in ~1 second in my editor (which also uses Ropey). But I've heard that some people open 10-20GB (or even larger) log files sometimes, so that would start to become pretty significant wait times. And when transcoding on load because a file isn't utf8, those times will almost certainly go up quite a bit as well.

from helix.

kirawi avatar kirawi commented on August 21, 2024

Seems like we might want a wrapper over RopeBuilder for our usecase? Or does that already exist with Document?

from helix.

kirawi avatar kirawi commented on August 21, 2024

To expand on what @pickfire said: I think using RopeBuilder is just generally the better way to go for Helix, even aside from transcoding. It will let you robustly handle a lot of things that from_reader() won't. from_reader() is really just a convenience method for more casual use, quick prototyping, etc.

For example, another good use-case for RopeBuilder is loading very large files. from_reader() will just block until the whole file is loaded, which isn't great if it takes a while. But using RopeBuilder + handling IO yourself, you can do things like showing a progress bar and letting the user cancel if it's taking too long.

This is completely new to me, so I have a few stupid questions to ask if you don't mind. How would you use RopeBuilder to open a file in a non-blocking way? From the example you gave in the other thread, it would block until the entire file was read, wouldn't it?

from helix.

cessen avatar cessen commented on August 21, 2024

How would you use RopeBuilder to open a file in a non-blocking way?

The short answer is simply that RopeBuilder doesn't handle IO, so you can do the IO yourself however you want (sync, async, hand-writing an event loop that periodically updates the UI and and checks for input, or whatever).

For example, you could do something like this (pseudo-code):

let mut reader = FancyBufReader::new(my_file);
let mut builder = RopeBuilder::new();

while let Ok(text_chunk) = reader.read_next_str_chunk_of_max_length(4096) {
    builder.append(text_chunk);
    ui_update_progress_bar();
    if user_input_cancel() {
        return None;
    }
}

return Some(builder.finish());

This does IO incrementally, a chunk at a time, and updates the UI and checks user input as it goes.

But RopeBuilder has nothing to do with IO at all, so you could write something similar to this but using async IO primitives or whatever instead. RopeBuilder doesn't care, it just takes str slices.

From the example you gave in the other thread, it would block until the entire file was read, wouldn't it?

I'm not sure exactly which example you're referring to, but if it calls from_reader(), then yes. from_reader() always blocks until the entire file is read (errors notwithstanding)

Hope that helps!

from helix.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.