Giter VIP home page Giter VIP logo

Comments (6)

ChristianMurphy avatar ChristianMurphy commented on June 3, 2024 1

Welcome @stephenlf! 👋

The Options struct is not Sync.

This is an XY Problem.

This is an issue in web servers because it means we must reallocate an Options struct every time we parse something.

Memory is allocated all over the place to do parsing and create a syntax tree.
Allocation feels like a non issue here.

This library aims to be 100% pure safe rust, sync is unsafe https://doc.rust-lang.org/nomicon/send-and-sync.html
Marking a core part of the safe API unsafe is a non-starter here.

Reading between the lines a bit you seem to be trying to reuse options across multiple threads in Rust? Is that accurate?
If so why is allocating Options an issue for you?

from markdown-rs.

wooorm avatar wooorm commented on June 3, 2024 1

without limiting the MDX functionality too much

I have some doubts (but unfounded). If it works with https://github.com/wooorm/mdxjs-rs I’d be open to it.


Anyway, closing this for now. The safe part of this project is very important. It also seems like there is a tiny fraction of time spent on this, we can probably spend our time better on parsing memory/performance. If all of that is addressed and it works with mdxjs-rs, I’d be open to a PR and rediscussing it!

Best! :)

from markdown-rs.

stephenlf avatar stephenlf commented on June 3, 2024

Current workaround I'm using. I'll write this up in a pull request later.

/// A markdown::Options wrapper that can be safely passed between threads. 
/// Note that this struct cannot wrap Options with MDX parser functions
/// specified. In other words, `options.parse.mdx_ems_parse` and 
/// `options.parse.mdx_expression_parse` must both be `None`.
/// 
/// # Examples
/// 
/// ```rust
/// use docgen::state::ThreadsafeOptions;
/// 
/// // Equivalent to `let options = ThreadsafeOptions::new(markdown::Options::default());`
/// let options = ThreadsafeOptions::default();
/// let app_state = Arc::new(options);
/// // `app_state` may now be used in a multi-threaded web server.
/// ```
#[derive(Debug, Default)]
pub struct ThreadsafeOptions(Options);

unsafe impl Send for ThreadsafeOptions {}
unsafe impl Sync for ThreadsafeOptions {}

impl ThreadsafeOptions {
    /// Create a new `ThreadsafeOptions` wrapper around the provided `Options`
    /// object. 
    /// 
    /// **PANICS**: This function will panic if you attempt to instantiate a
    /// new object with an MDX parser specified. In other words, 
    /// `options.parse.mdx_ems_parse` and `options.parse.mdx_expression_parse`
    /// must both be `None`.
    pub fn new(options: Options) -> Self {
        if options.parse.mdx_esm_parse.is_some() || options.parse.mdx_expression_parse.is_some() {
            panic!("Cannot instantiate markdown::Options with MDX parsers specified")
        }
        Self(options)
    }
}

impl From<Options> for ThreadsafeOptions {
    fn from(value: Options) -> Self {
        Self::new(value)
    }
}

impl Borrow<Options> for ThreadsafeOptions {
    fn borrow(&self) -> &Options {
        &self.0
    }
}

from markdown-rs.

stephenlf avatar stephenlf commented on June 3, 2024

I am incorporating a Markdown parser into an Axum web server I am writing. Rather than reallocate an markdown::Options object every time the parser endpoint is called, I had hoped to save on memory by instantiating a single, immutable markdown::Options object, wrapping it in Arc, and passing it in to the router's axum::State. To do this, the markdown::Options object would have to be Send/Sync, since application state may be passed between threads.

I understand not wanting any unsafe blocks in the codebase. "100% safe Rust" is a remarkable claim. My workaround is sufficient for my use case. And it sounds like the amount of allocation going into an Options struct is trivial compared to the work being done by the parser, so there's probably no need for me to worry about it. However, if you do want to create a thread-safe, "100% safe Rust" Options implementation, I believe there are ways to do it without unsafe code blocks.

I realize now it was presumptuous of me to say that I'd "set up a pull request later." I forgot this wasn't one of my projects! Let me know if you'd like me to explore this issue more.

from markdown-rs.

wooorm avatar wooorm commented on June 3, 2024

Other than safe, this project is also no_std + alloc. It’s incredibly basic, Box and Vec but that’s it. Send/Sync/Arc etc is all much higher level.
And indeed, the bottle neck here is parsing documents. I’d assume all energy is better spent there.
I’d also guess that Rust could figure this out itself if there were no mdx functions? Not sure if a thread safe version should just omit those?

from markdown-rs.

stephenlf avatar stephenlf commented on June 3, 2024

I’d also guess that Rust could figure this out itself if there were no mdx functions? Not sure if a thread safe version should just omit those?

That's the approach I took in my codebase. However, I suspect that annotating the MDX functions something like Sync + Send would also solve the problem without limiting the MDX functionality too much. I haven't explored it very much, but something like this might work:

// markdown-rs/src/util/mdx.rs
// ...
pub type EsmParse = dyn Fn(&str) -> Signal + Sync + Send;
// ...
pub type ExpressionParse = dyn Fn(&str, &ExpressionKind) -> Signal + Sync + Send;
// ...

This would of course mean that users couldn't dynamically create their MDX parsers, though I can't see this being a practical issue for most use cases. This code compiles and passes all checks, though I'm not sure what other downstream effects something like this might have.

One can imagine a ThreadsafeOptions struct that is the same as the Options struct with these additional constraints.

from markdown-rs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.