statiqdev / statiq.framework Goto Github PK
View Code? Open in Web Editor NEWA flexible and extensible static content generation framework for .NET.
Home Page: https://statiq.dev/framework
License: MIT License
A flexible and extensible static content generation framework for .NET.
Home Page: https://statiq.dev/framework
License: MIT License
Index
metadata should be treated specially.
Index
value, the engine should sort all result documents based on their Index
valuesOrderDocuments
module and all other modules that produce ordered results should simply clone and set the Index
value and let the engine do the actual ordering
AsOrdered
call being used in base modules for parallel operations can be removed since the documents will get ordered by the engine so preserving order during parallel operations isn’t importantThe GroupBy
module is confusing because it groups the result of child modules and not the input documents. This does make some sense because the input documents are used as the basis for each group, but it's different enough from how LINQ GroupBy()
works that it causes confusion.
Some ideas:
Not quite sure what the best approach is here. Some are breaking changes, but I'm okay with that if it makes the module substantially easier to use.
No longer needed given more powerful document model of ObjectDocument<T>
Change all uses of DateTime
to DateTimeOffset
, including in the app once that gets ported.
It wouldn't always be desirable, but in projects where you distribute the source code it would be helpful to users to be able to see the file path of the source file for whatever part of the API they are looking at. A line number would be nice too.
Consider adding ImmutableArray<IDocument> IDocument.Children
. This would allow any document to have a well-defined way of describing children and would be utilized by the following modules:
PaginateDocuments
GroupDocuments
CreateTree
AddDocumentsToMetadata
would also need to be rewritten and renamed to something like AddChildren
(or SetChildren
- would a separate AddChildren
be needed in that case? And for that matter, should AddMetadata
be renamed to SetMetadata
?)
It would also let modules like CombineDocuments
treat children a little differently by concatenating them rather than overwriting them as with metadata.
Could also consider adding int IDocument.Index
to indicate the document's index within a list. That would be useful to persist ordering after a module clones or other reorders documents. Extensions that rely on the index could be built to do things like get the next document given the current one, even if the input documents are out of order.
I just tried to run a build and kept getting an error: Shortcodes: One or more errors occurred. (No such host is known)
which was claiming to come from a file I haven't touched in a while which contains a youtube shortcode. Eventually I realised that my internet was down and once I got it back up the build worked fine.
So I'm assuming it has to look up what website "YouTube" is referring to or something and I'm wondering if that's something which could be cached, both to allow working offline and just to speed up builds in general.
This issue actually belongs to the app, but I'm dropping it here for now since it suggests some framework enhancements. Consider the generation of a blog with tags. Each tag needs to have a page which lists the posts for that tag, or possibly even more than one if the tag pages are paginated.
Right now that's accomplished by making the layout have a known file like _Tag.cshtml
and then loading that file, grouping posts by tag, merging the groupings with the archive template for each tag, and calculating a new destination path for each one. Then each tag group is rendered and output using the content from the _Tag.cshtml
file.
This works fine but requires a lot of coordination between the theme/layout and the generator. Instead, I'd like the theme to be able to do this entirely on it's own. That would let themes have a lot more control over the pages they generate and would remove the burden from the generator for things like tags, authors, series, etc.
I would like to be able to write a tag.cshtml
file in my theme that looks something like:
TemplateDocuments: {PostsPath}/**/*.md
TemplateGroup: tags
TemplateDestination: tags/{GroupKey}.html
---
<!-- Razor code to iterate the group items for each tag and display as an archive -->
Some thoughts:
TemplateDocuments
front matter uses a globbing pattern
FilePath
(or string) to see if the path "matches"FilterDocuments
module if the Documents
front matter is presentIMetadata
TemplateGroup
front matter, if present, indicates the TemplateDocuments
should be grouped by the specified metadata value(s) using the GroupDocuments
modulePresumably the pipeline that handles this would look something like:
new ExecuteIf(
"TemplateDocuments",
// Process each template independently
new ForEachDocument(
// Merge the template into the pages or results it's representing
new MergeDocuments(
// Merge each potential document the template applies to onto the template
// This gives us access to "TemplateDocuments" and "TemplateGroups"
new MergeDocuments("Pages"),
// Filter to only those documents that match the globbing pattern
new FilterDocuments(Config.FromDocument(doc =>
doc.Source?.Matches(doc.Interpolate(doc.String("Documents"))))
),
// If we're also grouping the templated documents, go ahead and do that
new ExecuteIf(
"TemplateGroup",
new GroupDocuments(Config.FromDocument(doc =>
doc.Get<IEnumerable<object>>(doc.String("TemplateGroup")))
)
)
).WithInputDocuments().MergeIntoResults(),
// Set the destination for each templated result document
SetDestination(Config.FromDocument(doc =>
doc.Interpolate(doc.String("TemplateDestination")))
)
)
)
// At this point we have the original documents that don't specify TemplateDocuments
// as well as a document for each templated document or group
// with the content of the template
The highlight shortcode renders my code differently than it would render if I user the markdown syntax.
On the image below, the first piece of code is using the highlight shortcode whereas the second one is using mardown syntax for code.
The main difference is that using markdown the code is rendered inside a pre and a code wrapper whereas in the shorcode only the code element is used. The shortcode allows to specify an element to wrap the highlighted content in so I can specify "pre" instead of "code", but I don't see how I can specify both the the code and pre markers. From what I understood, highlight.js has some css that make the code to be inline with only the code marker and to be in block with both pre and code markers.
Maybe I am missing something, but I think most of the time the content highlighted should be in block rather than inline, so this should be the default behavior.
Finally figured out a strategy for per-document caching that I’m happy with:
Cache
module that has child modules.Cache
module looks at IDocument.Source
and the hash of all input documents and stores them for the next run. The first time through every input document is a cache miss so it runs them all through the child modules and then saves the output document for each input source.Cache
should have a option to either run each input cache miss document through the child modules one at a time or all at once (similar to other modules with child modules).and all these other documents also match
...” - that’ll be useful in scenarios like Razor where we can key the cache off the input Razor document but also trigger a miss if any _*.cshtml
documents change (which could be captured by another pipeline specifically to guide the cache)..GetHashCode()
for each metadata key and value (excluding settings).Cache
module should use a metadata value/setting to disable and clear it. This would dovetail nicely with the other issue for manual regeneration, something like a “Press c to regenerate without caching”.Razor
into two modules, one for compilation and another for rendering. This would allow the Cache
module to aggressively cache Razor compilation since that doesn’t actually rely on other documents and pipelines - it may even be performed during the process phase. Then the Razor assembly for each page would get stored as document content (maybe a new content provider for dynamic assemblies?) and could be used in the render phase without caching (or with caching that looks at all other dependent pipelines).I think the approach of making caching a module concern rather than try and deal with it more broadly in the engine or pipeline will be more powerful and flexible.
This will make it easier to change the output path for anything without having to redefine pipelines. Basically, whenever WriteFiles
goes to output something, allow a global event handler to change the path. I.e., if you wanted to change the tag index for the blog recipe, you could add a handler that checks for the known output path and then resets it to something different.
Possibly related to #192
I have not found any documentation on this yet, but I'm wondering whether it's possible to cross-reference between pages and / or posts.
In DocFx you can set a uid
in the yml header, which can be used with an @<uid>
syntax (other syntaxes are also available, but this is a shorthand).
As Statiq grows, the namespace hierarchy is getting confusing. I'm even finding myself forgetting certain extensions or objects exist when they're not in scope due to being in a nested namespace. One solution that I theoretically like is flattening the namespace structure to a single namespace per package (only Statiq.Common
, Statiq.Core
, etc.).
The tradeoff is that namespaces are also used to communicate organizational structure, particularly for documentation. Imagine what the Statiq API docs would look like if everything in Statiq.Common
were listed in a single table. Until a feasible way of presenting better organization in documentation without relying on namespaces is found, flattening the namespaces doesn't make sense.
I made a new issue for this general feature as #32 - if it can be implemented then flattening the namespaces becomes a possibility.
This will be my last request for today.
DocFX has Markdown syntax to create tabbed content. https://dotnet.github.io/docfx/spec/docfx_flavored_markdown.html?tabs=tabid-1%2Ctabid-a#tabbed-content
Can this also be added to Wyam?
Add new modules CompileContent
and ExecuteContent
that executes the content of a document as raw code. Document
(the current document) and Context
(the execution context) will be provided as global properties. The CompileContent
module will pre-compile the content and store the assembly bytes in a content stream while the ExecuteContent
module will either execute the precompiled content assembly or perform the compilation if not already done. Precompiling allows for better caching in the prepare vs. transform phases.
The code can return a few different things (similar to the Execute
module):
IDocument
or IEnumerable<IDocument>
will result in the module outputting the returned document(s)IModule
or IEnumerable<IDocument>
will result in the module executing the returned module(s) without an initial input documentThis will go in Statiq.CodeAnalysis
.
Remove support for specifying child modules as well since that's covered by MergeDocuments
or MergeMetadata
.
This issue actually goes to the Statiq app, but I'm putting it here for now.
The default pipelines should run front matter through interpolation by default using the InterpolateMetadata
module (#38). A global or document level metadata bool key (probably named InterpolateMetadata
too) should allow control over whether metadata is interpolated either for all documents or for specific documents.
Offer an option when building docs in Statiq to present classes in their file system folder structure in addition to, or instead of, in their namespaces. That would require the docs to be built either from source or to find some way to gather symbol location data from a pdb for docs built from an assembly (and if neither source or a pdb is available, then location data won't be either and you can't present symbols organized by file system location).
One question is what to do when we don't know the original location even when we have a pdb or source files, I.e. for external references. First thought is they just wouldn't be included in the file system structure view.
This is really a Statiq app issue but putting in Framework for now.
By default all pipelines are executed. It should be possible to specify specific pipeline(s) to execute, which will execute only them (and their dependent pipelines once pipeline parrallelism/dependencies are implemented in #189). You should be able to specify one or more pipelines to run.
Pipelines should also be able to specify that they shouldn't be run by default. That will allow special pipelines that only get run when the pipeline CLI argument specifies that they should.
Hi, I was wondering what would it take to add a new module to wyam to support RTL content. The module needs to add dir="rtl" attribute to all elements that start with RTL characters and align="right" to tables, RTL characters are specified by section D of RFC3454. For more information, please take a look at xoofx/markdig#239.
It would also be useful if the module adds a css class to these html elements so that they can be styled (most probably their font characteristics) separately.
I am willing to create a PR, but might need some help and guidance.
It made sense when configurations were scripted out, but now that it's all in code a List<IModule>
or IModule[]
would work just as well for reusability.
The If
module should have a convenience signature that evaluates meta keys
If(string key, object value)
I often find myself testing meta keys in the If
module, and I feel like this is common enough to be made simpler. For example:
If("SourceFileBase", "introduction")
// Which is easier and clearer than...
If((doc, ctx) => { return doc["SourceFileBase"] == "introduction"; })
This wouldn't replace anything, just make a common scenario easier.
Any reason why I shouldn't write this and submit it? Should it be another signature on If
, or should it be another module (IfMeta
) that wraps If
?
Similar to how GroupBy
used to work, PaginateDocuments
paginates documents as a result of running child modules. Instead it should paginate the input documents similar to the refactoring performed for GroupBy
in #8.
Now that module execution is async, support for CancellationToken
should be added to the module execution method. That's important because right now when one of the parallel modules throws there's no way to stop other modules which results in a loop that's not as tight as it could be. Should also respond to quit requests from the CLI.
Add Config.FromDocument<T>(string)
and Config.FromContext<T>(string)
to get config values from metadata. Audit code to see if these can be used anywhere instead of the delegate config methods.
File system case sensitivity is tricky. Windows and Linux are different. And on Linux the sensitivity can actually vary from folder to folder depending on if a different file system is mounted in that location. The current LocalDirectory
implementation attempts to deal with this on a directory-by-directory basis, but that requires surfacing a IsCaseSensitive
property for every directory, is error prone and requires temporary file writes, and can't be cascaded all the way through for things like documents that only store a source path.
A simpler (though maybe not as correct) approach would be to define a single case-sensitivity rule in the engine and leave it at that. Most (like, almost all) sites will end up pulling in files from the same subfolder structure, especially now that alternate file systems have been removed. The engine should attempt to determine the case sensitivity of it's current file system and make that the default for all path comparisons.
I'm working on my thesis at the moment and there are a couple of things I'm missing in almost all systems so far. One of them is something to create a glossary.
What I'm thinking of is a yaml or markdown file which contains the terms and their descriptions, and a shortcode that can be used to include the terms into markdown files. Then a page should be generated with all glossary items that are actually used. It would also be useful when a shortcode is used that's not included in the glossary file.
Yaml is built to create lists like this. The downside I see here is that it's harder to include other shortcodes or Markdown syntax.
When a Markdown file is used, which brings the advantage of md syntax and shortcodes, it will need to have a simple yet rigid structure.
I think this might be a useful feature for scholars, who don't really have any choice besides Word or LaTeX.
New extension methods string IMetadata.Interpolate(string)
and string IMetadata.Interpolate(string, IExecutionContext)
that performs a kind of string interpolation, making the current metadata (and context if provided) available to the expressions.
For example, given a document with metadata "Foo"="Bar"
:
document.Interpolate("ABC {Foo} 123")
= "ABC Bar 123"document.Interpolate("ABC {Foo.ToUpper()} 123")
= "ABC BAR 123"document.Interpolate("ABC {Context.String("Global")} 123")
= "ABC XYZ 123"That last example demonstrates that Metadata
and Context
have special meanings that override any existing metadata and correspond to the passed-in metadata and execution context (if provided). If the provided metadata also contains a key named "Metadata" or "Context" it can be accessed through the Metadata
value like "ABC {Metadata.String("Metadata")} 123"
.
This will need to use Roslyn and go in Statiq.CodeAnalysis
since each expression needs to be evaluated as code. The Roslyn script host should provide metadata keys and the special Metadata
and Context
values as global properties. The whole thing could potentially be evaluated by wrapping the input string as an actual interpolated string and then evaluating that as code in a single script submission.
As mentioned in #831, the Razor
module should be split into RazorCompile
and RazorRender
(exact names TBD - CompileRazor
and RenderRazor
?). Because Razor compilation doesn't need access to other documents, it can be performed in the prepare pipeline phase and cached. That should make rebuilds faster when the Razor files don't change.
This falls in the same line as #839, where I think it might be a nice plugin for scholars.
For this we might use a Yaml or a BibTeX file. I even thought about using C# to create a strong typed DSL for a while. It should also include a shortcode to be used in Markdown files. When generating the site it should create a page with the bibliography for all items used on md pages and show a warning when a shortcode is used for an unknown item.
It would be nice if a setting can be included in config.wyam
to select the different Bibliography styles. (APA, IEEE, Harvard, et. al.).
What do you think about this?
They're a little confusing - is there a way to merge them into a single module? May not need all of them since modules no longer start with an initial input (or starting with inputs vs. empty for child modules could be an option).
When using the fluent engine builder API, the user should be able to register plain delegates as modules and the builder will wrap them in an Execute
module automatically.
The file location metadata like RelativeFilePath
that modules like ReadFiles
and WriteFiles
use has been bugging me for a long time. This is another significant change I'm strongly considering for 2.0. I figure we're going to break so much anyway, might as well get some of the other big wish list breaks in at the same time.
Data about where the document was from and where it's going are so fundamental to the process that they should get special treatment. And we already do treat the source specially with the IDocument.Source
property. But we currently don't give the destination equally special treatment. It's also really confusing right now which of the many ReadFiles
and WriteFiles
metadata should be used for what. This aims to remedy both those problems by providing one source of truth for where a document will be/was written to.
FilePath
property IDocument.Destination
that tracks where a document should be written to (or was written to if it's already been output by the WriteFiles
module).ReadFiles
and WriteFiles
(RelativeFilePath
, WritePath
, etc.) - they should set/use IDocument.Source
and IDocument.Destination
exclusively to avoid confusion.IDocument.Destination
will initially get set by the ReadFiles
module to the relative input path (same as RelativeFilePath
metadata right now).IExecutionContext.GetDocument()
overloads will be added that allow setting the destination in much the same way we can set the source when requesting a new document.WriteFiles
module will only look at the new IDocument.Destination
property to figure out where to write to. If it's relative, it'll append it to the output path. If it's absolute, that's where the file will go.WriteFiles
module changes where the file was written to (I.e., by using the overload that lets you set the extension), it'll adjust the IDocument.Destination
property to match..GetLink()
).Destination
module that sets IDocument.Destination
(similar to the current pattern of using a Meta
module to set WritePath
or RelativeFilePath
).Not quite sure how to do this yet (of if it's possible). Can't rely on the thread since we jump around via async. Maybe a custom SynchronizationContext per pipeline that stores the pipeline name? Or we could force getting a trace instance from the execution context, which always knows the current pipeline and module.
Since tons of other stuff is breaking in v3, I've started looking at other stuff that's been bothering me but that I'd normally just leave alone. This may be the one final opportunity to correct some of these nits.
One area I think could be improved is module names. It made sense for them to be as short as possible when we were scripting config files (at least that was the thinking), but since it'll be code-based with Intellisense in v3, maybe it's time to rethink that.
I'm considering going through modules and renaming them so they all follow the pattern [verb][noun]:
Append
-> AppendContent
AutoLink
-> InsertLinks
Content
-> SetContent
Replace
-> ReplaceContent
Markdown
-> RenderMarkdown
Redirect
-> GenerateRedirects
Shortcodes
-> ProcessShortcodes
Branch
-> ExecuteBranch
Concat
-> ConcatDocuments
GroupBy
-> GroupDocuments
If
-> ExecuteIf
You get the idea. I'll generate a table we can point to that shows the old and new names for people porting custom configs.
Thoughts? cc @deanebarker because I know you had some opinions on this a while back.
A new FilterSources
module that can filter document by their source using a globbing pattern. For example:
new FilterSources("/foo/**/*.md")
would get all the .md
documents.
Instead of passing input documents to modules as a separate argument, create a Documents
property in the execution context to carry them (the existing Documents
property will need to be renamed to Pipelines
or something). This solves a number of problems:
Merge the various Execute
, ExecuteContext
, and ExecuteDocument
modules into a single ExecuteCode
module - challenge will be differentiating constructor lambdas enough between a single document and all documents to avoid ambiguity.
This is one thought that came out of a good discussion with @deanebarker. The new code-based model in v3 promotes doing work outside the bounds of pipelines. That said, there still some advantages to letting the engine coordinate this work such as parallelism, access to state, etc.
Before
tasks:
IExecutionContext
as input (or maybe an IEngine
so the before tasks can manipulate the engine - need to be careful about concurrency if we do that) which contains the file system, settings etc.After
tasks:
IExecutionContext
as input which contains the file system, settings, etc. as well as all output documents from all pipelines.Need to think about overall before/after vs. before/after re-execution (I.e. due to previewing). First thought is to add a innerLoop
flag (name TBD) to indicate if the task should always run or just run during initial and final execution.
The before and after tasks should be implemented as IBeforeTask
and IAfterTask
though the fluent engine builder should also allow registering them as delegates (which will create a delegate oriented instance behind the scenes).
Globbing patterns can currently accept braces to indicate expansion: *.{xml,txt}
. This has potential to conflict with string interpolation which also uses braces. One or the other could be escaped, but that's counter-intuitive and not very friendly. Instead the globbing patterns could be changed to use brackets instead which keeps similar semantics: *.[xml,txt]
.
Now that interpolation has been implemented, create a new InterpolateMetadata
module in Statiq.CodeAnalysis
that will check all metadata items in a document for the presence of {
and, if found, interpolate those values and produce a new document with the interpolated metadata. The module should only modify string
metadata (which covers values from front matter).
See https://github.com/Wyamio/Wyam/issues/56 for a discussion of why it was created. I think the scenario continues to be confusing, and now that the output folder is cleaned on every rebuilt, the module won't work anymore as intended anyway unless you turn off output cleaning (and screw up other pipelines).
The underlying use case of not wanting to spend time processing large assets on every build could be better addressed by creating multiple generators and pre-generating the expensive assets as needed into a holding space and then using CopyFiles
to get them into the output in the main generator. Could also do something similar in a single generator once #4 is implemented and you can say "only run this one pipeline" from the CLI.
The WithModel
fluent method for Razor requires an IDocument. The ViewDataDictionary
specifies this somewhere.
This code is out of LinqPad, and it's as self-contained and stripped down as I can get it:
void Main()
{
var engine = new Engine();
engine.Pipelines.Add("Foo",
new Documents((d,c) =>
{
return new List<IDocument>() { c.GetDocument() };
}),
new Razor().WithModel(new Person() { Name = "Deane" })
);
engine.Execute();
}
public class Person
{
public string Name { get; set; }
}
That throws the error:
The model item passed into the ViewDataDictionary is of type 'UserQuery+Person', but this ViewDataDictionary instance requires a model item of type 'Wyam.Common.Documents.IDocument'
Note that there was no Razor template file even specified (and the Content
property of the Document
passed in is empty), so there's no way this can be coming from Razor code. It has to be in the Razor module somewhere.
I've looked all through it, and I can't find it...
A shortcode to evaluate C# scripts similar to the new EvaluateScript
module in Statiq.CodeAnalysis
including bringing all metadata into scope as global properties
Documents are currently disposable because of an earlier design in which they kept open file handles that needed to be disposed (and might potentially have kept other disposable items from other file providers). The execution context was responsible for tracking documents and disposing them when they fell out of scope.
Since we no longer have alternate file providers and content is managed by content providers, there's really no need for documents to be disposable. And if they're not disposable, the execution context doesn't have to track them. And if they don't need to be tracked, we don't need a factory pattern and can allow modules to instantiate them directly.
Removing all of this legacy would open up a lot of possibilities for custom document types and strongly-typed documents while simplifying the document APIs.
CustomDocument
Document
publicDocument
to make it easy to use as a base classIExecutionContext
to get access to settings...or maybe an IEngine
is enough so they can be instantiated before execution)IDocument.Clone()
methods and appropriate implementations to Document
(keeping in mind possible use as a base class)MemoryStream
we get from RecyclableMemoryStream
should clean itself and return to the pool on finalize when the GC collects the content provider after the last document using it is collectedRight now a host can be specified with Keys.Host
and the link generator provides an option for including the host at the call site. However, it's sometimes useful to turn on absolute link generation globally or for a given page, regardless of if the call to GetLink()
requests the host be included.
There should be a settings Keys.AbsoluteLinks
(or similar) that forces all links that would otherwise be relative to include the Keys.Host
(or alternate provided host).
Consider switching the file provider model to ASP.NET file providers. That would allow use of existing and new file providers as well as make file provider development for Wyam more broadly usable in ASP.NET. There’s also a built-in file watching system so that could be leveraged as well.
Having two modules for sorting documents is confusing, combine them into a single OrderDocuments
module (using the term "order" for the module because it implies an overall ordering whereas "sort" only implies arranging items in groups).
Now that there's finally an answer to netstandard RSS/Atom, consider ripping out the old code and using this library instead: https://github.com/dotnet/SyndicationFeedReaderWriter
It would be useful in certain situations for a script (or code) to be able to handle important events like pipeline start/finish, module start/finish, etc. The event could return a bool (or more likely, set one in an args object) that would indicate if the operation should proceed. Should also pass some context information (IExecutionContext
or something else?) to the handler.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.