Giter VIP home page Giter VIP logo

archivist's Introduction

Archivist

CircleCI

Archivist is a straightforward blogging utility for generating article content at compile time from version-controlled article and image files. It is built to be used in conjunction with the Arcdown plaintext article parser library.

Archivist is inspired by the general approach of Cẩm Huỳnh's great Nabo library with some key differences:

  • Articles are formatted in Arcdown format by default, allowing for more robust articles and article features.

  • Content parsing and sorting mechanisms are exposed as anonymous functions, easiliy exposing custom functionality.

  • Articles can be organized into nested topic directories for better organization. Topics are parsed in a hierarchical structure.

  • Use of an "intermediate library pattern" is supported, allowing content and articles to be stored in a dedicated library and separate repository.

  • Default attributes are included for both author names and email addresses

  • created_at and published_at timestamps are permitted

  • Flexible tags can be applied as desired to any article

  • Custom content constraints throw warnings during compilation if violated

  • Slug uniqueness is enforced by default and triggers compile-time warnings

  • Image files can be stored alongside articles, and accessed with helpers

Installation

The package can be installed by adding archivist to your list of dependencies in mix.exs:

def deps do
  [
    {:archivist, "~> 0.3"}
  ]
end

Usage

The heart of Archivist is the Archive module, which acts as a repository for exposing query functions for articles, slugs, topics, etc. You can create an Archive out of any Elixir module by using Archivist.Archive like this:

defmodule MyApp.Archive
  use Archivist.Archive
end

# this alias is just a nicety, not required
alias MyApp.Archive

Archive.articles()
Archive.topics() # hierarchical topics
Archive.topics_list() # flattened topics and sub-topics
Archive.tags()
Archive.slugs()
Archive.authors()

Additionally Archvist exposes helpers for reading paths for articles and image files:

Archive.article_paths()
Archive.image_paths()

Archivist 0.3.x and 0.2.x versions expect you to create your article content directory at priv/archive/articles at the root of your elixir library, like this:

priv/archive/articles/journey_to_the_center_of_the_earth.ad

If you'd like to customize any of your archive's behavior, you can define any of the following options when it is used in the target archive directory. The values shown are the defaults:

defmodule MyApp.Archive
  use Archivist.Archive,
    archive_dir: "priv/archive",
    content_dir: "articles",
    content_pattern: "**/*.ad",
    image_dir: "images",
    image_pattern: "**/*.{jpg,gif,png}",
    article_sorter: &(Map.get(&1, :published_at) >= Map.get(&2, :published_at)),
    article_parser: &Arcdown.parse_file(&1),
    content_parser: &Earmark.as_html!(&1),
    slug_warnings: true,
    application: nil,
    valid_tags: nil,
    valid_topics: nil,
    valid_authors: nil
end

Archivist will read any files with the .ad extension in your content directory or in any of its subdirectories, and parse the content of those files with the parser you've selected (Arcdown by default)

If you'd like to store your archive somewhere besides priv/archive you can assign a custom path to your archive like this:

defmodule MyApp.Archive
  use Archivist.Archive, archive_dir: "assets/archive",
end

Arcdown

Arcdown supports the following features for articles:

  • Article Content
  • Article Summary
  • Topics
  • Sub-Topics
  • Tags
  • Published Datetime
  • Creation Datetime
  • Author Name
  • Author Email
  • Article Slug

Here is an example article written in Arcdown (.ad) format:

The Day the Earth Stood Still <the-day-the-earth-stood-still>
by Julian Blaustein <[email protected]>

Filed under: Films > Sci-Fi > Classic

Created @ 10:24pm on 1/20/2019
Published @ 10:20pm on 1/20/2019

* Sci-Fi
* Horror
* Thrillers
* Aliens

Summary:
A sci-fi classic about a flying saucer landing in Washington, D.C.

---

The Day the Earth Stood Still (a.k.a. Farewell to the Master and Journey to the
World) is a 1951 American black-and-white science fiction film from 20th Century
Fox, produced by Julian Blaustein and directed by Robert Wise.

By default Archivist will parse and return article content as Archivist.Article structs. The parsing output of the above article example would look like this:

%Archivist.Article{
  author: "Julian Blaustein",
  content: "The Day the Earth Stood Still (a.k.a. Farewell to the Master and Journey to the\nWorld) is a 1951 American black-and-white science fiction film from 20th Century\nFox, produced by Julian Blaustein and directed by Robert Wise.\n",
  parsed_content: "<p>The Day the Earth Stood Still (a.k.a. Farewell to the Master and Journey to the\nWorld) is a 1951 American black-and-white science fiction film from 20th Century\nFox, produced by Julian Blaustein and directed by Robert Wise.</p>\n",
  created_at: #DateTime<2019-01-20 22:24:00Z>,
  email: "[email protected]",
  published_at: #DateTime<2019-04-02 04:30:00Z>,
  slug: "the-day-the-earth-stood-still",
  summary: "A sci-fi classic about a flying saucer landing in Washington, D.C.",
  tags: [:sci_fi, :horror, :thrillers, :aliens],
  title: "The Day the Earth Stood Still",
  topics: ["Films", "Sci-Fi", "Classic"]
}

Intermediate Library Pattern

While it's completely acceptable use Archivist.Archive within the same application in which the content archive is located, sites with lots of content and publishers who commit changes frequently will quickly find the git history for their application littered with content-related commits that have nothing to do with the broader functionality of the application itself.

To remedy this issue, Archivist permits and encourages the use of an intermediate library to house the content archive (myapp_blog for example), and then to include that intermediate library in the target application where the content is being used and displayed.

This approach requires you generate a new mix library with mix new myapp_blog, and then to publish that repository so that it's available to other Elixir and Erlang applications, via hex.pm or hex.pm organizations for example.

The preferred way to implement this approach is to include archivist as a dependency in your intermediate library (rather than in your application), and then to create a new Archive in your intermediate library like this:

defmodule MyappBlog.Archive do
  use Archivist.Archive,
    application: :myapp_blog,
    archive_dir: "archive"
end

Note that this approach requires you to add the name of your otp application in the application flag when your archive is defined. Also note that archive_dir is compressed to just archive instead of priv/archive since this approach automatically assumes that content will be stored in the priv directory of the otp app indicated by the application option.

Here is an example of an excerpt from the mixfile of an intermediate library:

  defp deps do
    [
      {:archivist, "~> 0.3"},
      {:ex_doc, ">= 0.0.0", only: :dev, runtime: false}
    ]
  end

  defp package do
    [
     files: ["lib", "priv", "mix.exs", "README.md"],
     organization: "my_hex_org"
    ]
  end

Requiring the priv dir here is essential to ensuring that the content archive is packaged with the hex release, and is then made available for the target application.

Setting the organization here scopes the published package to a hex organization, thus ensuring that it remains private.

Then in the application where your content is being used, be sure to include the intermediate library as a dependency:

defp deps do
  [
    ...
    {:myapp_blog, "~> 0.1", organization: "my_hex_org"}
  ]
end

And then you should be able to use your content directly in your application:

MyappBlog.Archive.articles()
MyappBlog.Archive.topics()

Parsed Content Constraints

As of Archivist version 0.2.6 archives can receive flags for lists of valid_topics and valid_tags. Version 0.2.9 added support for valid_authors constraints. Here are some examples of constraints:

defmodule Myapp.Archive do
  use Archivist,
    valid_topics: [
      "Action",
      "Classic",
      "Crime",
      "Fiction",
      "Films",
      "Sci-Fi"
    ],
    valid_tags: [
      :action,
      :adventure,
      :aliens,
      :crime,
      :horror,
      :literature,
      :modern_classic,
      :sci_fi,
      :thrillers
    ],
    valid_authors: [
      "Jules Verne",
      "Julian Blaustein",
      "Michael Mann"
    ]
end

Adding articles with tags, topics or authors that don't conform to these lists, or using a topic directory structure that doesn't conform to these lists will throw warnings at compile time, like this:

warning: Archivist Archive contains invalid topics: Action, Classic
  (archivist) lib/archivist/parsers/article_parser.ex:77: Archivist.ArticleParser.warn_invalid/3

warning: Archivist Archive contains invalid tags: action, adventure
  (archivist) lib/archivist/parsers/article_parser.ex:77: Archivist.ArticleParser.warn_invalid/3

warning: Archivist Archive contains invalid authors: Ernest Hemingway
  (archivist) lib/archivist/parsers/article_parser.ex:77: Archivist.ArticleParser.warn_invalid/3

Compilation will not cease, however, simply because these constraints are being violated.

Please note that only exact topic and author matches are accounted for here, so"Sci-Fi" will not be considered equivalent to "SciFi" and will throw a warning. Similarly, "J.D. Salinger" will not be considered to be the same author as "JD Salinger" by the article parser.

If you do not want warnings for tags, topics or authors during compilation simply don't declare any values for valid_topics, valid_tags, or valid_authors depending on your desired outcomes, and they'll be ignored.

Also note that enforcement of valid topics currently is only compared to the flattened list of topics and sub-topics. There is no functionality in place at the moment for constraining specific topic hierarchies.

It should additionally be noted that the slug_warnings filter is on by default, meaning that the parser will throw warnings if duplicate slugs are found across articles in your content archive. This can be turned off by setting slug_warnings: false when you declare your archive, like this:

defmodule Myapp.Archive do
  use Archivist, slug_warnings: false
end

Mounting Images with Plug

If you choose to store images with your archive, it's probably most useful to have that content mounted as a static assets path somewhere where the content can be digested with Webpack or whichever assets manager you're using.

For systems built with Plug (including Phoenix), it's easy enough to mount the images path with Plug.Static at the path of your choice. Simply call the name of the otp app where your content is stored along with the path to the images:

plug Plug.Static,
  at: "/blog/images/",
  from: {:myapp_blog, "priv/archive/images"}

Note that for applications that employed the Intermediate Library Pattern, the flags for Plug.Static will look like the example above. For instances where Archivist is being used directly in the target application, the name of the current application should be used here, like this:

plug Plug.Static,
  at: "/blog/images/",
  from: {:myapp, "priv/archive/images"}

For use within Phoenix in particular, your plug declaration would likely go in the MyappWeb.Endpoint module, like this:

defmodule MyappWeb.Endpoint do
  use Phoenix.Endpoint, otp_app: :myapp

  # app name here will be :myapp or :myapp_blog depending on which otp app
  # contains the content archive
  plug Plug.Static,
    at: "/blog/images/",
    from: {:myapp_blog, "priv/archive/images"}

Development Notes

Please find additional information about known issues and planned features for Archivist in the issues tracker.

Todo

Issues and Todo enhancements are managed at the official Archivist issues tracker on GitHub.

Availability

Source code is available at the official Archivist repository on the FunctionHaus GitHub Organization

License

Archivist source code is released under Apache 2 License. Check LICENSE file for more information.

© 2017 FunctionHaus, LLC

archivist's People

Contributors

zazaian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

rickdeshon

archivist's Issues

Add archive_root option to abstract priv path issue

Currently the archive_dir option has to change if the application flag is used to utilize a 'remote-style' archive. This seems like an antipattern. archive_dir should refer only to the name of the archive directory, and the archive_root should be the path to the entire archive entity.

Add multiple article fixtures for testing

Testing is currently only being performed with a single article fixture. Let's add a few different article examples in sub-directories to ensure that attribute lists are being parsed and compiled as expected.

Add compile-time constraints for Topics

Add some kind of functionality for allowing users to create a constrained list of topics and sub-topics, allowing only those specific topics and sub-topics to be used by articles. Articles that apply topics and sub-topics that do not exist within the constrained list should throw errors at compile-time.

Add compile-time constraints for author names

Add some kind of functionality for allowing users to create a constrained list of author names, allowing only those specific author names to be used. Articles with author names that do not exist within the constrained list should throw warnings at compile-time.

Add intelligent parsing for system-wide topic hierarchies

Currently Topics and sub-topics are being parsed as single-level entities and mixed together without any enforced hierarchy. The topics parser should retain the hierarchy set forth in all articles and infer topics and sub-topics based on the hierarchy established in the articles.

Add option for parsing topics from directory paths or articles

Currently topics are being parsed based on the directories and sub-directories used to store the articles within the archive. The user should have the option to ignore those directory paths and instead use only the topic hierarchies listed within the article arcdown content itself.

Integrate article_parser option into parsing workflow

The article_parser Archive flag currently is unused and is a placeholder. We need to inject this into the article parsing workflow so that content is actually parsed using the module or anonymous function stored in the article_parser setting.

Resolve assets dirs relative to the app priv path

Assets directories for articles and images are currently being resolved relative to the current working directory (Path.relative_to_cwd) but should be resolved to the priv dir for the target BEAM application that is housing the archive assets (Path.join(:code.priv_dir(:some_app), "archive")).

We need to add an archive option to determine which app is housing the target assets, either instead of the current solution or as an override to the existing approach.

Consolidate articles and image dirs into common priv/archive dir

Articles and images are currently stored in separate directories in the priv dir, which could get messy for applications that already have several directories under priv. By default these should be grouped into a top-level priv/archive dir to create less clutter in the priv dir.

Allow parsing content from external OTP applications

Currently Archivist only works if it is implemented as a dependency in the application where the content and images also reside. We need to add functionality to allow the user to declare which
OTP app to fetch the content from so that content paths are correctly resolved to the target directory, not just the current working directory.

Add compile-time contstraints for tags

We need some kind of functionality for allowing users to create a constrained list of tags for each application, allowing only certain tags to be used within articles. Articles that apply tags that do not exist within the constrained list should throw errors at compile-time.

Add support for image parsing

Add functionality for storing images at priv/images in the target image directory, and parsing available image filenames.

Add docs for using Archivist through an OTP app

README and docs currently outline a basic workflow for using Archivist in the same Mix library/application in which the archive content is being held, but we need to outline the process of referencing content from an external OTP application being used as a dependency.

Add docs for using intermediate Archivist repository

The best design pattern for using Archivist seems to be installing it in an intermediary mix library (some_app_blog etc.), collecting the archived articles and image there, and then including the intermediate mix library in the final target application. This approach separates clutter from commit messages related to the content into an archive/content-specific repository, and allows the consumer library to focus just on commits for code. We need documentation in the README and/or Wiki to outline this process and how it needs to be setup, both as an intermediate library and in the final destination (including using Plug.Static to serve image assets).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.