novoid / lazyblorg Goto Github PK

Blogging with Org-mode for very lazy people

License: GNU General Public License v3.0

Python 94.16% Shell 1.71% SCSS 4.13%

python orgmode emacs blog blogging blog-engine lazy simplicity simple simplistic

lazyblorg's Introduction

## -*- coding: utf-8;mode: org; -*- ## This file is best viewed with GNU Emacs Org-mode: http://orgmode.org/

«A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away.» (Antoine de Saint-Exupéry)

lazyblorg – blogging with Org-mode for very lazy people

This is a web log (blog) environment for GNU Emacs with Org-mode which generates static HTML5 web pages. It is much more superior to any other Org-mode-to-blog-solution I have seen so far!

<(All?) your Org-mode files>  --lazyblorg-->  static HTML pages
                                                      |
                                                      v
                                 optional upload (shell) script
                                                      |
                                                      v
                                                 your web space

There is a list of similar/alternative Org-mode blogging projects whose workflows seem really tedious to me.

See my original post to the Org-mode ML for how this idea of lazyblorg started in 2011.

This awesome piece of software is a sheer beauty with regard to:

simplicity of creating a new blog entry
- I mean it: there is no step which can be skipped!
  - add heading+content anywhere, add ID, tag, invoke lazyblorg
integration into my personal, published workflows
- here, you have to either adapt my totally awesome workflows or you have to find alternative ways of doing following things:
  - linking/including image files or attachments in general (I use this Memacs module)
    - advantage of my method: I simply add an image file by typing tsfile:2014-03-03-this-is-my-file-name.jpg in double-brackets and I really don’t care in which folder the file is currently located on my system
  - copying resulting HTML files to web-space (I do it using unison/rsync)
  - probably more to come

DISCLAIMER: This is a personal project

I wrote lazyblorg to get a blogging system that works for me exactly the way I need it. I did not write lazyblorg for the pleasure on the coding process - I just wanted to get the resulting thing working in order to be able to blog the way I want to blog.

Therefore, it’s mostly a works-for-me project. I won’t add anything that I don’t use myself.

For the same reason, I won’t accept pull requests for things like:

Features I don’t use myself.
Added complexity I don’t want.
Features or dependencies I don’t understand completely.
Stuff that might interfere with the way I’m using lazyblorg.

If you do want to adapt lazyblorg and implement your own features that conflict with the things listed above, I’d like to see you creating a fork of this project. If you drop me a line with a description how your fork differs from the original, I’m glad to add it to this README file.

If you think that your idea might be also a good one for me, you can always hand in a pull request and I’ll have a look if I’m willing to add it. However, don’t be disappointed if I don’t. I want to keep the time spent on lazyblorg at a minimum. My personal priority is visible in the lazyblorg issues where I introduced tags for high and low priority things to implement.

Target group

Lazy users of Org-mode who want to do blogging very easily and totally cool.

Or simply wannabes. I’m perfectly fine with this as long as they use lazyblorg.

Other people using lazyblorg

Pages using lazyblorg are listed on my personal tag page on “lazyblorg”. Please do drop me a line when you want to get your page added to the list.

Quote from Sepp Jansen:

[…]

But when I revisited lazyblorg after studying the other packages, it suddenly seemed like a better solution. After only a short time of reading I figured out the entire templating and post generation system. Although not the most elegant, it is super simple and easy to understand. And those are my most important points.

The developer states that it is easy to configure and start building, and is absolutely right.

In just a few hours I went from installing dependencies to having a fully working website, including some layout and CSS customization. The included HTML and CSS is easy to modify so I could (lazily) make the site look like I wanted it to without too much digging in many little files. I even managed to make it look a lot like my old site without too much effort! Lazyblorg really lives up to its name!

[…]

I really like lazyblorg, and I’ll happily manage this website with it for as long as possible.

Skills necessary

modifying configuration settings, e.g., in script files
optional: creating scheduled tasks (cronjob, …) if you are a really lazy one (and if you trust lazyblorg to do its job in the background)

System requirements

lazyblorg is written in Python 3.

Development platform is Debian GNU/Linux. So with any decent GNU/Linux you should be fine.

It might work on OS X but I never tried it so far.

I definitely does not work with Microsoft Windows. Although a programmer can add a couple of os.path.thisorthat() here and there and it should be good to go. Please consider sending a pull-request if you are fixing this issue. Thanks!

Version and Changelog

Currently (2019-10-23), I consider lazyblorg in beta-status with version 0.96 or so.

I don’t maintain a specific changelog. However, when there are substantial changes to lazyblorg, you will find a blog article tagged with “lazyblorg”. Use an RSS/Atom aggregator to follow the blog.

Why lazyblorg?

Minimum effort for blogging.

And: your blog entries can be written anywhere in your Org-mode files. They will be found by lazyblorg. :-)

Further advantages are listed below.

Example workflow for creating a blog entry

write a blog entry anywhere in your Org-mode files
- With lazyblorg, you can, e.g., write a blog article about an event as a sub-heading of the event itself!
tag your entry with :blog:
add an unique ID in the PROPERTIES drawer
- You might want to use a package that automatically generates unique IDs to your headings (I don’t).
- You might want to take a look at this solution using file or directory variables.
set the state of your entry to DONE
- make sure that a :LOGBOOK: drawer entry will be created that contains the time-stamp

An example blog entry looks like this:

** DONE An Example Blog Post           :blog:lazyblorg:software:
CLOSED: [2017-06-18 Sun 00:16]
:PROPERTIES:
:ID: 2017-07-17-example-posting
:CREATED:  [2017-06-17 Sat 23:45]
:END:
:LOGBOOK:
- State "DONE"       from "NEXT"       [2017-06-18 Sun 00:16]
:END:
       […]
Today, I found out that…

That’s it. lazyblorg does the rest. It feels like magic, doesn’t it? :-)

Advantages

These things make a blogger a happy one:

No other Org-mode blogging system I know of is able to process blog entries which are scattered across all your Org-mode documents except the usual org-export-based approaches.

No other Org-mode blogging system I know of is able to generate a blog entry with that minimum effort to the author.

You do not need to maintain a specific Org-mode file that contains you blog posts only. *Create* blog posts anywhere in between your notes, todos, contacts, …

And there are some technological advantages you might consider as well:

You don’t need to write or correct HTML code by yourself.
produces static, state-of-the-art HTML5
- it’s super-fast on delivery to browsers
- very low computing requirements on your web server: minimum of server load
No in-between format or tool.
- Direct conversion from Org-mode to HTML/CSS.
- dependencies have the tendency to cause problems when the dependent tools change over time
- lazyblorg should be running fine for a long time after it is set up properly
Decide by yourself how and where you are hosting your blog files and log files.
you will find more advantages when running and using lazyblorg - I am very confident about that ;-)

Disadvantages

Yes, there are some disadvantages. I am totally honest with you since we are becoming close friends right now:

lazyblorg re-generates the complete set of output pages on every run
- this will probably changed in a future release (to me: no high priority)
- most of the time this is not an issue at all
  - if pages are generated on a different system as the web server runs on, performance is a minor issue
  - if you don’t have thousands of pages, this will not take long
lazyblorg is implemented in Python:
- Its Org-mode parser supports only a (large) sub-set of Org-mode syntax and features.
  - Basic rule: use an empty line between two different syntax elements such as paragraphs, lists, tables, and so on.
  - Whenever I think that an additional Org-mode syntax element is needed for my blog, I start thinking of implementing it
  - I am using Pandoc as a fall-back for all other Org-mode syntax elements which works pretty fine
  - For a list of general Org-mode parsers please read this page
lazyblorg is using state-of-the art HTML5 and CSS3
- No old HTML4.01 transitional stuff or similar
- Results might not be compatible with browsers such as Internet Explorer or mobile devices.
  - tell your Internet Explorer friends that they should do themselves a favor and switch to a real browser
You have to accept the one-time setup effort which requires knowledge of:
- using command-line tools
- modifying configuration files
- summary: getting this beautiful thing to work in your environment

Features

«Technology develops from the primitive via the complex to the simple.»

(Antoine de Saint-Exupéry; note: lazyblorg is currently “primitive” but with a great outlook up to the status of being simple)

Here is a selection of features of lazyblorg which helps you to blog efficiently:

Converts Org-mode To HTML5: lazyblorg supports a (large sub-)set of syntax elements of Org-mode
- also see FAQs for “What Org-mode elements are supported by lazyblorg?”
Different page types allow you to create:
1. articles related to a specific date (temporal pages)
  - Those articles are published and hardly updated.
2. articles not related to a specific date (persistent pages)
  - Frequent updates or the absence of any day-relation makes this page type very sexy to use.
3. articles describing a tag you are using (tag pages)
  - Yes, with lazyblorg, you are (optionally) able to explain how you are using a certain tag. You can link your most important tag-related articles and so forth. Most systems don’t offer any possibility to communicate the meaning of the tags used.
4. the entry page of your blog
  - You gotta give them a starting page ;-)
5. the templates which are used to generate your blog pages
  - Hooray, you are able to define all templates of your blog within Org-mode as well. No need to edit source code here. Isn’t this great?
To efficiently notify users of new articles or changes to existing articles, lazyblorg generates RSS/ATOM feeds.
Really fast to use linking to other blog articles using their ID property.
At the bottom of each article, there is a list of related articles that back-link to here.
You can very easily embed image files with automatically scaling to their desired width
- This feature is hardened against image file renaming and broken links because of moving images files to different folders
- Users of Memacs do have advanced possibilities here as well
- An optional image cache directory holds previously resized image file and therefore prevents resizing effort for each run.
For navigating through the blog articles I do recommend using the tags. Articles related to one topic share common tags whereas a date-oriented archive has only very limited use. The tag cloud which is on the tag overview page offers a quick overview of your most used tags.
There is a search feature which brings you to the content by searching for keywords or phrases.
Easy embedding of external content such as Tweets or YouTube videos.
You can exclude content from being published with various features:
1. Comment lines
2. Mark an article/heading as hidden using the tag NOEXPORT
3. The hidden tag does publish an article but hides it from the entry page, navigational pages, and the feeds. This way, you can publish pages who can only be access by people knowing its URL.
Reading time estimations (multi-language) following this feature request

FAQs

See https://github.com/novoid/lazyblorg/wiki/FAQs

Installing and Starting with lazyblorg

I am using it for my own blog and therefore it gets more and more ready to use as I add new features.

What’s working so far:

parsing a large sub-set of Org-mode
- most important: the parser requires a blank line between different Org mode elements
parsing the HTML templates
generating HTML5 pages with a sub-set of the sub-set of the Org-mode syntax elements

External dependencies

The number of external dependencies is kept at a minimum.

This is a list of the most important dependencies:

Werkzeug
- for sanitizing path components
- I installed it on Debian GNU/Linux with sudo apt-get install python3-werkzeug
pickle
- object serialization
- most likely: should be part of your Python distribution
pypandoc
- some Org-mode syntax elements are being converted using Pandoc and its Python binding pypandoc
- you can get it via sudo apt-get install pandoc and sudo pip install python3-pypandoc
- Note: Debian GNU/Linux 8 (Jessie) comes with a Pandoc version which is has bugs. Please install a more recent version. I upgraded to pandoc-1.15.1-1-amd64.deb from: http://pandoc.org/installing.html
opencv-python
- lazyblorg scales embedded images according to the HTML export attributes
- Install using sudo apt-get install python3-opencv
Sass (optional) if you want to generate your CSS from the scss-file
orgformat
- This is my library that provides basic utility functions when working with Org mode strings

All other libraries should be part of a standard Python distribution.

If you don’t want to install the dependencies via package management, you can use the python way from requirements.txt:

pip install -r requirements.txt

Running tests also requires `pytest`.

How to Start

Get the source
- git clone https://github.com/novoid/lazyblorg.git or download current version as ZIP file
Adapt config.py to meet your settings.
Do a technological test-drive
- start: lazyblorg/example_invocation.sh
- this should work with GNU/Linux (and most probably OS X)
- if not, there is something wrong with the set-up; maybe missing external libraries, wrong paths, …
Study, understand, and adopt the content of example_invocation.sh
- with this, you are able to modify command line parameters to meet your requirements
- if unsure, ask for help using lazyblorg.py --help
Get yourself an overview on what defines a lazyblorg blog post and write your own blog posts. A (normal temporal) blog article consists of:
1. A (direct) tag has to be blog
  - Sorry, no tag inheritance. Every blog entry has to be explicitly tagged.
2. You have to add an unique :ID: property
3. The entry has to be marked with DONE
4. A :LOGBOOK: entry has to be found with the time-stamp of setting the entry to DONE
  - in my set-up, this is created automatically
5. Get yourself familiar on the sub-set of Org-mode syntax you can use with lazyblorg
  - Always put an empty line between different syntax elements such as a heading and the next paragraph, normal text and a list or a table, and so forth.
  - You should not get a disaster if you are using elements lazyblorg is not optimized for. The result might disappoint you, that’s all.
  - However, “unknown” Org-mode elements are automatically converted through pandoc as a fall-back.
OPTIONAL: Write your own CSS file
- you can take a look on mine if you do not care that I am not really into Web design :-)
- please replace hard-coded URL to CSS file in lazyblorg/templates/blog-format.org and link it to your CSS file
OPTIONAL: Adopt the blog template
- default template is defined in lazyblorg/templates/blog-format.org
OPTIONAL: Create tag pages for your most important tags where you describe how you are using this tag, what are the most important blog entries related to the tag and so forth.
Publish your pages on a web space of your choice
- publishing can be done in various ways. This is how I do it using lazyblorg/make_and_publish_public_voit.sh which is an adopted version of lazyblorg/example_invocation.sh:
  1. invoking start_all_tests.sh
    - this is for checking whether or not recent code changes did something harmful to my (unfortunately very limited) set of unit tests
  2. invoking lazyblorg with my more or less fixed set of command line parameters
  3. invoking rsync -av testdata/2del/blog/* $HOME/public_html/
    - it synchronizes the newly generated blog data to the local copy of my web space data
    - this separation makes sense to me because with this, I am able to do test drives without overwriting my (local copy of my) blog
  4. invoking unison
    - in order to transfer my local copy of my web space data to my public web space
- This method has the advantage that generating (invoking lazyblorg) and publishing (invoking unison) are separate steps. This way, I can locally re-generate the blog (for testing purposes) as often I want to. However, as long as I do not sync it to my web space, I keep the meta-data (which is in the local web space copy) of the published version (and not the meta-data of the previous test-run).
Have fun with a pretty neat method to generate your blog pages

Because we are already close friends now, I tell you a hidden feature of lazyblorg nobody knows yet: whenever you see a π-symbol in the upper right corner of a blog entry on my blog: this is a link to the original Org-mode source of that page. This way, you can compare Org-mode-source and HTML-result right away. Isn’t that cool? :-)

Five categories of page type

There are five different types of pages in lazyblorg. Most of the time, you are going to produce temporal pages. However, it is important to understand the other ones as well.

In order to process a blog-heading to its HTML5 representation, its Org-mode file has to be included in the --orgfiles command line argument of lazyblorg.py. Do not forget to include the archive files as well.

temporal
persistent
tags
entry page
templates

Please do read https://github.com/novoid/lazyblorg/wiki/Page-Types for important details.

BONUS: Preview Blog Article

It is tedious to re-generate the whole blog and even upload it to your web-space just to check the HTML version of the article you are currently writing.

Yeah, this also sucks at my side.

Good news everybody: There is a simple method to preview the article under the cursor. The script preview_blogentry.sh contains an ELISP function that extracts the current blog article (all lazyblorg criteria has to be fulfilled: ID, blog tag, status DONE), stores it into a temporary file, and invokes lazyblorg via preview_blogentry.sh with this temporary file and the Org-mode file containing the format definitions.

If this worked out, your browser shows you all generated blog articles.

Please do adopt the mentioned scripts to you specific requirements - the ones from the repository are for my personal set-up which is unlikely to fit yours (directory paths mostly).

Bang! Another damn cool feature of lazyblorg. This is going better and better. :-)

BONUS: Jump From URL to Blog Article

Imagine, you’re looking at a blog article of your nice lazyblorg-generated blog. Now you want to go to the corresponding Org-mode source to fix a typo.

The issue here is, that you have to either know, where your heading is located or you have to go to the HTML page source, extract the ID, and jump to this ID.

I’ve got a better method: put the URL of your blog article into your clipboard (via C-l C-c), press a magic shortcut in Emacs, and BAAAM! you’re right on spot.

How’s that magic happening?

Just use the following Emacs lisp code snippet, adapt the domain string, and assign a keyboard shortcut:

(defun my-jump-to-lazyblorg-heading-according-to-URL-in-clipboard ()
  "Retrieves an URL from the clipboard, gets its Org-mode source,
   extracts the ID of the article and jumps to its Org-mode heading"
  (interactive)
  (let (
        ;; Getting URL from the clipboard. Since it may contain
        ;; some text properties we are using substring-no-properties
        ;; function
        (url (substring-no-properties (current-kill 0)))
        ;; This is a check string: if the URL in the clipboard
        ;; doesn't start with this, an error message is shown
        (domain "http://karl-voit.at")
  )
    ;; Check if URL string is from my domain (all other strings do
    ;; not make any sense here)
    (if (string-prefix-p (upcase domain) (upcase url))
    ;; Retrieving content by URL into new buffer asynchronously
    (url-retrieve url
                      ;; call this lambda function when URL content is retrieved
          (lambda (status)
             ;; Extrating and preparing the ID
             (let* (
                                ;; Limit the ID search to the top 1000 characters of the buffer
                (pageheader (buffer-substring 1 1000))
                ;; Start index of the id
                                (start (string-match "<meta name=\"orgmode-id\" content=\"" pageheader))
                                ;; End index of the id
                                (end (string-match "\" />" pageheader start))
                                ;; Amount of characters to skip for the openning tag
                                (chars-to-skip (length "<meta name=\"orgmode-id\" content=\""))
                                ;; Extract ID
                                (lazyblorg-id (if (and start end (< start end))
                                                  ;; ... extract it and return.
                                                  (substring pageheader (+ start chars-to-skip) end)
                                                nil))
                                )
               (message (concat "Looking for id:" lazyblorg-id " ..."))
               (org-open-link-from-string (concat "id:" lazyblorg-id))
               )
             )
          )
  (message (concat "Sorry: the URL \"" (substring url 0 (length domain)) "...\" doesn't start with \"" domain "\". Aborting."))
  )
    )
  )

BONUS: Embedding External Things

Do read the Wiki for embedding external stuff like Tweets or YouTube videos.

Using relative URLs instead of domain-URLs

The links to the CSS & co, even on the homepage, start with a slash which means you can’t easily have a look at the HTML locally by opening the files, you need to spin up a webserver.

If you want to learn how to move to a more flexible setup, read this comment and follow its instructions.

How to Thank Me

I’m glad you like my tools. If you want to support me:

Send old-fashioned postcard per snailmail - I love personal feedback!
- see my address
Send feature wishes or improvements as an issue on GitHub
Create issues on GitHub for bugs
Contribute merge requests for bug fixes
Check out my other cool projects on GitHub

If you want to contribute to this cool project, please fork and contribute!

Issues, bugs,… are maintained in the GitHub issue tracker.

I am using Python PEP8 and some ideas from Test Driven Development (TDD).

Local Variables

lazyblorg's People

Contributors

Stargazers

Watchers

lazyblorg's Issues

Implement a simple snippet system

I'd like to have a simple method of adding text snippets to any generated blog page.

Example Concept

Blog format template file gets extended by an additional main heading "Snippets" that holds one heading per snippet:

* README
[...]
* DONE Templates    :lb_templates:blog:
[...]
* DONE Snippets      :lb_snippets:blog:
** disclaimer

This article belongs to a series of articles about the topic "Foo Bar".

You can read more about this topic in [id:foobar][this page].

** example

a wonderful example

** another snippet
[...]

Blog page in Org mode:

** DONE My Example   :blog:
:PROPERTIES:
[...]
:END:

#disclaimer#

I'd love to have #example# in lazyblorg.

Resulting HTML page:

[Header]

This article belongs to a series of articles about the topic "Foo Bar".

You can read more about this topic in <a href="https://[...URL of this page...]">this page</a>.

I'd love to have a wonderful example in lazyblorg.

Leading and trailing newlines of snippets are omitted. If you need one, do newlines before/after using the snippet in your blog post.
Leading and trailing whitespaces of snippets are omitted. This way, snippets can be used within sentences.

Open questions:

Is there a need for a check on uniqueness of the snippet definitions or their starting substrings? Or is this up to the user to shoot himself in the food?
Can existing Template definitions like #BLOG-NAME# or "#AUTHOR-NAME# be used when typing a blog post? (needs confirmation first)

Fix archive.web links

Links with descriptions containing a valid URL like https://web.archive.org/web/20200602015020/https://karl-voit.at/managing-digital-photographs/ does not work.

Command line parameter to extract random articles

I'd like to have the possibility to start a lazyblorg run with an additional parameter like --randomarticles which prints out (a configurable amount of) random articles with their URLs and their titles.

This can be run without re-generating the actual blog files. Therefore, this would only parse existing Org mode files, extract all the meta-data, generate the URLs and print out the results.

Using this feature, it is possible to promote older articles that you may have forgotten yourself as the author.

Implementation might relate to #43 and #44 which also might potentially extract data without generating the actual blog data (not decided yet).

Atom feeds for auto-tags (language, size, ...)

Automatically assigned tags for the dominant text language (already implemented), the article size (small/medium/large) and so forth are good candidates for Atom (better replacement for RSS) feeds.

lazyblorg shall generate feeds per auto-tag.

Implement "Comments" section is able to be separated from the main text

This is a feature wish.

Consider you define a set of strings that qualify a top-level heading of the article as the comment section. In my use-case, this would be COMMENTS_LINE_DEFINITIONS = ['Comments', 'Kommentare'].

Whenever there is a heading on the top level of the current article, this section is stored and handled differently, if you want.

For example, you could define the page template such, that the comment section goes to the bottom of the page. Below the "Related articles that link to this one" and even below the Disqus comments.

Link (language-)autotag articles to autotag-page

Articles that are tagged with autotags like language:english should be linked to the corresponding tag pages.

For example: https://karl-voit.at/2016/11/16/empty-autotag-page/ has no links at all although there are many English articles in the blog.

Robust Atom/RSS feed generator

With my simple Atom/RSS feed generating code, I often had issues with some Atom/RSS aggregators in combination with special characters, encoding, and such. Images don't work so far within Atom/RSS feeds.

Maybe somebody want to volunteer to add a decent Atom/RSS support that generates rock solid Atom/RSS feeds so that people might follow the "full content" feed and not the "link only" feed which I am forced to recommend.

remove recommendation for simple feed

Blog entries can not start with a list

When a blog entry starts with a list instead of a text paragraph, an exception is thrown.

Simple reference/citation

Idea:

I do have a folder with all of my BibTeX files as described on https://karl-voit.at/2015/12/26/reference-management-with-orgmode/
For lazyblorg I can think of using a custom link like cite:Voit2012b when blogging. What lazyblorg does when generating the blog:

copy the BibTeX file (as is) to a special folder karl-voit.at/bib/ (or similar): karl-voit.at/bib/Voit2012b.bib
create a href link to that bib-file in the blog article
the link text is something meaningful (FIXXME: to be defined)

This way, details on references exist only once. Somebody might copy the BibTeX entry and re-use it. I keep using my usual simple citation workflow even for blog articles.

What do you think about it?

Config Parameters could not be empty

Now I have the problem with the new config parameter ID_OF_HOWTO_PAGE. I don't have such a page and therefore I will set it to ' ' or comment it out. But when generating the blog I got error, because this parameter is mandatory.

Implement path-tags for collecting articles in a specific sub-folder

Currently, there are five different types of pages in lazyblorg. Here is an idea for a sixth page type.

Consider the use-case of writing a book. The book has some URL references (one per chapter) to your blog for corrections and stuff that was new after the book was published. Wouldn't it be nice if all these "chapter pages" would be separated from your normal blog pages?

Here is the idea for the implementation: every page of the book blog pages gets a tag like "mybook". This tag is also represented in the config.py as, let's say, in a list called path_tags: path_tags=['mybook', 'recipies']

From now on, any page that gets tagged with mybook is not shown on, e.g., http://example.com/2019/07/08/pagetitle but http://example.com/mybook/pagetitle instead. So it looks like a persistent page but with an additional path level.

FIXXME: maybe find a better term for "path_tag" and add "persistent" to it? Ideas: persistent_tags, ...

Note: path_tags and lb_persistent are mutual exclusive.

First list is an implicit article update list showing a changelog

Hi,
So far, I was manually writing paragraphs like following to update to pages like this:

Update 2014-05-14: added real world example

Update 2015-03-16: filtering photographs according to their GPS coordinates

Update 2016-08-29: replaced outdated ~show-sel.sh~ method with new ~filetags --filter~ method

Update 2017-08-28: Email comment on geeqie video thumbnails

[begin of the article content]

What about making an implicit changelog list?

Any simple itemize list at the begin of an blog article (which was not possible so far because of #10) is implicitly interpreted (and formatted) as a changelog list with special CSS markup.

For example:

- Update 2014-05-14: added real world example
- Update 2015-03-16: filtering photographs according to their GPS coordinates
- Update 2016-08-29: replaced outdated ~show-sel.sh~ method with new ~filetags --filter~ method
- Update 2017-08-28: Email comment on geeqie video thumbnails

[begin of the article content]

... may be transformed to:

- Article changelog:
  - 2014-05-14: added real world example
  - 2015-03-16: filtering photographs according to their GPS coordinates
  - 2016-08-29: replaced outdated ~show-sel.sh~ method with new ~filetags --filter~ method
  - 2017-08-28: Email comment on geeqie video thumbnails

[begin of the article content]

... or similar.

I am not sure whether or not the actual words should be changed. Maybe just adapt the CSS markup (smaller font, different background, ...).

The goal should be that the reader is clearly able to see that this is meta-data and that the actual content is started below.

Derive list of tsfile: link targets

Add an additional command line parameter similar to --tsfile-links $FILE. This $FILE gets populated with CSV content like:

2020-04-12-my-nice_image.jpeg; id:2020-04-12-article-id
2018-09-10T13.55.03 flower at a tree -- publicvoit cliparts.jpg; id:another-id

The entries refer to images which got linked using the tsfile-method. The CSV file might be handy for locating externally linked files for some reason.

Note: Determining broken links of that kind is already a built-in feature.

Implementation might relate to #41 and #43 which also might potentially extract data without generating the actual blog data (not decided yet).

auto-tag for categories of small/medium/large articles

Related to #47:

Think of adding an auto-tag for categories of small/medium/large articles

Define boundaries for small/medium/large
Implement small/medium/large auto-tags according to reading time from finished feature of #47

Sidebar: List of articles published on this day (any year)

In order to support serendipity, there could be a sidebar section where all temporal+persistent only articles are listed, that were initially published on the same day, in any year. For example, an article from 2020-08-26 would list all articles from 2020-08-26, 2019-08-26, 2018-08-26, 2017-08-26, and so on.

how o handle end-to-end tests with randomized articles?
- current simple comparison method would be impossible
- maybe: purge those sections from all result files prior to comparison?

Linked bigger version of an image

What and Why

I want to add big versions of images to a blog post that exceed the default max width of 560 which is not enforced but suggested by the CSS layout.

This is a common pattern: when a big image is embedded in an article, show a thumbnail version or the maximum possible size in the blog post and offer the big version as a link. This way, the reader is able to decide whether or not he/she wants to open the link (as a new tab).

Without this feature, I am limited to the maximum size according to the CSS and/or I can't see all the details in the resized image. When I ignore the maximum size according to the CSS, I end up with a broken layout where the image is exceeding the column and potentially causes various CSS side-effects.

Open Questions

Following questions are to clarify for the concept:

how should the syntax be in Org mode?
- I don't know (yet?) how a linked image can be a link to a bigger image as well
alternative/optional: generate a link in the caption line like This is my caption (see bigger version)? (automatically?) And how is this written in the Org version of the article?

Current situation

Org mode file with image links will get transformed by lazyblorg to this HTML result. So far I am quite confident that I am 100% consistent with the normal Org mode syntax for images:

#+CAPTION: This is going to be the caption
#+ATTR_HTML: :alt This is going to be the alt parameter of the img tag
#+ATTR_HTML: :title The title is ignored
#+ATTR_HTML: :align right :width 300
[[tsfile:2017-03-11T18.29.21 Sterne im Baum -- mytag publicvoit.jpg][If there is an CAPTION, this title gets ignored]]

<figure class="image-right">
<img src="22017-03-11T18.29.21 Sterne im Baum -- mytag publicvoit scaled width 300.jpg" 
     alt="This is going to be the alt parameter of the img tag" 
     width="300" />
<figcaption>This is going to be the caption</figcaption>
</figure>

And yes: lazyblorg really scales the images to their desired width/height so that you don't rely on browser scaling and minimize transferred data amount.

The Org manual on image links or on handling links does not mention something that caught my eye.

Therefore, I need something that might (or might not?) differ from the standard Org mode syntax to be able to define the desire to link the original image behind the embedded image.

A Proposal

The most Org-like way of enabling this feature would probably be inventing a new property that Org mode (hopefully) ignores:

#+CAPTION: This is going to be the caption
#+ATTR_HTML: :alt This is going to be the alt parameter of the img tag
#+ATTR_HTML: :align right :width 300
#+ATTR_HTML: :embedfullsizeimage true
[[tsfile:2017-03-11T18.29.21 Sterne im Baum -- mytag publicvoit.jpg][If there is an CAPTION, this title gets ignored]]

Please note that :embedfullsizeimage true is a stupid placeholder which should be renamed to an appropriate name.

To be honest, I don't even know if there are boolean parameters that do not require a value, thus, omitting the "true" part of my example above.

(I also asked on reddit)

Implement pages for date-based navigation

At the top of each temporal page, there is an ISO date whose elements (year, month, day) is a clickable link. So far, the browser is able to display the folders within those link destinations if the web server is configured such (directory list permission).

In future, there should be beautifully generated pages by lazyblorg that show a nice list of items for the corresponding level.

really do list all temporal articles of a whole year?
alternatives:
- number of articles of that year
- links to all months including their total number of temporal articles
link non-hidden non-temporal pages as well

Add "estimated reading time" per article in header

Find out how to implement a feature that estimates the reading time (according to number of words?)
- https://medium.com/better-programming/how-to-estimate-any-articles-read-time-2a8403f0bd79
  - 200 WPM + Python code at https://github.com/assafelovic/reading_time_estimator
- https://help.medium.com/hc/en-us/articles/214991667-Read-time
  - "Read time is based on the average reading speed of an adult (roughly 265 WPM). We take the total word count of a post and translate it into minutes, with an adjustment made for images. For posts in Chinese, Japanese and Korean, it's a function of number of characters (500 characters/min) with an adjustment made for images."
- https://lifehacker.com/calculate-the-reading-time-for-any-book-1835121105
  - 250 WPM
Implement feature and add it to lazyblorg output
- Potential approach:
  1. Count characters or words of Org mode original (excluding comments and noexport headings) because it still lacks HTML clutter.
  2. If character-count: divide by average 5 characters per word to determine WPM.

discuss: Variablizing and Separating Content from Config

I have a few ideas on how to make lazyblorg easier to use and would like to know your thoughts.

Allow config variables to be set using environment variables.
Be more forgiving with configuration options, so that (if desired) warnings are produced but a site render will still succeed.
Support multiple sites from a single lazyblorg install through use of named configuration sets.

My hope with these proposals is to support the continued growth of the lazyblorg community, and to make it easier to accept contributions back into upstream.

I'm cooking/prototyping these ideas in a fork but ideally would like to submit PRs upstream to you.

Would you be receptive to these changes? Let's discuss!

Switch import of config.py from import to exec

So far, lazyblorg.py is using import to read the configuration file. This might be changed to exec() for some reasons mentioned in https://lobste.rs/s/qyhvhc/your_configs_suck_try_real_programming#c_0btczk

leveraging org-publish for the HTML-creation?

I like your approach to URLs and I like the simplicity of lazyblorg. The only thing which irks me is only having a subset of org-syntax.

Could the site be created by setting up org-mode files with exact copies of the blog-entries and then running that through org-publish? A #+setup: … line could take care of templates.

This would be no true intermediate format, since there would be no conversion step between the org-mode text of the entries in the original files and the org-mode text in the finished website. Also this would allow you to avoid following org-mode syntax.

Another advantage is that you could get a preview of an article by simply exporting the subtree (change the scope in the export).

Override auto-tag decision with explicit tagging

Sometimes, auto-tags can't be applied in an uncertain situation. For example, my blog has German and English articles. Some small articles contain keywords from both languages which results in "unsure" classification.

This feature should allow for manual tagging (overriding) of the auto-tag algorithm which ends up in a tag that looks like it has been applied by the auto-tag algorithm.

TODO: look out for differences of auto-tags and manual tags and decide how this should be implemented.

Order of tags in HTML result files are different between two runs

Due to moving from Python 2 to Python 3 as of #22, the handling of Dict has changed. This results in failed end-to-end tests because the order of the tags are different from one run to the next one.

Either the Dict has to be migrated to OrderedDict or the tags should be in alphabetic order.

Implement code within quotes

I'd like to render lazyblorg the following example:

This is normal text.

#+BEGIN_QUOTE
This is normal text.

: This is a code within the quote block

This is normal text.
#+END_QUOTE

This is normal text.

fix issue with not replaced "#READING-MINUTES-SECTION#" template in tag pages

Example: https://karl-voit.at/tags/photos/ does show the text "#READING-MINUTES-SECTION#".

Write a test case for it.

Fix the error.

pypandoc not found

Hi,
When I issue:
./example_invocation.sh
I get:
Warning: MEMACS_FILE_WITH_IMAGE_FILE_INDEX is not empty but contains no existing file. Please fill it with an existing filename containing a Memacs file index or set either MEMACS_FILE_WITH_IMAGE_FILE_INDEX or CUSTOMIZED_IMAGE_LINK_KEY to an empty string.
Could not find Python module "pypandoc".
Please install it, e.g., with "sudo pip install pypandoc".

But, pypandoc is installed and the other stuff as well:
Installing:
sudo pip install memacs
sudo apt install python3-werkzeug
sudo apt install pandoc
sudo pip install python3-pypandoc
sudo apt install python3-opencv
sudo npm install -g sass
git clone https://github.com/novoid/lazyblorg.git
cd ~/Downloads/lazyblorg/
./example_invocation.sh

How do I fix those errors?
Thx a lot...

Escaping of a tilde character is wrong

see id:2014-05-09-managing-digital-photographs

: All portrait photographs are rotated using [[http://www.sentex.net/~mwandel/jhead/][jhead]]. Also
: with jhead, I generate file-name time-stamps from the Exif header
: time-stamps. Using [[https://github.com/novoid/date2name][date2name]] I add time-stamps also to the movie
: files. After processing all those files, they get moved to the
: destination folder for new digicam files: ~$HOME/tmp/digicam/tmp/~.

... will be transformed into:

All portrait photographs are rotated using
<a href="http://www.sentex.net/<code>mwandel/jhead/">jhead</a>. Also with jhead,
I generate file-name time-stamps from the Exif header time-stamps.
Using <a href="https://github.com/novoid/date2name">date2name</a> I add time-stamps
also to the movie files. After processing all those files, they get moved to the
destination folder for new digicam files: </code>$HOME/tmp/digicam/tmp/~.

... which is wrong. The <code> element must be replaced by the tilde character.

Another example within an URL:

http://sd.wareonearth.com/~phil/xdu/examp1.gif gets messed up to:
http://sd.wareonearth.com/</code>phil/xdu/examp1.gif
on https://karl-voit.at/2014/03/25/xdu

List items must not be empty

There is a parsing error when a list item does not contain any data.

Example:

- a list item
- another list item
-
- previous line has an error

KeyError: 'autotags'

revision f8bf1b6
python version 2.7.13
os macOS 10.12.5

reproducing:

git clone https://github.com/novoid/lazyblorg.git
cd lazyblorg
sh example_invocation.sh

err:

Warning: DIRECTORIES_WITH_IMAGE_ORIGINALS[1] which is set to "/Users/yqrashawn/tmp/digicam/tmp" is not an existing directory. It will be ignored.
Warning: DIRECTORIES_WITH_IMAGE_ORIGINALS[2] which is set to "/Users/yqrashawn/archive/events_memories/2017" is not an existing directory. It will be ignored.
Warning: DIRECTORIES_WITH_IMAGE_ORIGINALS[3] which is set to "/Users/yqrashawn/archive/fromweb/cliparts" is not an existing directory. It will be ignored.
Warning: MEMACS_FILE_WITH_IMAGE_FILE_INDEX is not empty but contains no existing file. Please fill it with an existing filename containing a Memacs file index or set either MEMACS_FILE_WITH_IMAGE_FILE_INDEX or CUSTOMIZED_IMAGE_LINK_KEY to an empty string.
Warning: IMAGE_CACHE_DIRECTORY is set but points to a directory which does not exist. Either empty the string or create its cache directory at "/Users/yqrashawn/src/lazyblorg/testdata/imagecache".
WARNING  Blog data file "./NONEXISTING_-_REPLACE_WITH_YOUR_PREVIOUS_METADATA_FILE.pk" is not found. Assuming first run!
INFO     Parsing "testdata/end_to_end_test/orgfiles/test.org" ...
INFO     Parsing "testdata/end_to_end_test/orgfiles/about-placeholder.org" ...
INFO     Parsing "templates/blog-format.org" ...
INFO     no previous metadata found: must be the first run of lazyblorg with this configuration
Traceback (most recent call last):
  File "./lazyblorg.py", line 581, in <module>
    generate, marked_for_feed, increment_version)
  File "./lazyblorg.py", line 182, in generate_output
    return htmlizer.run()  # FIXXME: return value?
  File "/Users/yqrashawn/projects/lazyblorg/lib/htmlizer.py", line 186, in run
    stats_generated_persistent, stats_generated_tags = self._generate_pages_for_tags_persistent_temporal()
  File "/Users/yqrashawn/projects/lazyblorg/lib/htmlizer.py", line 342, in _generate_pages_for_tags_persistent_temporal
    entry)
  File "/Users/yqrashawn/projects/lazyblorg/lib/htmlizer.py", line 1844, in _generate_persistent_article
    content += self._generate_back_references_content(entry, config.PERSISTENT)
  File "/Users/yqrashawn/projects/lazyblorg/lib/htmlizer.py", line 1674, in _generate_back_references_content
    if 'language' in entry['autotags'].keys() and entry['autotags']['language'] == 'deutsch':
KeyError: 'autotags'

Omit empty tags

If there are two colons in the tags (happens when tags get propagated), this results in empty tags of the blog article: ** My Heading :tag1:tag2::tag3: -> has tag1, tag2, empty tag, and tag3.

Do not omit the HTML block names

According to https://github.com/novoid/lazyblorg/wiki/Orgmode-Elements#html

   #+NAME: This is the caption 
   #+BEGIN_EXPORT html
   <a href="http://Karl-Voit.at">Link to my webpage</a> 
   #+END_EXPORT

... results in:

   <a href="http://Karl-Voit.at">Link to my webpage</a>

... without any caption/name.

Implement HTML block captions.

derive list of external URLs

Add an additional command line parameter similar to --external-urls $FILE. This $FILE gets populated with CSV content like:

http://foo.example.com/an/old/path; id:2020-04-12-article-id
https://bar.example.com; id:another-id

This can be used as input for any external URL checker in order to find broken links.

Implementation might relate to #41 and #44 which also might potentially extract data without generating the actual blog data (not decided yet).

Move `id:2017-01-03-how-to-use-public-voit` to config variable

When invoking lazyblorg for the first time, it may be the case, that the common-sidebar definition of the blog template complains about not being able to find id:2017-01-03-how-to-use-public-voit.

Check, if it is feasible to do it similar to id:#ABOUT-PAGE-ID# so that config.py is defining how this should be handled.

Implement an Atom feed for each tag

People may not be interested in the majority of my articles. They may be only interested in articles that are tagged with tags I use sparsely. For example, when somebody is only interested in articles about Apple,, she should be able to follow a feed that holds articles for this single tag.

Feeds for single tags could be published on the tag pages.

Migrate lazyblorg project to Python3

This is really over time ;-)

Replace Disqus with a self-hosted/open alternative

Disqus is currently used in the templates to enable comments from readers.

Unfortunately, this is an external dependency to a closed service which is not a good thing to have.

One alternative would be https://posativ.org/isso/ which is using a SQLite backend.

evaluate integration effort
evaluate hosting possibilities for karl-voit.at

Article must not end with a list item

It seems to be the case that there is a problem when an article ends with a list item.

Workaround: append a normal paragraph at the end of the article after the list items.

example: id:2015-05-24-browser-keywords

fix issue with missing images in article preview

Images within article previews are not displayed and their optional link to the larger version is a dead one.

Add a end-to-end test for it.

Fix with sanitizing its image links to the corresponding article directory.

Write a requirements.txt

https://www.reddit.com/r/orgmode/comments/763gfz/blogging_with_orgmode/dox1goo/ lists some missing dependencies.

Create a clean test environment for testing initial setup and write a full requirements.txt for it.

Refactor: introduce hierarchical list that holds all articles based on date

To improve performance (replacing some [x from x where <date>]) and code maintainability, an additional(?) data structure would be helpful that is organized according to date. This way, all articles from one specific day (see #50 or #35) can be easily retrieved.

draft data structure (hierarchical list?)
identify potential spots that can be refactored in order to get cleaner code
populate data structure (while parsing?)
add also non-temporal pages (with their type of course)
document data structure

Optimize string concatenation performance via lists

The HTMLizer does a lot of string concatenation to generate the resulting text (HTML) files out of the snippets "translated" form the internal data representation of the content. Optimizing performance for string concatenation might have a relevant impact on the overall blog (re-)generation performance.

Switching from mystring += string_to_append to my_list_of_strings.append(string_to_append); ''.join(my_list_of_strings) as stated in http://xahlee.info/python/python_append_string_in_loop.html to optimize HTMLization performance.

Implement command line switch to change config file

In #26 somebody suggested to be able to switch between different blogs using a command line switch for the config file.

Add static types to source code (mypy)

Test: URL link over mixed formatting

Add a test to see, if following is working as expected:

*this is [[https://karl-voit.at][bold]]*
*this is [[https://karl-voit.at][bold*]]
*this [[https://karl-voit.at][is]] bold*
[[https://karl-voit.at][*This]] is bold*
*[[https://karl-voit.at][This]] is bold*

If an issue is detected, fix it accordingly.

Update 2022-01-23: line 2 and 4 of the example can't be directly converted to HTML/XML as they may result in invalid syntax (see discussion below). For now, it should be safe to ignore those cases. Maybe this issue can only be solved by switching to a better parser: #63

Write repr() of currently_supported_orgmode_syntax.org in a text file

In order to learn about the current Python data representation of syntax elements, write the repr() for the hard-coded ID of https://github.com/novoid/lazyblorg/blob/master/testdata/end_to_end_test/orgfiles/currently_supported_orgmode_syntax.org in a text file.

Document it accordingly in the Wiki.

Multiple blogs?

Does lazyblorg support multiple blog targets out-of-the-box?

Blog entries have to marked as done from a previous state

I get "Heading does not contain a most recent timestamp" when I do mark a fresh blog entry as DONE without having a previous state.

This behavior results from

:LOGBOOK:
- State "DONE"       from              [2016-02-27 Sat 12:03]
:END:

compared to:

:LOGBOOK:
- State "DONE"       from "NEXT"       [2016-02-27 Sat 12:00]
:END:

The first one is not recognized correctly in orgparser.py in the section after components = self.LOG_REGEX.match(line)

Escaping/sanitizing of <> characters are missing in blocks

Within blocks, < and > are not sanitized accordingly.

Implement support for rendering /italics/ properly

This seems to be a tough one: implement support for italics within blog posts without breaking support for determining URLs.

To be honest, I did not spend much time investigating how Org mode is implemented. Maybe there are some clever ideas how to do it properly. So far, I did not do it. Maybe it's easily done.

Ideas for unit tests:

determining italics within a line that holds URLs as well
determining italics that span over one or even more line breaks (Org mode does it only for one line break AFAIR)

And:

add it to https://github.com/novoid/lazyblorg/blob/master/testdata/end_to_end_test/orgfiles/currently_supported_orgmode_syntax.org
add documentation to the Wiki