Giter VIP home page Giter VIP logo

booky's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

booky's Issues

Combine to single file for linux.

A much better option is to combine the python file into the shell script so we can put that one file in PATH,

#!/bin/bash

# Change to the directory of pdf file
cd $(dirname "$1")
pdf=$(basename "$1")
pdf_data="${pdf%.*}""_data.txt"
EXTRACT_FILE=booky_bookmarks_extract
bkFile="$2"


if [[ "$OSTYPE" == "darwin"* ]]; then
    SED=gsed
else
    SED=sed
fi

echo "Converting $bkFile to pdftk compatible format"
python3 -c '
import sys

level = 0
startChar = "{"
endChar = "}"
for line in sys.stdin:
	line = line.strip()
	if line == startChar:
		level = level + 1
	elif line == endChar:
		level = level - 1
	elif line:
		commaIndex = line.rfind(",")
		title = line[:commaIndex]
		pageNo = line[commaIndex + 1:].strip()
		print("BookmarkBegin")
		print("BookmarkTitle:", title.strip())
		print("BookmarkLevel:", level)
		print("BookmarkPageNumber:", pageNo.strip())' < "$bkFile" > "$EXTRACT_FILE"

echo "Dumping pdf meta data..."
pdftk "$pdf" dump_data_utf8 output "$pdf_data"

echo "Clear dumped data of any previous bookmarks"
$SED -i '/Bookmark/d' "$pdf_data"

echo "Inserting your bookmarks in the data"
$SED -i "/NumberOfPages/r $EXTRACT_FILE" "$pdf_data"

echo "Creating new pdf with your bookmarks..."
pdftk "$pdf" update_info_utf8 "$pdf_data" output "${pdf%.*}""_new.pdf"

echo "Deleting leftovers"
rm "$EXTRACT_FILE" "$pdf_data"

quick way/tips to prepare TOC text file into booky format?

Hello: Do you have any regex suggestions or tips to quickly prepare the TOC text file into booky required format?
I am very new to regex, so any help would be super.

I was trying to find a tool, where I could create a template for one chapter of the TOC and then apply this format template to all other chapters. Kinda like excel's "paste special" feature.

For example:
1 Insert { and beginning of each TOC block, and } at end of each TOC block
2 replace TOCitem leading dots (.......67) with booky required format of /67
e.g. TOCitem ........67 ==> TOCitem/67
3 replace TOCitem (space space67) or (space space,67) or (space space space67) with booky required format of /67
4 Automate indentation of all child TOC items

Or maybe there is a repository of regex samples that apply to TOC manipulation.
I use sublime texteditor and could not find any specific snippets for TOC text file manipulation

Thankyou

keeps failing

Hi, I hope you can help me work out what I am doing wrong.
10.15.6
PDF version: pdftk_server-2.02-mac_osx-10.11-setup

book called book.pdf
text file containing TOC is TOC.txt
in terminal executed

In Terminal it says:

(base) XXX@XXXs-MacBook-Air booky % ./booky.sh book.pdf TOC.txt
Converting TOC.txt to pdftk compatible format
Dumping pdf meta data...
Clear dumped data of any previous bookmarks
sed: 1: "book_data.txt": undefined label 'ook_data.txt'
Inserting your bookmarks in the data
sed: 1: "book_data.txt": undefined label 'ook_data.txt'
Creating new pdf with your bookmarks...
Deleting leftovers

In the pdf file, no new bookmarks were created.
I checked I have { } around the bookmark
I checked I have", space" between bookmark and page number

Does it handle - in a bookmark name?

suggest to add the offset

I think your solution is great to create the bookmark automatically. I have observed that many PDFs have page number in the Content section but they do not agree with the corrected PDF page. Therefore one has to calculate manually the page number from the content. This process is error-prone. I have a suggestion to add offset to the page marking, for example:

{
Title1, 1
Title2, 2
offset, 5
{
Subtitle1, 3
Subtitle2, 4
{
SubSubtitle1, 5
...
}
}
}

Then from when the offset keyword is defined, the page number is automatically added up. By this solution, one only needs to copy the page number from the content.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.