Comments (12)
After a Replace all "..." by “...” (shortest match, within any verse) the statistics become:
U+0022 " 6 QUOTATION MARK
U+201C “ 696 LEFT DOUBLE QUOTATION MARK
U+201D ” 631 RIGHT DOUBLE QUOTATION MARK
Addressing these remaining 6 locations is clearly a much simpler task.
from bible-punjabi-pavitr-bible-1945.
Here are the 6 remaining instances of (U+0022 " QUOTATION MARK):
Numbers 5:22
\v 22 ਐਉਂ ਏਹ ਜਲ ਜਿਹੜਾ ਸਰਾਪ ਲਿਆਉਂਦਾ ਹੈ ਤੇਰੇ ਸਰੀਰ ਵਿੱਚ ਜਾ ਕੇ ਤੇਰੇ ਢਿੱਡ ਨੂੰ ਸੁਜਾਵੇ ਅਤੇ ਤੇਰੀ ਜਾਂਘ ਨੂੰ ਸਾੜੇ ਤਾਂ ਤੀਵੀਂ ਆਖੇ, " ਆਮੀਨ, ਆਮੀਨ ''
The end of the quotation has two single quotation marks.
Numbers 6:24-26
\v 24 " ਯਹੋਵਾਹ ਤੇਨੂੰ ਬਰਕਤ ਦੇਵੇ ਅਤੇ ਤੇਰੀ ਰਾਖੀ ਕਰੇ,
\v 25 ਯਹੋਵਾਹ ਆਪਣੇ ਮੁਖੜੇ ਨੂੰ ਤੇਰੇ ਉੱਤੇ ਚਮਕਾਵੇ ਅਤੇ ਤੇਰੇ ਉੱਤੇ ਦਯਾ ਕਰੇ,
\v 26 ਯਹੋਵਾਹ ਆਪਣਾ ਮੁਖੜਾ ਤੇਰੇ ਵੱਲ ਫੇਰੇ ਅਤੇ ਤੈਨੂੰ ਸ਼ਾਂਤੀ ਦੇਵੇ ॥ "
The quotation spans three verses.
Numbers 10:36
\v 36 ਅਤੇ ਜਦ ਠਹਿਰਦਾ ਸੀ ਤਾਂ ਉਹ ਕਹਿੰਦਾ ਹੁੰਦਾ ਸੀ, ਹੇ ਯਹੋਵਾਹ, ਇਸਰਾਏਲ ਦੇ ਲੱਖਾਂ ਹਜ਼ਾਰਾਂ ਵਿੱਚ ਮੁੜ ਆ " ॥
A leading left double quotation mark seems to be missing.
Lamentations 2:15
\v 15 ਸਭ ਲੰਘਣ ਵਾਲੇ ਤੇਰੇ ਉੱਤੇ ਤਾਲੀਆਂ ਵਜਾਉਂਦੇ ਹਨ, ਯਰੂਸ਼ਲਮ ਦੀ ਧੀ ਦੇ ਉੱਤੇ, ਓਹ ਨੱਕ ਚੜ੍ਹਾਉਂਦੇ ਅਤੇ ਸਿਰ ਹਿਲਾਉਂਦੇ ਹਨ ਕੀ ਏਹ ਉਹ ਸ਼ਹਿਰ ਹੈ ਜਿਹ ਨੂੰ ਓਹ ਏਹ ਨਾਮ ਦਿੰਦੇ, - “ ਸੁਹੱਪਣ ਦੀ ਪੂਰਨਤਾਈ, ” ਸਾਰੇ ਸੰਸਾਰ ਦੀ ਖੁਸ਼ੀ " ?
The final double quotation mark seems to be spurious.
Acts 2:34-35
\v 34 ਕਿਉਂ ਜੋ ਦਾਊਦ ਅਕਾਸ਼ ਉੱਤੇ ਨਾ ਗਿਆ ਪਰ ਉਹ ਆਪੇ ਕਹਿੰਦਾ ਹੈ, "ਪ੍ਰਭੁ ਨੇ ਮੇਰੇ ਪ੍ਰਭੁ ਨੂੰ ਆਖਿਆ, ਤੂੰ ਮੇਰੇ ਸੱਜੇ ਪਾਸੇ ਬੈਠ,
\v 35 ਜਦ ਤੀਕਰ ਮੈਂ ਤੇਰੇ ਵੈਰੀਆਂ ਨੂੰ ਤੇਰੇ ਪੈਰ ਰੱਖਣ ਦੀ ਚੌਂਕੀ ਨਾ ਕਰ ਦਿਆਂ ॥
An ending right double quotation mark seems to be missing.
from bible-punjabi-pavitr-bible-1945.
See also issue #91
That issue was easily fixed by a tweak within my TextPipe filter.
from bible-punjabi-pavitr-bible-1945.
There are 46 instances of two adjacent apostrophes ''
These are probably all typos for a double quotation mark "
I have edited my TextPipe filter to implement this systematic replacement.
After reprocessing, the character frequency analysis became:
merged.usfm.character.frequency.txt
The quotation mark counts became:
U+0022 " 6 QUOTATION MARK
U+0027 ' 3 APOSTROPHE
U+2018 ‘ 29 LEFT SINGLE QUOTATION MARK
U+2019 ’ 29 RIGHT SINGLE QUOTATION MARK
U+201C “ 719 LEFT DOUBLE QUOTATION MARK
U+201D ” 654 RIGHT DOUBLE QUOTATION MARK
The 3 remaining ordinary apostrophes need to be reviewed.
from bible-punjabi-pavitr-bible-1945.
The 3 apostrophes are located in:
Deuteronomy 33:18
\v 18 ਜਬੂਲੁਨ ਲਈ ਉਸ ਆਖਿਆ, ਹੇ ਜ਼ਬੂਲੁਨ 'ਆਪਣੇ ਬਾਹਰ ਜਾਣ ਉੱਤੇ, ਅਤੇ ਹੈ ਯਿੱਸਾਕਰ, ਆਪਣੇ ਤੰਬੂਆਂ ਵਿੱਚ ਖੁਸ਼ ਹੋ ।
This lone instance looks simply spurious to me, in which case, it can be fixed by a further tweak to my TextPipe filter.
Isaiah 44:5 (where the apostrophe occurs twice - wrapping a comma)
\v 5 ਕੋਈ ਆਖੇਗਾ, “ ਮੈਂ ਯਹੋਵਾਹ ਦਾ ਹਾਂ,” ਕੋਈ ਆਪ ਨੂੰ ਯਾਕੂਬ ਦੇ ਨਾਉਂ ਤੇ ਸਦਾਵੇਗਾ, ਕੋਈ ਆਪਣੇ ਹੱਥ ਉੱਤੇ ਲਿਖੇਗਾ, "ਯਹੋਵਾਹ ਦਾ',' ਅਤੇ ਆਪ ਨੂੰ ਇਸਰਾਏਲ ਦੇ ਨਾਮ ਦੀ ਪਦਵੀ ਦੇਵੇਗਾ ॥
It looks as though this should have "ਯਹੋਵਾਹ ਦਾ", in which case, it can be fixed by a further tweak to my TextPipe filter.
Both tweaks now done! Updated analysis becomes:
merged.usfm.character.frequency.txt
U+0022 " 5 QUOTATION MARK
U+2018 ‘ 29 LEFT SINGLE QUOTATION MARK
U+2019 ’ 29 RIGHT SINGLE QUOTATION MARK
U+201C “ 720 LEFT DOUBLE QUOTATION MARK
U+201D ” 655 RIGHT DOUBLE QUOTATION MARK
This is significant progress, I think.
from bible-punjabi-pavitr-bible-1945.
As a result of my TextPipe filter, one of the locations has been altered incorrectly:
Numbers 6:24-26 now reads:
\v 24 " ਯਹੋਵਾਹ ਤੇਨੂੰ ਬਰਕਤ ਦੇਵੇ ਅਤੇ ਤੇਰੀ ਰਾਖੀ ਕਰੇ,
\v 25 ਯਹੋਵਾਹ ਆਪਣੇ ਮੁਖੜੇ ਨੂੰ ਤੇਰੇ ਉੱਤੇ ਚਮਕਾਵੇ ਅਤੇ ਤੇਰੇ ਉੱਤੇ ਦਯਾ ਕਰੇ,
\v 26 ਯਹੋਵਾਹ ਆਪਣਾ ਮੁਖੜਾ ਤੇਰੇ ਵੱਲ ਫੇਰੇ ਅਤੇ ਤੈਨੂੰ ਸ਼ਾਂਤੀ ਦੇਵੇ ॥
\p "
The quotation mark has been shifted to after the added paragraph marker.
This is due to inadvertently not including the EOL specifier $ in my regexp.
I have tweaked this line in the filter accordingly, with the following result:
- 125 instances of mid-verse double danda are no longer split by a paragraph tag \p
NB. This includes all the "Selah" words in the Psalms & Habakkuk. See issue #92
Even so, this prompts the question as to whether the exceptionally positioned double danda in Numbers 6:26 should actually be placed after the quotation mark rather than before?
It also begs the important question whether verses that do have a mid-verse double danda should still actually be split?
from bible-punjabi-pavitr-bible-1945.
Here are all the lines in the concatenated USFM file that now contain a mid-verse double danda.
mid-verse double danda search results.txt
from bible-punjabi-pavitr-bible-1945.
Minor tweak made to the TextPipe filter to tackle one pair of quotation marks that spanned three verses.
Numbers 6:24-26 now with left and right quotation marks reads:
\v 24 “ ਯਹੋਵਾਹ ਤੇਨੂੰ ਬਰਕਤ ਦੇਵੇ ਅਤੇ ਤੇਰੀ ਰਾਖੀ ਕਰੇ,
\v 25 ਯਹੋਵਾਹ ਆਪਣੇ ਮੁਖੜੇ ਨੂੰ ਤੇਰੇ ਉੱਤੇ ਚਮਕਾਵੇ ਅਤੇ ਤੇਰੇ ਉੱਤੇ ਦਯਾ ਕਰੇ,
\v 26 ਯਹੋਵਾਹ ਆਪਣਾ ਮੁਖੜਾ ਤੇਰੇ ਵੱਲ ਫੇਰੇ ਅਤੇ ਤੈਨੂੰ ਸ਼ਾਂਤੀ ਦੇਵੇ ॥ ”
Not done anything yet to insert a paragraph marker \p after verse 26.
There are now only 3 remaining ordinary double quotation marks:
U+0022 " 3 QUOTATION MARK
merged.usfm.character.frequency.txt
from bible-punjabi-pavitr-bible-1945.
The 3 remaining ordinary quotation marks can be fixed by manual editing.
I have create a new issue #97 to address this.
That means that this larger issue can now be closed.
from bible-punjabi-pavitr-bible-1945.
There are still 65 more LEFT double quotation marks than RIGHT ones.
Are all these explained by the use of Continuation Quotes?
Or might some of them be instances where a RIGHT was inadvertently keyed instead of a LEFT?
Further analysis is required.
from bible-punjabi-pavitr-bible-1945.
I just made a TextPipe filter to facilitate the analysis.
Replace matching pairs of left and right double quotation marks.
It provisionally replaces these by «...» (pointing double quotation marks).
The output file contains:
- 647 matched pairs of «...»
- 74 instances of the left double quotation mark
- 9 instances of the right double quotation mark
Thus there are 83 locations that need to be reviewed and corrected.
I will create a new issue for this.
from bible-punjabi-pavitr-bible-1945.
See issue #103
from bible-punjabi-pavitr-bible-1945.
Related Issues (20)
- Create a locale.d file for the Punjabi language to support the SWORD module
- Update STATUS.md to record which OCR software was used HOT 2
- Order of diacritic signs: NUKTA before or after VOWEL ? HOT 6
- Gurmukhi signs Virama and Addak - may they ever go together?
- Exclamation mark in the wrong position in Proverbs 20:14 HOT 2
- Matters relating to the word "Selah" in the Psalms and Habakkuk HOT 4
- Poetry markup?
- Asterisks in the text ?
- Making systematic changes using TextPipe HOT 7
- Fix the remaining 3 ordinary quotation marks HOT 3
- Resolve the mid-verse numeral 1 in Nahum 1:1 HOT 2
- Missing punctuation at EOL HOT 2
- Mid-chapter paragraph markers \p not preceded by a double danda HOT 2
- Why do some books have so few paragraph markers? HOT 4
- Marking proper names?
- Unmatched left and right double quotation marks that may need correcting HOT 4
- The rules for using the Gurmukhi Virama do not allow it to follow a vowel sign HOT 2
- FIO: A Gurmukhi Collation Algorithm (sorting Punjabi words)
- Zipf's Law illustrated for the distribution of Gurmukhi glyphs in the Punjabi Bible
- Invalid glyphs containing HA NUKTA in II Samuel 20:17 and I Chronicles 29:2 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bible-punjabi-pavitr-bible-1945.