Comments (5)
Sample files demonstrating the issue:
format.txt
sample.txt
Note that format.txt, would be format.ini if GitHub allowed upload of .ini files.
from fwdataviz.
This is because FWDataViz uses byte counts, and not character counts, for determining field widths. This choice was made during the initial stages to keep the core algorithm for the plugin simple and fast. Unicode usage was enabled only on the File Type Label, Record Type Label and Field Labels since their use with those labels didn't affect that goal.
That being said, I took the time to review the code before posting this response here. My preliminary observation is that a File Type-level flag can be added to switch the algorithm between byte count (existing) and character count (new) modes for determining field widths.
The UTF-8 character encoding can yield between 1 and 4 bytes per character. Character counting entails inspecting every byte in each record, and on every paint event of the NPP document (this paint event is triggered more frequently than you would think). This overhead affects the plugin's performance. Therefore, the switch to character count mode should be made on only those file types that may require it. So, I plan to make this switch available in the File Type Configuration dialog in a very obscure, non-intuitive manner. Only those needing this mode should be able to make the switch.
Also, note that turning on the character count mode means that TAB characters will be treated as multiple spaces and counted as such. This characteristic is part of the Scintilla API. You can see this in play with my other plugin: GotoLineCol wherein both byte-based and character-based column navigation feature is available.
Lastly, in this round of enhancement, the File Record Terminator, ADFT Line Regex Keys, and Record Type Regex Keys could continue to be limited to ANSI characters only. We will take baby steps, and only if and when we need them.
It will take me a few days to implement this enhancement. In the meanwhile, if you have any additional inputs, please let me know. Thanks!
from fwdataviz.
The enhancement for Multi-Byte Character mode is available in the latest 2.2.0.0 Release.
Please be sure to read the Multi-Byte Character document. You may benefit from enabling the hidden option for quick override mode from the plugin panel.
DLL-only manual upgrade of a previously installed version of the plugin
Go to the 2.2.0.0 Release page. Download the zip file version to match your Notepad++ bitness. Then, either:
- In Notepad++, navigate to menu: Settings ยป Import ยป Import Plugins... and import the dll file extracted from the zip file.
- OR, Extract the
FWDataviz.dll
file into the<NPP_Plugins_folder>/FWDataviz
folder to overwrite the existing DLL therein.
from fwdataviz.
That is an excellent solution to the problem, and a fine compromise between keeping the highlighter fast while also supporting formats with multi-byte characters.
Thank you for your great work!
from fwdataviz.
@rasmusgreve , please try out the latest v2.3.0.0 release for FWDataViz. This release fully supports multi-byte characters, to include Record Terminators, ADFT Regexes & Record Markers. Also, this release features Field Copy, Field Paste and Field Hop -- all of these also accessible via menu-based keyboard shortcuts.
Thank you.
from fwdataviz.
Related Issues (20)
- NPP Plugin Version HOT 2
- No colorization after switching to 64 bit HOT 4
- "FWDataViz.dll is not compatible with the current version of Notepad++" message in Notepad++ HOT 45
- Desire for a Variable Width Data Viz. HOT 4
- FWDataViz will crash Notepad++ when scrolling beyond last line
- Crash when clicking the Plus or Minus buttons on the filled line items after a Data Extraction
- Manually set selections of File Type and Visualizer Theme are being lost when returning to the same doc after briefly switching to a different doc in NPP HOT 1
- Folding feature request HOT 13
- Small feature request HOT 1
- Feature request: Sequence Numbers in the 'Jump to Field' dropdown list HOT 6
- Continuous Flicker with Show Calltip option when Word Wrap is enabled HOT 1
- No File Type visible in FWDataViz panel initially HOT 2
- Clarification for: Jumping to next field gets to one character before next field HOT 2
- Trim Fields for Data Extraction HOT 2
- Creating new fold structures crashes Notepad++ HOT 1
- Visualizer Styles don't work on large files HOT 3
- Bug in column navigation (arrows) HOT 3
- FWDataViz is visualizing a fixed-width file only when the plugin is the topmost in a stack of panels
- 'Auto-Detect File Type' crashes Notepad++ when a file with a line having more than 32767 chars is opened
- Porting to VS Code HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fwdataviz.