Giter VIP home page Giter VIP logo

shriprem / fwdataviz Goto Github PK

View Code? Open in Web Editor NEW
36.0 5.0 6.0 12.45 MB

Fixed Width Data Visualizer plugin for Notepad++. Turns Notepad++ into Excel for fixed-width data files. Displays cursor position data. Jumps to specific fields. Folding Record Blocks. Extracts Data. Builtin dialogs to configure file-type, record-type & fields; Themes & Colors; and Folding. Handles homogenous, mixed & multi-line records.

License: GNU General Public License v2.0

Batchfile 0.12% C++ 76.19% C 23.69%
fixed-width visualizer navigator notepad-plus-plus plugin data-visualization excel data-extraction

fwdataviz's Introduction

FWDataViz

GitHub GitHub release (latest by date) GitHub all releases       GitHub last commit (by committer) GitHub Workflow Status (with event) GitHub issues

Fixed Width Data Visualizer plugin for Notepad++

Current Version: 2.6.3.1

image ICD-10 Billable-Flagged Order Codes Sample File, with folding applied.

Features at the Panel

Features beyond the Panel


Plugin Panel

Plugin_Panel

  • Click on the View Sample Files icon: View Sample Files, and choose from the menu options to view various sample files.

  • Click on the File Type Metadata Editor icon: File Type Metadata Editor, to view, modify or create your custom File Type definitions. For more information, see: File Type Metadata Editor.

  • Click on the Visualizer Theme Editor icon: Visualizer Theme Editor, to view, modify or create your custom Visualizer Theme definitions. For more information, see: Visualizer Theme Editor.

  • Click the Preferences button to specify preferences for the plugin. For more information, see: Preferences.

  • Check the Auto-detect File Type box to automatically visualize files with matching file type from the defined list. For more information, see: Auto-Detect File Type Configuration.

  • To get the Multi-Byte Chars become visible on the plugin panel, see: Preferences. Since this is a 3-state checkbox, please review the documentation for it at: Multi-byte Character Data Visualization.

  • Check the Default Background box to render the fixed-width fields with just the text colors while suppressing the background colors of the theme styles.

  • Check the Show Calltip box to display the Cursor Position Data in a calltip within the editor, right below the current cursor position. The calltip will be useful during presentations and other situations when there is a need to avoid an additional glance towards the side panel to view the same Cursor Position Data.

    💡 To display the calltip text in bold, check the option for Use DirectWrite in Notepad++ Settings » Preferences » MISC.

  • Check the Trim Field Copy box to automatically ignore the left or right padding characters when copying a field. For more information, see: Field Copy.

  • For more information on the Jump to Field and left & right hop buttons, see: Quick Field Navigation.

  • For more information on Field Copy and Field Paste buttons, see: Field Copy and Field Paste.

  • For more information on the Data Extraction button, see: Data Extraction.

  • For more information on the buttons in the Folding section, see: Foldable Record Blocks and Fold Structures Editor.

  • Clicking the button at the right-bottom corner of the plugin panel will display the refreshed list of file paths for the INI files active for the current fixed-width file. Just hovering over this button will display the last refreshed list.


Sample data files

Homogenous Record-type Fixed-width Data Files

Files with data in the same record format.

NOAA Weather Stations List Single_Rec

Mixed Record-type Fixed-width Data Files

Files with data in multiple record formats.

NOAA Weather Stations List (Same data from the preceding sample, but now flagged based on the GSN field) Multi_Rec

Treasury IPAC File (Real format, but with fake data. Foldable.) Multi_Rec

In this sample clip:

  • The records are being displayed in foldable blocks. For more information on this, see: Foldable Record Blocks.
  • The Transaction Header records are being visualized with a different theme. For more information on this, see: Record Type Theme.

Multi-Line Record-type Fixed-width Data Files

Files with data records that span multiple lines. End of record is indicated by a marker. For example: <END_REC>

Owing to such a marker, records for these files can be of varying width.

Weather Stations List with location wiki and Daily Weather data (A contrived format, for concept illustration) Multi_Line


Installation

  1. Install Notepad++ version 8.4 or higher.
  2. Open Notepad++.
  3. In Notepad++, go to menu Plugins » Plugins Admin....
  4. In the Available tab, check the box for Fixed-width Data Visualizer.
  5. Click the Install button.

Attributions

  • Most of the icons used in this plugin are the originals or derivatives of the Fugue Icons designed by Yusuke Kamiyamane.

  • All screen clippings in this repository were made using FastStone Capture. Small-size application with awesome features!

  • The core visualizer algorithm of this plugin was first protyped in Python using David Brotherstone's Python Script plugin for Notepad++.

fwdataviz's People

Contributors

chcg avatar dependabot[bot] avatar shridharprem avatar shriprem avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

fwdataviz's Issues

[Bug] Erroneous regex causes crash

I created an entry with erroneous regexp, like this:
image
— plus should've been escaped with backslash!

Result: Notepad++ hung and eventually disappeared. Needs to be fixed.

'Auto-Detect File Type' crashes Notepad++ when a file with a line having more than 32767 chars is opened

Description of the Issue

While 'Auto-Detect File Type' is enabled, FWDataViz plugin is causing Notepad++ to crash when a file with a line having more than 32767 byte characters is opened.

Steps to Reproduce the Issue

  1. Prepare a file with at least one line having more than 32767 byte characters.
    [For this purpose, I used the 'Preferences' file of the Microsoft Edge browser. This compressed JSON file had a single line with 290,000+ characters (YMMV). The normal path template for this file is: 'C:\Users\UserNameHere\AppData\Local\Microsoft\Edge\User Data\Snapshots\VersionNumberHere\Default\Preferences']
  2. With the FWDataViz panel mounted and 'Auto-Detect File Type' enabled (i.e., checked), open the prepared file.

Expected Behavior

Notepad++ must not crash.

Actual Behavior

Notepad++ crashes. However, it will not crash only if the 'Auto-Detect File Type' is not enabled.

Debug Information

Notepad++ v8.5.7 x64
FWDataViz plugin v2.6.3.0 x64

Feature requests: Non-wrapping Field Labels, Record Length, Auto-Detection of File Type

Hello, Sridhar

I wonder if you might consider a couple of small changes? And thank you for a wonderful productivity tool!

  1. Bug - The visualizer panel word-wraps the record type and field label even if the panel is wide enough to accommodate the string. See screen shot.
  2. Feature request - I would like to see the record length added to the panel. Both the "designed" and "actual" record length, in case there is a discrepancy.
  3. Feature request - Could you add an "auto detect" for the file type, perhaps based on a regex rule? I work with multiple data file types, each with multiple versions, so it would be great to automatically select the right file type based on contents.
    Clipboard01

Kind regards,
Jeff Bowles

Multi-byte characters break highlighter/fields

Multi-byte characters break the highlighter and field definitions.
In the format that I'm currently looking at, some text fields can contain multi-byte characters (see attached image).
The two lines in the image have the same record type and field layout, but because one of the lines contains the multi-byte characters: æøå, the line is highlighted wrong.

image

Bug in column navigation (arrows)

Description of the Issue

Column navigation (arrows) near the Jump to Field is off by 1 space

Steps to Reproduce the Issue

  1. Load a file with the visualizer showing
  2. Set cursor to beginning of first line
  3. Click the right arrow >

Expected Behavior

The cursor moves to the first letter of the next field

Actual Behavior

If n is the location of first letter of the next field the cursor moves to n-1

Debug Information

Notepad++ v8.5.4 (64-bit)
Build time : Jun 17 2023 - 20:42:45
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line :
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows Server 2022 Datacenter (64-bit)
OS Version : 21H2
OS Build : 20348.1850
Current ANSI codepage : 1252
Plugins :
ComparePlugin (2.0.2)
FWDataViz (2.6.2)
HugeFiles (0.4.1)
IndentByFold (0.7.3)
JsonTools (4.10)
MarkdownViewerPlusPlus (0.8.2)
mimeTools (2.9)
NppConverter (4.5)
nppcrypt (1.0.1.6)
NppEditorConfig (0.4)
NppExport (0.4)
NppFavorites (1.0.0.1)
NPPJSONViewer (2.0.4)
NppSaveAsAdmin (1.0.211)
SurroundSelection (1.4.1)
XMLTools (3.1.1.13)

The definition has start position and width, but this behaves as if its using the character position which starts at 1 with a 0 based index.

Jump to field works as expected and although it can be hard to tell with cursors and screenshots here is on showing how it set the cursor one character to the left of where it should be.

image

Folding feature request

Requesting the ability to flag certain record types as "fold begin" and "fold end" points

Most of my data files are tens of thousands of records long, consisting of contiguous and logically-related groups of records I call "record sets". It would be very helpful if certain record types could be designated as "fold begin" points, while others could be "fold end" points.

Why the standard "user defined language" feature of N++ doesn't suit my need
I have tried using the standard "user defined language" feature of N++ to designate a pattern that identifies "fold begin", but I've not found way, with the type of data files I use, to reliably indicate a "fold end". Mainly, this is because the logical groups of records do not always end with the same record type.

Proposed solution
For this reason, it would be a wonderful feature if FWDataViz could allow the "fold begin" point to also be the "fold end" for the prior logical group.

A little background info
I posted a question in the N++ forum, but got a disappointing answer that the UDL feature has not been updated in a long time and my request would be behind many others in priority.

Examples and (I hope) clarifications

Here are some examples of typical "record sets" using common metasyntactic symbols.

Example 1 - a logical record set

foo record
bar record
baz record
plugh record
qux record

Example 2 - a logical record set

foo record
bar record
baz record
plugh record
qux record
fred record
waldo record

Example 3 - logical record set

foo record
bar record
baz record
plugh record
fred record

In all examples, "foo" is the first record of my record set, so I would like to be able to check a box on the FWDataViz metadata editor indicating "this is a fold-opening record type".

For a "waldo" record (Example 2), I could check a different box indicating "this is a fold-closing record type". In my world, "waldo" always marks the end of a logical record set, if present.

Conversely, I would not declare "qux" or "fred" as fold-closing types, because they are not always present, nor are they always the final record type in a set.

The real magic would be the additional ability to designate "foo" records as being both a fold-close (ending prior record set) and a fold-open point (beginning next record set). Maybe we call this a "open implies close" feature. I don't know if special handling would be needed at BOF and EOF, but I would not care so much if the top and bottom record set in a file were broken / did not fold correctly, as long as the thousands in between worked.

Sorry for the long-winded explanation. If there is any interest in adding this feature, I'd be happy to provide more examples or clarification of the problem / proposed solution.

Thanks for an already-awesome plugin!

Jeff

Crash when clicking the Plus or Minus buttons on the filled line items after a Data Extraction

Steps to Replicate

  1. Load a fixed-width data file.
  2. Click the Data Extraction button.
  3. In the Data Extraction dialog specify a record type and field type on a few (1 or more) line items.
  4. Click the Extract Data button.
  5. Now click the Plus button next to one of the line items wherein you had specified the record and field types. This will cause Notepad++ to crash immediately.

Diagnosis

When Extract Data button is clicked, there is a change of the active document in the Notepad++ editor. To start with, the active document in Notepad++ was the fixed-width file that launched the Data Extraction dialog. But now the active document is the new document with the extracted data. This change of the active document is causing the crash.

Enhancement: Allow certain fields to be hidden

It would be useful for some record types to be able to mark "filler" fields as "do not display". Especially when these fields are near the beginning of the record. This way, one might not have to scroll as often horizontally to view the important data fields.

This feature could also be useful at the record level. Certain file formats contain boilerplate record types that don't really contribute information to the person reading the file. For example, copyright notices or "authorized use only" warnings.

FWDataViz will crash Notepad++ when scrolling beyond last line

Steps to Replicate

  1. In Notepad++, go to Preferences » Editing. Check the box for Enable scrolling beyond last line.
    image

  2. Then open a fixed-width data file. With the FWDataViz sidepanel open, the file should start getting colorized.

  3. Now, try to scroll past the last line of the file. Notepad++ will crash.

Note: This worked fine in FWDataViz 1.4.2.0 when this action was last tested.

Bug: When using foldable/collapsable regions, styles aren't applied to the last lines.

Hi, thanks for creating this plugin.

I observed an issue where creating a foldable region and collapsing it, the style will no longer be applied to the last x visible lines, where x is the number of collapsed/hidden lines. You can produce this simply using one of the sample files like ghcnd-stations.txt, changing the language in the notepad++ language menu to something like C, and add {} with some blank lines to create a collapsable region, then collapse it.

expanded

collapsed

Porting to VS Code

Are there any plans to port this plugin as an Visual Studio Code extension?

Creating new fold structures crashes Notepad++

Creating new fold structures in the Editor frequently crashed the whole Notepad++ for me.
I can't exactly pinpoint what does that and how to reproduce it. The same action once works, then again it crashes the whole N++ (just closes the app). If I'll find any pattern to it, I'll add it here. For now it is worth investigating further.

Visualizer Styles don't work on large files

Description of the Issue

When I try to apply a visualizer to a large file the colors don't appear. The field information still works, however. In my case, this is a text file of about 1GB, with over 1 million lines of about 900 characters each. However, if I copy and paste the entire content into a new tab, the visualizer applies the color styling without issue. Also, if I cut the file down to about 100,000 lines, the style works correctly. So it seems to have something to do with the size of the file, and that the file was opened from disk, rather than being created in a new tab.

Steps to Reproduce the Issue

  1. Open a very large file (1GB with 1,000,000 or more lines, 900+ chars per line)
  2. Try to apply any of the FW file types.
  3. The field data works, but the colors don't apply.
  4. Copy and paste the content of the file into a new tab.
  5. Try to apply any of the FW file types.
  6. The colors show as intended.

Expected Behavior

The colors should show on the source file, without having to copy and paste the content.

Actual Behavior

The colors did not show until after copying and pasting the content into a new tab.

Debug Information

Notepad++ v8.5.4 (64-bit)
Build time : Jun 17 2023 - 20:42:45
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line : "D:\Junk\test (2).txt"
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 11 Pro (64-bit)
OS Version : 22H2
OS Build : 22621.1992
Current ANSI codepage : 1252
Plugins :
BigFiles (0.1.3)
ComparePlugin (2.0.2)
CSVLint (0.4.6.5)
FWDataViz (2.6.2)
NPPJSONViewer (2.0.5)

Trim Fields for Data Extraction

There is "Trim Field Copy" feature, which works great. But there is no similar feature for Data Extraction, which would be great to have.

No File Type visible in FWDataViz panel initially

After loading a file and opening the panel the is no visible File Type to choose. After going into File Type Metadata Editor and just clicking "Save as Primary Configuration" and "Close", it's there all of the sudden.

image

Continuous Flicker with Show Calltip option when Word Wrap is enabled

Description of the Issue

There is a continuous flicker with the Show Calltip option enabled in FWDataViz when Word Wrap is enabled in Notepad++.

Steps to Reproduce the Issue

  1. Open one of the FWDataViz sample files. For example: Weather Stations List or ICD-10 Diagnosis Codes.
  2. Enable the Show Calltip option in the FWDataViz panel.
  3. Enable Word Wrap option in Notepad++. The calltip displayed at the cursor position in the editor will start flickering continuously.

Enable / Disable plugin ondemand

This great plugin is awesome.

An active problem is that it can't be disabled on demand, so when using parent app (Notepad++) to edit any other filetype (source code, in, etc) any other synthax highlight is override.

npp

Enhancement: Repeatable "segments"

Some document types (X12 EDI format comes to mind) allow for certain records to have repeatable groups of fields called segments. The fields within the segments are all fixed-width, but there may be N number of the same segment type within a record.

EDI file viewing may be beyond the simple design philosophy of FWDataVis. But it would be an awesome feature!

[Feature Request] - change extension of Themes file

Please use .ini or another traditional text file extension for the configuration file currently named Themes.dat. I note that all the other configuration files use the .ini extension and I believe that .dat is very often used for binary files. Using .ini makes it much easier to directly edit the file as well as preview it in the Explorer preview pane.

Thank you.

Desire for a Variable Width Data Viz.

Would it be possible to create a variable width data visualizer? I am new to GitHub, and to this plugin. I'm not a coder, but I think being able to specify characters present in every line as the column boundaries. (Example: A tab, a comma, etc) instead of fixed width would be quite useful as NPP does not seem to like to align text to visual columns.

Apologies if this isn't a correct place to post this 'Idea' and for not seeing if this had already been suggested.

Enhancement: Allow certain record types to be "decorated"

It would be helpful if certain record types (or even fields within records) could have a boolean property that indicates the record (or field) should be emphasized in some way.

For example, a file might have numerous "documents" within it. If all of these begin with a particular record type, it would help if a brightly colored horizontal line could be drawn above that record type, effectively adding a visual demarcation of each new document within the file.

Manually set selections of File Type and Visualizer Theme are being lost when returning to the same doc after briefly switching to a different doc in NPP

Manually set selections of File Type and Visualizer Theme are being lost when returning to the same document after briefly switching to a different document in Notepad++. This issue started in the FWDataViz plugin with the Notepad++ 8.4 version.

Steps to Reproduce

  1. Open several files in Notepad++.
  2. On one of the fixed-width files, manually select a different choice of fixed-width File Type and /or a Visualizer Theme from their respective dropdowns in the side panel.
  3. Switch to a different document tab in Notepad++.
  4. Return to the original tab with the fixed-width file. The manually selected changes made in step 2 are lost.

Diagnosis

In Notepad++ 8.4 version, the Scintilla component was upgraded from version 4.x to 5.x. FWDataViz relied on two Scintilla API methods (SCI_SETPROPERTY & SCI_GETPROPERTY) to cache the fixed-width File Type and Visualizer Theme preferences on a per-document basis. This worked well in Notepad++ versions 8.3.3 and earlier that were built with the Scintilla 4.x component. But the feature set of the two API calls got drastically curtailed in Scintilla 5 with the separation of the Lexer interface. So, the document-level custom choices of File Type and Visualizer Theme are being lost when the tab is switched to a different document in Notepad++ versions 8.4.0 and later.

FWDataViz is visualizing a fixed-width file only when the plugin is the topmost in a stack of panels

Description of the Issue

FWDataViz is visualizing a fixed-width file only when the plugin is the topmost in a stack of panels.

Steps to Reproduce the Issue

  1. Have the FWDataViz plugin panel open in Notepad++.
  2. Open other plugin panels such as Explorer or GotoLineCol so that the FWDataViz panel is stacked behind these other panels.
  3. Open a fixed-width data file.

Expected Behavior

The fixed-width file should get visualized by FWDataViz as long as its panel has been mounted.

Actual Behavior

The fixed-width file is not being visualized by FWDataViz when its panel is behind other panels in a stack.

No colorization after switching to 64 bit

This is possibly an environmental issue and not a bug.

  • I decided to update Notepad++ to 64-bit.
  • The install offered to remove the 32-bit version (keeping settings), which I allowed.
  • I downloaded the 64-bit version of FWDataViz 2.4.0.2 via the Plugins Admin Tool
  • FWDataVis properly auto-detects my custom file definitions, and all the fields are still identified correctly
  • The theme is still set to "Spectrum"
  • No coloring is happening
  • If I "View Sample Files" from the FWDataVis dialog, and choose one of the predefined files, coloring works.

Any ideas for things to try?

[Bug] Theme applied incompletely during switching when they have different number of colors defined

If you switch between themes which have different number of colors defined (in my case 12 vs 2), then colors in positions which don't exist in one of them are not reapplied, and resulting screen shows either a mixture of two themes or a subset of target theme. It looks that number of colors to be applied is taken from old theme instead of new. Switching to another app and back to N++ fixes this.

  1. Starting with a colorful theme Tango Stripes:
    image

  2. Going to change theme to Gray:
    image

  3. I expect this:
    image

  4. But I get this:
    image

  5. Then I fix it (see screen as in point 3), and I switch from Gray back to Tango Stripes, and again an error:
    image

Adding my Themes.ini for reference.
Themes.zip

More control over coloring

Would it be possible to add some more control over colors which are user for particular fields?
Consider the following cases:

  1. Monetary value contains a sign (+/-) 10 integer digits and 3 decimal digits. Such a value appears a few times in each record (each time containing all three elements). I would like all monetary values to be colored similarly, yet slightly different (for example integers and decimals in different shades of the same color).
  2. Similar record types appear in a file. Each of them contains transaction date, but depending on record type it is moved by 4 bytes. I would like to have the date in the same color, disregarding the fact which type of record I am seeing.

Sooo, these requests can be narrowed to this: would it be possible to assign colors to fields' labels? And perhaps have it (optionally) as a global setting rather than per-file-format?

So for example I would have a setting:

  • TransDate -> black on yellow
  • PostingDate -> black on orange
  • NetValueInt -> dark blue on light blue
  • NetValueDec -> dark blue on light blue + italic
  • BrutValueInt -> dark green on light green
  • BrutValueDec -> dark green on light green + italic

And then while creating file format definitions, I could use these field labels to have desired coloring.
On top of that I would create a simple color theme with just two colors, two shades of grey, and these colors would be used in all cases when particular color was not defined per field label.

Great job, by the way!

NPP Plugin Version

The version of this plugin that you get from using the Plugin Admin in NPP is 2.1.2.1, and it does not show up as having an update available.

The version here on Github is currently 2.4.0.1.

I could find no way to replace the old version with the current build. Can you please post up how to do that and/or replace the version pulled automatically with the latest build?

Thanks in advance!!

Improve "Record Starts With"

Hi - I'll start by saying this is a fantastic, wonderful plugin! I work with lots of flat file data exchange formats and this will be a great time saver.

Possibly this feature already exists and I just have not found it yet.

My request:

One of the formats I work with has record identifiers not in the first text column, but in the fourth through seventh character positions. It would seem that the "Record Regex Key" alone should be enough to identify the record type, and the "Record Starts With" does not understand regex.

Is there an existing way to identify records that don't "start with" but "contain at position"? If not, would you consider adding this feature?

Thank you,

Jeff

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.