Giter VIP home page Giter VIP logo

butler-spyglass's Introduction

Butler Spyglass

Build status Project Status: Active – The project has reached a stable, usable state and is being actively developed.

Butler Spyglass is a tool for extracting metadata from Qlik Sense applications.

The tool will

  • Extract data lineage for all or some applications.
  • Extract load scripts for all or some applications.
  • Extract complete info for all data connections.
  • Run once or recurring at a configurable interval.

Table of contents

Why extract app metadata

Data lineage

When using Sense in enterprise environments, there is often a need to understand both what apps use a certain data source, and what data sources are used by a specific app.

  • When de-commissioning an old system that feed several Sense apps with data, it is important to know which these apps are. Butler Spyglass provide this information in the form of data lineage information.
  • If a data source contains sensitive information, it is important to always have up-to-date information on what apps use the data source in question.
  • Reviewing and auditing apps is greatly simplified if there is clear information on what data sources the app in question uses.

Load scripts

By storing all app load scripts as individual files on disk, it is possible to snapshot these daily and store them in one ZIP archive for each day. This becomes a historical record of what the scripts looked like in the past.
Traditional disk backups provide a similar capability to bring back old versions, experience has however proven it to be very valuable to have quick and easy access to old script versions, for example if apps have become corrupt or if there is a need to revert back to an earlier app version.

Butler Spyglass solves all the scenarios above by extracting both data lineage information as well as full load scripts for all apps.

What's new

Each release on the releases page contains info what is new, if there are any breaking changes that require special attention etc.

The change log also keeps a complete log of all details from all releases.

Extracted data

Extracted information for each app is

  1. Data lineage, i.e. what data sources are used by the app in question.
  2. Load scripts.

In addition to the above, complete definitions (except credentials, passwords etc) for all data connections in the Qlik Sense server are extracted and stored as CSV and JSON files.

Data lineage

Whether or not to extract data lineage info for apps is controlled by the configuration parameter ButlerSpyglass.lineageExtract.enable. Set to true/false as needed.

Data lineage information is stored in CSV and JSON files, one pair for each Sense app. Files are stored in the directory defined in ButlerSpyglass.lineageExtract.exportDir.

More info about discriminators and statements is found here.

Load scripts

Whether or not to extract app load scripts is controlled by the configuration parameter ButlerSpyglass.scriptExtract.enable. Set to true/false as needed.

Each app's load script is extracted and stored in a <appid>.qvs file in a folder as defined by the ButlerSpyglass.scriptExtract.exportDir configuration parameter.

Data connection definitions

Whether or not to extract data connection definitions is controlled by the config parameter ButlerSpyglass.dataConnectionExtract.enable. Set to true/false as needed.

Data connections are stored to dataconnection.json and dataconnection.csv in the directory specified by ButlerSpyglass.dataConnectionExtract.exportDir.

Config file

A template config file is available here.
How to name and where to store the config file is described here.

The parameters in the config file are described below. All parameters must be defined in the config file - run time errors will occur otherwise.

Parameter Description
logLevel The level of details in the logs. Possible values are silly, debug, verbose, info, warn, error (in order of decreasing level of detail).
fileLogging true/false to enable/disable logging to disk file
logDirectory Subdirectory where log files are stored
extract.frequency Time between extraction runs. 60000 means that the next extraction run will start 60 seconds after the previous one ends. Milliseconds
extract.itemInterval Time between two sets of apps are extracted. The number of apps in a set is defined by extract.concurrentTasks (below). For example, if set to 500 there will be a 0.5 sec delay between sets of apps are sent to the Qlik Sense engine API. Milliseconds
extract.itemTimeout Timeout for the call to the engine API. For example, if set to 5000 and no response has been received from the engine API within 5 seconds, an error will be thrown. Milliseconds
extract.concurrentTasks Number of apps that will be sent in parallel to the engine API. Use with caution! You can easily affect performance of a Sense environment by setting this parameter too high. Start setting it low, then increase it while at the same time monitoring the realtime performance (mainly CPU) of the target server, to ensure it is not too heavily loaded by the data extraction tasks.
extract.enableScheduledExecution true=start an extraction run extractFrequency milliseconds after the previous one finished. false=only run once, then exit
lineageExtract.enable Controls whether to extract lineage info or not. true/false
lineageExtract.exportDir Directory where lineage files should be stored.
lineageExtract.maxLengthDiscriminator Max characters of discriminator field (=source or destination of data) to store in per-app lineage disk file
lineageExtract.maxLengthStatement Max characters of statement field (e.g. SQL statement) to store in per-app lineage disk file
scriptExtract.enable Controls whether load scripts are extracted to text files or not. true/false
scriptExtract.exportDir Directory where script files will be stored.
dataConnectionExtract.enable Controls whether data connections are extracted to JSON file not. true/false
dataConnectionExtract.exportDir Directory where data connections JSON file will be stored.
appFilter.appNameExact List of apps for which lineage and/or load scripts should be extracted. An exact match on app name is done.
appFilter.appId App ids for which lineage and/or load scripts should be extracted.
appFilter.appTag Lineage and/or load scripts will be extracted for apps with these tags set.
configEngine.engineVersion Version of the Qlik Sense engine running on the target server. Version 12.612.0 should work with any Qlik Sense server from 2020 February and later.
configEngine.host Host name, fully qualified domain name (=FQDN) or IP address of Qlik Sense Enterprise server where Qlik Engine Service (QES) is running.
configEngine.port Should be 4747, unless configured otherwise in the QMC.
configEngine.useSSL Set to true if https is used to communicate with the engine API.
configEngine.headers.X-Qlik-User Sense user directory and user to be used when connecting to the engine API. UserDirectory=Internal;UserId=sa_repository is a system account.
configEngine.rejectUnauthorized If set to true, strict checking will be done with respect to ssl certificates etc when connecting to the engine API.
configQRS.authentication Method to authenticate with Qlik Repository Service. Valid options are: certificates.
configQRS.host Host name, fully qualified domain name (=FQDN) or IP address of Qlik Sense Enterprise server where Qlik Repository Service (QRS) is running.
configQRS.port Should be 4242, unless configured otherwise in the QMC.
configQRS.useSSL Set to true if https is used to communicate with the repository API.
configQRS.headers.X-Qlik-User Sense user directory and user to be used when connecting to the engine API. UserDirectory=Internal;UserId=sa_repository is a system account.
cert.clientCerCA Root certificate, as exported from the QMC
cert.clientCert Client certificate, as exported from the QMC
cert.clientCertKey Client certificate key, as exported from the QMC

App filters

All apps will be processed (=lineage and/or load scripts extracted ) if no app filters at all are set in the config file.

Logging

Console logs are always enabled, with configurable logging level (in the YAML config file).

Logging to disk files can be turned on/off via the YAML config file.

Log files on disk are rotated daily. They are kept for 30 days, after which the one(s) older than 30 days are deleted.

Parallel extraction of lineage data

Lineage data is stored within each Sense app.
Each app from which lineage should be extracted must therefore be accessed.

The obvious approach is to get lineage data from one app, then move on to the next app.
This can take a long time on servers with thousands of apps though, Butler Spyglass therefore offers parallel extraction of lineage data.
Some settings in the config file offer fine-tuning of the extraction process:

  • ButlerSpyglass.extract.concurrentTasks controls how many apps will be processed in parallel.
  • ButlerSpyglass.extract.itemInterval controls how long a pause there will be before starting processing of another app.
  • ButlerSpyglass.extract.itemTimeout is the timeout after which Butler Spyglass will give up for a specific app.

An error will occur if lineage or load script for some reason cannot be extracted for an app.

Butler Spyglass keeps track of the ratio of successful extracts.
For example, the following text in the log means that all (100%) extracts have so far been successful:

Extracting metadata (#5, overall success rate 100%): fc90c7f0-f498-4780-8864-2f78f449d9e9 <<>> ✅ Qlik help pages

If the number is below 100% it means that one or more lineage/load script extracts failed.
There should be some info in the logs about which apps were affected and maybe also clues to what happened.

Running Butler Spyglass

There is no installer, just download the binary for your OS from the releases page.

Then edit the config file as needed (there is a template config file here).
Place the config file in the config subdirectory in the directory where Butler Spyglass was started.
For example, if butler-spyglass.exe is stored in d:\tools\butler-spyglass, the config file should be stored in d:\tools\butler-spyglass\config.

You must also set the NODE_ENV environment variable to the name of the config file.
For example, if your config file is my-config-file.yaml the NODE_ENV environment variable should be set to my-config-file.
Butler Spyglass uses that variable to determine where to look for the config file.

Run from command line

The tree structure looks like this:

tree /F
Folder PATH listing
Volume serial number is ....-....
C:.
│   butler-spyglass.exe
│
└───config
        production.yaml

The NODE_ENV environment variable is set to production and the config file is called production.yaml.
In this example the certificates are stored elsewhere (not in a subfolder of the current folder). That's fine as long as the paths are correct.

type .\config\production.yaml
---
ButlerSpyglass:
  # Logging configuration
  logLevel: info                    # Log level. Possible log levels are silly, debug, verbose, info, warn, error
  fileLogging: true                 # true/false to enable/disable logging to disk file
  logDirectory: ./log               # Subdirectory where log files are stored. Either absolute path or relative to where Butler Spyglass was started

  # Extract configuration
  extract:
    frequency: 60000000             # Time between extraction runs. Milliseconds
    itemInterval: 250               # Time between requests to the engine API. Milliseconds
    itemTimeout: 15000              # Timeout for calls to the engine API. Milliseconds
    concurrentTasks: 3              # Simultaneous calls to the engine API. Example: If set to 3, this means 3 calls will be done at the same time, every extractItemInterval milliseconds.
    enableScheduledExecution: false # true=start an extraction run extractFrequency milliseconds after the previous one finished. false=only run once, then exit

  lineageExtract:
    enable: true                    # Should data lineage files be created?
    exportDir: ./out/lineage        # Directory where data lineage files will be stored.
    maxLengthDiscriminator: 1000    # Max characters of discriminator field (=source or destination of data) to store in per-app lineage disk file
    maxLengthStatement: 1000        # Max characters of statemenf field (e.g. SQL statement) to store in per-app lineage disk file

  scriptExtract:
    enable: true                    # Should app load scripts be saved to files?
    exportDir: ./out/script         # Directory where load script files will be stored.

  dataConnectionExtract:
    enable: true                    # Should data connections definitions be saved to files? One JSON file with all data connections will be created.
    exportDir: ./out/dataconnection # Directory where data connection JSON definitions file will be stored.

  # Filter out a selection of apps for which lineage and/or load scripts should be extracted.
  # Filters are additive.
  # If no filters are specified lineage/script will be extracted for all apps in the Sense server.
  appFilter:
    appNameExact:                   # Apps for which lineage/script should be extract. Exact matches are done on app name. 
      - User retention
      - Butler 8.4 demo app
    appId:                          # App IDs for which lineage/script should be extracted.
      - d1ace221-b80e-4754-98ea-3d0a9ebc9632
      - bf4cbb34-cd3c-4fc4-b69d-6fa61d5a270e
    appTag:                         # Lineage/script will be extracted for apps having these tags set.
      - Test data
      - apiCreated
      
  configEngine:
    engineVersion: 12.612.0         # Qlik Associative Engine version to use with Enigma.js. ver 12.612.0 works with Feb 2020 and later
    host: 192.168.100.109
    port: 4747
    useSSL: true
    headers:
      X-Qlik-User: UserDirectory=Internal;UserId=sa_repository
    rejectUnauthorized: false

  configQRS:
    authentication: certificates
    host: 192.168.100.109
    port: 4242
    useSSL: true
    headers:
      X-Qlik-User: UserDirectory=Internal;UserId=sa_repository

  # Certificates to use when connecting to Sense. Get these from the Certificate Export in QMC.
  cert:
    clientCert: C:\tools\ctrl-q\cert\client.pem
    clientCertKey: C:\tools\ctrl-q\cert\client_key.pem
    clientCertCA: C:\tools\ctrl-q\cert\root.pem

Now let's run Butler Spyglass itself:

.\butler-spyglass.exe
2023-03-10T17:16:57.144Z info: --------------------------------------
2023-03-10T17:16:57.144Z info: | butler-spyglass
2023-03-10T17:16:57.144Z info: |
2023-03-10T17:16:57.144Z info: | Version    : 2.0.1
2023-03-10T17:16:57.144Z info: | Log level  : info
2023-03-10T17:16:57.144Z info: |
2023-03-10T17:16:57.144Z info: --------------------------------------
2023-03-10T17:16:57.144Z info:
2023-03-10T17:16:57.144Z info: Extracting metadata from server: 192.168.100.109
2023-03-10T17:16:57.144Z info: Data linage files will be stored in                : ./out/lineage
2023-03-10T17:16:57.144Z info: Load script files will be stored in                : ./out/script
2023-03-10T17:16:57.144Z info: Data connection definitions files will be stored in: ./out/dataconnection
2023-03-10T17:16:57.160Z info: --------------------------------------
2023-03-10T17:16:57.160Z info: Extraction run started
2023-03-10T17:16:57.284Z info: Done writing data connection metadata to disk
2023-03-10T17:16:57.660Z info: Number of apps on server: 337
2023-03-10T17:16:57.675Z info: Extracting metadata (#1, overall success rate 0%): 9e15c449-6269-4a0b-a51a-afbda794bce2 <<>> 🔑 Butler Auth
2023-03-10T17:16:57.675Z info: Extracting metadata (#2, overall success rate 0%): b34a8081-ca65-4005-8a93-5daf2d6b7364 <<>> 📨 Butler
2023-03-10T17:16:57.675Z info: Extracting metadata (#3, overall success rate 0%): 8873183c-a45d-412e-b718-d3365af58706 <<>> 🏆 Butler Control-Q
2023-03-10T17:16:58.364Z info: Extracting metadata (#4, overall success rate 100%): 7b797bd9-8354-4d00-a4d1-2d50c74c92b3 <<>> 🏆 Butler Control-Q
2023-03-10T17:16:58.364Z info: Extracting metadata (#5, overall success rate 100%): fc90c7f0-f498-4780-8864-2f78f449d9e9 <<>> ✅ Qlik help pages
2023-03-10T17:16:58.378Z info: Extracting metadata (#6, overall success rate 100%): 874369dd-cee1-431b-b9fd-22087382c3c9 <<>> ⚠️Butler SOS
2023-03-10T17:16:58.988Z info: Extracting metadata (#7, overall success rate 100%): a5f868ca-60ff-4df6-93e9-2c45577fe703 <<>> Web site analytics(1)
...
...

If ButlerSpyglass.enableScheduledExecution in the config file is set to true Butler Spyglass will keep running and do a data lineage extract run every ButlerSpyglass.extractFrequency milliseconds.

Run using Docker

Using Docker is convenient and easy if you have an existing Docker or Kubernetes environment and know how to use those tools.
A few things to keep in mind though:

  • The NODE_ENV variable in the docker-compose.yml file controls what config file will be used. If NODE_ENV is set to production, the file ./config/production.yaml will be used.

  • The output directories defined in the ./config/production.yaml file must match the volume mapping in the docker-compose.yml file.
    I.e. if the config file defines the output directories as ./out/lineage and ./out/script, the docker-compose file must map the containers /nodeapp/out to an existing directory on the Docker host, for example

    ./out:/nodeapp/out.

Looking at the directory structure and the config files, they could look as follows:

Directory structure:

.
├── config
│   ├── certificate
│   │   ├── client.pem
│   │   ├── client_key.pem
│   │   └── root.pem
│   └── production.yaml
├── docker-compose.yml
└── out
   ├── dataconnection
   ├── lineage
   └── script

config/production.yaml:

---
ButlerSpyglass:
  # Logging configuration
  logLevel: info              # Log level. Possible log levels are silly, debug, verbose, info, warn, error
  fileLogging: true           # true/false to enable/disable logging to disk file
  logDirectory: logs          # Subdirectory where log files are stored

  # Extract configuration
  extractFrequency: 60000     # Time between extraction runs. Milliseconds
  extractItemInterval: 500    # Time between requests to the engine API. Milliseconds
  extractItemTimeout: 5000    # Timeout for calls to the engine API. Milliseconds
  concurrentTasks: 1          # Simultaneous calls to the engine API. Example: If set to 3, this means 3 calls will be done at the same time, every extractItemInterval milliseconds.
  enableScheduledExecution: true  # true=start an extraction run extractFrequency milliseconds after the previous one finished. false=only run once, then exit

  lineageExtract:
    enable: true                  # Should data lineage files be created?
    exportDir: ./out/lineage      # Directory where data lineage files will be stored.
    maxLengthDiscriminator: 1000  # Max characters of discriminator field (=source or destination of data) to store in per-app lineage disk file
    maxLengthStatement: 1000      # Max characters of statemenf field (e.g. SQL statement) to store in per-app lineage disk file

  scriptExtract:
    enable: true                  # Should app load scripts be saved to files?
    exportDir: ./out/script       # Directory where load script files will be stored.

  dataConnectionExtract:
    enable: true                      # Should data connections definitions be saved to files? One JSON file with all data connections will be created.
    exportDir: ./out/dataconnection   # Directory where data connection JSON definitions file will be stored.

  # Filter out a selection of apps for which lineage and/or load scripts should be extracted.
  # Filters are additive.
  # If no filters are specified lineage/script will be extracted for all apps in the Sense server.
  appFilter:
    appNameExact:                   # Apps for which lineage/script should be extract. Exact matches are done on app name. 
      - User retention
      - Butler 8.4 demo app
    appId:                          # App IDs for which lineage/script should be extracted.
      - d1ace221-b80e-4754-98ea-3d0a9ebc9632
      - bf4cbb34-cd3c-4fc4-b69d-6fa61d5a270e
    appTag:                         # Lineage/script will be extracted for apps having these tags set.
      - Test data
      - apiCreated      

  configEngine:
    engineVersion: 12.612.0         # Qlik Associative Engine version to use with Enigma.js. ver 12.612.0 works with Feb 2020 and later 
    host: sense.ptarmiganlabs.com
    port: 4747
    useSSL: true
    headers:
      X-Qlik-User: UserDirectory=Internal;UserId=sa_repository
    rejectUnauthorized: false

  configQRS: 
    authentication: certificates
    host: sense.ptarmiganlabs.com
    port: 4242
    useSSL: true
    headers:
      X-Qlik-User: UserDirectory=Internal;UserId=sa_repository

  # Certificates to use when connecting to Sense. Get these from the Certificate Export in QMC.
  cert:
    ca: /nodeapp/config/certificate/root.pem
    cert: /nodeapp/config/certificate/client.pem
    key: /nodeapp/config/certificate/client_key.pem
    rejectUnauthorized: false

docker-compose.yml:

version: '3.3'
services:
  butler-spyglass:
    image: ptarmiganlabs/butler-spyglass:latest
    container_name: butler-spyglass
    restart: always
    volumes:
      # Make config file and output directories are accessible outside of container
      - "./config:/nodeapp/config"
      - "./out:/nodeapp/out"
    environment:
      - "NODE_ENV=production"
    logging:
      driver: json-file

Output files

The output directories are emptied every time Butler Spyglass is started.
No need to manually clear them thus.

Data lineage output files

The data lineage information is saved as JSON and CSV files - for each app.
The file names are <app id>.csv and <app id>.json:

PS C:\tools\butler-spyglass> dir .\out\lineage\


    Directory: C:\tools\butler-spyglass\out\lineage


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----        10/03/2023     18:21            101 1254a8c6-f804-4d29-b386-7aab07f512f9.csv
-a----        10/03/2023     18:21            151 1254a8c6-f804-4d29-b386-7aab07f512f9.json
-a----        10/03/2023     18:21            105 1a98ef8e-dd98-442b-8b76-58de43b5cbfa.csv
-a----        10/03/2023     18:21            155 1a98ef8e-dd98-442b-8b76-58de43b5cbfa.json
-a----        10/03/2023     18:21             92 2fa09446-86fa-495c-b803-025c4c8ebc23.csv
-a----        10/03/2023     18:21             96 2fa09446-86fa-495c-b803-025c4c8ebc23.json
-a----        10/03/2023     18:21            112 aa5fde91-f8c0-4127-9eb1-452b31355e8e.csv
-a----        10/03/2023     18:21            162 aa5fde91-f8c0-4127-9eb1-452b31355e8e.json
...

Each lineage file may contain zero or more rows, each representing a data source or destination that the app uses. Everything is included - even inline tables, resident loads, writing to QVDs etc.

This richness can be a problem though. If an inline table contains a thousand rows, all those rows will be returned as part of the lineage data. That's where the maxLengthDiscriminator config option (in the config YAML file) comes in handy. It makes it possible to set a limit to how many characters should be included for each row of lineage data. The setting is global for all apps, and applies to all rows of lineage data extracted from Sense.

Here is an example lineage file. Note that both QVDs, SQL statements and inline tables are included in the lineage data.

AppId,AppName,Discriminator,Statement
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,Healthcheck;,"RestConnectorMasterTable:
    SQL SELECT 
        ""col_1""
    FROM CSV (header off, delimiter "","", quote """""""") ""CSV_source""
    WITH CONNECTION(Url ""http://healthcheck.ptarmiganlabs.net:8000/ping/10a887bf-4580-4891-9c6f-2affbd380f16"")"
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,INLINE;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,RESIDENT __cityAliasesBase;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,RESIDENT __cityGeoBase;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,RESIDENT __countryAliasesBase;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,RESIDENT __countryGeoBase;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,\\fileshare1\testdata\meetupcom\categories.csv;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,\\pro\sensedata\staticcontent\appcontent\c840670c-7178-4a5e-8409-ba2da69127e2\cityaliases.qvd;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,\\pro\sensedata\staticcontent\appcontent\c840670c-7178-4a5e-8409-ba2da69127e2\citygeo.qvd;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,\\pro\sensedata\staticcontent\appcontent\c840670c-7178-4a5e-8409-ba2da69127e2\countryaliases.qvd;,
c840670c-7178-4a5e-8409-ba2da69127e2,Meetup.com,\\pro\sensedata\staticcontent\appcontent\c840670c-7178-4a5e-8409-ba2da69127e2\countrygeo.qvd;,

Load script output files

Each app's load script is also stored as its own file, with the app ID as the file name:

PS C:\tools\butler-spyglass> dir .\out\script\


    Directory: C:\tools\butler-spyglass\out\script


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----        10/03/2023     18:21           6470 1254a8c6-f804-4d29-b386-7aab07f512f9.qvs
-a----        10/03/2023     18:21           3170 1a98ef8e-dd98-442b-8b76-58de43b5cbfa.qvs
-a----        10/03/2023     18:21           3190 2fa09446-86fa-495c-b803-025c4c8ebc23.qvs
-a----        10/03/2023     18:21           1677 aa5fde91-f8c0-4127-9eb1-452b31355e8e.qvs
...
PS C:\tools\butler-spyglass> type .\out\script\aa5fde91-f8c0-4127-9eb1-452b31355e8e.qvs
///$tab Main
SET ThousandSep=',';
SET DecimalSep='.';
SET MoneyThousandSep=',';
SET MoneyDecimalSep='.';
SET MoneyFormat='$#,##0.00;-$#,##0.00';
SET TimeFormat='h:mm:ss TT';
SET DateFormat='M/D/YYYY';
SET TimestampFormat='M/D/YYYY h:mm:ss[.fff] TT';
SET FirstWeekDay=6;
SET BrokenWeeks=1;
SET ReferenceDay=0;
SET FirstMonthOfYear=1;
SET CollationLocale='en-US';
SET CreateSearchIndexOnReload=1;
SET MonthNames='Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec';
SET LongMonthNames='January;February;March;April;May;June;July;August;September;October;November;December';
SET DayNames='Mon;Tue;Wed;Thu;Fri;Sat;Sun';
SET LongDayNames='Monday;Tuesday;Wednesday;Thursday;Friday;Saturday;Sunday';
SET NumericalAbbreviation='3:k;6:M;9:G;12:T;15:P;18:E;21:Z;24:Y;-3:m;-6:μ;-9:n;-12:p;-15:f;-18:a;-21:z;-24:y';

///$tab Test data 1
Characters:
Load Chr(RecNo()+Ord('A')-1) as Alpha, RecNo() as Num autogenerate 26;

ASCII:
Load
 if(RecNo()>=65 and RecNo()<=90,RecNo()-64) as Num,
 Chr(RecNo()) as AsciiAlpha,
 RecNo() as AsciiNum
autogenerate 255
 Where (RecNo()>=32 and RecNo()<=126) or RecNo()>=160 ;

Transactions:
Load
 TransLineID,
 TransID,
 mod(TransID,26)+1 as Num,
 Pick(Ceil(3*Rand1),'A','B','C') as Dim1,
 Pick(Ceil(6*Rand1),'a','b','c','d','e','f') as Dim2,
 Pick(Ceil(3*Rand()),'X','Y','Z') as Dim3,
 Round(1000*Rand()*Rand()*Rand1) as Expression1,
 Round(  10*Rand()*Rand()*Rand1) as Expression2,
 Round(Rand()*Rand1,0.00001) as Expression3;
Load
 Rand() as Rand1,
 IterNo() as TransLineID,
 RecNo() as TransID
Autogenerate 1000
 While Rand()<=0.5 or IterNo()=1;

 Comment Field Dim1 With "This is a field comment";

Data connections output files

All data connections for the entire Qlik Sense server are exported to JSON and CSV files:

PS C:\tools\butler-spyglass> dir .\out\dataconnection\


    Directory: C:\tools\butler-spyglass\out\dataconnection


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----        10/03/2023     18:21          32953 dataconnections.csv
-a----        10/03/2023     18:21          52468 dataconnections.json

Analysing the generated files

There are currently no analysis apps included in this project. This should be fairly easy to create though. The data lineage CSV files can be loaded into a Sense app and from there be made available for analysis.

The load script .qvs files could be zipped into a daily archive by means of a scheduled task, using the standard OS scheduler.

Feel free to contribute with good analysis apps - pull requests are always welcome!

Security / Disclosure

If you discover any important bug with Butler Spyglass that may pose a security problem, please disclose it confidentially to [email protected] first, so that it can be assessed and hopefully fixed prior to being exploited. Please do not raise GitHub issues for security-related doubts or problems.

butler-spyglass's People

Contributors

github-actions[bot] avatar greenkeeper[bot] avatar mountaindude avatar renovate-bot avatar renovate[bot] avatar snyk-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

butler-spyglass's Issues

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Script keeps looping

After I run the script (with node.js) the script keeps repeating itself. it looks like it's in a never ending loop:

image

Also the files are being stored in seperate csv files instead one single lineage.csv file:

image

Make scheduling of extract runs optional

Add a config option to enable/disable the automatic scheduling of extract runs.
Would be useful in cases where the host OS scheduler would be used to start Docker Spyglass only during weekends (e.g. to minimise impact on daily operations).

Make sure whatever solution that is selected also works when running the tool in a Docker container (i.e. not only when run using "node index.js" from command line).

Send progress and error info to MQTT

By sending extract progress to MQTT, it can be picked up and used by other systems.
It would for example be possible to forward the progress data to Influxdb, from where it could be shown in a nice way in a Grafana dashboard. This would be a useful feature for large QS environments, where a full lineage/script extract might take several hours to complete.

Refactor code to make use of modules

The code base is getting to a point where it is hard to keep in a single js file.
Break it up into modules: one for app extract, one for retries (when extracts have failed), one for logging etc

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Rate-Limited

These updates are currently rate-limited. Click on a checkbox below to force their creation now.

  • fix(deps): update dependency eslint to v8.57.0
  • fix(deps): update dependency winston to v3.13.0
  • chore(deps): update actions/checkout action to v4
  • chore(deps): update actions/setup-node action to v4
  • chore(deps): update crazy-max/ghaction-virustotal action to v4
  • chore(deps): update dependency prettier to v3
  • chore(deps): update docker/build-push-action action to v5
  • chore(deps): update docker/login-action action to v3
  • chore(deps): update docker/metadata-action action to v5
  • chore(deps): update docker/setup-buildx-action action to v3
  • chore(deps): update docker/setup-qemu-action action to v3
  • chore(deps): update github artifact actions to v4 (major) (actions/download-artifact, actions/upload-artifact)
  • chore(deps): update github/codeql-action action to v3
  • chore(deps): update google-github-actions/release-please-action action to v4
  • chore(deps): update node.js to v20
  • fix(deps): update dependency eslint to v9
  • fix(deps): update dependency eslint-config-prettier to v9
  • fix(deps): update dependency eslint-plugin-prettier to v5
  • fix(deps): update dependency winston-daily-rotate-file to v5
  • 🔐 Create all rate-limited PRs at once 🔐

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

docker-compose
docker-compose.yaml
dockerfile
Dockerfile
  • node 19-bullseye-slim
github-actions
.github/workflows/docker-image-build.yml
  • actions/checkout v3
  • docker/setup-qemu-action v2
  • docker/setup-buildx-action v2
  • docker/login-action v2
  • docker/metadata-action v4
  • docker/build-push-action v4
.github/workflows/release-please.yml
  • google-github-actions/release-please-action v3
  • actions/checkout v3
  • github/codeql-action v2
  • actions/upload-artifact v3
  • actions/upload-artifact v3
  • actions/upload-artifact v3
  • actions/checkout v3
  • actions/setup-node v3
  • actions/download-artifact v3
  • ncipollo/release-action v1
  • actions/checkout v3
  • actions/setup-node v3
  • actions/download-artifact v3
  • ncipollo/release-action v1
  • actions/checkout v3
  • actions/setup-node v3
  • actions/download-artifact v3
  • ncipollo/release-action v1
.github/workflows/remove-old-artifacts.yml
  • c-hive/gha-remove-artifacts v1
.github/workflows/virus-scan.yml
  • crazy-max/ghaction-virustotal v3
npm
package.json
  • axios 0.27.2
  • better-queue ^3.8.12
  • config ^3.3.9
  • csv-stringify ^6.3.0
  • csv-writer ^1.6.0
  • enigma.js ^2.10.0
  • eslint ^8.35.0
  • eslint-config-airbnb-base ^15.0.0
  • eslint-config-prettier ^8.7.0
  • eslint-plugin-import ^2.27.5
  • eslint-plugin-prettier ^4.2.1
  • fs-extra ^11.1.0
  • js-yaml ^4.1.0
  • jshint ^2.13.6
  • upath ^2.0.1
  • winston ^3.8.2
  • winston-daily-rotate-file ^4.7.1
  • ws ^8.12.1
  • prettier ^2.8.4
  • snyk ^1.1114.0

  • Check this box to trigger a request for Renovate to run again on this repository

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Store extracted info to disk as it is retrieved from Sense engine

Consider writing the retrieved info to disk as it becomes available. This would reduce the impact of Butler Spyglass crashing late during an extraction run. It would also allow the generated files to be immediately inspected.

Could be a good idea to create/delete files in the out-folder to indicate whether an extraction run is ongoing or done.

No lineage if drop table statements

Describe the bug
No lineage info if tables have been dropped in load script.

To Reproduce
Run Butler spyglass with apps that have Drop table statements

Expected behavior
Get lineage of tables that have been dropped from script.

Screenshots
If applicable, add screenshots to help explain your problem.

Describe environment:

  • OS: Windows Server 2016
  • Containerisation: no
  • Version of Butler Spyglass used: master
  • Command used to start Butler Spyglass: node index.js

Config file(s)
What's the content of the config file(s) you use?
Please make sure to remove all sensitive information before posting it here.

Default config files.

Additional context
Add any other context about the problem here.
Tested with QS Nov 2019

Docker image build is failing in 2.0.0

What version of Butler Spyglass are you using?

2.0.0

What version of Node.js are you using? Not applicable if you use the standalone version of Butler Spyglass.

No response

What command did you use to start Butler Spyglass?

What operating system are you using?

What CPU architecture are you using?

What Qlik Sense versions are you using?

Describe the Bug

Docker image builds fail after the switch from drone.io to GitHub Actions.
The same build concept is used as in Butler, Butler SOS and other sibling tools, so it's something trivial that's causing the problem.

Expected Behavior

No response

To Reproduce

No response

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

"ButlerSpyglass.logLevel" is not defined

Describe the bug
A clear and concise description of what the bug is.

D:\QS-Buttler-Spy-Glass\node_modules\config\lib\config.js:203
throw new Error('Configuration property "' + property + '" is not defined');
^

Error: Configuration property "ButlerSpyglass.logLevel" is not defined
at Config.get (D:\QS-Buttler-Spy-Glass\node_modules\config\lib\config.js:203:11)
at Object. (D:\QS-Buttler-Spy-Glass\src\logger.js:25:23)
at Module._compile (node:internal/modules/cjs/loader:1275:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1329:10)
at Module.load (node:internal/modules/cjs/loader:1133:32)
at Module._load (node:internal/modules/cjs/loader:972:12)
at Module.require (node:internal/modules/cjs/loader:1157:19)
at require (node:internal/modules/helpers:119:18)
at Object. (D:\QS-Buttler-Spy-Glass\src\extract_app.js:8:16)
at Module._compile (node:internal/modules/cjs/loader:1275:14)

To Reproduce
Node index.js

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Describe environment:

  • OS: [e.g. Windows Server 2016,
  • Containerisation -no
  • Version of Butler Spyglass used - 1.2.1
  • Command used to start Butler Spyglass - Node Index.js

Config file(s)
What's the content of the config file(s) you use?
Please make sure to remove all sensitive information before posting it here.

Additional context
Add any other context about the problem here.

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Configuration Property "ButtlerSpyglass.loglevel" is not defined

Describe the bug
Hello, i followed the instruction using node.js but when i run "node.js" i get an error. I renamed the config file "Production.yaml" but i cannot make it work.

To Reproduce
I just installed using npm and then run node index.js.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
Error

Describe environment:

  • OS: Windows server2 012
  • Containerisation: No
  • Version of Butler Spyglass used: Last
  • Command used to start Butler Spyglass: node index.js

Config file(s)
What's the content of the config file(s) you use?
Please make sure to remove all sensitive information before posting it here.

Additional context
Add any other context about the problem here.

Track and store failed app metadata extracts

In the best of worlds metadata for all apps will be extracted by Butler Spyglass.
But if som apps for some reason fail to extract, this should be recorded to disk so it is clear what went well and what failed.

Add app ID to error messages

Error messages are good, and they are even better if they tell you which app caused the error... Add app IDs to error messages where possible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.