turbot / steampipe-plugin-csv Goto Github PK

View Code? Open in Web Editor NEW

19.0 12.0 4.0 266 KB

Use SQL to instantly query data from CSV files. Open source CLI. No DB required.

Home Page: https://hub.steampipe.io/plugins/turbot/csv

License: Apache License 2.0

Makefile 1.00% PLSQL 16.59% Go 82.40%

steampipe steampipe-plugin csv sql postgresql postgresql-fdw hacktoberfest backup etl sqlite

steampipe-plugin-csv's Introduction

CSV Plugin for Steampipe

Use SQL to query data from CSV files.

Get started →
Documentation: Table definitions & examples
Community: Join #steampipe on Slack →
Get involved: Issues

Quick start

Install the plugin with Steampipe:

steampipe plugin install csv

Configure your config file to include directories with CSV files. If no directory is specified, the current working directory will be used.

Run steampipe:

steampipe query

Run a query for the my_users.csv file:

select
  first_name,
  last_name
from
  my_users;

Engines

This plugin is available for the following engines:

Engine	Description
Steampipe	The Steampipe CLI exposes APIs and services as a high-performance relational database, giving you the ability to write SQL-based queries to explore dynamic data. Mods extend Steampipe's capabilities with dashboards, reports, and controls built with simple HCL. The Steampipe CLI is a turnkey solution that includes its own Postgres database, plugin management, and mod support.
Postgres FDW	Steampipe Postgres FDWs are native Postgres Foreign Data Wrappers that translate APIs to foreign tables. Unlike Steampipe CLI, which ships with its own Postgres server instance, the Steampipe Postgres FDWs can be installed in any supported Postgres database version.
SQLite Extension	Steampipe SQLite Extensions provide SQLite virtual tables that translate your queries into API calls, transparently fetching information from your API or service as you request it.
Export	Steampipe Plugin Exporters provide a flexible mechanism for exporting information from cloud services and APIs. Each exporter is a stand-alone binary that allows you to extract data using Steampipe plugins without a database.
Turbot Pipes	Turbot Pipes is the only intelligence, automation & security platform built specifically for DevOps. Pipes provide hosted Steampipe database instances, shared dashboards, snapshots, and more.

Developing

Prerequisites:

Clone:

git clone https://github.com/turbot/steampipe-plugin-csv.git
cd steampipe-plugin-csv

Build, which automatically installs the new version to your ~/.steampipe/plugins directory:

make

Configure the plugin:

cp config/* ~/.steampipe/config
vi ~/.steampipe/config/csv.spc

Try it!

steampipe query
> .inspect csv

Open Source & Contributing

This repository is published under the Apache 2.0 (source code) and CC BY-NC-ND (docs) licenses. Please see our code of conduct. We look forward to collaborating with you!

Steampipe is a product produced from this open source software, exclusively by Turbot HQ, Inc. It is distributed under our commercial terms. Others are allowed to make their own distribution of the software, but cannot use any of the Turbot trademarks, cloud services, etc. You can learn more in our Open Source FAQ.

Get Involved

Join #steampipe on Slack →

Want to help but don't know where to start? Pick up one of the help wanted issues:

steampipe-plugin-csv's People

Contributors

Stargazers

Watchers

Forkers

justlikeef alaffia-technology-solutions thoughtspot jonudell

steampipe-plugin-csv's Issues

Empty CSV file seems to be problematic when loading

Create an empty CSV file:

touch empty.csv

With v0.19.5 I saw some problems loading it:

/tmp/crap $ steampipe query
Welcome to Steampipe v0.19.5
For more information, type .help
Warning: failed to start plugin 'hub.steampipe.io/plugins/turbot/csv@latest': failed to parse file header /tmp/crap/empty.csv: EOF
> .inspect csv
Error: could not find connection or table called 'csv'. Is the plugin installed? Is the connection configured?
>

I also saw different issues using it with v0.20.2. For v0.20.2 I recommend checking the hash status in the connection state table as part of testing. I saw that behavior a little unpredictable too (which is more of a steampipe CLI issue, but worth noting as it can be confusing when trying to pin down issues).

/tmp/crap $ steampipe query
Welcome to Steampipe v0.20.2
For more information, type .help
> select * from steampipe_connection_state where name = 'csv'
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| name | state | type | import_schema | error  | plugin                                     | schema_mode | schema_hash                      | comments_set | connection_mod_time       | plugin_mod_time           |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
| csv  | ready |      | enabled       | <null> | hub.steampipe.io/plugins/turbot/csv@latest | dynamic     | 6c368b2c8e93381d7c35de58620b72ea | true         | 2023-05-26T15:09:01-04:00 | 2023-05-15T09:04:32-04:00 |
+------+-------+------+---------------+--------+--------------------------------------------+-------------+----------------------------------+--------------+---------------------------+---------------------------+
> .inspect csv
+----------------+------------------------------------------+
| table          | description                              |
+----------------+------------------------------------------+
| country        | CSV file at /tmp/crap/country.csv        |
| json-not-csv-2 | CSV file at /tmp/crap/json-not-csv-2.csv |
| state          | CSV file at /tmp/crap/state.csv          |
+----------------+------------------------------------------+
>

paths must be configured

I built the plugin, "installed" by copying csv.spc to ~/.steampipe/config, and boiled csv.spc down to this:

connection "csv" {
  plugin = "csv"

  paths = [ "~/csv/*" ]

When I reach for .inspect in the cli it thinks no paths are configured. There are csv files in ~/csv.

What am I missing?

Partial Case-Sensivity makes CAMELCase.csv fail

Describe the bug
I have a CAMELCase.csv file on a case-sensivtive file-system

Steampipe version (steampipe -v)
Example: v0.14.3

Plugin version (steampipe plugin list)
csv@latest | 0.3.0 | csv

To reproduce

create a simple .csv file with some UPPERlower.csv name

.inspect does work, however using select always bails with relation does not exist

> .inspect csv.FIxme
+--------+-------+-------------------------------------------------------+
| column | type  | description                                           |
+--------+-------+-------------------------------------------------------+
| _ctx   | jsonb | Steampipe context in JSON form, e.g. connection_name. |
| one    | text  | Field 0.                                              |
| two    | text  | Field 1.                                              |
+--------+-------+-------------------------------------------------------+
> select * from FIxme
Error: relation "fixme" does not exist (SQLSTATE 42P01)

Expected behavior
the query should work

Additional context

renaming the file to alllowercase.csv works

Poor error message when the folder contains bad CSVs

Describe the bug
When the paths in the connection config point to a folder that contains one/more invalid/bad CSV files, the plugin returns the following error:

failed to plugin initialise plugin 'steampipe-plugin-csv': TableMapFunc 'PluginTables' had unhandled error: parse error on line 1, column 25: bare " in non-quoted-field

It is hard to understand what's actually wrong from this message IMO.

Steampipe version (steampipe -v)
Example: v0.9.0-rc.0

Plugin version (steampipe plugin list)
Example: v0.3.0

To reproduce
Add a bad CSV file(may contain a bare " somewhere ) to a folder, and add the path to the folder in the connection config.
Query a good CSV.

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

Empty CSV files causes plugin initialization failure

Describe the bug
If the plugin's paths matches one or more .csv file that contains no data, the plugin fails to initialize.

Steampipe version (steampipe -v)
v0.17.4

Plugin version (steampipe plugin list)
v0.5.0

To reproduce

Create empty.csv in one of the directories in paths config arg, which contains no data
Run steampipe query
View Steampipe logs in ~/.steampipe/logs/plugin-<date>.log

Expected behavior
Should the plugin skip over empty or malformed CSV files? This was previously discussed in #40 and #31, as these issues were more common previously when the header was expected to be valid all of the time.

Additional context
Add any other context about the problem here.

Skip malformed csv files and run correctly

I have many csv files in a folder and want to use steampipe.
When I run steampipe query, I got an error that the plugin failed to start because of some file has an error.
Remove that file and got another error from the other file.

Again and again...

I understand that some files are not having headers or local language characters but anyway, I think the steampipe should skip those malformed files and still could support the tables for the correct files.

Outputting results from query as CSV into current working directory results in error

Describe the bug
When using Steampipe to perform a query using the csv plugin and outputting the result of that query as csv into the current working directory results in an error

Steampipe version (steampipe -v)
v0.16.0

Plugin version (steampipe plugin list)
csv: v0.3.0

To reproduce

Welcome to Steampipe v0.16.0
For more information, type .help
> .inspect test
+---------+-------+-------------------------------------------------------+
| column  | type  | description                                           |
+---------+-------+-------------------------------------------------------+
| _ctx    | jsonb | Steampipe context in JSON form, e.g. connection_name. |
| column1 | text  | Field 0.                                              |
| column2 | text  | Field 1.                                              |
+---------+-------+-------------------------------------------------------+
> .exit
➜  mod_list steampipe query "select * from test;" --output csv >> results.csv
Warning: executeQueries: query 1 of 1 failed: ERROR: failed to start plugin 'hub.steampipe.io/plugins/turbot/csv@latest': runtime error: invalid memory address or nil pointer dereference (SQLSTATE HV000)
➜  mod_list steampipe query "select * from test;" --output csv >> results.test
Warning: executeQueries: query 1 of 1 failed: ERROR: failed to start plugin 'hub.steampipe.io/plugins/turbot/csv@latest': runtime error: invalid memory address or nil pointer dereference (SQLSTATE HV000)
➜  mod_list rm results.csv
➜  mod_list steampipe query "select * from test;" --output csv >> results.test
➜  mod_list cat results.test
column1,column2,_ctx
value1,value2,"{""connection_name"":""csv""}"
value3,value4,"{""connection_name"":""csv""}"
➜  mod_list

Expected behavior
No error, or warning not to do it.

Additional context
error persists until the csv is deleted from the directory.

Default configuration breaks steampipe

Describe the bug
The default configuration file installed by the plugin is incomplete, and breaks steampipe.

The error failed to start plugin 'csv': paths must be configured occurs even if the CSV plugin isn't called.

Steampipe version (steampipe -v)
0.11.0

Plugin version (steampipe plugin list)
0.1.0

To reproduce

$ steampipe --version
steampipe version 0.11.0

$ steampipe plugin list
+------+---------+-------------+
| Name | Version | Connections |
+------+---------+-------------+
+------+---------+-------------+

$ steampipe plugin install csv

Installed plugin: csv v0.1.0
Documentation:    https://hub.steampipe.io/plugins/turbot/csv

$ steampipe plugin list
Error: Plugin Listing failed - failed to start plugin 'csv': paths must be configured

Expected behavior
The plugin should not break steampipe after installation. For example:

the plugin should only fail if it's queried
or the configuration should contain a placeholder path so that it parses correctly

Additional context
Add any other context about the problem here.

gz support

I wish to use steampipe for csv.gz files without uncompression.

error if csv filename begins with a number

To reproduce, convert a file that works properly, with a name like foo.csv, to instead be 1foo.csv.

Error: syntax error at or near "1" (SQLSTATE 42601)

if a column name in the header row contains a period, the values in the column will be null

Given this in seitz.csv

"subscriptions.id","subscriptions_plan_id","subscriptions_plan_quantity"
a,b,1
d,e,2

The result for select * from csv.seitz:

+------------------+-----------------------+-----------------------------+
| subscriptions.id | subscriptions_plan_id | subscriptions_plan_quantity |
+------------------+-----------------------+-----------------------------+
| <null>           | b                     | 1                           |
| <null>           | e                     | 2                           |
+------------------+-----------------------+-----------------------------+

CSV plugin randomly fails with "relation "csv.blah" does not exist (SQLSTATE 42P01)"

Describe the bug
I have a script which at it's core, using steampipe + csv plugin to manipulate a bunch of CSV's, get data from jira, make some new CSV's etc. This job is run using a Gitlab pipeline. Randomly, the job will just fail reading one of the CSV's with the error "relation "csv.blah" does not exist (SQLSTATE 42P01)". If I retrigger the exact same job, it works fine the second time. I've had this issue when testing locally and in Gitlab. I assume there is some race condition happening.

Steampipe version (steampipe -v)
Latest as of writing this issue - 0.20.6

Plugin version (steampipe plugin list)
Latest as of writing this issue - 0.9.0

To reproduce
It's a bit hard to give exact steps, but potentially try having multiple CSV files, do some query to join them, write the result out to a new temp file, move the temp file to blah.csv, try and read blah.csv, do some query to join them, write the result out to a new temp file, move the temp file to blah2.csv.

Expected behavior
No error/race condition occurs.

CSV plugin loads csv tables from prior working directory when issues encountered

Describe the bug
When steampipe with the csv plugin has been run in one directory successfully, quit, and then launched from a different directory with a csv file that contains a "_ctx" column , it will load the csv tables from the prior directory with no other warnings to the terminal. This occurs even if the csv.spc is set to only load from the new directory. No steampipe processes were found running between runs.

Steampipe version (steampipe -v)
steampipe version 0.17.4

To reproduce
$ cd /users/me/turbot
$ steampipe query --output csv "select * from googledirectory_user" > google_users
$ mv google_users google_users.csv

oh, look, its quittin' time!

eat, sleep, get up

Do some early morning AdventOfCode work

directory contains day4.csv see https://github.com/Eric-Hacker/AOC22/tree/main/Day4 for example

$ cd /users/me/aoc/day4
$ steampipe query "select * from day4.csv"

whoops, how time flies, better get some real work done

$ cd /users/me/turbot
$ steampipe query

.inspect cdv
+-------+-----------------------------------------------------------+
| table | description |
+-------+-----------------------------------------------------------+
| day4 | CSV file at /Users/me/aoc/day4/day4.csv |
+-------+-----------------------------------------------------------+

see csv.day4 table

think, hmm, I guess I wasn't meant to get work done today, maybe I should go back to working on AOC

Expected behavior
Should be loading tables from the current directory or give a warning if there are issues.

CSV Plugin Crashes if any files in path are in invalid format

Describe the bug
CSV Plugin Crashes if any files in path are in invalid format

Steampipe version (steampipe -v)
failed to start plugin 'hub.steampipe.io/plugins/turbot/csv@latest': myfile.xlsx - Detailed Findings (1).csv header row has empty value in field 1

Plugin version (steampipe plugin list)

$ steampipe --version
steampipe version 0.16.0
OPL-M-PSOLOMON4:Downloads psolomon$ steampipe plugin list
failed to start plugin 'hub.steampipe.io/plugins/turbot/csv@latest': myfile.xlsx - Detailed Findings (1).csv header row has empty value in field 1
+--------------------------------------------------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Name                                             | Version | Connections                                                                                                                                                           |
+--------------------------------------------------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| hub.steampipe.io/plugins/turbot/aws@latest       | 0.74.1  | <LIST_REDACTED> |
|                                                  |         |  |
|                                                  |         |                                                                                                                                                        |
| hub.steampipe.io/plugins/turbot/csv@latest       | 0.3.2   |                                                                                                                                                                       |
| hub.steampipe.io/plugins/turbot/datadog@latest   | 0.1.0   | datadog                                                                                                                                                               |
| hub.steampipe.io/plugins/turbot/finance@latest   | 0.2.1   | finance                                                                                                                                                               |
| hub.steampipe.io/plugins/turbot/github@latest    | 0.19.0  | github                                                                                                                                                                |
| hub.steampipe.io/plugins/turbot/jira@latest      | 0.5.0   | jira                                                                                                                                                                  |
| hub.steampipe.io/plugins/turbot/net@latest       | 0.7.0   | net                                                                                                                                                                   |
| hub.steampipe.io/plugins/turbot/pagerduty@latest | 0.1.0   | pagerduty                                                                                                                                                             |
| hub.steampipe.io/plugins/turbot/slack@latest     | 0.8.0   | slack                                                                                                                                                                 |
| hub.steampipe.io/plugins/turbot/terraform@latest | 0.1.0   | terraform                                                                                                                                                             |
| hub.steampipe.io/plugins/turbot/whois@latest     | 0.5.0   | whois                                                                                                                                                                 |
| hub.steampipe.io/plugins/turbot/zoom@latest      | 0.4.0   | zoom                                                                                                                                                                  |
+--------------------------------------------------+---------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+

To reproduce
Create a CSV with an invalid header column (i.e. blank)

Expected behavior
Instead of completely crashing the plugin, just print warnings of the bad files.

Additional context
None at this time.

Improve README and docs by adding examples of tables/columns with mixed case

Is your feature request related to a problem? Please describe.
It's not clear how to query tables or columns with mixed case as there are little to no examples.

Describe the solution you'd like
Docs should be updated with these examples.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Update plugin to discover CSV files in the current working directory by default

Currently the plugin required paths argument to be configured in .spc file. It's inconvenient and unintuitive that we require the paths to be configured first. Can we discover files in the current working directory by default?

Add table csv_map

Currently the CSV plugin creates a table for each CSV file. I'd like a different table that can be used to read the rows of a CSV file into a JSONB column.

Table csv_map, with columns:

row, an integer row number. Assume the header is row zero, making the first data row be 1.
data, a jsonb map of the data for the row. The header row is used to determine the keys.
path, a string which is the full path to the CSV file.

The path should be a required column. By default, all CSV files found in the config paths will be included as data in this combination table. But, if the path is set then any file at that path can be read in immediately and alone.

Use file paths to qualify table names for csv files

Problem: When CSV files in the search path have the same name, they are inaccessible. I work with local CSV a lot, they arrive as the result of spark/trino/pandas transforms and I land them like a hive warehouse to stay organized:

tree data/
data/
└── extracts
    └── csv
        ├── dataset=Page_Traffic
        │   └── rundate=2022-10-23
        │       └── data.csv
        ├── dataset=hdp_sessions
        │   ├── rundate=2022-10-17
        │   │   └── data.csv
        │   └── rundate=2022-10-18
        │       └── data.csv
        ├── dataset=funnel_metrics
        │   ├── rundate=2022-10-21
        │   │   └── data.csv
         ... etc....

# paths = [ "**/*.csv"]
steampipe query
> .inspect csv
+-------+--------------------------------------------------------------------------------------------+
| table | description                                                                                |
+-------+--------------------------------------------------------------------------------------------+
| data  | CSV file at /Users/../data/extracts/csv/dataset=funnel_metrics/rundate=2022-10-22/data.csv |
+-------+--------------------------------------------------------------------------------------------+

Only one of these files will become part of the schema csv. It is very common for CSV files to have same filename but organized by path.

I'd like easy access to all of the tables.

Solution:
Qualify the table name by prepending the path from current working directory to the file. Transform characters like /={. to underscore _. You could limit the path to 3 parent directories above the .csv file. Long table names are not a hindrance, and current user experience would be unchanged.
Import all these locations of data.csv with distinct names depending on the path from current working directory.

cd data/extracts/csv/; steampipe query

> .inspect csv
+------------------------------------------------+------------------------------------------------------------------+
|                  table                         | description                                                      |
+------------------------------------------------+------------------------------------------------------------------+
| dataset_Page_Traffic_rundate_2022-10-23_data   | CSV file at ./dataset=Page_Traffic/rundate=2022-10-23/data.csv   |
+------------------------------------------------+------------------------------------------------------------------+
| dataset_funnel_metrics_rundate_2022-10-21_data | CSV file at ./dataset=funnel_metrics/rundate=2022-10-21/data.csv |
+------------------------------------------------+------------------------------------------------------------------+
| dataset_hdp_sessions_rundate=2022-10-17_data   | CSV file at ./dataset=hdp_sessions/rundate=2022-10-17/data.csv   |
+------------------------------------------------+------------------------------------------------------------------+
| dataset_hdp_sessions_rundate_2022-10-18_data   | CSV file at ./dataset=hdp_sessions/rundate=2022-10-18/data.csv   |
+------------------------------------------------+------------------------------------------------------------------+

Describe alternatives you've considered

Right now I have to cd data/extracts/csv/dataset=funnel_metrics/Page_Traffic/rundate=2022-10-22 to determine the table definition. Unfortunately, I cant join or combine results from multiple tables unless I rename them or move a set to a directory.
Rename and move tables around before using Steampipe to interact with them.

Additional context
I love the steampipe tool. Its a universal interface with so many uses, it is quickly becoming a daily tool for me. Great decisions by the designers.

Use of camelCase in columns causes error

Describe the bug
Attempting to query a column with camelCase column name results in a column camelCase does not exist error

Steampipe version (steampipe -v)
v0.16.0

Plugin version (steampipe plugin list)
csv: v0.3.0

To reproduce

> .inspect camels
+-------------+-------+-------------------------------------------------------+
| column      | type  | description                                           |
+-------------+-------+-------------------------------------------------------+
| CapitalCase | text  | Field 3.                                              |
| UPPERCASE   | text  | Field 2.                                              |
| _ctx        | jsonb | Steampipe context in JSON form, e.g. connection_name. |
| camelCase   | text  | Field 0.                                              |
| lowecase    | text  | Field 1.                                              |
+-------------+-------+-------------------------------------------------------+
> select * from camels;
+-----------+-----------+-----------+-------------+---------------------------+
| camelCase | lowercase | UPPERCASE | CapitalCase | _ctx                      |
+-----------+-----------+-----------+-------------+---------------------------+
| no        | yes       | no        | no          | {"connection_name":"csv"} |
+-----------+-----------+-----------+-------------+---------------------------+
> select camelCase from camels;
Error: column "camelcase" does not exist (SQLSTATE 42703)

> select camelcase from camels;
Error: column "camelcase" does not exist (SQLSTATE 42703)

> select lowercase from camels;
+-----------+
| lowercase |
+-----------+
| yes       |
+-----------+

> select UPPERCASE from camels;
Error: column "uppercase" does not exist (SQLSTATE 42703)

> select CapitalCase from camels;
Error: column "capitalcase" does not exist (SQLSTATE 42703)

Expected behavior
Should be able to query based the columns using correct case.

Additional context
Add any other context about the problem here.

byte order mark becomes space before first field in header when exporting from excel

I think that's what's happening, anyway. When saving a sheet as CSV from Excel, my default is UTF-8. and if the first column header is "name" it becomes " name" in the schema.

My workaround: Save as CSV (MS-DOS). 😱

UTF-8 encoded CSVs can't leverage Column A

Describe the bug
CSV files encoded in UTF-8 begin with hex EF BB BF. The CSV Plugin doesn't parse/ignore these three bites making the first column not queryable. CSV files created by MS-Excel are in UTF-8.

Two sample files run through the MacOS xxd hex viewer:

chris$ xxd bad.csv  | head -2
00000000: efbb bf63 6f6e 7472 6f6c 2c70 6574 5f6e  ...control,pet_n
00000010: 616d 652c 6578 7065 6374 6564 5f72 6573  ame,expected_res

Note the three dots and the efbb bf first three bytes above. A steampipe parsable CSV looks like:

chris$ xxd good.csv  | head -2
00000000: 636f 6e74 726f 6c2c 7065 745f 6e61 6d65  control,pet_name
00000010: 2c65 7870 6563 7465 645f 7265 7375 6c74  ,expected_result

Steampipe version (steampipe -v)
steampipe version 0.16.4

Plugin version (steampipe plugin list)

steampipe plugin list | grep csv
| hub.steampipe.io/plugins/turbot/csv@latest    | 0.4.0   | csv         |

To reproduce
See attached files for examples:

> select count(control) from bad;
Error: column "control" does not exist (SQLSTATE 42703)
> select count(control) from good;
+-------+
| count |
+-------+
| 7     |
+-------+
> select count(pet_name) from bad;
+-------+
| count |
+-------+
| 7     |
+-------+
> select count(pet_name) from good;
+-------+
| count |
+-------+
| 7     |
+-------+
>

Expected behavior
CSV plugin should ignore the first three bytes of a CSV File in UTF-8

CSV@private S3: automatic resolution of the region

Is your feature request related to a problem? Please describe.
The documentation states :

Make sure that region is configured in the config. If not set in the config, region will be fetched from the standard environment variable AWS_REGION.

I'm confused as to why this is needed as the region is in the hostname of the S3 path to the file or folder.

The documentation suggests the use of AWS profiles, so if one were to have csv files in two regions, he'd have to configure two different AWS profiles for Steampipe, which is at odd with most of other tooling using AWS credentials.

The documentation also suggest to pass the region but it's not in the exemple and I can't get it to work, passing region, aws_region parameter is not recognized:

$ steampipe query
Welcome to Steampipe v0.19.3
For more information, type .help
Warning: failed to start plugin 'hub.steampipe.io/plugins/turbot/csv@latest': failed to get directory specified by the source s3::https://XXXXXXX.s3.eu-west-1.amazonaws.com/XXXX.csv?aws_profile=aa&aws_region=eu-west-1: error downloading 'https://XXXXXXX.s3.eu-west-1.amazonaws.com/XXXXXXX.csv?aws_profile=aa&aws_region=eu-west-1': MissingRegion: could not find region configuration

Setting the region corresponding to the bucket location in ~/.aws/credentials (region=eu-west-1) works.

Setting the incorrect region in ~/.aws/credentials yields to this error upon Steampipe invokation: BucketRegionError: incorrect region, the bucket is not in 'eu-central-1' region

Describe the solution you'd like
I think this feature should not expect a region to be given anyhow, worst case scenario it can be parsed from the hostname : https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-bucket-intro.html

versions
plugin csv 0.7.0
Steampipe v0.19.3

Enable CSV plugin to access files stored in an S3 bucket

Describe the solution you'd like
I absolutely love the power of the CSV plugin but it would be great if you could use this plugin to access files stored in an S3 bucket. I currently get various reports delivered to S3 buckets and I'd love to be able to perform joins of the data stored in these CSV files with live data from the AWS plugin. This would a) eliminate the extra step required to download files to my local machine for access via the CSV plugin and b) enable interesting use cases where CSV files could be leveraged to build dashboards on an server running in AWS and/or in Steampipe cloud.

the way I'd envision this working is you could configure your csv.spc to point to an S3 URL or ARN, and the credentials steampipe is using would need to be granted access to the bucket.

foreign table persists when csv file is deleted (or, presumably, renamed)

connection "csv" }
plugin = "csv"
paths = [ "/home/jon/csv/*.csv" ]
}

cat > fruit.csv
name,color
apple,red

select * from csv.fruit
name,color
apple,red

rm fruit.csv
select * from csv.fruit
... hangs ...

sudo -u root psql -d steampipe -h localhost -p 9193

select * from information_schema.foreign_tables

steampipe, csv, fruit, steampipe, steampipe

drop foreign table csv.fruit

cat > fruit.csv
name,color
apple,red

select * from csv.fruit
name,color
apple,red

Use of DoubleQuotes not working for some table names

Describe the bug
Can't query column header for csv file with the name "Billable Hours"

Steampipe version (steampipe -v)
Example: v0.16.3

Plugin version (steampipe plugin list)
Example: v0.3.2

To reproduce
create this csv file:

Billable Hours,Bundle Type
10,xxx

query the file:

> select "Billable Hours" from test
Error: column "Billable Hours" does not exist (SQLSTATE 42703)
> select "Bundle Type" from test
+-------------+
| Bundle Type |
+-------------+
| xxx         |
+-------------+

Expected behavior
Both queries should work

Additional context
Was having trouble reproducing with other column names, e.g. "Foo Bar" works fine...

turbot / steampipe-plugin-csv Goto Github PK

steampipe-plugin-csv's Introduction

CSV Plugin for Steampipe

Quick start

Engines

Developing

Open Source & Contributing

Get Involved

steampipe-plugin-csv's People

Contributors

Stargazers

Watchers

Forkers

steampipe-plugin-csv's Issues

oh, look, its quittin' time!

eat, sleep, get up

Do some early morning AdventOfCode work

directory contains day4.csv see https://github.com/Eric-Hacker/AOC22/tree/main/Day4 for example

whoops, how time flies, better get some real work done

see csv.day4 table

think, hmm, I guess I wasn't meant to get work done today, maybe I should go back to working on AOC

Recommend Projects

Recommend Topics

Recommend Org