Comments (2)
Thanks @parthosa for the feedback. I'm new to TPC-DS and this repo is intended only for the official TPC-DS run originally.
I take a look at this dbgen_version
table, looks like it's only used to show the meta info for "data gen", not used in queries. I'm fine to remove it.
Btw, could you share what "other Datasources" could be except the official tool?
from spark-rapids-benchmarks.
There could be other ways
to generate the TPCDS data that does not create the dbgen_version
folder. Since this folder is not used in queries, it should not cause hurdle while running queries that use data from these other ways
.
I was using an S3 bucket having TPCDS data that did not have the dbgen_version
folder.
from spark-rapids-benchmarks.
Related Issues (20)
- Transcode from other formats besides CSV
- Query filtering for nds_power HOT 2
- [Bug] Failed to build twice in nds/tpcds-gen HOT 2
- [FEA] Support Iceberg and DeltaLake as input data format for data conversion
- [FEA] Allow property files for nds_transcode and other nds_* scripts HOT 1
- [BUG] Got error "cannot resolve 'd_date' given input columns" when run nds_maintenance.py HOT 1
- [BUG] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.11.0:compile (default-compile) on project tpcds-gen
- Update README to latest version of spark-rapids HOT 1
- [FEA] nds_validate.py can compare some specified queries
- [FEA]Define a pre-commit hook to update copyright year automatically
- [BUG] nds_transcode.py is not handling international characters correctly HOT 1
- [BUG] validation failure of power run results on gpu with nvcomp and power run results on cpu HOT 7
- [BUG] No need for patches/code.patch HOT 1
- [BUG] Throughput run should use a new template file instead of using the Power test one
- [BUG] Table meta information are missing when running Data Maintenance over Delta
- Implement a pre-commit / premerge check for licence headers HOT 1
- [FEA] Create NDS-H benchmark for performance analysis HOT 2
- [QST] Cannot run on GPU because GpuCSVScan only supports UTF8 encoded data HOT 14
- [BUG] Iceberg related jobs failed due to Spark version incompatibility
- [BUG] Delta related jobs failed due to Spark version incompatibility
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from spark-rapids-benchmarks.