Comments (19)
ssh to vitrina03 and run ps -ef | grep gpfdist
to see if it is running.
You can view the detailed logs in the home directory of gpadmin on that host too.
Refer to this:
https://github.com/RunningJon/TPC-DS/blob/master/04_load/start_gpfdist.sh#L8
from tpc-ds.
Hello.
Thanks for the answer.
I send the output of the command, as well as the log of the program.
Sincerely, Sergey Berezin.
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[root@vitrina03 ~]# ps -ef | grep gpfdist
gpadmin 81087 1 0 16:57 ? 00:00:00 gpfdist -p 4001 -d /data1/primary/gpseg5/pivotalguru
gpadmin 81125 1 0 16:57 ? 00:00:00 gpfdist -p 4002 -d /data1/primary/gpseg6/pivotalguru
gpadmin 81163 1 0 16:57 ? 00:00:00 gpfdist -p 4003 -d /data1/primary/gpseg7/pivotalguru
gpadmin 81254 1 0 16:57 ? 00:00:00 gpfdist -p 4004 -d /data1/primary/gpseg8/pivotalguru
gpadmin 81297 1 0 16:57 ? 00:00:00 gpfdist -p 4005 -d /data1/primary/gpseg9/pivotalguru
root 101134 101007 0 17:23 pts/0 00:00:00 grep --color=auto gpfdist
[root@vitrina03 ~]#
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2022-11-08 16:57:03 81125 INFO Before opening listening sockets - following listening sockets are available:
2022-11-08 16:57:03 81125 INFO IPV6 socket: [::]:4002
2022-11-08 16:57:03 81125 INFO IPV4 socket: 0.0.0.0:4002
2022-11-08 16:57:03 81125 INFO Trying to open listening socket:
2022-11-08 16:57:03 81125 INFO IPV6 socket: [::]:4002
2022-11-08 16:57:03 81125 INFO Opening listening socket succeeded
2022-11-08 16:57:03 81125 INFO Trying to open listening socket:
2022-11-08 16:57:03 81125 INFO IPV4 socket: 0.0.0.0:4002
2022-11-08 16:57:03 81125 INFO Opening listening socket succeeded
Serving HTTP on port 4002, directory /data1/primary/gpseg6/pivotalguru
2022-11-08 16:57:13 81125 INFO [0:1:0:8] 192.168.11.24 requests /time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 INFO [0:1:0:8] got a request at port 30656:
GET /time_dim_[0-9]_[0-9].dat HTTP/1.1
2022-11-08 16:57:13 81125 INFO [0:1:0:8] request headers:
2022-11-08 16:57:13 81125 INFO [0:1:0:8] Host:192.168.11.24:4002
2022-11-08 16:57:13 81125 INFO [0:1:0:8] Accept:/
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-XID:1667912973-0000000268
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-CID:2
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SN:0
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SEGMENT-ID:6
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SEGMENT-COUNT:15
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-LINE-DELIM-LENGTH:-1
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-PROTO:1
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-MASTER_HOST:192.168.11.63
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-MASTER_PORT:5432
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-CSVOPT:m0x 92q 0n0h0
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP_SEG_PG_CONF:/data1/primary/gpseg6/postgresql.conf
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP_SEG_DATADIR:/data1/primary/gpseg6
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-DATABASE:adb
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-USER:gpadmin
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SEG-PORT:10001
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SESSION-ID:722
2022-11-08 16:57:13 81125 INFO remove sessions
2022-11-08 16:57:13 81125 INFO [0:1:6:8] r->path /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 INFO [0:1:6:8] new session trying to open the data stream
gfile stat /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat failure: No such file or directory
fstream unable to open file /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 WARN [0:1:6:8] reject request from 192.168.11.24, path /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 WARN [0:1:6:8] HTTP ERROR: 192.168.11.24 - 404 file not found
2022-11-08 16:57:13 81125 INFO [0:1:6:8] request end
2022-11-08 16:57:13 81125 INFO [0:1:6:8] detach segment request from session
2022-11-08 16:57:13 81125 INFO [0:1:6:8] successfully shutdown socket
2022-11-08 16:57:13 81125 INFO [0:1:6:8] peer closed after gpfdist shutdown
2022-11-08 16:57:13 81125 INFO [0:1:6:8] unsent bytes: 0 (-1 means not supported)
2022-11-08 16:57:13 81125 INFO [0:1:6:8] successfully closed socket
from tpc-ds.
This means gpfdist is working properly and simply couldn't find any files.
Did the generate data step finish?
Look for the log file on that host named generate_data.x.log. Refer to this:
https://github.com/RunningJon/TPC-DS/blob/master/04_load/start_gpfdist.sh#L8
from tpc-ds.
Here is the data in the Log.
Sincerely, Sergey Berezin.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
GEN_DATA_SCALE: 10000
CHILD: 1
PARALLEL: 15
GEN_DATA_PATH: /data1/primary/gpseg5/pivotalguru
./generate_data.sh: line 32: /home/gpadmin/dsdgen: No such file or directory
from tpc-ds.
This means the dsdgen binary didn't get copied to the host.
https://github.com/RunningJon/TPC-DS/blob/master/00_compile_tpcds/rollout.sh#L33
Do you see this file? 00_compile_tpcds/tools/dsqgen
If so, does segment_hosts.txt have all of the correct segment hosts in it including this host?
from tpc-ds.
Yes, the dsqgen file is in place.
The hostnames are spelled correctly.
Sincerely, Sergey Berezin.
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
-rwxrwxr-x 1 gpadmin gpadmin 455416 Oct 31 15:53 dsdgen
-rwxrwxr-x 1 gpadmin gpadmin 286416 Oct 31 15:53 dsqgen
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[root@vitrina01 ~]# find / -name segment_hosts.txt
find: ‘/proc/255094’: No such file or directory
/pivotalguru/TPC-DS/segment_hosts.txt
[root@vitrina01 ~]# cat /pivotalguru/TPC-DS/segment_hosts.txt
vitrina03
vitrina04
vitrina02
[root@vitrina01 ~]# ping vitrina03
PING vitrina03 (192.168.11.24) 56(84) bytes of data.
64 bytes from vitrina03 (192.168.11.24): icmp_seq=1 ttl=64 time=0.198 ms
64 bytes from vitrina03 (192.168.11.24): icmp_seq=2 ttl=64 time=0.180 ms
^C
--- vitrina03 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.180/0.189/0.198/0.009 ms
[root@vitrina01 ~]# ping vitrina04
PING vitrina04 (192.168.11.25) 56(84) bytes of data.
64 bytes from vitrina04 (192.168.11.25): icmp_seq=1 ttl=64 time=0.250 ms
64 bytes from vitrina04 (192.168.11.25): icmp_seq=2 ttl=64 time=0.184 ms
^C
--- vitrina04 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.184/0.217/0.250/0.033 ms
[root@vitrina01 ~]# ping vitrina02
PING vitrina02 (192.168.11.49) 56(84) bytes of data.
64 bytes from vitrina02 (192.168.11.49): icmp_seq=1 ttl=64 time=0.268 ms
64 bytes from vitrina02 (192.168.11.49): icmp_seq=2 ttl=64 time=0.242 ms
^C
--- vitrina02 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.242/0.255/0.268/0.013 ms
[root@vitrina01 ~]#
from tpc-ds.
10TB with only 15 segments on 3 nodes will take a very long time to complete.
The error was on vitrina03:
./generate_data.sh: line 32: /home/gpadmin/dsdgen: No such file or directory
But you said it is in place. Correct? ls -la /home/gpadmin/dsdgen
from tpc-ds.
The file is in a different location.
Sincerely, Sergey Berezin.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[root@vitrina01 tools]# ls -la /home/gpadmin/dsdgen
ls: cannot access /home/gpadmin/dsdgen: No such file or directory
[root@vitrina01 tools]# pwd
/pivotalguru/TPC-DS/00_compile_tpcds/tools
from tpc-ds.
dsdgen is the program that is compiled on vitrina01 (master) in /pivotalguru/TPC-DS/00_compile_tpcds/tools and copied to every segment host (vitrina02, vitrina03, vitrina04). The error message said that the file didn't exist on vitrina03.
from tpc-ds.
Sincerely, Sergey Berezin.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[gpadmin@vitrina03 ~]$ ls -la /home/gpadmin/dsdgen
ls: cannot access /home/gpadmin/dsdgen: No such file or directory
[gpadmin@vitrina03 ~]$ pwd
/home/gpadmin
[gpadmin@vitrina03 ~]$
from tpc-ds.
https://github.com/RunningJon/TPC-DS/blob/master/00_compile_tpcds/rollout.sh#L33
Are you able to ssh between the nodes as gpadmin? Can you scp files from the master node to every segment host?
from tpc-ds.
Yes, there is a connection.
Sincerely, Sergey Berezin.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[root@vitrina01 TPC-DS-master]# scp tpcds.log vitrina02:/home/gpadmin/
tpcds.log 100% 197KB 33.4MB/s 00:00
[root@vitrina01 TPC-DS-master]#
from tpc-ds.
I would set RUN_COMPILE_TPCDS to "true" and RUN_GEN_DATA to "true" in tpcds_variables.sh and then run the benchmark again. When it ran the compile the first time, it did not copy the binary to the segment hosts.
from tpc-ds.
Now the log is showing:
copy tpcds binaries to vitrina03:/home/gpadmin
scp: /home/gpadmin//dsdgen: Text file busy
from tpc-ds.
You need to kill all of the dsdgen processes on the segment hosts and make sure the benchmark isn't running.
from tpc-ds.
Hello.
Everything seems to have started.
Thank you.
And where is the number of Threads configured?
Where can i change variable for this one, if this is possible?
"Starting analyze with 5 workers..." ?
Sincerely, Sergey Berezin.
from tpc-ds.
It is running analyzedb
which started the 5 workers. The command does allow you to run more workers but that isn't exposed in this benchmark utility.
from tpc-ds.
Please tell me how to run more or less than five worker processes?
Sincerely, Sergey Berezin.
from tpc-ds.
for analyze? You could hard code it in the script but it really won't help much. That isn't what is taking a long time.
from tpc-ds.
Related Issues (20)
- imp option - clarification needed HOT 1
- Creating socket failed during dataload HOT 2
- hawq_rm_nvseg_perquery_perseg_limit clarification HOT 1
- Very poor HDFS throughput HOT 2
- Unable to load more than 50GB data in hdfs through tcpds script HOT 8
- Sharing TPC-DS test results of HAWQ & SparkSQL
- Generate data step hangs HOT 14
- relation "pg_filespace_entry" does not exist HOT 7
- Changes in Postgresql.conf causing to Stop Greenplum HOT 5
- Canceling query because of high VMEM usage. HOT 2
- ERROR: could not open file "../log/rollout_gen_data.log" for reading: No such file or directory HOT 10
- Can not execute tpcds.sh in offline environments HOT 2
- Setting RUN_COMPILE_TPCDS="false" does not disable compiling HOT 2
- 请教问题 HOT 5
- what's the difference with score and qphds HOT 2
- Should 02_init/rollout.sh set search path for ADMIN_USER? HOT 3
- Selected scale factor is NOT valid && Connection timed out HOT 7
- Generating data takes long time HOT 4
- Session report not avaialbe HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tpc-ds.