Giter VIP home page Giter VIP logo

Comments (19)

RunningJon avatar RunningJon commented on July 29, 2024

ssh to vitrina03 and run ps -ef | grep gpfdist to see if it is running.
You can view the detailed logs in the home directory of gpadmin on that host too.
Refer to this:
https://github.com/RunningJon/TPC-DS/blob/master/04_load/start_gpfdist.sh#L8

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Hello.
Thanks for the answer.
I send the output of the command, as well as the log of the program.

Sincerely, Sergey Berezin.

||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[root@vitrina03 ~]# ps -ef | grep gpfdist
gpadmin 81087 1 0 16:57 ? 00:00:00 gpfdist -p 4001 -d /data1/primary/gpseg5/pivotalguru
gpadmin 81125 1 0 16:57 ? 00:00:00 gpfdist -p 4002 -d /data1/primary/gpseg6/pivotalguru
gpadmin 81163 1 0 16:57 ? 00:00:00 gpfdist -p 4003 -d /data1/primary/gpseg7/pivotalguru
gpadmin 81254 1 0 16:57 ? 00:00:00 gpfdist -p 4004 -d /data1/primary/gpseg8/pivotalguru
gpadmin 81297 1 0 16:57 ? 00:00:00 gpfdist -p 4005 -d /data1/primary/gpseg9/pivotalguru
root 101134 101007 0 17:23 pts/0 00:00:00 grep --color=auto gpfdist
[root@vitrina03 ~]#

||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

2022-11-08 16:57:03 81125 INFO Before opening listening sockets - following listening sockets are available:
2022-11-08 16:57:03 81125 INFO IPV6 socket: [::]:4002
2022-11-08 16:57:03 81125 INFO IPV4 socket: 0.0.0.0:4002
2022-11-08 16:57:03 81125 INFO Trying to open listening socket:
2022-11-08 16:57:03 81125 INFO IPV6 socket: [::]:4002
2022-11-08 16:57:03 81125 INFO Opening listening socket succeeded
2022-11-08 16:57:03 81125 INFO Trying to open listening socket:
2022-11-08 16:57:03 81125 INFO IPV4 socket: 0.0.0.0:4002
2022-11-08 16:57:03 81125 INFO Opening listening socket succeeded
Serving HTTP on port 4002, directory /data1/primary/gpseg6/pivotalguru
2022-11-08 16:57:13 81125 INFO [0:1:0:8] 192.168.11.24 requests /time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 INFO [0:1:0:8] got a request at port 30656:
GET /time_dim_[0-9]_[0-9].dat HTTP/1.1
2022-11-08 16:57:13 81125 INFO [0:1:0:8] request headers:
2022-11-08 16:57:13 81125 INFO [0:1:0:8] Host:192.168.11.24:4002
2022-11-08 16:57:13 81125 INFO [0:1:0:8] Accept:/
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-XID:1667912973-0000000268
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-CID:2
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SN:0
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SEGMENT-ID:6
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SEGMENT-COUNT:15
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-LINE-DELIM-LENGTH:-1
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-PROTO:1
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-MASTER_HOST:192.168.11.63
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-MASTER_PORT:5432
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-CSVOPT:m0x 92q 0n0h0
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP_SEG_PG_CONF:/data1/primary/gpseg6/postgresql.conf
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP_SEG_DATADIR:/data1/primary/gpseg6
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-DATABASE:adb
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-USER:gpadmin
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SEG-PORT:10001
2022-11-08 16:57:13 81125 INFO [0:1:0:8] X-GP-SESSION-ID:722
2022-11-08 16:57:13 81125 INFO remove sessions
2022-11-08 16:57:13 81125 INFO [0:1:6:8] r->path /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 INFO [0:1:6:8] new session trying to open the data stream
gfile stat /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat failure: No such file or directory
fstream unable to open file /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 WARN [0:1:6:8] reject request from 192.168.11.24, path /data1/primary/gpseg6/pivotalguru/time_dim_[0-9]_[0-9].dat
2022-11-08 16:57:13 81125 WARN [0:1:6:8] HTTP ERROR: 192.168.11.24 - 404 file not found

2022-11-08 16:57:13 81125 INFO [0:1:6:8] request end
2022-11-08 16:57:13 81125 INFO [0:1:6:8] detach segment request from session
2022-11-08 16:57:13 81125 INFO [0:1:6:8] successfully shutdown socket
2022-11-08 16:57:13 81125 INFO [0:1:6:8] peer closed after gpfdist shutdown
2022-11-08 16:57:13 81125 INFO [0:1:6:8] unsent bytes: 0 (-1 means not supported)
2022-11-08 16:57:13 81125 INFO [0:1:6:8] successfully closed socket

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

This means gpfdist is working properly and simply couldn't find any files.

Did the generate data step finish?

Look for the log file on that host named generate_data.x.log. Refer to this:
https://github.com/RunningJon/TPC-DS/blob/master/04_load/start_gpfdist.sh#L8

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Here is the data in the Log.
Sincerely, Sergey Berezin.

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

GEN_DATA_SCALE: 10000
CHILD: 1
PARALLEL: 15
GEN_DATA_PATH: /data1/primary/gpseg5/pivotalguru
./generate_data.sh: line 32: /home/gpadmin/dsdgen: No such file or directory

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

This means the dsdgen binary didn't get copied to the host.

https://github.com/RunningJon/TPC-DS/blob/master/00_compile_tpcds/rollout.sh#L33

Do you see this file? 00_compile_tpcds/tools/dsqgen

If so, does segment_hosts.txt have all of the correct segment hosts in it including this host?

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Yes, the dsqgen file is in place.
The hostnames are spelled correctly.

Sincerely, Sergey Berezin.

||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

-rwxrwxr-x 1 gpadmin gpadmin 455416 Oct 31 15:53 dsdgen
-rwxrwxr-x 1 gpadmin gpadmin 286416 Oct 31 15:53 dsqgen

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

[root@vitrina01 ~]# find / -name segment_hosts.txt
find: ‘/proc/255094’: No such file or directory
/pivotalguru/TPC-DS/segment_hosts.txt
[root@vitrina01 ~]# cat /pivotalguru/TPC-DS/segment_hosts.txt
vitrina03
vitrina04
vitrina02
[root@vitrina01 ~]# ping vitrina03
PING vitrina03 (192.168.11.24) 56(84) bytes of data.
64 bytes from vitrina03 (192.168.11.24): icmp_seq=1 ttl=64 time=0.198 ms
64 bytes from vitrina03 (192.168.11.24): icmp_seq=2 ttl=64 time=0.180 ms
^C
--- vitrina03 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.180/0.189/0.198/0.009 ms
[root@vitrina01 ~]# ping vitrina04
PING vitrina04 (192.168.11.25) 56(84) bytes of data.
64 bytes from vitrina04 (192.168.11.25): icmp_seq=1 ttl=64 time=0.250 ms
64 bytes from vitrina04 (192.168.11.25): icmp_seq=2 ttl=64 time=0.184 ms
^C
--- vitrina04 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 0.184/0.217/0.250/0.033 ms
[root@vitrina01 ~]# ping vitrina02
PING vitrina02 (192.168.11.49) 56(84) bytes of data.
64 bytes from vitrina02 (192.168.11.49): icmp_seq=1 ttl=64 time=0.268 ms
64 bytes from vitrina02 (192.168.11.49): icmp_seq=2 ttl=64 time=0.242 ms
^C
--- vitrina02 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.242/0.255/0.268/0.013 ms
[root@vitrina01 ~]#

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

10TB with only 15 segments on 3 nodes will take a very long time to complete.

The error was on vitrina03:
./generate_data.sh: line 32: /home/gpadmin/dsdgen: No such file or directory

But you said it is in place. Correct? ls -la /home/gpadmin/dsdgen

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

The file is in a different location.

Sincerely, Sergey Berezin.

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

[root@vitrina01 tools]# ls -la /home/gpadmin/dsdgen
ls: cannot access /home/gpadmin/dsdgen: No such file or directory
[root@vitrina01 tools]# pwd
/pivotalguru/TPC-DS/00_compile_tpcds/tools

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

dsdgen is the program that is compiled on vitrina01 (master) in /pivotalguru/TPC-DS/00_compile_tpcds/tools and copied to every segment host (vitrina02, vitrina03, vitrina04). The error message said that the file didn't exist on vitrina03.

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Sincerely, Sergey Berezin.

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
[gpadmin@vitrina03 ~]$ ls -la /home/gpadmin/dsdgen
ls: cannot access /home/gpadmin/dsdgen: No such file or directory
[gpadmin@vitrina03 ~]$ pwd
/home/gpadmin
[gpadmin@vitrina03 ~]$

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

https://github.com/RunningJon/TPC-DS/blob/master/00_compile_tpcds/rollout.sh#L33

Are you able to ssh between the nodes as gpadmin? Can you scp files from the master node to every segment host?

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Yes, there is a connection.
Sincerely, Sergey Berezin.

|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

[root@vitrina01 TPC-DS-master]# scp tpcds.log vitrina02:/home/gpadmin/
tpcds.log 100% 197KB 33.4MB/s 00:00
[root@vitrina01 TPC-DS-master]#

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

I would set RUN_COMPILE_TPCDS to "true" and RUN_GEN_DATA to "true" in tpcds_variables.sh and then run the benchmark again. When it ran the compile the first time, it did not copy the binary to the segment hosts.

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Now the log is showing:
copy tpcds binaries to vitrina03:/home/gpadmin
scp: /home/gpadmin//dsdgen: Text file busy

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

You need to kill all of the dsdgen processes on the segment hosts and make sure the benchmark isn't running.

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Hello.
Everything seems to have started.
Thank you.
And where is the number of Threads configured?

Where can i change variable for this one, if this is possible?
"Starting analyze with 5 workers..." ?

Sincerely, Sergey Berezin.

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

It is running analyzedb which started the 5 workers. The command does allow you to run more workers but that isn't exposed in this benchmark utility.

from tpc-ds.

skazka064 avatar skazka064 commented on July 29, 2024

Please tell me how to run more or less than five worker processes?

Sincerely, Sergey Berezin.

from tpc-ds.

RunningJon avatar RunningJon commented on July 29, 2024

for analyze? You could hard code it in the script but it really won't help much. That isn't what is taking a long time.

from tpc-ds.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.