Giter VIP home page Giter VIP logo

Comments (13)

dongreenberg avatar dongreenberg commented on May 18, 2024 1

Hi! Thank you for the detailed bug report. I actually noticed some new breakage this evening relating to the FastAPI 0.100.0 release, but given it's working on GCP I'm not sure that's it. Trying to reproduce on AWS now.

from runhouse.

shyamsn97 avatar shyamsn97 commented on May 18, 2024 1

Awesome! Thanks so much!

from runhouse.

dongreenberg avatar dongreenberg commented on May 18, 2024 1

I think I see the issue. Your script isn't running inside an if __name__ == "__main__": block so when the module is being imported on the cluster to run the function, the code to up the cluster runs again on the cluster. The corrected code would be:

import runhouse as rh

def num_cpus():
    import multiprocessing
    return f"Num cpus: {multiprocessing.cpu_count()}"

if __name__ == "__main__":
    num_cpus()

    cluster = rh.ondemand_cluster(
              name="runhouse",
              instance_type="CPU:8",
              provider="aws",      # options: "AWS", "GCP", "Azure", "Lambda", or "cheapest"
          )
    cluster.up_if_not()
    num_cpus_cluster = rh.function(name="num_cpus_cluster", fn=num_cpus).to(system=cluster, reqs=["./"])
    num_cpus_cluster()

We're actually introducing new logic now that will make it impossible for a cluster to start itself again because that's just silly, so your code may actually work as is in the next release, but using the if __name__ block is best practice anyway.

from runhouse.

shyamsn97 avatar shyamsn97 commented on May 18, 2024 1

That worked! Thanks! This is good to close now :)

from runhouse.

dongreenberg avatar dongreenberg commented on May 18, 2024 1

This is fixed. I've tested with a few of the new versions of fastapi and pydantic on main and something changed about our code, their code, or both such that they now work out of the box. I've relaxed the requirement on main, which should be in the release today.

from runhouse.

shyamsn97 avatar shyamsn97 commented on May 18, 2024

After configuring gcp correctly, looks like this works when specifying "gcp" in provider:

INFO | 2023-07-10 22:37:35,284 | Loaded Runhouse config from /home/shyam/.rh/config.yaml
INFO | 2023-07-10 22:37:44,473 | Running command on cpu-cluster: python3 -c "import numpy; print(numpy.__version__)"
1.21.6
INFO | 2023-07-10 22:37:46,311 | Connected (version 2.0, client OpenSSH_7.9p1)
INFO | 2023-07-10 22:37:46,765 | Authentication (publickey) successful!
INFO | 2023-07-10 22:37:46,767 | Checking server cpu-cluster
INFO | 2023-07-10 22:37:48,945 | Server cpu-cluster is up.
None

from runhouse.

dongreenberg avatar dongreenberg commented on May 18, 2024

I was able to reproduce this and it is indeed an issue with the latest fastapi release (you can confirm by sshing into the server, viewing the server logs at ~/.rh/cluster_server_cpu-cluster.log, and confirming that the latest exception is a pydantic issue. The latest fastapi release was a massive update to Pydantic v2). I'm pushing a fix to main and releasing presently.

from runhouse.

dongreenberg avatar dongreenberg commented on May 18, 2024

Just pushed https://github.com/run-house/runhouse/releases/tag/v0.0.8, which should fix this.

from runhouse.

dongreenberg avatar dongreenberg commented on May 18, 2024

@shyamsn97 confirming that the fix worked?

from runhouse.

shyamsn97 avatar shyamsn97 commented on May 18, 2024

Yep seems like that worked and now I'm not getting an error on check_server. However the function is hanging for aws hangs indefinitely.

code from example:

import runhouse as rh

def num_cpus():
    import multiprocessing
    return f"Num cpus: {multiprocessing.cpu_count()}"

num_cpus()

# Using a Cloud provider
cluster = rh.ondemand_cluster(
              name="runhouse",
              instance_type="CPU:8",
              provider="aws",      # options: "AWS", "GCP", "Azure", "Lambda", or "cheapest"
          )

cluster.up_if_not()

num_cpus_cluster = rh.function(name="num_cpus_cluster", fn=num_cpus).to(system=cluster, reqs=["./"])

num_cpus_cluster()

Output:

(py310) shyam@shyam-ThinkPad-P53:~/Code/test-runhouse$ python test.py 
INFO | 2023-07-11 11:50:06,755 | Loaded Runhouse config from /home/shyam/.rh/config.yaml
INFO | 2023-07-11 11:50:07,937 | Found credentials in shared credentials file: ~/.aws/credentials
I 07-11 11:50:08 optimizer.py:636] == Optimizer ==
I 07-11 11:50:08 optimizer.py:647] Target: minimizing cost
I 07-11 11:50:08 optimizer.py:659] Estimated cost: $0.4 / hour
I 07-11 11:50:08 optimizer.py:659] 
I 07-11 11:50:08 optimizer.py:732] Considered resources (1 node):
I 07-11 11:50:08 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 11:50:08 optimizer.py:781]  CLOUD   INSTANCE      vCPUs   Mem(GB)   ACCELERATORS   REGION/ZONE   COST ($)   CHOSEN   
I 07-11 11:50:08 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 11:50:08 optimizer.py:781]  AWS     m6i.2xlarge   8       32        -              us-east-1     0.38          ✔     
I 07-11 11:50:08 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 11:50:08 optimizer.py:781] 
I 07-11 11:50:08 cloud_vm_ray_backend.py:3495] Creating a new cluster: "runhouse" [1x AWS(m6i.2xlarge)].
I 07-11 11:50:08 cloud_vm_ray_backend.py:3495] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 07-11 11:50:08 cloud_vm_ray_backend.py:1215] To view detailed progress: tail -n100 -f /home/shyam/sky_logs/sky-2023-07-11-11-50-07-852797/provision.log
I 07-11 11:50:09 cloud_vm_ray_backend.py:1539] Launching on AWS us-east-1 (us-east-1a,us-east-1b,us-east-1c,us-east-1d,us-east-1f)
I 07-11 11:51:42 log_utils.py:89] Head node is up.
I 07-11 11:53:34 cloud_vm_ray_backend.py:1352] Successfully provisioned or found existing VM.
I 07-11 11:53:41 cloud_vm_ray_backend.py:3544] Processing file mounts.
I 07-11 11:53:41 cloud_vm_ray_backend.py:3575] To view detailed progress: tail -n100 -f ~/sky_logs/sky-2023-07-11-11-50-07-852797/file_mounts.log
I 07-11 11:53:41 backend_utils.py:1254] Syncing (to 1 node): ~/.rh -> ~/.rh
I 07-11 11:53:44 cloud_vm_ray_backend.py:2808] Run commands not specified or empty.
Clusters
NAME         LAUNCHED        RESOURCES              STATUS  AUTOSTOP  COMMAND  
runhouse     a few secs ago  1x AWS(m6i.2xlarge)    UP      (down)    test.py  
cpu-cluster  20 mins ago     1x GCP(n2-standard-8)  UP      (down)    test.py  

INFO | 2023-07-11 11:53:51,543 | Restarting HTTP server on runhouse.
INFO | 2023-07-11 11:53:51,543 | Running command on runhouse: pip install runhouse==0.0.8Collecting runhouse==0.0.8
  Downloading runhouse-0.0.8-py3-none-any.whl (142 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.7/142.7 kB 3.5 MB/s eta 0:00:00
Collecting sshfs
  Downloading sshfs-2023.4.1-py3-none-any.whl (15 kB)
Collecting typer
  Downloading typer-0.9.0-py3-none-any.whl (45 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.9/45.9 kB 2.6 MB/s eta 0:00:00
Collecting fsspec
  Downloading fsspec-2023.6.0-py3-none-any.whl (163 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 7.9 MB/s eta 0:00:00
Collecting uvicorn
  Downloading uvicorn-0.22.0-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 4.4 MB/s eta 0:00:00
Collecting sshtunnel>=0.3.0
  Downloading sshtunnel-0.4.0-py2.py3-none-any.whl (24 kB)
Requirement already satisfied: pyOpenSSL>=21.1.0 in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (22.1.0)
Collecting fastapi<=0.99.0
  Downloading fastapi-0.99.0-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.2/58.2 kB 321.7 kB/s eta 0:00:00
Collecting pyarrow
  Downloading pyarrow-12.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.9/38.9 MB 13.5 MB/s eta 0:00:00
Requirement already satisfied: rich in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (13.4.2)
Requirement already satisfied: wheel in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (0.38.4)
Requirement already satisfied: skypilot==0.3.1 in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (0.3.1)
Requirement already satisfied: filelock>=3.6.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.12.2)
Requirement already satisfied: pulp in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.7.0)
Requirement already satisfied: click<=8.0.4,>=7.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (8.0.4)
Requirement already satisfied: pycryptodome==3.12.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.12.0)
Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.1)
Requirement already satisfied: pandas in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.0.3)
Requirement already satisfied: grpcio<=1.51.3,>=1.42.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (1.51.3)
Requirement already satisfied: awscli in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (1.29.2)
Requirement already satisfied: oauth2client in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (4.1.3)
Requirement already satisfied: pendulum in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.1.2)
Requirement already satisfied: psutil in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (5.9.5)
Requirement already satisfied: tabulate in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (0.9.0)
Requirement already satisfied: cryptography in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (38.0.3)
Requirement already satisfied: colorama<0.4.5 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (0.4.4)
Requirement already satisfied: boto3 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (1.28.2)
Requirement already satisfied: ray[default]<=2.4.0,>=2.2.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.4.0)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (4.23.4)
Requirement already satisfied: PrettyTable>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.8.0)
Requirement already satisfied: packaging in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (23.1)
Requirement already satisfied: jsonschema in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (4.18.0)
Requirement already satisfied: jinja2>=3.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.1.2)
Requirement already satisfied: typing-extensions>=4.5.0 in /opt/conda/lib/python3.10/site-packages (from fastapi<=0.99.0->runhouse==0.0.8) (4.7.1)
Collecting starlette<0.28.0,>=0.27.0
  Downloading starlette-0.27.0-py3-none-any.whl (66 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.0/67.0 kB 588.7 kB/s eta 0:00:00
Collecting pydantic!=1.8,!=1.8.1,<2.0.0,>=1.7.4
  Downloading pydantic-1.10.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 16.3 MB/s eta 0:00:00
Collecting paramiko>=2.7.2
  Downloading paramiko-3.2.0-py3-none-any.whl (224 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.2/224.2 kB 30.5 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.16.6 in /opt/conda/lib/python3.10/site-packages (from pyarrow->runhouse==0.0.8) (1.25.1)
Requirement already satisfied: markdown-it-py>=2.2.0 in /opt/conda/lib/python3.10/site-packages (from rich->runhouse==0.0.8) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/conda/lib/python3.10/site-packages (from rich->runhouse==0.0.8) (2.15.1)
Collecting asyncssh<3,>=2.11.0
  Downloading asyncssh-2.13.2-py3-none-any.whl (349 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 349.3/349.3 kB 2.8 MB/s eta 0:00:00
Collecting h11>=0.8
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 11.9 MB/s eta 0:00:00
Requirement already satisfied: cffi>=1.12 in /opt/conda/lib/python3.10/site-packages (from cryptography->skypilot==0.3.1->runhouse==0.0.8) (1.15.1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2>=3.0->skypilot==0.3.1->runhouse==0.0.8) (2.1.3)
Requirement already satisfied: mdurl~=0.1 in /opt/conda/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->runhouse==0.0.8) (0.1.2)
Collecting pynacl>=1.5
  Downloading PyNaCl-1.5.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (856 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 856.7/856.7 kB 11.9 MB/s eta 0:00:00
Collecting bcrypt>=3.2
  Downloading bcrypt-4.0.1-cp36-abi3-manylinux_2_28_x86_64.whl (593 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 593.7/593.7 kB 67.6 MB/s eta 0:00:00
Requirement already satisfied: wcwidth in /opt/conda/lib/python3.10/site-packages (from PrettyTable>=2.0.0->skypilot==0.3.1->runhouse==0.0.8) (0.2.6)
Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.28.1)
Requirement already satisfied: aiosignal in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.3.1)
Requirement already satisfied: virtualenv<20.21.1,>=20.0.24 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (20.21.0)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.0.5)
Requirement already satisfied: frozenlist in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.3.3)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (5.4.1)
Requirement already satisfied: attrs in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (23.1.0)
Requirement already satisfied: aiohttp>=3.7 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (3.8.4)
Requirement already satisfied: aiohttp-cors in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.7.0)
Requirement already satisfied: opencensus in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.11.2)
Requirement already satisfied: smart-open in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (6.3.0)
Requirement already satisfied: prometheus-client>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.17.1)
Requirement already satisfied: colorful in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.5.5)
Requirement already satisfied: gpustat>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.1)
Requirement already satisfied: py-spy>=0.2.0 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.3.14)
Collecting anyio<5,>=3.4.0
  Downloading anyio-3.7.1-py3-none-any.whl (80 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 80.9/80.9 kB 16.7 MB/s eta 0:00:00
Requirement already satisfied: s3transfer<0.7.0,>=0.6.0 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (0.6.1)
Requirement already satisfied: docutils<0.17,>=0.10 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (0.16)
Requirement already satisfied: rsa<4.8,>=3.1.2 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (4.7.2)
Requirement already satisfied: botocore==1.31.2 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (1.31.2)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from botocore==1.31.2->awscli->skypilot==0.3.1->runhouse==0.0.8) (1.0.1)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /opt/conda/lib/python3.10/site-packages (from botocore==1.31.2->awscli->skypilot==0.3.1->runhouse==0.0.8) (1.26.11)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.10/site-packages (from botocore==1.31.2->awscli->skypilot==0.3.1->runhouse==0.0.8) (2.8.2)
Requirement already satisfied: referencing>=0.28.4 in /opt/conda/lib/python3.10/site-packages (from jsonschema->skypilot==0.3.1->runhouse==0.0.8) (0.29.1)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /opt/conda/lib/python3.10/site-packages (from jsonschema->skypilot==0.3.1->runhouse==0.0.8) (2023.6.1)
Requirement already satisfied: rpds-py>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from jsonschema->skypilot==0.3.1->runhouse==0.0.8) (0.8.10)
Requirement already satisfied: six>=1.6.1 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (1.16.0)
Requirement already satisfied: pyasn1>=0.1.7 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (0.5.0)
Requirement already satisfied: httplib2>=0.9.1 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (0.22.0)
Requirement already satisfied: pyasn1-modules>=0.0.5 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (0.3.0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pandas->skypilot==0.3.1->runhouse==0.0.8) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /opt/conda/lib/python3.10/site-packages (from pandas->skypilot==0.3.1->runhouse==0.0.8) (2023.3)
Requirement already satisfied: pytzdata>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pendulum->skypilot==0.3.1->runhouse==0.0.8) (2020.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (4.0.2)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.9.2)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.1.1)
Collecting exceptiongroup
  Downloading exceptiongroup-1.1.2-py3-none-any.whl (14 kB)
Collecting sniffio>=1.1
  Downloading sniffio-1.3.0-py3-none-any.whl (10 kB)
Requirement already satisfied: idna>=2.8 in /opt/conda/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi<=0.99.0->runhouse==0.0.8) (3.4)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.10/site-packages (from cffi>=1.12->cryptography->skypilot==0.3.1->runhouse==0.0.8) (2.21)
Requirement already satisfied: nvidia-ml-py>=11.450.129 in /opt/conda/lib/python3.10/site-packages (from gpustat>=1.0.0->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (12.535.77)
Requirement already satisfied: blessed>=1.17.1 in /opt/conda/lib/python3.10/site-packages (from gpustat>=1.0.0->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.20.0)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /opt/conda/lib/python3.10/site-packages (from httplib2>=0.9.1->oauth2client->skypilot==0.3.1->runhouse==0.0.8) (3.1.0)
Requirement already satisfied: distlib<1,>=0.3.6 in /opt/conda/lib/python3.10/site-packages (from virtualenv<20.21.1,>=20.0.24->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.3.6)
Requirement already satisfied: platformdirs<4,>=2.4 in /opt/conda/lib/python3.10/site-packages (from virtualenv<20.21.1,>=20.0.24->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (3.8.1)
Requirement already satisfied: opencensus-context>=0.1.3 in /opt/conda/lib/python3.10/site-packages (from opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.1.3)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.11.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2022.9.24)
Requirement already satisfied: google-auth<3.0.dev0,>=2.14.1 in /opt/conda/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.22.0)
Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /opt/conda/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.59.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (5.3.1)
Installing collected packages: typer, sniffio, pydantic, pyarrow, h11, fsspec, exceptiongroup, bcrypt, uvicorn, pynacl, anyio, starlette, paramiko, asyncssh, sshtunnel, sshfs, fastapi, runhouse
  Attempting uninstall: pydantic
    Found existing installation: pydantic 2.0.2
    Uninstalling pydantic-2.0.2:
      Successfully uninstalled pydantic-2.0.2
Successfully installed anyio-3.7.1 asyncssh-2.13.2 bcrypt-4.0.1 exceptiongroup-1.1.2 fastapi-0.99.0 fsspec-2023.6.0 h11-0.14.0 paramiko-3.2.0 pyarrow-12.0.1 pydantic-1.10.11 pynacl-1.5.0 runhouse-0.0.8 sniffio-1.3.0 sshfs-2023.4.1 sshtunnel-0.4.0 starlette-0.27.0 typer-0.9.0 uvicorn-0.22.0
INFO | 2023-07-11 11:53:58,889 | Running command on runhouse: pkill -f "python -m runhouse.servers.http.http_server"
INFO | 2023-07-11 11:54:00,015 | Running command on runhouse: pkill -f ".*ray.*6379.*"
INFO | 2023-07-11 11:54:01,244 | Running command on runhouse: ray start --head --port 6379 --autoscaling-config=~/ray_bootstrap_config.yaml
2023-07-11 18:54:03,331	INFO usage_lib.py:398 -- Usage stats collection is enabled by default without user confirmation because this terminal is detected to be non-interactive. To disable this, add `--disable-usage-stats` to the command that starts the cluster, or run the following command: `ray disable-usage-stats` before starting the cluster. See https://docs.ray.io/en/master/cluster/usage-stats.html for more details.
2023-07-11 18:54:03,331	INFO scripts.py:710 -- Local node IP: 172.31.46.221
2023-07-11 18:54:05,396	SUCC scripts.py:747 -- --------------------
2023-07-11 18:54:05,397	SUCC scripts.py:748 -- Ray runtime started.
2023-07-11 18:54:05,397	SUCC scripts.py:749 -- --------------------
2023-07-11 18:54:05,397	INFO scripts.py:751 -- Next steps
2023-07-11 18:54:05,397	INFO scripts.py:754 -- To add another node to this Ray cluster, run
2023-07-11 18:54:05,397	INFO scripts.py:757 --   ray start --address='172.31.46.221:6379'
2023-07-11 18:54:05,397	INFO scripts.py:766 -- To connect to this Ray cluster:
2023-07-11 18:54:05,397	INFO scripts.py:768 -- import ray
2023-07-11 18:54:05,397	INFO scripts.py:769 -- ray.init()
2023-07-11 18:54:05,397	INFO scripts.py:781 -- To submit a Ray job using the Ray Jobs CLI:
2023-07-11 18:54:05,397	INFO scripts.py:782 --   RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir . -- python my_script.py
2023-07-11 18:54:05,397	INFO scripts.py:791 -- See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html 
2023-07-11 18:54:05,397	INFO scripts.py:795 -- for more information on submitting Ray jobs to the Ray cluster.
2023-07-11 18:54:05,397	INFO scripts.py:800 -- To terminate the Ray runtime, run
2023-07-11 18:54:05,397	INFO scripts.py:801 --   ray stop
2023-07-11 18:54:05,397	INFO scripts.py:804 -- To view the status of the cluster, use
2023-07-11 18:54:05,397	INFO scripts.py:805 --   ray status
2023-07-11 18:54:05,397	INFO scripts.py:809 -- To monitor and debug Ray, view the dashboard at 
2023-07-11 18:54:05,397	INFO scripts.py:810 --   127.0.0.1:8265
2023-07-11 18:54:05,397	INFO scripts.py:817 -- If connection to the dashboard fails, check your firewall settings and network configuration.
INFO | 2023-07-11 11:54:05,750 | Running command on runhouse: screen -dm bash -c 'python -m runhouse.servers.http.http_server |& tee -a ~/.rh/cluster_server_runhouse.log 2>&1'
/home/shyam/miniconda3/envs/py310/lib/python3.10/site-packages/runhouse/rns/function.py:113: UserWarning: ``reqs`` and ``setup_cmds`` arguments has been deprecated. Please use ``env`` instead.
  warnings.warn(
INFO | 2023-07-11 11:54:12,193 | Setting up Function on cluster.
INFO | 2023-07-11 11:54:12,564 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:13,099 | Authentication (publickey) successful!
INFO | 2023-07-11 11:54:13,108 | Checking server runhouse
INFO | 2023-07-11 11:54:14,768 | Server runhouse is up.
INFO | 2023-07-11 11:54:14,914 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:14,928 | Authentication (publickey) failed.
INFO | 2023-07-11 11:54:14,934 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:14,949 | Authentication (publickey) successful!
2023-07-11 11:54:14,949| ERROR   | Problem setting SSH Forwarder up: Couldn't open tunnel :50052 <> 127.0.0.1:50052 might be in use or destination not reachable
ERROR | 2023-07-11 11:54:14,949 | Problem setting SSH Forwarder up: Couldn't open tunnel :50052 <> 127.0.0.1:50052 might be in use or destination not reachable
INFO | 2023-07-11 11:54:15,083 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:15,097 | Authentication (publickey) failed.
INFO | 2023-07-11 11:54:15,103 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:15,118 | Authentication (publickey) successful!
INFO | 2023-07-11 11:54:15,119 | Checking server local-cluster
INFO | 2023-07-11 11:54:16,445 | Server local-cluster is up.
INFO | 2023-07-11 11:54:16,445 | Running command on local-cluster: ray start --head
INFO | 2023-07-11 11:54:17,903 | Copying folder from file:///home/shyam/Code/test-runhouse to: runhouse
INFO | 2023-07-11 11:54:19,653 | Installing packages on cluster runhouse: ['Package: test-runhouse']
INFO | 2023-07-11 11:54:20,726 | Function setup complete.
INFO | 2023-07-11 11:54:20,731 | Running num_cpus_cluster via HTTP
INFO | 2023-07-11 11:54:21,326 | Submitted remote call to cluster for num_cpus_cluster_20230711_115420
:job_id:01000000
:task_name:get_fn_from_pointers
:job_id:01000000
INFO | 2023-07-11 18:54:21,732 | Loaded Runhouse config from /home/ubuntu/.rh/config.yaml
:task_name:get_fn_from_pointers
INFO | 2023-07-11 18:54:22,423 | Appending /home/ubuntu/test-runhouse to sys.path
INFO | 2023-07-11 18:54:22,424 | Importing module test
SkyPilot collects usage data to improve its services. `setup` and `run` commands are not collected to ensure privacy.
Usage logging can be disabled by setting the environment variable SKYPILOT_DISABLE_USAGE_COLLECTION=1.
INFO | 2023-07-11 18:54:22,944 | Found credentials in shared credentials file: ~/.aws/credentials
I 07-11 18:54:23 aws_catalog.py:120] Fetching availability zones mapping for AWS...
I 07-11 18:54:25 optimizer.py:636] == Optimizer ==
I 07-11 18:54:25 optimizer.py:647] Target: minimizing cost
I 07-11 18:54:25 optimizer.py:659] Estimated cost: $0.4 / hour
I 07-11 18:54:25 optimizer.py:659] 
I 07-11 18:54:25 optimizer.py:732] Considered resources (1 node):
I 07-11 18:54:25 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 18:54:25 optimizer.py:781]  CLOUD   INSTANCE      vCPUs   Mem(GB)   ACCELERATORS   REGION/ZONE   COST ($)   CHOSEN   
I 07-11 18:54:25 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 18:54:25 optimizer.py:781]  AWS     m6i.2xlarge   8       32        -              us-east-1     0.38          ✔     
I 07-11 18:54:25 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 18:54:25 optimizer.py:781] 
I 07-11 18:54:25 cloud_vm_ray_backend.py:3495] Creating a new cluster: "runhouse" [1x AWS(m6i.2xlarge)].
I 07-11 18:54:25 cloud_vm_ray_backend.py:3495] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 07-11 18:54:25 cloud_vm_ray_backend.py:1215] To view detailed progress: tail -n100 -f /home/ubuntu/sky_logs/sky-2023-07-11-18-54-22-466640/provision.log
I 07-11 18:54:26 cloud_vm_ray_backend.py:1539] Launching on AWS us-east-1 (us-east-1a,us-east-1b,us-east-1c,us-east-1d,us-east-1f)

from runhouse.

shyamsn97 avatar shyamsn97 commented on May 18, 2024

weird because it tries to start the cluster twice? Unless I'm misunderstanding the process there

from runhouse.

shyamsn97 avatar shyamsn97 commented on May 18, 2024

I see! Will try that now!

from runhouse.

tullie avatar tullie commented on May 18, 2024

@dongreenberg i'm running into this issue but require fastAPI > 0.1 and Pydantic 2.x. How easy would it be to support these with the latest version of Runhouse (which seems to fix this issue)?

from runhouse.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.