Comments (13)
Hi! Thank you for the detailed bug report. I actually noticed some new breakage this evening relating to the FastAPI 0.100.0 release, but given it's working on GCP I'm not sure that's it. Trying to reproduce on AWS now.
from runhouse.
Awesome! Thanks so much!
from runhouse.
I think I see the issue. Your script isn't running inside an if __name__ == "__main__":
block so when the module is being imported on the cluster to run the function, the code to up the cluster runs again on the cluster. The corrected code would be:
import runhouse as rh
def num_cpus():
import multiprocessing
return f"Num cpus: {multiprocessing.cpu_count()}"
if __name__ == "__main__":
num_cpus()
cluster = rh.ondemand_cluster(
name="runhouse",
instance_type="CPU:8",
provider="aws", # options: "AWS", "GCP", "Azure", "Lambda", or "cheapest"
)
cluster.up_if_not()
num_cpus_cluster = rh.function(name="num_cpus_cluster", fn=num_cpus).to(system=cluster, reqs=["./"])
num_cpus_cluster()
We're actually introducing new logic now that will make it impossible for a cluster to start itself again because that's just silly, so your code may actually work as is in the next release, but using the if __name__
block is best practice anyway.
from runhouse.
That worked! Thanks! This is good to close now :)
from runhouse.
This is fixed. I've tested with a few of the new versions of fastapi and pydantic on main and something changed about our code, their code, or both such that they now work out of the box. I've relaxed the requirement on main, which should be in the release today.
from runhouse.
After configuring gcp correctly, looks like this works when specifying "gcp" in provider:
INFO | 2023-07-10 22:37:35,284 | Loaded Runhouse config from /home/shyam/.rh/config.yaml
INFO | 2023-07-10 22:37:44,473 | Running command on cpu-cluster: python3 -c "import numpy; print(numpy.__version__)"
1.21.6
INFO | 2023-07-10 22:37:46,311 | Connected (version 2.0, client OpenSSH_7.9p1)
INFO | 2023-07-10 22:37:46,765 | Authentication (publickey) successful!
INFO | 2023-07-10 22:37:46,767 | Checking server cpu-cluster
INFO | 2023-07-10 22:37:48,945 | Server cpu-cluster is up.
None
from runhouse.
I was able to reproduce this and it is indeed an issue with the latest fastapi release (you can confirm by sshing into the server, viewing the server logs at ~/.rh/cluster_server_cpu-cluster.log
, and confirming that the latest exception is a pydantic issue. The latest fastapi release was a massive update to Pydantic v2). I'm pushing a fix to main and releasing presently.
from runhouse.
Just pushed https://github.com/run-house/runhouse/releases/tag/v0.0.8, which should fix this.
from runhouse.
@shyamsn97 confirming that the fix worked?
from runhouse.
Yep seems like that worked and now I'm not getting an error on check_server. However the function is hanging for aws hangs indefinitely.
code from example:
import runhouse as rh
def num_cpus():
import multiprocessing
return f"Num cpus: {multiprocessing.cpu_count()}"
num_cpus()
# Using a Cloud provider
cluster = rh.ondemand_cluster(
name="runhouse",
instance_type="CPU:8",
provider="aws", # options: "AWS", "GCP", "Azure", "Lambda", or "cheapest"
)
cluster.up_if_not()
num_cpus_cluster = rh.function(name="num_cpus_cluster", fn=num_cpus).to(system=cluster, reqs=["./"])
num_cpus_cluster()
Output:
(py310) shyam@shyam-ThinkPad-P53:~/Code/test-runhouse$ python test.py
INFO | 2023-07-11 11:50:06,755 | Loaded Runhouse config from /home/shyam/.rh/config.yaml
INFO | 2023-07-11 11:50:07,937 | Found credentials in shared credentials file: ~/.aws/credentials
I 07-11 11:50:08 optimizer.py:636] == Optimizer ==
I 07-11 11:50:08 optimizer.py:647] Target: minimizing cost
I 07-11 11:50:08 optimizer.py:659] Estimated cost: $0.4 / hour
I 07-11 11:50:08 optimizer.py:659]
I 07-11 11:50:08 optimizer.py:732] Considered resources (1 node):
I 07-11 11:50:08 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 11:50:08 optimizer.py:781] CLOUD INSTANCE vCPUs Mem(GB) ACCELERATORS REGION/ZONE COST ($) CHOSEN
I 07-11 11:50:08 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 11:50:08 optimizer.py:781] AWS m6i.2xlarge 8 32 - us-east-1 0.38 ✔
I 07-11 11:50:08 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 11:50:08 optimizer.py:781]
I 07-11 11:50:08 cloud_vm_ray_backend.py:3495] Creating a new cluster: "runhouse" [1x AWS(m6i.2xlarge)].
I 07-11 11:50:08 cloud_vm_ray_backend.py:3495] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 07-11 11:50:08 cloud_vm_ray_backend.py:1215] To view detailed progress: tail -n100 -f /home/shyam/sky_logs/sky-2023-07-11-11-50-07-852797/provision.log
I 07-11 11:50:09 cloud_vm_ray_backend.py:1539] Launching on AWS us-east-1 (us-east-1a,us-east-1b,us-east-1c,us-east-1d,us-east-1f)
I 07-11 11:51:42 log_utils.py:89] Head node is up.
I 07-11 11:53:34 cloud_vm_ray_backend.py:1352] Successfully provisioned or found existing VM.
I 07-11 11:53:41 cloud_vm_ray_backend.py:3544] Processing file mounts.
I 07-11 11:53:41 cloud_vm_ray_backend.py:3575] To view detailed progress: tail -n100 -f ~/sky_logs/sky-2023-07-11-11-50-07-852797/file_mounts.log
I 07-11 11:53:41 backend_utils.py:1254] Syncing (to 1 node): ~/.rh -> ~/.rh
I 07-11 11:53:44 cloud_vm_ray_backend.py:2808] Run commands not specified or empty.
Clusters
NAME LAUNCHED RESOURCES STATUS AUTOSTOP COMMAND
runhouse a few secs ago 1x AWS(m6i.2xlarge) UP (down) test.py
cpu-cluster 20 mins ago 1x GCP(n2-standard-8) UP (down) test.py
INFO | 2023-07-11 11:53:51,543 | Restarting HTTP server on runhouse.
INFO | 2023-07-11 11:53:51,543 | Running command on runhouse: pip install runhouse==0.0.8Collecting runhouse==0.0.8
Downloading runhouse-0.0.8-py3-none-any.whl (142 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.7/142.7 kB 3.5 MB/s eta 0:00:00
Collecting sshfs
Downloading sshfs-2023.4.1-py3-none-any.whl (15 kB)
Collecting typer
Downloading typer-0.9.0-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.9/45.9 kB 2.6 MB/s eta 0:00:00
Collecting fsspec
Downloading fsspec-2023.6.0-py3-none-any.whl (163 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 7.9 MB/s eta 0:00:00
Collecting uvicorn
Downloading uvicorn-0.22.0-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 4.4 MB/s eta 0:00:00
Collecting sshtunnel>=0.3.0
Downloading sshtunnel-0.4.0-py2.py3-none-any.whl (24 kB)
Requirement already satisfied: pyOpenSSL>=21.1.0 in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (22.1.0)
Collecting fastapi<=0.99.0
Downloading fastapi-0.99.0-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.2/58.2 kB 321.7 kB/s eta 0:00:00
Collecting pyarrow
Downloading pyarrow-12.0.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.9/38.9 MB 13.5 MB/s eta 0:00:00
Requirement already satisfied: rich in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (13.4.2)
Requirement already satisfied: wheel in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (0.38.4)
Requirement already satisfied: skypilot==0.3.1 in /opt/conda/lib/python3.10/site-packages (from runhouse==0.0.8) (0.3.1)
Requirement already satisfied: filelock>=3.6.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.12.2)
Requirement already satisfied: pulp in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.7.0)
Requirement already satisfied: click<=8.0.4,>=7.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (8.0.4)
Requirement already satisfied: pycryptodome==3.12.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.12.0)
Requirement already satisfied: networkx in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.1)
Requirement already satisfied: pandas in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.0.3)
Requirement already satisfied: grpcio<=1.51.3,>=1.42.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (1.51.3)
Requirement already satisfied: awscli in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (1.29.2)
Requirement already satisfied: oauth2client in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (4.1.3)
Requirement already satisfied: pendulum in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.1.2)
Requirement already satisfied: psutil in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (5.9.5)
Requirement already satisfied: tabulate in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (0.9.0)
Requirement already satisfied: cryptography in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (38.0.3)
Requirement already satisfied: colorama<0.4.5 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (0.4.4)
Requirement already satisfied: boto3 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (1.28.2)
Requirement already satisfied: ray[default]<=2.4.0,>=2.2.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (2.4.0)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (4.23.4)
Requirement already satisfied: PrettyTable>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.8.0)
Requirement already satisfied: packaging in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (23.1)
Requirement already satisfied: jsonschema in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (4.18.0)
Requirement already satisfied: jinja2>=3.0 in /opt/conda/lib/python3.10/site-packages (from skypilot==0.3.1->runhouse==0.0.8) (3.1.2)
Requirement already satisfied: typing-extensions>=4.5.0 in /opt/conda/lib/python3.10/site-packages (from fastapi<=0.99.0->runhouse==0.0.8) (4.7.1)
Collecting starlette<0.28.0,>=0.27.0
Downloading starlette-0.27.0-py3-none-any.whl (66 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.0/67.0 kB 588.7 kB/s eta 0:00:00
Collecting pydantic!=1.8,!=1.8.1,<2.0.0,>=1.7.4
Downloading pydantic-1.10.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.1/3.1 MB 16.3 MB/s eta 0:00:00
Collecting paramiko>=2.7.2
Downloading paramiko-3.2.0-py3-none-any.whl (224 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.2/224.2 kB 30.5 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.16.6 in /opt/conda/lib/python3.10/site-packages (from pyarrow->runhouse==0.0.8) (1.25.1)
Requirement already satisfied: markdown-it-py>=2.2.0 in /opt/conda/lib/python3.10/site-packages (from rich->runhouse==0.0.8) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/conda/lib/python3.10/site-packages (from rich->runhouse==0.0.8) (2.15.1)
Collecting asyncssh<3,>=2.11.0
Downloading asyncssh-2.13.2-py3-none-any.whl (349 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 349.3/349.3 kB 2.8 MB/s eta 0:00:00
Collecting h11>=0.8
Downloading h11-0.14.0-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 11.9 MB/s eta 0:00:00
Requirement already satisfied: cffi>=1.12 in /opt/conda/lib/python3.10/site-packages (from cryptography->skypilot==0.3.1->runhouse==0.0.8) (1.15.1)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/conda/lib/python3.10/site-packages (from jinja2>=3.0->skypilot==0.3.1->runhouse==0.0.8) (2.1.3)
Requirement already satisfied: mdurl~=0.1 in /opt/conda/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->runhouse==0.0.8) (0.1.2)
Collecting pynacl>=1.5
Downloading PyNaCl-1.5.0-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (856 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 856.7/856.7 kB 11.9 MB/s eta 0:00:00
Collecting bcrypt>=3.2
Downloading bcrypt-4.0.1-cp36-abi3-manylinux_2_28_x86_64.whl (593 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 593.7/593.7 kB 67.6 MB/s eta 0:00:00
Requirement already satisfied: wcwidth in /opt/conda/lib/python3.10/site-packages (from PrettyTable>=2.0.0->skypilot==0.3.1->runhouse==0.0.8) (0.2.6)
Requirement already satisfied: requests in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.28.1)
Requirement already satisfied: aiosignal in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.3.1)
Requirement already satisfied: virtualenv<20.21.1,>=20.0.24 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (20.21.0)
Requirement already satisfied: msgpack<2.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.0.5)
Requirement already satisfied: frozenlist in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.3.3)
Requirement already satisfied: pyyaml in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (5.4.1)
Requirement already satisfied: attrs in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (23.1.0)
Requirement already satisfied: aiohttp>=3.7 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (3.8.4)
Requirement already satisfied: aiohttp-cors in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.7.0)
Requirement already satisfied: opencensus in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.11.2)
Requirement already satisfied: smart-open in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (6.3.0)
Requirement already satisfied: prometheus-client>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.17.1)
Requirement already satisfied: colorful in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.5.5)
Requirement already satisfied: gpustat>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.1)
Requirement already satisfied: py-spy>=0.2.0 in /opt/conda/lib/python3.10/site-packages (from ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.3.14)
Collecting anyio<5,>=3.4.0
Downloading anyio-3.7.1-py3-none-any.whl (80 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 80.9/80.9 kB 16.7 MB/s eta 0:00:00
Requirement already satisfied: s3transfer<0.7.0,>=0.6.0 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (0.6.1)
Requirement already satisfied: docutils<0.17,>=0.10 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (0.16)
Requirement already satisfied: rsa<4.8,>=3.1.2 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (4.7.2)
Requirement already satisfied: botocore==1.31.2 in /opt/conda/lib/python3.10/site-packages (from awscli->skypilot==0.3.1->runhouse==0.0.8) (1.31.2)
Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from botocore==1.31.2->awscli->skypilot==0.3.1->runhouse==0.0.8) (1.0.1)
Requirement already satisfied: urllib3<1.27,>=1.25.4 in /opt/conda/lib/python3.10/site-packages (from botocore==1.31.2->awscli->skypilot==0.3.1->runhouse==0.0.8) (1.26.11)
Requirement already satisfied: python-dateutil<3.0.0,>=2.1 in /opt/conda/lib/python3.10/site-packages (from botocore==1.31.2->awscli->skypilot==0.3.1->runhouse==0.0.8) (2.8.2)
Requirement already satisfied: referencing>=0.28.4 in /opt/conda/lib/python3.10/site-packages (from jsonschema->skypilot==0.3.1->runhouse==0.0.8) (0.29.1)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /opt/conda/lib/python3.10/site-packages (from jsonschema->skypilot==0.3.1->runhouse==0.0.8) (2023.6.1)
Requirement already satisfied: rpds-py>=0.7.1 in /opt/conda/lib/python3.10/site-packages (from jsonschema->skypilot==0.3.1->runhouse==0.0.8) (0.8.10)
Requirement already satisfied: six>=1.6.1 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (1.16.0)
Requirement already satisfied: pyasn1>=0.1.7 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (0.5.0)
Requirement already satisfied: httplib2>=0.9.1 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (0.22.0)
Requirement already satisfied: pyasn1-modules>=0.0.5 in /opt/conda/lib/python3.10/site-packages (from oauth2client->skypilot==0.3.1->runhouse==0.0.8) (0.3.0)
Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pandas->skypilot==0.3.1->runhouse==0.0.8) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in /opt/conda/lib/python3.10/site-packages (from pandas->skypilot==0.3.1->runhouse==0.0.8) (2023.3)
Requirement already satisfied: pytzdata>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pendulum->skypilot==0.3.1->runhouse==0.0.8) (2020.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (6.0.4)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (4.0.2)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.9.2)
Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /opt/conda/lib/python3.10/site-packages (from aiohttp>=3.7->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.1.1)
Collecting exceptiongroup
Downloading exceptiongroup-1.1.2-py3-none-any.whl (14 kB)
Collecting sniffio>=1.1
Downloading sniffio-1.3.0-py3-none-any.whl (10 kB)
Requirement already satisfied: idna>=2.8 in /opt/conda/lib/python3.10/site-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi<=0.99.0->runhouse==0.0.8) (3.4)
Requirement already satisfied: pycparser in /opt/conda/lib/python3.10/site-packages (from cffi>=1.12->cryptography->skypilot==0.3.1->runhouse==0.0.8) (2.21)
Requirement already satisfied: nvidia-ml-py>=11.450.129 in /opt/conda/lib/python3.10/site-packages (from gpustat>=1.0.0->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (12.535.77)
Requirement already satisfied: blessed>=1.17.1 in /opt/conda/lib/python3.10/site-packages (from gpustat>=1.0.0->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.20.0)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in /opt/conda/lib/python3.10/site-packages (from httplib2>=0.9.1->oauth2client->skypilot==0.3.1->runhouse==0.0.8) (3.1.0)
Requirement already satisfied: distlib<1,>=0.3.6 in /opt/conda/lib/python3.10/site-packages (from virtualenv<20.21.1,>=20.0.24->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.3.6)
Requirement already satisfied: platformdirs<4,>=2.4 in /opt/conda/lib/python3.10/site-packages (from virtualenv<20.21.1,>=20.0.24->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (3.8.1)
Requirement already satisfied: opencensus-context>=0.1.3 in /opt/conda/lib/python3.10/site-packages (from opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (0.1.3)
Requirement already satisfied: google-api-core<3.0.0,>=1.0.0 in /opt/conda/lib/python3.10/site-packages (from opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.11.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.10/site-packages (from requests->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2022.9.24)
Requirement already satisfied: google-auth<3.0.dev0,>=2.14.1 in /opt/conda/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (2.22.0)
Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in /opt/conda/lib/python3.10/site-packages (from google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (1.59.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.10/site-packages (from google-auth<3.0.dev0,>=2.14.1->google-api-core<3.0.0,>=1.0.0->opencensus->ray[default]<=2.4.0,>=2.2.0->skypilot==0.3.1->runhouse==0.0.8) (5.3.1)
Installing collected packages: typer, sniffio, pydantic, pyarrow, h11, fsspec, exceptiongroup, bcrypt, uvicorn, pynacl, anyio, starlette, paramiko, asyncssh, sshtunnel, sshfs, fastapi, runhouse
Attempting uninstall: pydantic
Found existing installation: pydantic 2.0.2
Uninstalling pydantic-2.0.2:
Successfully uninstalled pydantic-2.0.2
Successfully installed anyio-3.7.1 asyncssh-2.13.2 bcrypt-4.0.1 exceptiongroup-1.1.2 fastapi-0.99.0 fsspec-2023.6.0 h11-0.14.0 paramiko-3.2.0 pyarrow-12.0.1 pydantic-1.10.11 pynacl-1.5.0 runhouse-0.0.8 sniffio-1.3.0 sshfs-2023.4.1 sshtunnel-0.4.0 starlette-0.27.0 typer-0.9.0 uvicorn-0.22.0
INFO | 2023-07-11 11:53:58,889 | Running command on runhouse: pkill -f "python -m runhouse.servers.http.http_server"
INFO | 2023-07-11 11:54:00,015 | Running command on runhouse: pkill -f ".*ray.*6379.*"
INFO | 2023-07-11 11:54:01,244 | Running command on runhouse: ray start --head --port 6379 --autoscaling-config=~/ray_bootstrap_config.yaml
2023-07-11 18:54:03,331 INFO usage_lib.py:398 -- Usage stats collection is enabled by default without user confirmation because this terminal is detected to be non-interactive. To disable this, add `--disable-usage-stats` to the command that starts the cluster, or run the following command: `ray disable-usage-stats` before starting the cluster. See https://docs.ray.io/en/master/cluster/usage-stats.html for more details.
2023-07-11 18:54:03,331 INFO scripts.py:710 -- Local node IP: 172.31.46.221
2023-07-11 18:54:05,396 SUCC scripts.py:747 -- --------------------
2023-07-11 18:54:05,397 SUCC scripts.py:748 -- Ray runtime started.
2023-07-11 18:54:05,397 SUCC scripts.py:749 -- --------------------
2023-07-11 18:54:05,397 INFO scripts.py:751 -- Next steps
2023-07-11 18:54:05,397 INFO scripts.py:754 -- To add another node to this Ray cluster, run
2023-07-11 18:54:05,397 INFO scripts.py:757 -- ray start --address='172.31.46.221:6379'
2023-07-11 18:54:05,397 INFO scripts.py:766 -- To connect to this Ray cluster:
2023-07-11 18:54:05,397 INFO scripts.py:768 -- import ray
2023-07-11 18:54:05,397 INFO scripts.py:769 -- ray.init()
2023-07-11 18:54:05,397 INFO scripts.py:781 -- To submit a Ray job using the Ray Jobs CLI:
2023-07-11 18:54:05,397 INFO scripts.py:782 -- RAY_ADDRESS='http://127.0.0.1:8265' ray job submit --working-dir . -- python my_script.py
2023-07-11 18:54:05,397 INFO scripts.py:791 -- See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html
2023-07-11 18:54:05,397 INFO scripts.py:795 -- for more information on submitting Ray jobs to the Ray cluster.
2023-07-11 18:54:05,397 INFO scripts.py:800 -- To terminate the Ray runtime, run
2023-07-11 18:54:05,397 INFO scripts.py:801 -- ray stop
2023-07-11 18:54:05,397 INFO scripts.py:804 -- To view the status of the cluster, use
2023-07-11 18:54:05,397 INFO scripts.py:805 -- ray status
2023-07-11 18:54:05,397 INFO scripts.py:809 -- To monitor and debug Ray, view the dashboard at
2023-07-11 18:54:05,397 INFO scripts.py:810 -- 127.0.0.1:8265
2023-07-11 18:54:05,397 INFO scripts.py:817 -- If connection to the dashboard fails, check your firewall settings and network configuration.
INFO | 2023-07-11 11:54:05,750 | Running command on runhouse: screen -dm bash -c 'python -m runhouse.servers.http.http_server |& tee -a ~/.rh/cluster_server_runhouse.log 2>&1'
/home/shyam/miniconda3/envs/py310/lib/python3.10/site-packages/runhouse/rns/function.py:113: UserWarning: ``reqs`` and ``setup_cmds`` arguments has been deprecated. Please use ``env`` instead.
warnings.warn(
INFO | 2023-07-11 11:54:12,193 | Setting up Function on cluster.
INFO | 2023-07-11 11:54:12,564 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:13,099 | Authentication (publickey) successful!
INFO | 2023-07-11 11:54:13,108 | Checking server runhouse
INFO | 2023-07-11 11:54:14,768 | Server runhouse is up.
INFO | 2023-07-11 11:54:14,914 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:14,928 | Authentication (publickey) failed.
INFO | 2023-07-11 11:54:14,934 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:14,949 | Authentication (publickey) successful!
2023-07-11 11:54:14,949| ERROR | Problem setting SSH Forwarder up: Couldn't open tunnel :50052 <> 127.0.0.1:50052 might be in use or destination not reachable
ERROR | 2023-07-11 11:54:14,949 | Problem setting SSH Forwarder up: Couldn't open tunnel :50052 <> 127.0.0.1:50052 might be in use or destination not reachable
INFO | 2023-07-11 11:54:15,083 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:15,097 | Authentication (publickey) failed.
INFO | 2023-07-11 11:54:15,103 | Connected (version 2.0, client OpenSSH_8.2p1)
INFO | 2023-07-11 11:54:15,118 | Authentication (publickey) successful!
INFO | 2023-07-11 11:54:15,119 | Checking server local-cluster
INFO | 2023-07-11 11:54:16,445 | Server local-cluster is up.
INFO | 2023-07-11 11:54:16,445 | Running command on local-cluster: ray start --head
INFO | 2023-07-11 11:54:17,903 | Copying folder from file:///home/shyam/Code/test-runhouse to: runhouse
INFO | 2023-07-11 11:54:19,653 | Installing packages on cluster runhouse: ['Package: test-runhouse']
INFO | 2023-07-11 11:54:20,726 | Function setup complete.
INFO | 2023-07-11 11:54:20,731 | Running num_cpus_cluster via HTTP
INFO | 2023-07-11 11:54:21,326 | Submitted remote call to cluster for num_cpus_cluster_20230711_115420
:job_id:01000000
:task_name:get_fn_from_pointers
:job_id:01000000
INFO | 2023-07-11 18:54:21,732 | Loaded Runhouse config from /home/ubuntu/.rh/config.yaml
:task_name:get_fn_from_pointers
INFO | 2023-07-11 18:54:22,423 | Appending /home/ubuntu/test-runhouse to sys.path
INFO | 2023-07-11 18:54:22,424 | Importing module test
SkyPilot collects usage data to improve its services. `setup` and `run` commands are not collected to ensure privacy.
Usage logging can be disabled by setting the environment variable SKYPILOT_DISABLE_USAGE_COLLECTION=1.
INFO | 2023-07-11 18:54:22,944 | Found credentials in shared credentials file: ~/.aws/credentials
I 07-11 18:54:23 aws_catalog.py:120] Fetching availability zones mapping for AWS...
I 07-11 18:54:25 optimizer.py:636] == Optimizer ==
I 07-11 18:54:25 optimizer.py:647] Target: minimizing cost
I 07-11 18:54:25 optimizer.py:659] Estimated cost: $0.4 / hour
I 07-11 18:54:25 optimizer.py:659]
I 07-11 18:54:25 optimizer.py:732] Considered resources (1 node):
I 07-11 18:54:25 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 18:54:25 optimizer.py:781] CLOUD INSTANCE vCPUs Mem(GB) ACCELERATORS REGION/ZONE COST ($) CHOSEN
I 07-11 18:54:25 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 18:54:25 optimizer.py:781] AWS m6i.2xlarge 8 32 - us-east-1 0.38 ✔
I 07-11 18:54:25 optimizer.py:781] ------------------------------------------------------------------------------------------
I 07-11 18:54:25 optimizer.py:781]
I 07-11 18:54:25 cloud_vm_ray_backend.py:3495] Creating a new cluster: "runhouse" [1x AWS(m6i.2xlarge)].
I 07-11 18:54:25 cloud_vm_ray_backend.py:3495] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 07-11 18:54:25 cloud_vm_ray_backend.py:1215] To view detailed progress: tail -n100 -f /home/ubuntu/sky_logs/sky-2023-07-11-18-54-22-466640/provision.log
I 07-11 18:54:26 cloud_vm_ray_backend.py:1539] Launching on AWS us-east-1 (us-east-1a,us-east-1b,us-east-1c,us-east-1d,us-east-1f)
from runhouse.
weird because it tries to start the cluster twice? Unless I'm misunderstanding the process there
from runhouse.
I see! Will try that now!
from runhouse.
@dongreenberg i'm running into this issue but require fastAPI > 0.1 and Pydantic 2.x. How easy would it be to support these with the latest version of Runhouse (which seems to fix this issue)?
from runhouse.
Related Issues (20)
- [Doc] Issues with Inline Markup Rendering HOT 2
- How to use runhouse on my local server HOT 6
- I consistently see the user script hanging when copying a local package to the cluster.
- Need to support on HPU servers HOT 5
- Consistantly hit "http.client.BadStatusLine" issue in self-hosted tests. HOT 6
- Python 3.11 support
- Ease of porting an existing project to rh HOT 1
- SSH ProxyCommand support HOT 10
- Need help with local gpu system HOT 7
- Consistently hit "BaseSSHTunnelForwarderError" HOT 2
- Hit "failed to rsync up" to test test_self_hosted_huggingface_instructor_embedding_documents() HOT 4
- Discord links in `README` are invalid HOT 2
- Uncaught error when bringing up on-demand GCP cluster with invalid `image_id`
- PX (P90) for inference Cold start HOT 1
- Secrets Management Overview + Tracker
- error when start with '--screen' option HOT 1
- How is this different then Modal? HOT 1
- Install fails with conda and python10 HOT 2
- Running into problems with runhouse in local mode - Very simple example HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from runhouse.