Giter VIP home page Giter VIP logo

Comments (10)

akhanna213 avatar akhanna213 commented on August 15, 2024 3

The fix #1161 for gpu init actions has been verified. We are already working on the same fix patch for other init actions which are failing with the same error.

For urgent fix, customers/developers can clone the init action and add the same lines of code as in the fix in their copy, and use it for cluster creation. Please note that we do not encourage our customers to use cloned init script as they will not have updated init actions, and they will have to clone it every time there is a change in the init actions repository. So unless urgent, please wait for the other fixes to go in :)

from initialization-actions.

cjac avatar cjac commented on August 15, 2024 2

Thank you for the report. We are addressing this issue with the highest priority.

from initialization-actions.

kishida-yuki avatar kishida-yuki commented on August 15, 2024

I am in the same situation with 1.5-debian10.

from initialization-actions.

ahmedetefy avatar ahmedetefy commented on August 15, 2024

@akhanna213 I just tried using the latest version of the install_gpu_driver.sh and just went through the process of create a dataproc cluster through the UI and setting that latest version of the driver and I am still running into initialization issues

from initialization-actions.

olbapjose avatar olbapjose commented on August 15, 2024

@akhanna213 @cjac I have run the command and it is still failing. Could you please provide an update? It is very important for us to have this up and running. I am using --image-version 2.0-debian10 which I know is a bit old but I don't think it is related to the issue, correct?

Thanks

from initialization-actions.

akhanna213 avatar akhanna213 commented on August 15, 2024

Hi @ahmedetefy @olbapjose could you confirm if the error message is still the same. We have already rolled out the fix a while back.

from initialization-actions.

olbapjose avatar olbapjose commented on August 15, 2024

@akhanna213 Please see the image below and the attachment, which is the output file mentioned in the error.

image

google-cloud-dataproc-metainfo_initialization-script-0_output.txt

Long story short, the error says 'Unable to update packages lists.'

from initialization-actions.

ahmedetefy avatar ahmedetefy commented on August 15, 2024

@akhanna213 Yes I can confirm the error is still there

To reproduce the error is quite straightforward

gcloud dataproc clusters create cluster-e485 --enable-component-gateway --bucket <bucket_name> --region <your-region> --single-node --master-machine-type n1-standard-8 --master-boot-disk-type pd-balanced --master-boot-disk-size 500 --master-accelerator type=nvidia-tesla-t4 --image-version <any 2.1 or above image version> --optional-components JUPYTER --initialization-actions '< gcs_path to latest install GPU driver script >' --project <project_name>

I have also had issues with 2.0-ubuntu18 (even though it succeeds in installing the GPU drivers sometimes)

And the following are the error logs if it helps

E: Repository 'https://packages.cloud.google.com/apt google-cloud-logging-bionic-all InRelease' changed its 'Codename' value from 'google-cloud-logging-stretch-all' to 'google-cloud-logging-bionic-all'

from initialization-actions.

akhanna213 avatar akhanna213 commented on August 15, 2024

Hi @ahmedetefy @olbapjose , this looks like a different issue than what the users were facing earlier. Let me check with the team to understand what is causing this breakage. Appreciate your patience on this, let me get back to you as soon as possible.

from initialization-actions.

olbapjose avatar olbapjose commented on August 15, 2024

Hi @akhanna213 do you have updates on this? Initially I was able to do a workaround by adding --allow-releaseinfo-change:

function update_apt_get() {
  retry_apt_command "apt-get update --allow-releaseinfo-change"
}

and it worked, but today it is failing again with a different message:

The following NEW packages will be installed:
gnupg2
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 393 kB of archives.
After this operation, 411 kB of additional disk space will be used.
Err:1 http://deb.debian.org/debian buster/main amd64 gnupg2 all 2.2.12-1+deb10u1
404 Not Found [IP: 151.101.22.132 80]
E: Failed to fetch http://deb.debian.org/debian/pool/main/g/gnupg2/gnupg2_2.2.12-1+deb10u1_all.deb 404 Not Found [IP: 151.101.22.132 80]
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?

I will try again with fix-missing but looks like the script is not robust as it is exposed to different possible points of failure.

from initialization-actions.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.