Giter VIP home page Giter VIP logo

Comments (30)

Shastick avatar Shastick commented on July 28, 2024 5

We have a go-hdfs implementation working with Kerberos at https://github.com/Sqooba/hdfs (using a pure go kerberos library)

Haven't done a PR yet as the code probably deserves a few cleanups and that we have not figured out automated testing yet (and we'd be curious to hear any suggestions about that matter)

Development was done against a Kerby KDC which is reasonably easy to setup and a hdfs mini cluster from https://github.com/sakserv/hadoop-mini-clusters/

(We attempted to use a miniKDC from the above repo, but they currently have an issue: sakserv/hadoop-mini-clusters#51)

I'm not sure of the best path to follow for testing:

  • run the whole test suite with Kerberos on?
  • just add a simple test case with Kerberos turned on, and leave it off otherwise?
  • run the tests with Kerberos on and off?

I'm only mildly familiar with travis CI, and enabling/disabling Kerberos might require changing/reloading the Hadoop environment entirely (unless we can tell it to accept both plaintext and SASL auth, have not checked if that is possible yet)

Anyway, in the meantime, we have a few persons using the code daily and it runs smoothly.

Feedback and comments more than welcome.

PS: it works with both keytabs and credential cache

from hdfs.

didip avatar didip commented on July 28, 2024 4

Hi all, sorry for digging up an old ticket, is there still interest in Kerberos?

from hdfs.

wrouesnel avatar wrouesnel commented on July 28, 2024 1

It might be an idea to gate @lomik 's code under a build tag, since it's the right basic approach we're just lacking a go-native GSSAPI library to support it.

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

Hi David,

Good question! I don't really know what would be involved in that. I also don't really have a cluster set up to test with.

I'll leave this open for now - I'd love to hear any suggestions!

from hdfs.

Georce avatar Georce commented on July 28, 2024

Hi Colinmarc
I have a cluster use kerberos auth
I test your hdfs cmd
hdfskrb

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

@Georce is that somewhere on the public internet for me to test with? Otherwise it'd be kind of hard to develop against =)

from hdfs.

Georce avatar Georce commented on July 28, 2024

@colinmarc OK, I will provide an all in one CDH5 in kerberos with jenkins CI next week.
But no hdfs put cmd , your project be kind of hard to develop

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

That sounds super useful! In the tests, I use hadoop fs -put to populate the test data: https://github.com/colinmarc/hdfs/blob/master/setup_test_env.sh#L44-L45

from hdfs.

Georce avatar Georce commented on July 28, 2024

@colinmarc
I send email for you. Is [email protected] ?

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

Yup, that's right.

from hdfs.

tristanwietsma avatar tristanwietsma commented on July 28, 2024

Has there been any progress on this ticket? Looking for help?

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

I never ended up getting an email from @Georce - so I'm still blocked on the availability of a test cluster to develop against. That, and I really have no idea how kerberos works =)

from hdfs.

tristanwietsma avatar tristanwietsma commented on July 28, 2024

Well, I've got a cluster at work, so I can try to carve out some time. It is secure, so I'll just mock it for test purposes. I think snakebite supports kerberos now, so I'll check that out and try to touch base in a week.

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

Oh amazing, thanks!

from hdfs.

tristanwietsma avatar tristanwietsma commented on July 28, 2024

Glancing at the snakebite implementation and paraphrasing heavily...

from hdfs.

tristanwietsma avatar tristanwietsma commented on July 28, 2024

Plot thickens in order to support encryption: spotify/snakebite#185

from hdfs.

Georce avatar Georce commented on July 28, 2024

@colinmarc What? I send you again. My email is [email protected]
image

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

Huh, weird - I see it now, thank you!

from hdfs.

tristanwietsma avatar tristanwietsma commented on July 28, 2024

Authentication/negotiation seems straightforward. Anyone grok the encryption workflow or know of a good working example? Snakebite is still deficient for this.

from hdfs.

tristanwietsma avatar tristanwietsma commented on July 28, 2024

Any luck with the authentication part? I've been buried.

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

Haven't looked at all, sorry =(. Also pretty buried.

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

I'd still love to have this, but don't really have the time, context or environment to add the feature.

from hdfs.

lomik avatar lomik commented on July 28, 2024

@colinmarc, FYI

I've implemented Kerberos authentification here lomik@bae39b4

But this implementation is not native and requires go-sasl library (cgo wrapper for Cyrus SASL) :(

from hdfs.

mxk1235 avatar mxk1235 commented on July 28, 2024

@lomik does your implementation support keytabs?

thanks in advance.

from hdfs.

lomik avatar lomik commented on July 28, 2024

@mxk1235, keytabs are not supported

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

@Shastick that's fantastic news! Feel free to open the PR even if it's not ready yet - it'd be good to see what the diff looks like.

For testing, we'd want to do a whole run of the test suite with kerberos on. We already do multiple runs to test different hadoop distributions with a build matrix:

hdfs/.travis.yml

Lines 12 to 14 in 0f30457

env:
- HADOOP_DISTRO=cdh
- HADOOP_DISTRO=hdp

And then switch on it in the test setup:

hdfs/setup_test_env.sh

Lines 11 to 18 in 0f30457

if [ $HADOOP_DISTRO = "cdh" ]; then
HADOOP_URL="http://archive.cloudera.com/cdh5/cdh/5/hadoop-latest.tar.gz"
elif [ $HADOOP_DISTRO = "hdp" ]; then
HADOOP_URL="http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.0.0/tars/hadoop-2.7.1.2.4.0.0-169.tar.gz"
else
echo "No/bad HADOOP_DISTRO='${HADOOP_DISTRO}' specified"
exit 1
fi

What I'd suggest is:

  • Change the test setup to use that library you mentioned (for the normal tests). This can even be a separate PR. If that library doesn't work, we can also try using a docker setup, as someone suggested on another issue.
  • Add a build for kerberos (it would actually add two builds: one for CDH and one for Hortonworks)

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

link to the PR for reference: #99

from hdfs.

xxh2000 avatar xxh2000 commented on July 28, 2024

Kerbores security is common in Hadoop, without this support,it is very inconvenient to use this lib。

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

There is now an internal PR that adds support for Kerberos: #133

If you're running Kerberos in production, please test it out and let me know if it works!

from hdfs.

colinmarc avatar colinmarc commented on July 28, 2024

Fixed in #133. I'll do a version release soon.

from hdfs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.