We've done a bit of this ad-hoc, but we could spend some dedicated c

I somehow missed the comment 20d ago <a class="user-mention notranslate" data-hovercar

We'd like to link to <a href="https://ropenscilabs.github.io/r-security-practices/" r

That's a <g-emoji class="g-emoji" alias="+1" fallback-src="https://github.githubassets

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Security/Safety "Best Practices" for rOpenSci Package Developers/Reviewers about unconf18 HOT 16 OPEN

ropensci commented on June 3, 2024 1

Security/Safety "Best Practices" for rOpenSci Package Developers/Reviewers

from unconf18.

Comments (16)

hrbrmstr commented on June 3, 2024 2

I somehow missed the comment 20d ago @elinw (apologies). https://github.com/hrbrmstr/rpwnd provides some context for the evil one can do with R and https://ropenscilabs.github.io/r-security-practices/ (which @stephlocke penned and @maelle noted) has a great start for that and other topics.

Packages with embedded other-lang libraries need care & feeding and some way to inform users they are in need of an update. Package authors may be putting vulnerable researchers (some who may not even know they fit that classification) and users in harms way without even knowing it depending on what type of internet calls they make or system traces they leave around.

We also started work last year on a way to help ensure package download safety (https://ropensci.org/blog/2017/07/25/notary/) but all of us who worked on it have been super busy and even if we weren't, it's somewhat moot b/c there's no backing infrastructure for it nor support in R itself for it (which is where it'd need to be).

One thing from the notary work that'd be an interesting "mandate" from rOpenSci is the requirement that all contributors use PGP and sign all commits and no GH merges or releases happen w/o that. Since R has no way for us to have "developer certs" like Apple or Android have for their apps, and since the package ecosystem is more collaborative in nature, the "everybody PGPs" approach at least provides a better guarantee that we can truly trace commits back to the person and not just the GH account.

In the context of ^^ perhaps one "fun" (I have weird ideas of what constitute that) wld be to get everyone on Keybase at the unconf. I 💙💙💙 what @stephlocke is doing with that in her personal and professional R work and perhaps finishing https://github.com/hrbrmstr/keybase wld be a possible unconf project.

from unconf18.

maelle commented on June 3, 2024 1

We'd like to link to https://ropenscilabs.github.io/r-security-practices/ whenever it's ready. Just sayin' 👼

from unconf18.

hrbrmstr commented on June 3, 2024 1

That's a 👍 point @batpigandme. "Thankfully?" malicious XML and JSON docs are usually targeting browsers and wld have some serious impediments trying to account for various R interpreter environs. Similarly, malicious PDFs are usually targeting Acrobat or Preview or third-party Windows PDF readers. However, the pkgs in the R ecosystem are all using the same core, [vulnerable] libraries so there is room for caution. And, we all get Word docs, Excel docs, PDFs, etc which all have threat vectors.

from unconf18.

elinw commented on June 3, 2024

How are you thinking about safety/security? I think this is a great concept.

from unconf18.

karthik commented on June 3, 2024

We discuss this regularly in our staff channels and would be super grateful for your advice/help on this! cc @maelle

from unconf18.

mmulvahill commented on June 3, 2024

@hrbrmstr I'm interested in learning more about how to think about security/safety w.r.t. R. That's all I have to add for now 😉

from unconf18.

noamross commented on June 3, 2024

In general, we try to make sure that mandates for RO packages go through a process that includes internal use, recommendation, good tooling that reduces effort and good docs/tutorials (not necessarily in that order) before requiring them. So work that advances any of those would make a mandate more likely.

I'd be interested in starting with tooling that could add this check (using **git2r**?) to both our onboarding checks and our nightly builds, along with other security best practices (goodpractices 😉).

from unconf18.

elinw commented on June 3, 2024

I’ve been on small projects that tried to mandate PGP and it was really painful and hard to enforce. It does worry me how much trust a lot of packages put into API calls to download data, both assuming the source must be safe and assuming nothing happens in the space between requestor and requested (e.g. a redirect to a site distributing malware). I don’t see a lot of validation or sanitizing.

…

On May 15, 2018, at 8:57 AM, Noam Ross ***@***.***> wrote: In general, we try to make sure that mandates for RO packages go through a process that includes internal use, recommendation, good tooling that reduces effort and good docs/tutorials (not necessarily in that order) before requiring them. So work that advances any of those would make a mandate more likely. > I'd be interested in starting with tooling that could add this check (using **git2r**?) to both our onboarding checks and our nightly builds, along with other security best practices (goodpractices 😉). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#35 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAuEfQY8iFu7G90SooS6wmryJqRTCdN1ks5tytDCgaJpZM4TijcH>.

from unconf18.

hrbrmstr commented on June 3, 2024

Aye. And there's "guidance" that might be useful to note in some API packages. For instance, I wrote epidata to access the economic policy institute data and use the data from it for various classes. Each call out to that API I do from home is logged (Federal requirement and also a side-$-business) by Comcast and searchable by authorities or interested third-parties. They use that data to classify me as a left-leaning activist (when, in fact, I'm really just a non-affiliated anti-authority anarchist :-) I've seen evidence of that in various mailings, adverts on sites that manage to get through my ad-blocking infrastructure, etc. And, due to a job stint at one of the world's largest network providers, I've also maybe even seen said databases. It's worse in other countries/regions and many at-risk researchers (again, who don't even realize they're 'at-risk') do not realize they shld be using, say, a VPN for some API calls or using DNS-over-HTTPS or DNSCrypt since DNS leaks where you're going.

I'm not suggesting rOpenSci can solve or provide guidance on all the issues, but we (I say "we" despite working in a rly strange proto-science vs a real one like y'all) cld definitely up the safety game for those using R.

from unconf18.

hrbrmstr commented on June 3, 2024

@elinw (re: PGP) aye, is is no panacea and unless you're a die heard infosec geek or have a die hard infosec hobby, being religious about PGP configs and use is a pain, especially when setting up new systems. Keybase definitely helps alot and perhaps we (like @noamross was alluding to) cld develop a "safety/security check" package/function similar to devtools::dr_devtools() or goodpractice as part of this to help both identify gaps and provide helpers or at least friendly tips on fixing things.

from unconf18.

noamross commented on June 3, 2024

If you want to do live testing of a package, like seeing what system files/folders it modifies, I'm working on a Dockerized setup for our standard package tests: https://github.com/noamross/launchboat, so one could run tests in an isolated environment before installing.

from unconf18.

boshek commented on June 3, 2024

Oh this is all so interesting. After reading about notary last year and some linked horror stories I try to sign all my commits now. So thanks @hrbrmstr !

It occurs to me that this is related to this possible project and in fact may be a key component. It is so easy to build packages/scripts and miss significant security considerations (at least for me) that this area likely has many spaces that could be improved upon. Providing means for reviewers to identify and even just consider that as part of a reviewer suite of tools would likely be useful.

from unconf18.

hrbrmstr commented on June 3, 2024

@noamross aye. been keeping an 👀 on launchboat and am also keen to also be watching the network calls pkgs make.

from unconf18.

jennybc commented on June 3, 2024

I'd appreciate knowing what the most realistic threat model is for the R package ecosystem and how that aligns against various measures to tighten things up.

Example: I am dimly aware of malicious packages in some other language's repository that had names very close to the "real" packages. And the Bad People exploited mis-spellings to get users to install and run them. That's a really different threat from, say, someone impersonating me and making commits to packages I maintain.

Which threats should we be most worried about and who has to do something to mitigate it?

from unconf18.

batpigandme commented on June 3, 2024

+1 to all of this…
Also, and maybe this is limited audience (or just unrelated), but basic file threat-assessment. Sometimes you've gotta deal with someone else's data, and (e.g. with readxl) they have to get it to you some way…

from unconf18.

hrbrmstr commented on June 3, 2024

@jennybc that's definitely a good unconf working-group mind-meld/group convo (since I'm likely far from the typical R user and cld use some examples of daily use patterns to help with said threat modeling :-)

from unconf18.

Security/Safety "Best Practices" for rOpenSci Package Developers/Reviewers about unconf18 HOT 16 OPEN

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent