Topic: ai-safety Goto Github
Some thing interesting about ai-safety
Some thing interesting about ai-safety
ai-safety,PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. ๐ Best Paper Awards @ NeurIPS ML Safety Workshop 2022
Organization: agencyenterprise
ai-safety,[ICCV2021 Oral] Fooling LiDAR by Attacking GPS Trajectory
Organization: ai4ce
Home Page: https://ai4ce.github.io/FLAT/
ai-safety,Code for "Preference Tuning For Toxicity Mitigation Generalizes Across Languages"
Organization: batsresearch
Home Page: https://arxiv.org/abs/2406.16235
ai-safety,A curated list of awesome resources for Artificial Intelligence Alignment research
User: dit7ya
ai-safety,A project to improve out-of-distribution detection (open set recognition) and uncertainty estimation by changing a few lines of code in your project! Perform efficient inferences (i.e., do not increase inference time) without repetitive model training, hyperparameter tuning, or collecting additional data.
User: dlmacedo
ai-safety,A project to add scalable state-of-the-art out-of-distribution detection (open set recognition) support by changing two lines of code! Perform efficient inferences (i.e., do not increase inference time) and detection without classification accuracy drop, hyperparameter tuning, or collecting additional data.
User: dlmacedo
ai-safety,DPLL(T)-based Verification tool for DNNs
Organization: dynaroars
ai-safety,Reading list for adversarial perspective and robustness in deep reinforcement learning.
User: ezgikorkmaz
ai-safety,๐ A curated list of papers & technical articles on AI Quality & Safety
Organization: giskard-ai
Home Page: https://giskard.ai
ai-safety,๐ข Open-Source Evaluation & Testing for LLMs and ML models
Organization: giskard-ai
Home Page: https://docs.giskard.ai
ai-safety,AART: AI-Assisted Red-Teaming with Diverse Data Generation for New LLM-powered Applications
Organization: google-research-datasets
Home Page: https://arxiv.org/abs/2311.08592
ai-safety,Aligning AI With Shared Human Values (ICLR 2021)
User: hendrycks
ai-safety,Scan your AI/ML models for problems before you put them into production.
Organization: iqtlabs
ai-safety,A compilation of AI safety ideas, problems, and solutions.
User: jakobovski
ai-safety,LAWLIA is an open-source computational legal framework designed to revolutionize legal reasoning and analysis. It combines the power of large language models with a structured syntactical grammar to facilitate precise legal assessments, truth values, and verdicts. LAWLIA is the future of computational jurisprudence
User: jehumtine
ai-safety,Deliver safe & effective language models
Organization: johnsnowlabs
Home Page: http://langtest.org/
ai-safety,A curated list of awesome responsible machine learning resources.
User: jphall663
ai-safety,[Findings of EMNLP 2022] Holistic Sentence Embeddings for Better Out-of-Distribution Detection
Organization: lancopku
ai-safety,[Findings of EMNLP 2022] Expose Backdoors on the Way: A Feature-Based Efficient Defense against Textual Backdoor Attacks
Organization: lancopku
ai-safety,How to Make Safe AI? Let's Discuss! ๐ก|๐ฌ|๐|๐
User: lets-make-safe-ai
ai-safety,An interpretability library for pytorch
User: luanademi
Home Page: https://luanademi.github.io/toumei/
ai-safety,Analysis of the survey "Towards best practices in AGI safety and governance: A survey of expert opinion"
User: mccaffary
ai-safety,Feature Space Singularity for Out-of-Distribution Detection. (SafeAI 2021)
Organization: megvii-research
ai-safety,Evaluation & testing framework for computer vision models
Organization: moonwatcher-ai
Home Page: https://www.moonwatcher.ai/
ai-safety,RuLES: a benchmark for evaluating rule-following in language models
User: normster
Home Page: https://eecs.berkeley.edu/~normanmu/llm_rules
ai-safety,Alpha principles for the ethical use of AI and Data Driven Technologies in Ontario | Proposition de principes pour une utilisation รฉthique des technologies axรฉes sur les donnรฉes en Ontario
Organization: ongov
ai-safety,In situ interactive widgets for responsible AI ๐ฑ
Organization: pair-code
Home Page: https://pair-code.github.io/farsight/
ai-safety,Code and materials for the paper S. Phelps and Y. I. Russell, Investigating Emergent Goal-Like Behaviour in Large Language Models Using Experimental Economics, working paper, arXiv:2305.07970, May 2023
User: phelps-sg
ai-safety,BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
Organization: pku-alignment
Home Page: https://sites.google.com/view/pku-beavertails
ai-safety,Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Organization: pku-alignment
Home Page: https://pku-beaver.github.io
ai-safety,Attack to induce LLMs within hallucinations
Organization: pku-yuangroup
Home Page: http://arxiv.org/abs/2310.01469
ai-safety,QROA: A Black-Box Query-Response Optimization Attack on LLMs
User: qroa
ai-safety,Website to track people, organizations, and products (tools, websites, etc.) in AI safety
User: riceissa
Home Page: https://aiwatch.issarice.com/
ai-safety,A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
User: ryoungj
Home Page: https://toolemu.com/
ai-safety,[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
Organization: safeailab
Home Page: https://arxiv.org/abs/2309.07124
ai-safety,[NeurIPS '23 Spotlight] Thought Cloning: Learning to Think while Acting by Imitating Human Thinking
User: shengranhu
Home Page: https://www.shengranhu.com/ThoughtCloning/
ai-safety,AI Safety Q&A web frontend
Organization: stampyai
Home Page: https://aisafety.info
ai-safety,Awesome PrivEx: Privacy-Preserving Explainable AI (PPXAI)
User: tamlhp
Home Page: https://awesome-privex.github.io/
ai-safety,Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
User: tigerlab-ai
Home Page: https://www.tigerlab.ai
ai-safety,Code accompanying the paper Pretraining Language Models with Human Preferences
User: tomekkorbak
Home Page: https://arxiv.org/abs/2302.08582
ai-safety,Codebase the paper "The Remarkable Robustness of LLMs: Stages of Inference?"
User: vdlad
Home Page: https://arxiv.org/pdf/2406.19384
ai-safety,Sparse probing paper full code.
User: wesg52
Home Page: https://arxiv.org/abs/2305.01610
ai-safety,Universal Neurons in GPT2 Language Models
User: wesg52
ai-safety,An unrestricted attack based on diffusion models that can achieve both good transferability and imperceptibility.
User: windvchen
ai-safety,A novel physical adversarial attack tackling the Digital-to-Physical Visual Inconsistency problem.
User: windvchen
ai-safety,LAMBDA is a model-based reinforcement learning agent that uses Bayesian world models for safe policy optimization
User: yardenas
ai-safety,The official implementation of the paper "Data Contamination Calibration for Black-box LLMs" (ACL 2024)
User: yyy01
ai-safety,Code for our paper "Modelobfuscator: Obfuscating Model Information to Protect Deployed ML-Based Systems" that has been published by ISSTA'23
User: zhoumingyi
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.