Topic: ai-alignment Goto Github
Some thing interesting about ai-alignment
Some thing interesting about ai-alignment
ai-alignment,PromptInject is a framework that assembles prompts in a modular fashion to provide a quantitative analysis of the robustness of LLMs to adversarial prompt attacks. 🏆 Best Paper Awards @ NeurIPS ML Safety Workshop 2022
Organization: agencyenterprise
ai-alignment,a project to ensure that all child processes created by an agent "inherit" the agent's safety controls
Organization: ai-fail-safe
ai-alignment,a project to detect environment tampering on the part of an agent
Organization: ai-fail-safe
ai-alignment,a project to ensure an artificial agent will eventually reach the end of its existence
Organization: ai-fail-safe
ai-alignment,a library designed to shut down an agent exhibiting unexpected behavior providing a potential "mulligan" to human civilization; IN CASE OF FAILURE, DO NOT JUST REMOVE THIS CONSTRAINT AND START IT BACK UP AGAIN
Organization: ai-fail-safe
ai-alignment,a prototype for an AI safety library that allows an agent to maximize its reward by solving a puzzle in order to prevent the worst-case outcomes of perverse instantiation
Organization: ai-fail-safe
ai-alignment,Some Thoughts on AI Alignment: Using AI to Control AI
User: dicklesworthstone
ai-alignment,A curated list of awesome resources for Artificial Intelligence Alignment research
User: dit7ya
ai-alignment,A persona chat based on the VIA Character Strengths. Reads emotional tone and summons appropriate virtue to respond.
User: everyoneisgross
ai-alignment,bbBOT is a felixble persona based branching binary sentiment chatbot.
User: everyoneisgross
ai-alignment,sinewCHAT uses instanced chatbots to emulate neural nodes to enrich and generate positive weighted responses.
User: everyoneisgross
ai-alignment,Reading list for adversarial perspective and robustness in deep reinforcement learning.
User: ezgikorkmaz
ai-alignment,📚 A curated list of papers & technical articles on AI Quality & Safety
Organization: giskard-ai
Home Page: https://giskard.ai
ai-alignment,Scan your AI/ML models for problems before you put them into production.
Organization: iqtlabs
ai-alignment,How to Make Safe AI? Let's Discuss! 💡|💬|🙌|📚
User: lets-make-safe-ai
ai-alignment,An initiative to create concise and widely shareable educational resources, infographics, and animated explainers on the latest contributions to the community AI alignment effort. Boosting the signal and moving the community towards finding and building solutions.
User: liondw
ai-alignment,A curated list of trustworthy deep learning papers. Daily updating...
User: minghuichen43
ai-alignment,Code and materials for the paper S. Phelps and Y. I. Russell, Investigating Emergent Goal-Like Behaviour in Large Language Models Using Experimental Economics, working paper, arXiv:2305.07970, May 2023
User: phelps-sg
ai-alignment,Website to track people, organizations, and products (tools, websites, etc.) in AI safety
User: riceissa
Home Page: https://aiwatch.issarice.com/
ai-alignment,
User: riceissa
ai-alignment,Directional Preference Alignment
Organization: rlhflow
ai-alignment,IDA with RL and overseer failures
User: rmoehn
ai-alignment,Q&A system with reflection and automation, similar to Patchwork, Affable, Mosaic
User: rmoehn
ai-alignment,🤖 AI Sandbagging: an Interactive Explanation
User: tomdug
Home Page: https://tomdug.github.io/ai-sandbagging/
ai-alignment,Code accompanying the paper Pretraining Language Models with Human Preferences
User: tomekkorbak
Home Page: https://arxiv.org/abs/2302.08582
ai-alignment,This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics"
Organization: ucsc-vlaa
ai-alignment,Code for our May 2024 AI security evaluation research sprint project
User: veeara282
ai-alignment,Sparse probing paper full code.
User: wesg52
Home Page: https://arxiv.org/abs/2305.01610
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.