retextjs / retext-equality Goto Github PK

View Code? Open in Web Editor NEW

156.0 11.0 55.0 819 KB

plugin to check for possible insensitive, inconsiderate language

Home Page: https://unifiedjs.com

License: MIT License

JavaScript 100.00%

retext retext-plugin equality natural-language inclusivity inconsiderate insensitive

retext-equality's Introduction

retext-equality

retext plugin to check for possible insensitive, inconsiderate language.

What is this?
When should I use this?
Install
Use
API
- unified().use(retextEquality[, options])
- Options
Messages
Types
Compatibility
Related
Contributing
License

What is this?

This package is a unified (retext) plugin to check for certain words that could be considered insensitive, or otherwise inconsiderate, in certain contexts.

When should I use this?

You can opt-into this plugin when you’re dealing with your own text and want to check for potential mistakes.

Install

This package is ESM only. In Node.js (version 16+), install with npm:

npm install retext-equality

In Deno with esm.sh:

import retextEquality from 'https://esm.sh/retext-equality@7'

In browsers with esm.sh:

<script type="module">
  import retextEquality from 'https://esm.sh/retext-equality@7?bundle'
</script>

Use

Say our document example.txt contains:

Now that the child elements are floated, obviously the parent element will collapse.

…and our module example.js contains:

import retextEnglish from 'retext-english'
import retextEquality from 'retext-equality'
import retextStringify from 'retext-stringify'
import {read} from 'to-vfile'
import {unified} from 'unified'
import {reporter} from 'vfile-reporter'

const file = await unified()
  .use(retextEnglish)
  .use(retextEquality)
  .use(retextStringify)
  .process(await read('example.txt'))

console.error(reporter(file))

…then running node example.js yields:

example.txt
1:42-1:51 warning Unexpected potentially insensitive use of `obviously`, try not to use it obvious retext-equality

⚠ 1 warning

API

This package exports no identifiers. The default export is retextEquality.

`unified().use(retextEquality[, options])`

Check potentially insensitive language.

Parameters

options (Options, optional) — configuration

Returns

Transform (Transformer).

`Options`

Configuration (TypeScript type).

Fields

ignore (Array<string>, optional) — phrases not to warn about
binary (boolean, default: false) — whether to allow “he or she”, “garbagemen and garbagewomen”, and similar

Messages

See rules.md for a list of rules and how rules work.

Each message is emitted as a VFileMessage with source set to 'retext-equality', ruleId to an id from rules.md, actual to the not ok phrase, and expected to suggested phrases. Some messages also contain a note with extra info.

Types

This package is fully typed with TypeScript. It exports the additional type Options.

Compatibility

Projects maintained by the unified collective are compatible with maintained versions of Node.js.

When we cut a new major release, we drop support for unmaintained versions of Node. This means we try to keep the current release line, retext-equality@^7, compatible with Node.js 16.

alex — Catch insensitive, inconsiderate writing
retext-passive — Check passive voice
retext-profanities — Check for profane and vulgar wording
retext-simplify — Check phrases for simpler alternatives

Contributing

See contributing.md in retextjs/.github for ways to get started. See support.md for ways to get help.

This project has a code of conduct. By interacting with this repository, organization, or community you agree to abide by its terms.

To create new patterns, add them in the YAML files in the data/ directory, and run npm install and then npm test to build everything. Please see the current patterns for inspiration. New English rules will automatically be added to rules.md.

When you are happy with the new rule, add a test for it in test.js, and open a pull request.

License

MIT © Titus Wormer

retext-equality's People

Contributors

Stargazers

Watchers

retext-equality's Issues

mental disorders are not adjectives

My partner pointed this out to me, I think it's quite striking:

Think we could do something to catch this.

Contractions slip through

Originally “he’ll” slips through on get-alex/atom-linter-alex#14.

Gendered proverbs suggestions

In the lines here, "ladylike" and "like a man" are said to be inconsiderate, which totally makes sense.

However, the suggestions to fix these words are completely outrageous. Why on earth does a program that is made to promote equality for everyone contain suggestions that "ladylike" should be replaced with "cultured"? Is this some sort of subversive way to suggest that ladies should be cultured?

This problem also exists with the "like a man" suggesting "bravely". Is the program implying that men should be brave?

I just thought I'd bring this point to attention as these suggestions really seem to lower the quality of the program and may make people think that the program is not actually promoting social equality :)

Bug with missing module

Hi team, I've been using your repo, it's great. But with the latest release of 3.9.0 I'm getting an npm error now (see error below). It looks like index.js in lib still requires it, but the patterns.json file no longer exists.

Error: Cannot find module './patterns.json'

I'd like to continue to use this, so hopefully you can fix this issue soon :)

Best,
Derek

Add support for "Cracker"

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

This project does not include racially insensitive words for white people.

Solution

Add the word "Cracker" as an offensive word.

Alternatives

Acknowledge that this is an incomplete project if you work in countries with a white-minority population on the README.md

Add 'primitive', 'savage', 'sophisticated', 'tribe', 'tribal', 'stone age'

These terms are considered by professional Anthropologists to be racially charged, derogatory and harmful to the welfare of indigenous groups. The Association of Social Anthropologists issued a statement that reads:

'All anthropologists would agree that the negative use of the terms 'primitive' and 'Stone Age' to describe [indigenous peoples] has serious implications for their welfare. Governments and other social groups. . . have long used these ideas as a pretext for depriving such peoples of land and other resources.'

Similarly, a major NGO dedicated to supporting endangered indigenous groups, Survival International writes:

Terms like 'stone age' and 'primitive' have been used to describe tribal people since the colonial era, reinforcing the idea that they have not changed over time and that they are backward. This idea is both incorrect and very dangerous. It is incorrect because all societies adapt and change, and it is dangerous because it is often used to justify the persecution or forced 'development' of tribal peoples. The results are almost always catastrophic: poverty, alcoholism, prostitution, disease and death.

Could it be something like these, below?

- type: simple
  note: Replace racist language about indigenous peoples with more accurate and inclusive words
  considerate:
    - simple
    - indigenous
    - hunter-gatherer
  inconsiderate:
    - primitive
    - savage
    - stone age

- type: simple
  note: Replace racist language about indigenous peoples with more accurate and inclusive words
  considerate:
    - society
    - community
  inconsiderate:
    - tribe

I'm not sure about this next one, is this a good way to do it?

- type: simple
  note: Replace racist language about indigenous peoples with more accurate and inclusive words
  considerate:
    - complex culture
    - complex technology
  inconsiderate:
    - sophisticated culture
    - sophisticated technology

What do you think?

Accept the responsibility associated with your work

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

I have followed each of @Murderlon's suggestions whilst trying to contribute to this project.

This project is downloaded by 20,000 people a week who are actively working to try and improve their use of language.

This project, according to @Murderlons comment in #118, contains only words which are "inconsiderate but is not necessarily considered profane"

This project considers "sand niggers" to be a lesser insult than "cracker" by including it only in this project and not in the profanities project.

The use of the phrase "sand niggers" is explicitly illegal in the United Kingdom and would therefore clearly be better suited to the profanity project. This is also true of the other racial epithets in this project.

@Murderlon has said that his rationale for this is so obvious that he doesn't have to explain it. Given the illegality of the words used in this project I cannot see that it is that obvious.

One might also think that if the true aim of this project was to improve the language used by people who download it then a discussion (not the immediate closure of issues) followed by an explanation might move not only the project, but the aims of the project as a whole, in the right direction.

I would like to invite @Murderlon to join a call with me and some friends from the Muslim Council of Britain to see if we can correct his thinking.

Solution

Accept responsibility

Alternatives

Acknowledge failure

Get non-software engineers to help

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

In #122 @Murderlon makes the suggestion that anti-arab racism is not illegal in and of itself in the United Kingdom. "Sand nigger" is not dealt with as the concatenation of the word "sand" with an actual offensive term. It is an offensive term that has legal consequences all of it's own.

If the legality of a word determines which repo it sits within then this project should retain legal counsel in each country it intends to support -- if only to ensure the package is accurate.

@Murderlon, a software engineer, is not a lawyer and is making decisions that make absolutely zero logical sense in this regard.

Despite acknowledging that this project should accept input, discuss it, and try to move towards better language for all he then immediately closes the issue.

The level of effort being put in to stop insults towards Arabs being treated the same as insults towards white people is concerning. Thusly, I have reported this project to both npm and GitHub for violating community guidelines.

This entire mess is a really good example of why this project perhaps doesn't make sense at all.

Solution

Retain counsel in every country this project aims to support if you're making decisions based on legality. Software engineers are NOT lawyers.
Stop running the project. Inaccuracies on the issue of racism are worse than the project not existing at all. This could very easily result in an increase in using of "sand niggers" as it is a lesser problem according to the linter... almost making it acceptable to some people, much like a compiler warning vs an error.
Discuss things in issues, don't just close them

Alternatives

Acknowledge that the issues this repo attempts to solve are socio-cultural.

Acknowledge that software engineers are famously not the right people to solve socio-cultural problems.

Acknowledge that people who cannot accept constructive criticism on socio-cultural issues should not be running projects which aim to improve socio-cultural problems.

Add more ableist terms

Some additions from this list that I didn't see in the rules. Sorry if I missed any!
https://www.autistichoya.com/p/ableist-words-and-terms-to-avoid.html

Diffability
Differently abled or different abilities
Handicapable
Special/Special Needs/Specially Abled
Invalid
Manic
Mental/Mental Case
Midget (existing, but add "little people" as an OK alternative)
*tard (examples: libtard, fucktard, etc.)
Short-bus

Spanish Translation Integration

I need some help on where to get started with integrating the Spanish wordlist from words/cuss. If anyone has a hint or idea of where to get started, that would be greatly appreciated :)

Add support for always warning about `or` patterns

Currently, she and him is not warned about because both “categories” are close to each other.
There is a use case to still warn about it, and suggest them regardless.

"null" erroneously considered insensitive

The word list at https://github.com/retextjs/retext-equality/blob/master/script/press.yml contains an extra empty entry at the very last line, that is converted to null during parsing, and then to the string "null" somewhere further down the pipeline, leading to the word "null" being considered insensitive.

via https://www.reddit.com/r/ProgrammerHumor/comments/a7fg4r/html_linter_concerned_about_offensive_language/

Add `insane`

This is insane.

Insanity is a serious condition and could be better written using other words.

Read more here: http://whatprivilege.com/replacing-crazy-for-ableism-and-preciseness-of-language/ (also some more words we could add there, and some good discussion in the comments)

Add some new tips

https://radyananda.wordpress.com/2009/06/06/nonsexist-alternative-language-handbook-for-conscious-writers/

All racial epithets should be equal

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

No one person can belong to all racial groups.

This means that no one person can objectively determine how offensive a word is.

Tier ranking how offensive one racial epithet over another seems like a slippery slope to the type of problem this project purports to be trying to solve

Solution

All racial epithets should be in this package. They're an equality issue, not a profanity issue. They should all be considered at the same level and as one evil.

Alternatives

Write supporting documentation which details the thought process behind the tier ranking of each racial epithet. I suspect this won't be pretty.

Add condescending terms (e.g. "obviously")

Not sure if it belongs in this retext, but I find that terms like "obviously" and "everyone knows" to be insensitive and inconsiderate.

Couple more ideas here:
https://css-tricks.com/words-avoid-educational-writing/

Suggest `spree`, (?) instead of `binge`

Subject of the feature

Suggest spree, other things (?) instead of binge

Problem

binge might be insensitive towards folks with eating disorders

Expected behavior

🤷‍♂️

Suggested over email

"Native land" may not be a suitable alternative for "motherland"

I don't have a better suggestion, but "native land" feels like a potentially problematic suggestion to be recommending to people whose families may not be from the indigenous peoples of their home country. See https://native-land.ca/

Move racial slurs over to another project

Probably, retext-profanity, when it exists. EDIT: https://github.com/wooorm/retext-profanities

German Language Support

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

I am supporting a colleague's project to build a browser plugin for encouraging gender-neutral language in German. I was really happy to find retext-equality and think it could be a brilliant base for some of the functionality.

I want to support my colleague in adding German language to retext-equality. (Some of the contributors won't be familiar with programming and I don't speak German.) I have seen that there are similar approaches to support retext-profanities (get-alex/alex#212). I also found a relevant issue here which ended with @wooorm refactoring the code to support multiple languages (#59). This is great!

I've started following the suggestions in #59 (comment) by adding a couple of basic rules and I am now trying to add the tests. I have a few questions about the best way to do this as it will form the template for both German and adding other languages.

Solution

Should I follow the approach taken in retext-profanities and export retextEqualityDe from de.js?
How should I set up tests?

I'm not super familiar with writing tests in Javascript but it looks simple enough. That said, I would like some support in creating a template for testing non-English language patterns in retext-equality.

The main reference I have found is that retext-profanities has the following test (https://github.com/retextjs/retext-profanities/blob/main/test.js#L35):

const fileFr = await retext().use(retextProfanitiesFrench).process('Merde!')

  assert.deepEqual(
    fileFr.messages.map(String),
    ['1:1-1:6: Don’t use `Merde`, it’s profane'],
    'should support other languages'
  )

This only runs for one test, whereas in retext-equality there are many tests for English.

Could someone give me an example of how a good template for the non-English tests would look?

Alternatives

I could intuitively push forward or not put in tests but I don't think either are good ideas!

LatinX is not a synonym for Mexican

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Affected packages and versions

6.2.0

Link to runnable example

No response

Steps to reproduce

See the discussion from January. As it is now, this library replaces the word "Mexican" in a phrase such as "Mexican Indie Rock" with "Latinx". The rule behind this suggests that "Latino" and "Latina" should be replaced with the "gender inclusive" word "Latinx".

I'm not here to discuss the acceptability of Latinx over Latino and Latina. Instead I'd like to ask why "Mexican" should be included as a target for replacement following this rule.

I understand that the current rule that favors "Latinx" over "Mexican" was probably written to prevent English speakers from using "Mexican" to refer to people from South or Central America. However, that is:

Totally unrelated to the rule "try to be gender inclusive"
Definitely not what we should assume is happening when someone uses the word "Mexican"

If I understand the discussion started by @metaverde, that user actually seems to suggest this project implements a rule like this:

- type: basic
  note: Avoid describing Indigenous people in terms of their colonizers.
  considerate:
    - people from Mexico, South America, or Central America
  inconsiderate:
    - latino
    - latina
    - latinx

In their view, "Latinx" actually is "describing Indigenous people in terms of their colonizers". So, I can imagine they were a bit shocked when this project suggests the opposite, to replace "Mexican" with "Latinx". The user didn't seem to see any issue with the stated goal of the rule, to prevent gendered language. In any case, I'll open a pull request #114 to simply drop Mexican from the "inconsiderate" words that trigger that gender-neutral rule.

As for the other suggested rule, I for now would advise this project avoids pitting words like "Mexican" and "Latinx" against each other. The Washington Post Article cited by the user @metaverde in the discussion is more about the gender-neutral language issue than on the acceptability of common terms like "Hispanic" or "Latino".

Expected behavior

"Mexican Indie Rock," as well as most uses of the word "Mexican," should not be flagged as offensive.

Actual behavior

"Mexican Indie Rock" is flagged as offensive.

Runtime

Node v16

Package manager

yarn v2

OS

macOS

Build and bundle tools

No response

Support case-sensitive matching

e.g., not add but ADD

Related to GH-31.

Add support for "Cracker"

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

@Murderlon has asked me to re-open this issue as it covers a specific term that can be rectified. He outlines his reasoning for this in #119.

No one person can belong to all racial groups.

This means that no one person can objectively determine how offensive a word is.

Tier ranking how offensive one racial epithet over another seems like a slippery slope to the type of problem this project purports to be trying to solve.

If sand nigger is appropriate here, so is cracker.

Without making a judgement on the validity of peoples hurt feelings we cannot come up with a reason for why one is here and another isn't.

Solution

Add cracker to this repo

Alternatives

Remove sand nigger from this repo

Seems like "man" is missing from "gals-men"

Subject of the issue

Seems like man is missing from gals-men, woman is picked up, but man is not.

Your environment

OS: Mac 11.1
Packages:
Env: node v15.5

Steps to reproduce

Tell us how to reproduce this issue. Please provide a working and simplified example.

🎉 BONUS POINTS for creating a minimal reproduction and uploading it to GitHub. This will get you the fastest support. 🎉

Expected behavior

woman -> flagged
man -> flagged

Actual behavior

woman -> flagged
man -> not flagged

Adding notes for some LGBTQ stuff

Hey there! I am working on adding some notes for LGBTQ+ terms (and one ablist note) and I had two questions.

Do I need to add notes to the tests somewhere?
In this test, if I am changing the note to be an actual description, does this test need to be replicated somewhere else?

t.same(
    retext()
      .use(equality)
      .processSync(doc).messages[0].note,
    'If possible, describe exacly what this is. (source: http://ncdj.org/style-guide/)',
    'should patch `description` when applicable'
  )

Once I get some answers and have committed a few rules/notes, I'd love to help get a document set up about contributing, if that would be helpful!

Thanks!

Political slogan

Hi! Is it within the scope of this package to include screening for phrases similar to "make America great again," which is a political slogan? It is problematic to many groups that this library tries to help prevent insensitive language towards. Unfortunately it shows up in a lot of things as a joke, since it's an easy play on words, but it's pretty hurtful to some people. Sources: one and two

Thoughts? Ideally something like this would be a regex-type match of anything like "make ___ great again" but I don't see that kind of match in any of the other examples. Possibly checking for "great again" would be enough. I haven't thought of any false positives that don't sound like a reference to the phrase.

I am happy to PR this if you think it's within scope and agree that "great again" is a reasonable search term. Thank you!

Add `straight-forward` to condescending list

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

Like most words in the condescending list
straight-forward adds little to a sentence.
Often a sentence containing it can be deleted without lose of value.

The method straight-forward.
The FooBar algorithm is straight-forward.
You can follow the straight-forward steps to solve your problem.

It is somewhat like just and easy which are on that list.

Solution

Add straight-forward to condescending list.

Alternatives

Don't add straight-forward to condescending list.
Maybe it isn't a great fit.

Don't flag the word `mental` in the phrase "mental health"

Subject of the issue

Currently the word "mental" is flagged as inconsiderate in the context of the phrase "mental health".

Expected behaviour

When used in the harmless contexts of the phrase "mental health", the word "mental" should not be flagged.

Actual behaviour

The word "mental" is flagged as inappropriate.

More ableist terms

http://www.autistichoya.com/p/ableist-words-and-terms-to-avoid.html

Add `sensible`, `reasonable` as suggestions for `sane`

retext-equality/script/ablist.yml

Lines 88 to 97 in 190ed23

 - type: simple 

 note: When describing a mathematical or programmatic value, using the word “sane” needlessly invokes the topic of mental health. Consider using a domain-specific or neutral term instead. 

 considerate: 

 - correct 

 - adequate 

 - sufficient 

 - consistent 

 - valid 

 - coherent 

 inconsiderate: sane

https://twitter.com/_k_e_l_s_e_y/status/1087960968091578368

Autistic isn't a slur

"Autistic" is a valid way to refer to a person with autism and should be preferred to "person with autism spectrum disorder", not the other way around. Autism is a disability, not a mental disorder.

Don't warn when the sentence is about a specific person

Ask Peter to review this and ask him what he thinks about it.

This currently warns on him and he, but I'm writing about a specific person, so gender does matter.

Add explanatory note to 'sophisticated technology'

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

The term 'sophisticated technology' describes advanced technology. As a person not deeply involved into sensitive language, I don't know what's wrong about that term, as it doesn't describe a group of people but rather technical things like machines or inventions.

Solution

Could you perhaps add an explanation to that entry in race.yml, so that I can make an informed decision whether using the word 'sophisticated' is OK in a certain context? At least, https://en.wiktionary.org/wiki/sophisticated#English doesn't generally describe the word as discouraged, it lists an obsolete meaning that might have caused this word to appear in the list of racial words.

Alternatives

Leave the users (especially nonnative speakers who may only know the primary meaning of the word) unclear about the decision to include the word in the list of racial words.

Repo: readme.md example output no longer matches text

Commit 1585a36 changed the sample prose in the usage example to this sentence:

He’s pretty set on beating your butt for sheriff.

But it didn't adjust the output to match. As a result, the sample output:

example.txt
    1:1-1:4  warning  `His` may be insensitive, use `Their`, `Theirs`, `Them` instead           her-him       retext-equality
  1:31-1:37  warning  `master` / `slave` may be insensitive, use `primary` / `replica` instead  master-slave  retext-equality

⚠ 2 warnings

contains admonishments about language (master, slave) that doesn't appear anywhere in the input text.

	- type: simple
	note: When describing a mathematical or programmatic value, using the word “sane” needlessly invokes the topic of mental health. Consider using a domain-specific or neutral term instead.
	considerate:
	- correct
	- adequate
	- sufficient
	- consistent
	- valid
	- coherent
	inconsiderate: sane

retextjs / retext-equality Goto Github PK

retext-equality's Introduction

retext-equality

Contents

What is this?

When should I use this?

Install

Use

API

unified().use(retextEquality[, options])

Parameters

Returns

Options

Fields

Messages

Types

Compatibility

Related

Contributing

License

retext-equality's People

Contributors

Stargazers

Watchers

Forkers

retext-equality's Issues

Initial checklist

Problem

Solution

Alternatives

Initial checklist

Problem

Solution

Alternatives

Initial checklist

Problem

Solution

Alternatives

Initial checklist

Problem

Solution

Alternatives

Subject of the feature

Problem

Expected behavior

Initial checklist

Problem

Solution

Alternatives

Initial checklist

Affected packages and versions

Link to runnable example

Steps to reproduce

Expected behavior

Actual behavior

Runtime

Package manager

OS

Build and bundle tools

Initial checklist

Problem

Solution

Alternatives

Subject of the issue

Your environment

Steps to reproduce

Expected behavior

Actual behavior

Initial checklist

Problem

Solution

Alternatives

Subject of the issue

Expected behaviour

Actual behaviour

Initial checklist

Problem

Solution

Alternatives

Recommend Projects

`unified().use(retextEquality[, options])`

`Options`