Giter VIP home page Giter VIP logo

typos's Introduction

typos

Source code spell checker

Finds and corrects spelling mistakes among source code:

  • Fast enough to run on monorepos
  • Low false positives so you can run on PRs

Screenshot

codecov Documentation License Crates Status

Dual-licensed under MIT or Apache 2.0

Documentation

Install

Download a pre-built binary (installable via gh-install).

Or use rust to install:

cargo install typos-cli

Or use Homebrew to install:

brew install typos-cli

Or use Conda to install:

conda install typos

Or use Pacman to install:

sudo pacman -S typos

Getting Started

Most commonly, you'll either want to see what typos are available with

typos

Or have them fixed

typos --write-changes
typos -w

If there is any ambiguity (multiple possible corrections), typos will just report it to the user and move on.

False-positives

Sometimes, what looks like a typo is intentional, like with people's names, acronyms, or localized content.

To mark a word or an identifier (grouping of words) as valid, add it your _typos.toml by declaring itself as the valid spelling:

[default]
extend-ignore-identifiers-re = [
    # *sigh* this just isn't worth the cost of fixing
    "AttributeID.*Supress.*",
]

[default.extend-identifiers]
# *sigh* this just isn't worth the cost of fixing
AttributeIDSupressMenu = "AttributeIDSupressMenu"

[default.extend-words]
# Don't correct the surname "Teh"
teh = "teh"

For cases like localized content, you can disable spell checking of file contents while still checking the file name:

[type.po]
extend-glob = ["*.po"]
check-file = false

(run typos --type-list to see configured file types)

If you need some more flexibility, you can completely exclude some files from consideration:

[files]
extend-exclude = ["localized/*.po"]

Integrations

Custom

typos provides several building blocks for custom native integrations

  • - reads from stdin, --write-changes will be written to stdout
  • --diff to provide a diff
  • --format json to get jsonlines with exit code 0 on no errors, code 2 on typos, anything else is an error.

Examples:

# Read file from stdin, write corrected version to stdout
typos - --write-changes
# Creates a diff of what would change
typos dir/file --diff
# Fully programmatic control
typos dir/file --format json

Debugging

You can see what the effective config looks like by running

typos --dump-config -

You can then see how typos is processing your project with

typos --files
typos --identifiers
typos --words

If you need to dig in more, you can enable debug logging with -v

FAQ

Why was ... not corrected?

Does the file show up in typos --files? If not, check your config with typos --dump-config -. The [files] table controls how we walk files. If you are using files.extend-exclude, are you running into #593? If you are using files.ignore-vcs = true, is the file in your .gitignore but git tracks it anyways? Prefer allowing the file explicitly (see #909).

Does the identifier show up in typos --identifiers or the word show up in typos --words? If not, it might be subject to one of typos' heuristics for detecting non-words (like hashes) or unambiguous words (like words after a \ escape).

If it is showing up, likely typos doesn't know about it yet.

typos maintains a list of known typo corrections to keep the false positive count low so it can safely run unassisted.

This is in contrast to most spell checking UIs people use where there is a known list of valid words. In this case, the spell checker tries to guess your intent by finding the closest-looking word. It then has a gauge for when a word isn't close enough and assumes you know best. The user has the opportunity to verify these corrections and explicitly allow or reject them.

For more on the trade offs of these approaches, see Design.

typos's People

Contributors

alatiera avatar augustelalande avatar ayazhafiz avatar bnjbvr avatar clo4 avatar damianbarabonkovqc avatar delgan avatar dependabot-preview[bot] avatar dependabot[bot] avatar dosisod avatar epage avatar flakebi avatar foriequal0 avatar halkeye avatar hamdor avatar jayvdb avatar jiralite avatar jplatte avatar kachick avatar ncfavier avatar not-my-profile avatar peter-lyons-kehl avatar phip1611 avatar rainrat avatar renovate[bot] avatar scop avatar shirayu avatar steffahn avatar szepeviktor avatar tekumara avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

typos's Issues

Layered config

For large projects, it can be helpful to support layered configs.

Audit API

The API has gone through some churn. We should audit it before 1.0 to make sure its something we want.

Calculate line number / line offset only when typo is found?

Right now we proactively parse out lines and then parse within a line. What if instead we found out our line number by counting the new lines afterwards? This puts the cost on the typo case, which should be rare, rather than on every case when parsing

Support file types embedded in file types?

With #14, we're going to have special handing of different file types but one file isn't always a single type

  • markdown files that have code fences
    • treat markdown as non-code (no identifier support), `` as generic code, and code-fences as the specified language
  • rust comments have markdown which have code fences
  • mako files are a mixture of python and whatever the generated type will be.

Custom ignores?

Some times files should just be ignore for spelling but work for all others

Perf: remove allocation when case correcting by switching to KString

KStringCow has the following states:

  • Box<str>
  • 'static str
  • 's str
  • inlined string

If we add a From to it, we can possibly detect being able to use the inline string and write straight to it, avoiding the allocation when case correcting.

In addition, we'd be dropping from 4 machine words to 3 machine words iirc.

Config file support

We're developing a lot of flags. It'd be good if we added a config file so people can easily get a consistent experience

Support an any-dialict mode

Currently, all corrections force into a single english dialect. This will cause a lot more failures in a CI/. We should support any dialect.

Add benchmarks

Possibly steal ripgreps cases

Compare to scspell, the go one that we took the list from, and some kind of baseline search, like ripgrep

Custom dictionaries

Source

  • passed in on cli
  • found on disc

Include

  • file type definitions
  • per file type corrections

Fill in misspell-go's comparison

Calculate line number / line offset on-demand?

Right now we proactively parse out lines and then parse within a line. What if instead we found out our line number by counting the new lines afterwards? This puts the cost on the typo case, which should be rare, rather than on every case when parsing

Per-file type identifier rules

We'll to define file types and what traits those file types should have (specialized dictionaries, _ / - as identifier characters, and whether escape sequences are supported (#3).

This can then be extended into a config file that works with custom dictionaries (#9) to allow the user to override existing file type definitions or add their own.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.