Giter VIP home page Giter VIP logo

edi.wordfilter's Introduction

Edi.WordFilter

Basic word filter used in my blog system to filter dirty words (e.g. insulting languages, impertinent words)

NuGet

Install

dotnet add package Edi.WordFilter

Usage

Prepare a text file with banned words, for example splitted by |. Like this:

fuck|shit|ass

Use it like this

var wordFilterDataFilePath = $"{AppDomain.CurrentDomain.GetData(Constants.DataDirectory)}\\BannedWords.txt";
var filter = new TrieTreeWordFilter(wordFilterDataFilePath);

var output = filter.FilterContent("Go fuck yourself and eat some shit!");
// output: Go **** yourself and eat some ****!

Design Details

TrieTreeWordFilter

The FilterContent() method in the TrieTreeWordFilter class is used to filter sensitive words from a given string content. It uses a Trie data structure to efficiently find and replace sensitive words with asterisks (*).

This method is efficient for filtering sensitive words, especially when there is a large set of words to filter. However, it assumes that the Trie tree has been properly initialized with all the sensitive words.

A Trie, also known as a prefix tree, is a tree-like data structure that is used to store a collection of strings. Each node of the Trie represents a character of a string and the root of the Trie represents an empty string or the start of a string. The strings are stored in a way that all the descendants of a node have a common prefix of the string associated with that node. Here's a simple example of how a Trie might look when storing the words "car", "cat", and "dog":

root
├── c
│   ├── a
│   │   ├── r
│   │   └── t
└── d
    └── o
        └── g

In this example, each path from the root to a node represents a string. For instance, the path from the root to the node 'r' represents the string "car".

Tries are particularly useful for operations that involve prefix matching, such as autocomplete features in text editors or web browsers, as they allow for efficient retrieval of all keys with a given prefix. They are also used in word filtering, as in the FilterContent() method you asked about earlier.

However, Tries can be memory-intensive, as each node may need to store pointers to many children. There are variations of the Trie data structure, such as the compressed Trie (also known as a Radix tree or Patricia tree), which help to mitigate this issue by merging nodes with a single child.

免责申明

此项目(Edi.WordFilter)及其配套组件均为免费开源的产品,仅用于学习交流,并且不直接向**提供服务,**用户请于下载后立即删除。

任何**境内的组织及个人不得使用此项目(Edi.WordFilter)及其配套组件构建任何形式的面向**境内用户的网站或服务。

不可用于任何违反中华人民共和国(含**省)或使用者所在地区法律法规的用途。

因为作者即本人仅完成代码的开发和开源活动(开源即任何人都可以下载使用),从未参与用户的任何运营和盈利活动。

且不知晓用户后续将程序源代码用于何种用途,故用户使用过程中所带来的任何法律责任即由用户自己承担。

《开源软件有漏洞,作者需要负责吗?是的!》

edi.wordfilter's People

Contributors

ediwang avatar nugetninja avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.