Giter VIP home page Giter VIP logo

choose's Introduction

Choose

This is choose, a human-friendly and fast alternative to cut and (sometimes) awk

choose demo

Features

  • terse field selection syntax similar to Python's list slices
  • negative indexing from end of line
  • optional start/end index
  • zero-indexed
  • reverse ranges
  • slightly faster than cut for sufficiently long inputs, much faster than awk
  • regular expression field separators using Rust's regex syntax

Rationale

The AWK programming language is designed for text processing and is extremely capable in this endeavor. However, the awk command is not ideal for rapid shell use, with its requisite quoting of a line wrapped in curly braces, even for the simplest of programs:

awk '{print $1}'

Likewise, cut is far from ideal for rapid shell use, because of its confusing syntax. Field separators and ranges are just plain difficult to get right on the first try.

It is for these reasons that I present to you choose. It is not meant to be a drop-in or complete replacement for either of the aforementioned tools, but rather a simple and intuitive tool to reach for when the basics of awk or cut will do, but the overhead of getting them to behave should not be necessary.

Contributing

Please see our guidelines in contributing.md.

Usage

$ choose --help
choose 1.2.0
`choose` sections from each line of files

USAGE:
    choose [FLAGS] [OPTIONS] <choices>...

FLAGS:
    -c, --character-wise    Choose fields by character number
    -d, --debug             Activate debug mode
    -x, --exclusive         Use exclusive ranges, similar to array indexing in many programming languages
    -h, --help              Prints help information
    -n, --non-greedy        Use non-greedy field separators
    -V, --version           Prints version information

OPTIONS:
    -f, --field-separator <field-separator>
            Specify field separator other than whitespace, using Rust `regex` syntax

    -i, --input <input>                                      Input file
    -o, --output-field-separator <output-field-separator>    Specify output field separator

ARGS:
    <choices>...    Fields to print. Either a, a:b, a..b, or a..=b, where a and b are integers. The beginning or end
                    of a range can be omitted, resulting in including the beginning or end of the line,
                    respectively. a:b is inclusive of b (unless overridden by -x). a..b is exclusive of b and a..=b
                    is inclusive of b

Examples

choose 5                # print the 5th item from a line (zero indexed)

choose -f ':' 0 3 5     # print the 0th, 3rd, and 5th item from a line, where
                        # items are separated by ':' instead of whitespace

choose 2:5              # print everything from the 2nd to 5th item on the line,
                        # inclusive of the 5th

choose -x 2:5           # print everything from the 2nd to 5th item on the line,
                        # exclusive of the 5th

choose :3               # print the beginning of the line to the 3rd item

choose -x :3            # print the beginning of the line to the 3rd item,
                        # exclusive

choose 3:               # print the third item to the end of the line

choose -1               # print the last item from a line

choose -3:-1            # print the last three items from a line

Compilation and Installation

Installing From Source

In order to build choose you will need the rust toolchain installed. You can find instructions here.

Then, to install:

git clone https://github.com/theryangeary/choose.git
cd choose
cargo build --release
install target/release/choose <DESTDIR>

Just make sure DESTDIR is in your path.

Installing From Package Managers

Cargo:

cargo install choose

Arch Linux:

yay -S choose-rust-git

Fedora/CentOS COPR:

dnf copr enable atim/choose
dnf install choose

Homebrew:

brew install choose-rust

MacPorts:

sudo port install choose

Benchmarking

Benchmarking is performed using the bench utility.

Benchmarking is based on the assumption that there are five files in test/ that match the glob "long*txt". GitHub doesn't support files big enough in normal repos, but for reference the files I'm working with have lengths like these:

     1000 test/long.txt
    19272 test/long_long.txt
    96360 test/long_long_long.txt
   963600 test/long_long_long_long.txt
 10599600 test/long_long_long_long_long.txt

and content generally like this:

Those an equal point no years do. Depend warmth fat but her but played. Shy and
subjects wondered trifling pleasant. Prudent cordial comfort do no on colonel as
assured chicken. Smart mrs day which begin. Snug do sold mr it if such.
Terminated uncommonly at at estimating. Man behaviour met moonlight extremity
acuteness direction.

Ignorant branched humanity led now marianne too strongly entrance. Rose to shew
bore no ye of paid rent form. Old design are dinner better nearer silent excuse.
She which are maids boy sense her shade. Considered reasonable we affronting on
expression in. So cordial anxious mr delight. Shot his has must wish from sell

choose's People

Contributors

theryangeary avatar crclark96 avatar tim77 avatar herbygillot avatar kianmeng avatar pfmoore avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.