Giter VIP home page Giter VIP logo

lfind-isr-project's Introduction

lfind-isr-project

lfind is a command line tool built entirely in java that leverages Apache Lucene for efficient searching across files and directories in the file system.

Links

Youtube Link
Google Slides
Github Link
Website

Demo

Youtube Demo Link

img.png img_2.png img_3.png img_4.png

Features

  • Interactive and Direct Command Modes: lfind supports both interactive mode for iterative search queries and direct command mode for quick one-time searches.
  • Search File Names and Content: Easily search for file names and within file content across various file types including PDFs, Word documents, and plain text files.
  • Advanced Query Support: Supports a wide range of query types including boolean, phrase, partial, wildcard, and prefix queries—leveraging the full querying capabilities of Lucene.
  • Piped Input Support: Integrates seamlessly into bash scripting with support for piped inputs, allowing lfind to be combined with other commands in powerful workflows.
  • Cross-Platform Compatibility: Works on any machine with Java installed. Tested on macOS, Windows, and expected to run smoothly on Linux environments.
  • Rich Command Line Interface: Powered by picocli, lfind provides a user-friendly and feature-rich command line experience.
  • Automatic Index Cleanup: lfind automatically cleans up index files upon safe termination of the application, ensuring efficient resource management.

Process Flow

img_1.png

Installation

Requirements

  • java 11 or higher (tested using Amazon Corretto 11)
  • Tested in both windows 11 and Mac M3 Pro

How to run

  1. Download the Package

    • For Windows users, navigate to package/windows-pkg and run using lfind.bat.

    • For Mac users, navigate to package/mac-pkg and run using lfind.sh.

  2. Run the Application

    • Mac:

      ./lfind.sh [options] <queries>
    • Windows:

      .\lfind.bat [options] <queries>

Adding to Environment Path (Optional)

To access lfind from anywhere in the terminal, you can add the appropriate folder (package/windows-pkg for Windows or package/mac-pkg for Mac) to your system's environment path.

  • Mac:
    export PATH=$PATH:/path/to/package/mac-pkg
  • Windows:
    setx PATH "%PATH%;C:\path\to\package/windows-pkg"

Usage

Ensure that you have followed the steps in the Adding to Environment Path section of the README. If not, run the following commands from the appropriate path.

Running Direct Commands

lfind [options] <query>

Running in Interactive Mode

lfind [options]

Options

  • -c, --content: Enable searching within file content (default: false).
  • -e, --expression: Treat queries as Lucene query expressions (default: false). When enabled, all queries are interpreted as Lucene queries.
  • -h, --help: Display help message and exit.
  • -m, --mimetypes=<mimeTypes>: Specify MIME types to include in content search (comma-separated, e.g., pdf,doc,text). Default is no filter; used only in content search mode.
  • -p, --path=<directory>: Specify the starting path for the search (default: current working directory).
  • -v, --verbose: Enable verbose mode to print useful debugging information.

Examples

Note: You can run the following examples (except piped input search) in both interactive mode (without providing a query) or direct command mode (providing a query).

Piped Input Search

You can combine lfind with other commands using pipes to search within the piped input.

Search for a term (substring) in the piped input (e.g., output of ls command):

ls | lfind "query"

Search for an expression in the piped input (e.g., contents of a file):

cat "test.txt" | lfind -e "one AND two"

File name search

Search within a specific directory path:

lfind -p "/path/to/directory" "query"

Search for a term (substring) in file names within the current directory:

lfind "query"

Search for an expression in file names:

lfind -e "one AND two"
lfind -e "one OR two"
lfind -e "one -two"
lfind -e "one AND two*"

Enable verbose mode to display additional information:

lfind -v "query"

File content search

Search for a term (substring) within file content:

lfind -c "query"

Search for an expression within file content:

lfind -ce "one AND two"

or

lfind -c -e "one AND two"

Apply file type filters (e.g., PDF and text files) to your content searches:

lfind -c -m pdf,text "query"

or

lfind -c -m pdf -m text "query"

Feel free to experiment with different options and queries to leverage the full capabilities of lfind for searching file names and content efficiently.

Contributing Guidelines

  1. Bug Reports and Feature Requests
    • If you encounter a bug or have an idea for a new feature, please open an issue on GitHub to report it.
    • Provide detailed steps to reproduce the bug, including any relevant error messages or screenshots.
    • Clearly describe the expected behavior or propose your feature idea.
  2. Code Contributions
    • Fork the repository and create a new branch for your changes.
    • Ensure your code follows the project's coding style and conventions.
    • Write clear, concise commit messages that explain the purpose of each change.
    • Open a pull request (PR) with a descriptive title and detailed description of your changes.
  3. Documentation
    • Improve existing documentation or write new documentation for features or APIs.
    • Update the README file with any necessary information for users and contributors.

Future Scope

While lfind currently provides powerful file system searching capabilities, there are several potential areas for future improvement and expansion:

  • Performance Optimizations: Investigate methods to further optimize indexing and searching for larger file systems, enhancing speed and efficiency.

  • Additional File Types: Expand file type support to include a broader range of formats, enabling comprehensive searching across various file types.

  • Enhanced Result Details: Enhance result output by providing additional details such as matched lines within files or specific substrings matched during content-based searches.

  • File Monitoring and Precomputed Indexing: Implement precomputed indexing and background file monitoring to maintain index consistency and accelerate search processes.

  • Custom Ranking and Scoring: Customize the ranking and scoring functions to tailor search results based on specific file system retrieval tasks, such as boosting file matches over folder matches or prioritizing exact matches.

  • Query Optimization: Optimize query processing internally to improve search performance, particularly for token-based searches using wildcard queries.

  • Improved Packaging: Explore solutions to resolve compatibility issues with GraalVM, potentially downgrading the Lucene version or adopting other strategies for efficient packaging.

  • User Interface (UI): Develop a graphical user interface (GUI) version for users who prefer a visual interaction.

  • Advanced Search Filters: Implement advanced search filters based on file attributes, metadata, or content properties to refine search results based on specific criteria.

  • Internationalization (i18n): Add support for different languages and localization.

Contributions and suggestions for these or other enhancements are welcome! Please feel free to open an issue or submit a pull request to discuss and collaborate on the future development of lfind.

Acknowledgements

We extend our sincere gratitude to the following individuals and resources that have contributed to the development and improvement of lfind:

  • Open Source Libraries and Tools:

    • Apache Lucene: Powerful search library used as the core engine for lfind.
    • Picocli: Command line parsing and validation library that enhances the user experience of lfind.
    • Apache PDFBox: Library for working with PDF documents, providing essential features for PDF file handling in lfind.
    • Apache Tika: Toolkit for detecting and extracting metadata and text content from various file formats, enhancing content extraction capabilities in lfind.
    • Apache POI: Library for reading and writing Microsoft Office file formats (such as Word documents), extending file type support in lfind.
  • Development Tools:

    • ChatGPT: Leveraged for generating this README and providing assistance in resolving coding issues.

Contributors

License

  • Licensed with Apache-2.0 license

lfind-isr-project's People

Contributors

vva2 avatar bhavan-dondapati avatar mkr-peta avatar spchimmanitamu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.