Giter VIP home page Giter VIP logo

telegram-bot's Introduction

telegram-bot

This project aims to develop a command line interface program to download all the relevant data from joined Telegram's groups and chats.

In order to do that, we heavily rely on Telethon library to interface with Telegram's official APIs. You can read more about its usage in their docs.

Summary

Installation

Before installing, you must first edit docker-compose.yml and config/config.yaml to provide the necessary credentials for the application. In order for everything to work, you should supply

  • Pgadmin credentials (optional)
  • Postgres database credentials
  • Telegram API credentials
  • Twitter API credentials (optional)

Then, you must initialize the necessary infrastructure and virtual environment.

docker-compose up -d
poetry install
poetry run alembic upgrade head

Running the code

This program implements several functionalities. You can check the most up to date ones by typing poetry run python main.py --help.

By default, it runs on every joined group and chat. If you want to change that behavior, specify in the config/config.yaml file a whitelist or a blacklist.

List all chats

To list all joined groups and channels, along with their ID, type

poetry run python main.py --list-dialogs

Download everything

To download channels, users, messages and media from joined groups and channels, type

poetry run python main.py

Download only text

To download only messages from joined groups and channels, in order to speed up the process, type

poetry run python main.py --without-media

Download only users

To download only users from joined groups and channels, type

poetry run python main.py --get-participants

Download only past media (WIP)

To download only media from already seen messages, type

poetry run python main.py --download-past-media

Download only groups/channels

TO BE IMPLEMENTED

Searching for invite links

In order to enumerate channels and groups to join, there are three strategies available:

  • Manual search: manually search in the web for available invite links. Any channel/group joined through the official app will be shown next time you run this code.
  • Twitter search: use twarc2 to automate the search for invite links in old tweets
  • Telegram/Messages search: use the acquired Telegram messages to search for invite links

For now, every valid invite link found with the automated methods will be joined, without the option of manual approval by the user.

Twitter's method

You can change the queries used to search twitter by modifying the search_queries.txt file.

poetry run python main.py --search-twitter

Telegram's method

poetry run python main.py --search-messages

Export collected data

You can export the collected data stored in the database as a pg_dump file.

poetry run python scripts.py export [--dest-file DEST_FILE] [--compress]

Related work

These are two related repositories that heavily inspired the developing of telegram-bot.

telegram-bot's People

Contributors

pedro-h-dias avatar

Stargazers

Luiz Geraldo Silva Braz avatar

Watchers

James Cloos avatar Luiz Geraldo Silva Braz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.