Giter VIP home page Giter VIP logo

hadith-data-sets's Introduction

Hadith-Data-Sets

There are two files of Hadith, the first one for all hadith With Tashkil and Without Tashkel from the Nine Books that are 62,169 Hadith. The second one it Hadith pre-processing data, which is applyed normalization and removeing stop words and lemmatization on it

Number of hadiths in whole books : 62,169

Book Name Number Of Hadiiths
Sahih Bukhari: 7008
Sahih Muslim: 5362
Sunan al Tirmidhi: 3891
Sunan al-Nasai: 5662
Sunan Abu Dawud: 4590
Sunan Ibn Maja: 4332
Musnad Ahmad ibn Hanbal: 26363
Maliks Muwatta: 1594
Sunan al Darami: 3367

hadith-data-sets's People

Contributors

abdelrahmaan avatar mohamed-ali avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hadith-data-sets's Issues

Number of Hadiths is wrong or imcomplete

The numbers of hadiths aren't correct:

  • In al-Nasai Without_Tashkel.csv does equal that of Sunan al-Nasai.csv
  • Musnad Ahmad ibn Hanbal has 18445 but in project description you wrote 26363
  • Sunan Abu Dawud.csv has 4601 you wrote 4590
  • Sunan Ibn Maja.csv 4351, you wrote 4332

Please double check.

$ git clone https://github.com/abdelrahmaan/Hadith-Data-Sets.git
Cloning into 'Hadith-Data-Sets'...
remote: Enumerating objects: 76, done.
remote: Total 76 (delta 0), reused 0 (delta 0), pack-reused 76
Unpacking objects: 100% (76/76), done.
$ 
$ cd Hadith-Data-Sets
$ ls 
'Maliks Muwatta.csv'                            README.md                           'Sahih Muslime Without_Tashkel.csv'    'Sunan al Darami.csv'                  'Sunan al Tirmidhi.csv'
'Maliks Muwatta Without_Tashkel.csv'           'Sahih Bukhari.csv'                   Scrapping_Hadith.ipynb                'Sunan al Darami Without_Tashkel.csv'  'Sunan al Tirmidhi Without_Tashkel.csv'
'Musnad Ahmad ibn Hanbal.csv'                  'Sahih Bukhari Without_Tashkel.csv'  'Sunan Abu Dawud.csv'                  'Sunan al-Nasai.csv'                   'Sunan Ibn Maja.csv'
'Musnad Ahmad ibn Hanbal Without_Tashkel.csv'  'Sahih Muslim.csv'                   'Sunan Abu Dawud Without_Tashkel.csv'  'Sunan al-Nasai Without_Tashkel.csv'   'Sunan Ibn Maja Without_Tashkel.csv'
$ wc -l *csv
    1595 Maliks Muwatta.csv
    1595 Maliks Muwatta Without_Tashkel.csv
   18445 Musnad Ahmad ibn Hanbal.csv
   18445 Musnad Ahmad ibn Hanbal Without_Tashkel.csv
    7009 Sahih Bukhari.csv
    7009 Sahih Bukhari Without_Tashkel.csv
    5363 Sahih Muslim.csv
    5363 Sahih Muslime Without_Tashkel.csv
    4601 Sunan Abu Dawud.csv
    4601 Sunan Abu Dawud Without_Tashkel.csv
    3368 Sunan al Darami.csv
    3368 Sunan al Darami Without_Tashkel.csv
    5663 Sunan al-Nasai.csv
    5613 Sunan al-Nasai Without_Tashkel.csv
    3892 Sunan al Tirmidhi.csv
    3892 Sunan al Tirmidhi Without_Tashkel.csv
    4351 Sunan Ibn Maja.csv
    4351 Sunan Ibn Maja Without_Tashkel.csv
  108524 total
$

خطأ في أول سطر في كل الملفات

السلام عليكم، شكرا على الجهد الطيب في جمع الأحاديث.

هناك خطأ في جميع الملفات حيث أنها تبدأ بكلمة Sahih Bukhari رغم إختلاف المصادر

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.