Giter VIP home page Giter VIP logo

repo-classifier's People

Contributors

blackdark avatar kuznecpl avatar linkvt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

repo-classifier's Issues

Language Feature extraction

Evaluieren, ob es sinnvoll ist:

  • Alle Sprachen als Features zu extrahieren (AllLanguageFeatureExtractor). Es könnte natürlich passieren, dass wir unsere Klassifikation damit kaputt machen und das dann nicht hinnehmbar ist. Damit einhergehend ist das die Möglichkeit, dass man irrelevante Features automatisch entfernen würde?

  • oder wir machen das ganz manuell und definieren die Sprachen, die für einzelne Kategorien in Frage kommen. Auch hier die Frage ob man die Features pro Sprache nimmt oder Kategorien

LanguageDEVExtractor erweitern

Momentan sind da finde ich zu wenige Sprachen drin. Wir sollten die ,wie bereits bei #41 bei DOCS und DATA geschehen, verbessern.
Muesste man hier anpassen: https://github.com/linkvt/repo-classifier/blob/master/classification/feature_extraction/dev.py#L36

Die Frage ist: Gehen wir auf Dateiendungen oder auf Sprachen?

Ueber die API abrufbare Sprachen

  • einfach zu holen
  • Sprachen koennen aus der languages.yaml gezogen werden

Dateiendungen

  • API zeigt nicht alle Sprachen an, bspw Markdown ist leer - aber verfuegbar: https://github.com/trending/markdown
  • Skripte in DOCS/DATA Repos haben 100% Anteil an Sprachen, es geht unter, dass Markdown oder so verwendet wird - TeX und XML koennen erkannt werden
  • Dateiendungen koennen eigentlich auch aus der languages.yaml geholt werden

DATA und DOCS verbessern

DOCS

  • mehr Dateiendungen: .tex, .pdf, .markdown, .rst, .txt
  • allgemein nach Dokumentationstypen suchen wie .docx, .key, .ppt, .doc, .odt oder so?
  • .pod ?
  • .ipynb
  • .graffle
  • Bilddateien?

DATA

Alle Dateiendungen
DOCS: (.md, 8596) (.jpg, 2447) (.png, 1325) (.js, 753) (.pdf, 709) (.swift, 392) (, 311) (.scss, 284) (.jpeg, 276) (.html, 274) (.log, 265) (.c, 239) (.css, 196) (.gif, 174) (.markdown, 170) (.xcworkspacedata, 145) (.json, 143) (.csv, 121) (.h, 102) (.plist, 87) (.less, 87) (.storyboard, 77) (.pbxproj, 74) (.xcplayground, 68) (.graffle, 66) (.xml, 64) (.launch, 57) (.tex, 57) (.txt, 47) (.svg, 40) (.pod, 40) (.xcscheme, 38) (.cc, 34) (.sks, 30) (.scssc, 30) (.ttf, 29) (.xctimeline, 27) (.yml, 20) (.caf, 20) (.jpg-large, 20) (.eot, 20) (.woff, 19) (.sh, 18) (.S, 17) (.py, 16) (.pptx, 15) (.java, 15) (.php, 13) (.jpg-thumb, 12) (.JPG, 10) (.graphml, 10) (.asciidoc, 9) (.erb, 9) (.ico, 9) (.t, 9) (.xcsettings, 8) (.pl, 7) (.dts, 7) (.loaded_0, 6) (.clj, 6) (.odp, 6) (.lock, 6) (.vm, 5) (.woff2, 5) (.xsl, 5) (.sty, 5) (.webp, 5) (.key, 4) (.bin, 4) (.template_partial, 4) (.pgp, 4) (.zip, 4) (.docx, 4) (.wav, 4) (.sla, 4) (.rb, 3) (.mobi, 3) (.psd, 3) (.asm, 3) (.haml, 3) (.ini, 3) (.xls, 3) (.conf, 3) (.template, 3) (.dic, 3) (.ai, 3) (.epub, 3) (.hdbs, 3) (.loaded_1, 3) (.indd, 2) (.mp3, 2) (.m4a, 2) (.pm, 2) (.base64, 2) (.a, 2) (.PL, 2) (.map, 2) (.jade, 2) (.otf, 2) (.db, 2) (.textmpl, 2) (.1, 2) (.dxf, 2) (.ld, 2) (.sketch, 2) (.sig, 2) 
HW: (.js, 23279) (, 7537) (.md, 6562) (.json, 5511) (.cs, 2118) (.m, 1930) (.html, 1795) (.rb, 1784) (.png, 1281) (.py, 1218) (.yml, 1206) (.dll, 1132) (.css, 1086) (.ejs, 1012) (.c, 907) (.config, 851) (.svn-base, 612) (.less, 603) (.h, 588) (.csproj, 552) (.txt, 504) (.xml, 493) (.erb, 465) (.st, 402) (.nupkg, 398) (.java, 397) (.result, 359) (.cache, 330) (.nuspec, 319) (.pdf, 294) (.cshtml, 222) (.pp, 197) (.cpp, 178) (.ps1, 171) (.coffee, 162) (.jpg, 156) (.tex, 140) (.markdown, 138) (.out, 129) (.map, 127) (.wav, 119) (.sln, 118) (.gif, 110) (.scss, 105) (.jsm, 102) (.hbs, 98) (.lock, 90) (.ru, 88) (.asm, 85) (.sym, 85) (.in, 83) (.sh, 82) (.php, 79) (.neu, 74) (.EXE, 71) (.ico, 69) (.exe, 68) (.tst, 67) (.C, 67) (.jade, 66) (.d, 64) (.transform, 62) (.psd1, 56) (.cmp, 56) (.sql, 56) (.XML, 54) (.rdoc, 53) (.mat, 52) (.cc, 48) (.csv, 46) (.dat, 46) (.S, 44) (.patch, 43) (.ls, 42) (.odt, 42) (.o, 40) (.1, 39) (.cmm, 38) (.opts, 37) (.jack, 37) (.OBJ, 37) (.log, 37) (.pkt, 36) (.class, 36) (.DotSettings, 34) (.pl0, 34) (.hdl, 33) (.MAP, 33) (.obj, 32) (.ts, 31) (.bnf, 31) (.R, 31) (.zip, 29) (.pv, 28) (.pl, 27) (.go, 27) (.ipynb, 27) (.psm1, 26) (.StyleCop, 25) (.docx, 25) (.mkd, 24) (.svg, 24) (.data, 24) (.glsl, 22) (.aspx, 22) (.priv, 21) (.pdb, 21) (.swift, 21) (.woff, 21) (.rpt, 20) (.ttf, 20) (.adj, 20) (.eot, 19) (.woff2, 18) (.swf, 18) (.pub, 17) (.gzip, 16) (.vcxproj, 16) (.filters, 16) (.asax, 15) (.properties, 15) (.Rmd, 15) (.tt, 15) (.tar, 15) (.2, 14) (.bak, 14) (.glsf, 14) (.rst, 13) (.fig, 13) (.vm, 13) (.v, 12) (.s, 12) (.bat, 12) (.MIT, 12) (.xls, 12) (.p, 12) (.xaml, 12) (.bass, 11) (.scala, 11) (.doc, 11) (.plist, 11) (.pptx, 11) (.diff, 10) (.rkt, 10) (.img, 10) (.rar, 10) (.3, 10) (.sty, 10) (.inp, 10) (.psd, 10) (.APACHE2, 10) (.JPG, 9) (.old, 9) (.ui, 9) (.user, 9) (.clj, 9) (.xlsx, 9) (.manifest, 9) (.gz, 9) (.tlog, 9) (.qrc, 9) (._, 8) (.un~, 8) (.ext1, 8) (.list, 8) (.conf, 7) (.pro, 7) (.4, 7) (.diagram, 7) (.crt, 7) (.DS_Store, 7) (.ini, 7) (.edmx, 7) (.ascx, 7) (.xsd, 6) (.targets, 6) (.jbuilder, 6) (.6, 6) (.pem, 6) (.ext2, 6) (.nb, 6) (.Config, 6) (.7, 6) (.5, 6) (.ftr, 6) (.pages, 6) (.scm, 6) (.spec, 6) (.hdr, 6) (.jar, 6) (.cd, 5) (.ra, 5) (.ods, 5) (.xcconfig, 5) (.htc, 5) (.mexa64, 5) (.swp, 5) (.gyp, 5) (.cae, 5) (.mws, 5) (.g, 5) (.dvi, 5) (.win, 5) (.ldf, 5) (.dtd, 5) (.tgz, 5) (.mk, 5) (.8, 5) (.mdf, 5) (.node, 5) (.jnl, 5) (.save, 4) (.appx, 4) (.lua, 4) (.tsv, 4) (.prefs, 4) (.kernel, 4) (.0, 4) (.pbxproj, 4) (.egg, 4) (.svcinfo, 4) (.10, 4) (.hack, 4) (.lisp, 4) (.solaris, 4) (.ext3, 4) (.Master, 4) (.graph, 4) (.tmpl, 4) (.pz, 4) (.targ, 4) (.py~, 4) (.tpl, 4) (.9, 4) (.xib, 4) (.jsx, 4) (.xcworkspacedata, 4) (.pcap, 4) (.r, 4) (.xcscheme, 4) (.ml, 3) (.Makefile, 3) (.eps, 3) (.y, 3) (.gnu, 3) (.g~, 3) (.gypi, 3) (.db, 3) (.pyc, 3) (.BSD, 3) (.sml, 3) (.pch, 3) (.htm, 3) (.iml, 3) (.js~, 2) (.dump, 2) (.pfx, 2) (.svcmap, 2) (.awk, 2) (.back, 2) (.csh, 2) (.11, 2) (.html~, 2) (.cbp, 2) (.err, 2) (.jpeg, 2) (.strings, 2) (.cls, 2) (.inc, 2) (.kernel#, 2) (.poly, 2) (.layout, 2) (.storyboard, 2) (.design, 2) (.scssc, 2) (.l, 2) (.cmd, 2) (.mexmaci64, 2) (.rtf, 2) (.types, 2) (.sitemap, 2) (.wsdl, 2) (.el, 2) (.sass, 2) (.12, 2) (.lab, 2) (.deps, 2) (.ns, 2) (.hs, 2) (.vssscc, 2) (.linux, 2) (.edu, 2) 
EDU: (.js, 26633) (, 8055) (.png, 7537) (.xml, 4138) (.q, 2247) (.css, 1846) (.html, 1476) (.otf, 1232) (.scala, 1181) (.pdf, 869) (.md, 773) (.java, 605) (.Rmd, 465) (.ipynb, 425) (.rdx, 408) (.rdb, 408) (.RData, 408) (.txt, 407) (.json, 378) (.jpg, 355) (.m, 278) (.csv, 175) (.glsl, 161) (.c, 110) (.R, 109) (.py, 101) (.h, 99) (.tex, 81) (.tga, 79) (.hs, 61) (.jar, 50) (.prefs, 50) (.zip, 49) (.erb, 49) (.properties, 47) (.mat, 43) (.hdr, 43) (.less, 41) (.scss, 40) (.sh, 39) (.jpeg, 39) (.rda, 39) (.docx, 33) (.data, 31) (.yml, 29) (.pptx, 28) (.lhs, 27) (.rb, 26) (.gif, 24) (.svg, 21) (.markdown, 20) (.pyc, 19) (.rst, 14) (.cmd, 13) (.key, 10) (.obj, 10) (.dll, 8) (.a, 8) (.xlsx, 8) (.Rnw, 8) (.log, 7) (.dat, 7) (.ttf, 6) (.class, 6) (.template, 6) (.sql, 6) (.rc, 6) (.bat, 5) (.woff, 5) (.eot, 5) (.MD, 5) (.PNG, 5) (.xls, 4) (.seq, 4) (.Rpres, 4) (.mtl, 4) (.eps, 4) (.swp, 3) (.rds, 3) (.map, 3) (.topojson, 3) (.avro, 3) (.org, 3) (.xqy, 3) (.bib, 3) (.db, 3) (.old, 3) (.qv, 2) (.h5, 2) (.cabal, 2) (.sch, 2) (.mk, 2) (.LESSER, 2) (.bash, 2) (.mdown, 2) (.sqlite, 2) (.ico, 2) (.fasta, 2) (.ps, 2) (.gz, 2) (.orig, 2) (.jks, 2) (.lib, 2) 
OTHER: (.md, 2) 
WEB: (.html, 4061) (.md5, 3497) (.sha1, 3494) (.jar, 3059) (.md, 2979) (.png, 2756) (.js, 2450) (.pom, 1129) (.css, 968) (.jpg, 757) (.json, 730) (.scss, 574) (.svg, 570) (, 365) (.less, 293) (.markdown, 226) (.gif, 164) (.rb, 162) (.pdf, 161) (.yaml, 147) (.ejs, 129) (.txt, 114) (.woff, 102) (.ttf, 88) (.otf, 80) (.yml, 79) (.eot, 77) (.gem, 75) (.xml, 62) (.map, 60) (.php, 55) (.erb, 50) (.jsx, 45) (.rst, 33) (.woff2, 32) (.plist, 26) (.vm, 24) (.jpeg, 20) (.py, 19) (.ico, 19) (.sh, 16) (.lock, 16) (.eps, 16) (.psd, 15) (.ai, 10) (.styl, 10) (.JPG, 9) (.jpe, 9) (.hbs, 9) (.csv, 9) (.swf, 8) (.sass, 8) (.ru, 7) (.coffee, 7) (.un~, 6) (.war, 6) (.db, 6) (.htm, 6) (.PNG, 6) (.cur, 5) (.xsl, 5) (.go, 5) (.tmp, 4) (.webfinger, 4) (.opts, 4) (.properties, 4) (.out, 3) (.jade, 3) (.graffle, 3) (.1, 3) (.zip, 3) (.tsv, 3) (.f, 3) (.ics, 2) (.rake, 2) (.svgz, 2) (.gz, 2) (.ps, 2) (.atom, 2) 
DEV: (.h, 44315) (.py, 39470) (.js, 29804) (.java, 28994) (.go, 27405) (.png, 19761) (.cc, 17414) (.cpp, 16222) (, 15798) (.cs, 13234) (.json, 13167) (.rb, 13042) (.swift, 10083) (.md, 9720) (.c, 9598) (.rs, 8485) (.xml, 7691) (.txt, 7123) (.php, 5988) (.html, 5444) (.rst, 4376) (.hpp, 3497) (.sh, 3082) (.svg, 2933) (.yaml, 2577) (.cxx, 2509) (.yml, 2469) (.vim, 2462) (.ts, 2367) (.scala, 2241) (.cmake, 2155) (.css, 1829) (.lua, 1787) (.po, 1347) (.I, 1301) (.mo, 1252) (.jpg, 1235) (.1, 1175) (.pod, 1099) (.erb, 1026) (.gif, 952) (.pl, 859) (.C, 844) (.properties, 771) (.pdf, 760) (.in, 745) (.glsl, 737) (.R, 707) (.coffee, 658) (.dll, 656) (.csv, 653) (.mk, 623) (.f, 622) (.patch, 609) (.s, 585) (.less, 583) (.dds, 568) (.cfg, 556) (.pem, 538) (.bat, 515) (.scss, 491) (.sil, 470) (.conf, 467) (.plist, 433) (.pm, 431) (.bmp, 421) (.doc, 412) (.mm, 408) (.csproj, 403) (.xksl, 400) (.el, 400) (.hlsl, 396) (.m, 381) (.grid2, 358) (.taml, 353) (.proto, 347) (.jsx, 347) (.ico, 337) (.pyc, 328) (.t, 326) (.sql, 324) (.dat, 318) (.xktex, 313) (.stderr, 303) (.d, 289) (.config, 275) (.res, 270) (.ttf, 264) (.gui, 264) (.cix, 262) (.gradle, 260) (.vcxproj, 254) (.as, 247) (.ps1, 245) (.asm, 239) (.jar, 232) (.gz, 226) (.filters, 222) (.ogg, 221) (.xkmat, 218) (.hh, 217) (.3, 215) (.shp, 214) (.Rd, 213) (.pp, 209) (.jun, 207) (.tif, 205) (.hbs, 201) (.pkg, 200) (.ini, 200) (.out, 190) (.cntk, 187) (.zip, 186) (.markdown, 185) (.tt, 181) (.wxs, 179) (.exe, 178) (.toml, 176) (.i, 176) (.tpl, 174) (.frag, 173) (.def, 172) (.des, 171) (.wav, 168) (.csh, 166) (.komodotool, 164) (.sls, 163) (.xul, 162) (.sbt, 158) (.inl, 155) (.xcscheme, 154) (.cu, 152) (.sln, 149) (.props, 149) (.xlf, 144) (.decTest, 143) (.msg, 143) (.dae, 143) (.org, 142) (.com, 141) (.ctp, 138) (.sass, 137) (.template, 136) (.ipynb, 136) (.gyb, 134) (.lib, 132) (.groovy, 131) (.mat, 131) (.waf_files, 129) (.dec, 129) (.rc, 129) (.inc, 125) (.ok, 124) (.resx, 123) (.fbx, 123) (.nsh, 122) (.xpm, 121) (.hs, 120) (.udl, 120) (.afm, 120) (.woff, 119) (.ml, 119) (.psd, 116) (.tmp, 116) (.idl, 116) (.oramap, 115) (.ipp, 114) (.param, 114) (.icns, 108) (.nib, 108) (.htm, 106) (.pro, 106) (.log, 104) (.pvr, 104) (.bin, 104) (.pyx, 103) (.cgf, 103) (.test, 103) (.gdg, 101) (.map, 101) (.lang, 101) (.dtd, 100) (.S, 97) (.url, 97) (.tcl, 96) (.vert, 96) (.obj, 94) (.pbxproj, 93) (.key, 93) (.m4, 93) (.mod, 93) (.prefs, 93) (.xkm3d, 93) (.manifest, 91) (.material, 89) (.enc, 88) (.mcmeta, 88) (.twig, 88) (.root, 88) (.info, 86) (.pdb, 84) (.icc, 82) (.rake, 82) (.tem, 81) (.class, 81) (.ent, 81) (.prefab, 81) (.xproj, 81) (.mtl, 80) (.adoc, 78) (.a, 78) (.cnf, 77) (.so, 77) (.eot, 76) (.targets, 75) (.vcproj, 75) (.cshtml, 75) (.xkprefab, 75) (.tpp, 75) (.xsd, 75) (.golden, 74) (.mli, 74) (.gyp, 74) (.block, 74) (.diff, 74) (.cfx, 72) (.ors, 72) (.crt, 72) (.F, 72) (.fnt, 71) (.tmpl, 70) (.textinfo, 70) (.woff2, 68) (.toc, 68) (.cljs, 68) (.gemspec, 68) (.ps, 67) (.response, 67) (.xib, 66) (.ui, 65) (.xkscene, 65) (.jpeg, 63) (.j2, 62) (.xcworkspacedata, 62) (.nsi, 61) (.fa, 60) (.slim, 60) (.mak, 60) (.apinotes, 59) (.cmd, 59) (.dist, 59) (.PNG, 59) (.xnb, 58) (.nlf, 58) (.predict, 57) (.j3md, 56) (.cfi, 56) (.aap, 55) (.xaml, 55) (.prototxt, 54) (.pz, 54) (.xkfx, 54) (.hdr, 54) (.TXT, 53) (.ejs, 53) (.pyd, 53) (.storyboard, 52) (.ec, 52) (.tmx, 52) (.egg, 52) (.styl, 52) (.pxd, 52) (.jade, 52) (.xksheet, 51) (.tga, 51) (.nuspec, 51) (.lut, 51) (.dot, 51) (.xkfnt, 50) (.lock, 50) (.scene, 49) (.tmCommand, 49) (.xkpromodel, 48) (.gltf, 48) (.service, 48) (.pc, 48) (.jst, 47) (.8, 47) (.xkpkg, 47) (.sav, 47) (.dts, 47) (.provides, 46) (.stdout, 45) (.ndl, 45) (.tags, 44) (.mp3, 44) (.spark, 44) (.cbp, 44) (.am, 44) (.xsl, 44) (.strings, 44) (.j3o, 43) (.cr, 43) (.pch, 43) (.xkanim, 42) (.list, 42) (.oref, 41) (.swiftdeps, 41) (.ani, 41) (.data, 41) (.sample, 40) (.asciidoc, 40) (.haml, 40) (.xkgamesettings, 39) (.cson, 39) (.routes, 39) (.psh, 38) (.rhtml, 38) (.aud, 38) (.modulemap, 38) (.bash, 37) (.sg, 37) (.mdl, 37) (.rdf, 36) (.ccbi, 36) (.ccb, 36) (.dsp, 35) (.fsh, 35) (.j3m, 35) (.shape, 34) (.opt, 34) (.spec, 34) (.pu, 34) (.cgfx, 34) (.par, 34) (.o, 34) (.leo, 33) (.blend, 33) (.wixproj, 32) (.db, 32) (.tar, 32) (.appxmanifest, 32) (.tgz, 32) (.awk, 32) (.xksnd, 31) (.g3dj, 31) (.stl, 31) (.arff, 30) (.tex, 30) (.gliffy, 30) (.unitless, 30) (.example, 30) (.rdoc, 29) (.xktpl, 28) (.whl, 28) (.jsm, 28) (.user, 28) (.rtf, 27) (.isolate, 27) (.utf-8, 27) (.vsh, 27) (.r, 27) (.pfx, 26) (.ru, 26) (.max, 26) (.sno, 26) (.j3sn, 26) (.ext, 26) (.src, 26) (.fx, 26) (.fits, 26) (.sec, 26) (.cer, 25) (.fwc, 25) (.mkd, 25) (.builder, 25) (.DotSettings, 24) (.doi, 24) (.axml, 24) (.dylib, 24) (.2, 24) (.otf, 23) (.capnp, 23) (.vstemplate, 23) (.DAE, 23) (.form, 23) (.skin, 23) (.bz2, 22) (.podspec, 22) (.bats, 22) (.mplstyle, 22) (.vp, 22) (.install, 22) (.FBX, 21) (.reg, 21) (.cubeset, 21) (.status, 21) (.ppm, 21) (.jsonschema, 21) (.gypi, 21) (.glsllib, 21) (.7, 21) (.ss, 21) (.asset, 21) (.tmSnippet, 21) (.http, 20) (.fq, 20) (.opts, 20) (.dox, 20) (.BUILD, 20) (.qml, 19) (.gpb, 19) (.vm, 19) (.texinfo, 19) (.srl, 19) (.dsw, 19) (.sha, 19) (.iml, 19) (.sshtml, 18) (.ep, 18) (.xkeffectlog, 18) (.fp, 18) (.launch, 18) (.TIF, 18) (.mgcb, 18) (.rgb, 18) (.raw, 18) (.xkphy, 17) (.tiff, 17) (.init, 17) (.tmLanguage, 17) (.Rmd, 17) (.xcscmblueprint, 17) (.kkf, 17) (.xcconfig, 17) (.wxl, 17) (.bdc3D, 16) (.opam, 16) (.build, 16) (.lnx, 16) (.mca, 16) (.jinja, 16) (.node, 16) (.doctest, 15) (.graffle, 15) (.kt, 15) (.isl, 15) (.jsp, 15) (.tech, 15) (.settings, 15) (.pyf, 15) (.kpf, 15) (.definition, 15) (.ac, 15) (.env, 14) (.cuh, 14) (.ftl, 14) (.loader, 14) (.default, 14) (.input, 14) (.npz, 14) (.csb, 14) (.mb, 14) (.jinja2, 14) (.prebuilt, 14) (.flt, 14) (.kernel, 14) (.atlas, 14) (.factories, 14) (.5, 14) (.xkskel, 14) (.0, 13) (.handlebars, 13) (.fish, 13) (.reg2, 13) (.desktop, 13) (.c3b, 13) (.int, 13) (.t4, 13) (.miml, 13) (.supp, 13) (.ignore, 13) (.docx, 13) (.mcr, 13) (.feature, 13) (.aj, 13) (.fragment, 13) (.command, 13) (.rng, 13) (.py-tpl, 13) (.eps, 12) (.xkraw, 12) (.man, 12) (.rules, 12) (.LICENSE, 12) (.BSD, 12) (.sub, 12) (.rpm, 12) (.pxm, 12) (.bpr, 12) (.mgfxo, 12) (.sed, 12) (.lhe, 12) (.pbtxt, 11) (.pdl, 11) (.handlers, 11) (.std, 11) (.gsh, 11) (.ref, 11) (.preds, 11) (.xcuserstate, 11) (.guess, 11) (.erl, 11) (.swf, 11) (.bzl, 11) (.ktx, 11) (.tmTheme, 11) (.ms, 11) (.mel, 11) (.upstart, 11) (.cbc, 11) (.ll, 10) (.recipes, 10) (.terrain, 10) (.inventory, 10) (.st, 10) (.bak, 10) (.cd, 10) (.schemas, 10) (.pkl, 10) (.jspx, 10) (.num, 10) (.xrc, 10) (.kml, 10) (.pub, 10) (.snk, 10) (.styled, 10) (.goconvey, 10) (.jks, 10) (.pas, 10) (.types, 10) (.dcl, 10) (.README, 10) (.cryproject, 10) (.wts, 10) (.mustache, 10) (.lis, 9) (.gn, 9) (.atomic, 9) (.meta, 9) (.markmin, 9) (.l, 9) (.ctf, 9) (.scp, 9) (.war, 9) (.psm1, 9) (.prop, 9) (.xkpcfnt, 9) (.tbl, 9) (.development, 9) (.simple, 9) (.vhd, 9) (.dir, 9) (.docs, 9) (.hxx, 9) (.lyx, 9) (.ocp, 9) (.tooling, 9) (.pmml, 9) (.vcxitems, 9) (.N, 9) (.dbf, 9) (.xpi, 9) (.epl, 9) (.fullpath, 8) (.mbox, 8) (.shx, 8) (.model, 8) (.xksky, 8) (.pump, 8) (.mount, 8) (.tmPreferences, 8) (.xz, 8) (.urdf, 8) (.nt, 8) (.au, 8) (.20, 8) (.rts, 8) (.bmfc, 8) (.VMS, 8) (.blocksounds, 8) (.rdata, 8) (.bpf, 8) (.erubis, 8) (.result, 8) (.liquid, 8) (.wxcp, 8) (.Dockerfile, 8) (.tsx, 8) (.flo, 8) (.shade, 8) (.phpt, 8) (.xcf, 8) (.spritefont, 7) (.gni, 7) (.g3db, 7) (.y, 7) (.postinst, 7) (.pxi, 7) (.adb, 7) (.sdf, 7) (.xslt, 7) (.glb, 7) (.xhtml, 7) (.hqx, 7) (.mar, 7) (.deb, 7) (.orig, 7) (.cql, 7) (.DEC, 7) (.vb, 7) (.cgh, 7) (.npy, 7) (.leafdoc, 7) (.mms, 7) (.odg, 7) (.resolved, 7) (.job, 7) (.engine, 7) (.proj, 7) (.xcplayground, 7) (.f90, 7) (.header, 7) (.doit, 7) (.clj, 6) (.mp4, 6) (.mlf, 6) (.dep, 6) (.str, 6) (.minitest, 6) (.j3odata, 6) (.mis, 6) (.es6, 6) (.dasc, 6) (.centos7, 6) (.lintian-overrides, 6) (.ptf, 6) (.theme, 6) (.te, 6) (.sysconfig, 6) (.crl, 6) (.ASN1, 6) (.twirl, 6) (.rspec, 6) (.empty, 6) (.xkprj, 6) (.swiftmodule, 6) (.test-sh, 6) (.expected, 6) (.tfs, 6) (.cur, 6) (.cat, 6) (.gcc, 6) (.gem, 6) (.fxh, 6) (.xls, 6) (.sha1, 6) (.4, 6) (.ldif, 6) (.load, 6) (.prj, 6) (.ma, 6) (.libyaml, 6) (.bib, 6) (.cmf, 6) (.g4, 6) (.doxy, 6) (.if, 6) (.bsh, 6) (.c3t, 6) (.6, 6) (.mvac, 6) (.mdb, 6) (.inf, 6) (.git, 6) (.vms, 6) (.fc, 6) (.printme, 6) (.fbs, 6) (.postrm, 6) (.aps, 5) (.CROSS, 5) (.dia, 5) (.environment, 5) (.dj, 5) (.tab, 5) (.workspace, 5) (.win, 5) (.UNIX, 5) (.appcache, 5) (.changes, 5) (.attr, 5) (.cin, 5) (.msc, 5) (.translation, 5) (.phiz, 5) (.ANY, 5) (.ascii, 5) (.URL, 5) (.mc, 5) (.APACHE2, 5) (.css_t, 5) (.czml, 5) (.pal, 5) (.H, 5) (.DLL, 5) (.pot, 5) (.JPG, 5) (.tps, 5) (.tap, 5) (.projitems, 5) (.label, 5) (.MF, 5) (.bs, 5) (.vc, 5) (.aiff, 5) (.3ds, 5) (.nupkg, 5) (.APACHE, 5) (.md5, 5) (.manpages, 5) (.sas, 5) (.mxml, 5) (.hin, 5) (.MIT, 5) (.glade, 5) (.xs, 5) (.MAC, 5) (.bc, 5) (.cp1250, 5) (.tmDragCommand, 5) (.ecr, 5) (.yxx, 5) (.g, 5) (.pck, 5) (.ai, 5) (.mobileprovision, 5) (.shproj, 5) (.sha512, 5) (.kubeconfig, 5) (.xctimeline, 5) (.T, 5) (.dirs, 5) (.mac, 5) (.fstab, 5) (.GNU, 5) (.doctree, 5) (.cmakein, 5) (.gzip, 5) (.private, 5) (.phipos, 5) (.xkss4a, 5) (.options, 5) (.translations, 5) (.natvis, 5) (.wxi, 4) (.rgs, 4) (.jdl, 4) (.idx, 4) (.snippet, 4) (.dpr, 4) (.scpt, 4) (.MD, 4) (.s390x, 4) (.seed, 4) (.gexf, 4) (.rdb, 4) (.aarch64, 4) (.gitbundle, 4) (.ipa, 4) (.dfm, 4) (.pom, 4) (.xkuipage, 4) (.qrc, 4) (.bak2, 4) (.xbm, 4) (.rsc, 4) (.old, 4) (.mf, 4) (.jxs, 4) (.ccz, 4) (.pfxp, 4) (.kmz, 4) (.physics, 4) (.ppc64le, 4) (.targ, 4) (.rnn, 4) (.fre, 4) (.mc6, 4) (.mul, 4) (.lxx, 4) (.aidl, 4) (.armhf, 4) (.sty, 4) (.components, 4) (.csd, 4) (.wxl_template, 4) (.sol, 4) (.vw, 4) (.graphml, 4) (.script, 4) (.stp, 4) (.torsion, 4) (.devel, 4) (.code, 4) (.vsmacros, 4) (.gv, 4) (.transform, 4) (.cpu, 4) (.tmMacro, 4) (.100, 4) (.ds, 4) (.plugin, 4) (.nbt, 4) (.pak, 4) (.jsd, 4) (.templ, 4) (.manx, 4) (.rda, 4) (.gpu, 4) (.erb~, 4) (.wat, 4) (.vsd, 4) (.profiles, 4) (.ver, 4) (.dict, 4) (.userprefs, 4) (.dif, 4) (.bcc, 4) (.vox, 4) (.xcsettings, 3) (.DDS, 3) (.particle, 3) (.pbfilespec, 3) (.hook, 3) (.nam, 3) (.23, 3) (.dnn, 3) (.sug, 3) (.exp, 3) (.zsh, 3) (.unix, 3) (.rc-d, 3) (.la, 3) (.wl, 3) (.rhel6, 3) (.p, 3) (.SSLeay, 3) (.W64, 3) (.pyw, 3) (.labels, 3) (.OS2, 3) (.Debian, 3) (.xclangspec, 3) (.h5, 3) (.bank, 3) (.PL, 3) (.deproj, 3) (.mab, 3) (.lds, 3) (.twb, 3) (.index, 3) (.sketch, 3) (.aifc, 3) (.addins, 3) (.webp, 3) (.euc, 3) (.gnu, 3) (.cabal, 3) (.geojson, 3) (.W32, 3) (.Config, 3) (.GIF, 3) (.rl, 3) (.MacOS, 3) (.head, 3) (.skel, 3) (.21, 3) (.NW, 3) (.story, 3) (.ldf, 3) (.chm, 3) (.os4, 3) (.AUD, 3) (.csr, 3) (.wasm, 3) (.cert, 3) (.text, 3) (.bundle, 3) (.asc, 3) (.cbproj, 3) (.lst, 3) (.p12, 3) (.names, 3) (.SIC, 3) (.copy, 3) (.mmp, 3) (.mdown, 3) (.PRJ, 3) (.shared, 3) (.mll, 3) (.cl, 3) (.iss, 3) (.ENGINE, 3) (.pg, 3) (.ads, 3) (.spl, 3) (.~, 3) (.htpasswd, 3) (.clh, 3) (.apt, 3) (.mingw, 3) (.base, 3) (.tip, 3) (.td, 3) (.en, 3) (.gost, 3) (.pending, 3) (.hpux10-cc, 3) (.mgd, 3) (.pdn, 3) (.RootCerts, 3) (.bor, 3) (.pdef, 3) (.pack, 3) (.22, 3) (.jsonl, 3) (.txt~, 3) (.cscfg, 3) (.nc, 3) (.shh, 3) (.xccheckout, 3) (.solaris, 3) (.ctypes, 3) (.mmlog, 3) (.WCE, 3) (.fontified, 3) (.fr, 3) (.soc, 3) (.foo, 3) (.DJGPP, 3) (.dc, 3) (.data-00000-of-00001, 3) (.textile, 3) (.XML, 3) (.mt, 3) (.suo, 3) (.tsv, 3) (.linux, 3) (.wpr, 3) (.LESSER, 3) (.pe, 3) (.audit_regr, 2) (.rjs, 2) (.js~, 2) (.initd, 2) (.jruby:minitest, 2) (.vbs, 2) (.x, 2) (.resources, 2) (.bar, 2) (.ansi, 2) (.CPP, 2) (.query, 2) (.arson, 2) (.zktx, 2) (.cls, 2) (.pri, 2) (.readme, 2) (.rss, 2) (.statelist, 2) (.7z, 2) (.layout, 2) (.libvpx, 2) (.train, 2) (.adv, 2) (.TTF, 2) (.xpt, 2) (.pex, 2) (.btfcover, 2) (.csv-neighbors, 2) (.9, 2) (.js2, 2) (.ecore, 2) (.apache, 2) (.mly, 2) (.component, 2) (.local, 2) (.vc6, 2) (.standalone, 2) (.ht, 2) (.lsf, 2) (.pyi, 2) (.kpz, 2) (.sublime-settings, 2) (.pgm, 2) (.emx, 2) (.en-US, 2) (.seldontestvm, 2) (.bnk, 2) (.msvc, 2) (.logrotate, 2) (.windows, 2) (.cint, 2) (.decl, 2) (.ssae, 2) (.getopt, 2) (.preinst, 2) (.ipy, 2) (.osf, 2) (.tcsh, 2) (.aug, 2) (.wgt, 2) (.Makefile, 2) (.htc, 2) (.se, 2) (.b3d, 2) (.valgrind, 2) (.iphone, 2) (.coalang, 2) (.irx, 2) (.jbuilder, 2) (.inv, 2) (.cjstyles, 2) (.python, 2) (.udev, 2) (.abc, 2) (.z, 2) (.rxml, 2) (.rpgle, 2) (.xkuilib, 2) (.extension, 2) (.drv, 2) (.testfile, 2) (.apns, 2) (.ttproj, 2) (.tau, 2) (.zoo, 2) (.neon, 2) (.leox, 2) (.xcbkptlist, 2) (.keymap, 2) (.pyt, 2) (.license, 2) (.scr, 2) (.seldonvm, 2) (.csv-distances, 2) (.CommandFactory, 2) (.tagset, 2) (.pred, 2) (.known, 2) (.atf, 2) (.02, 2) (.unw, 2) (.orm, 2) (.examples, 2) (.dbg, 2) (.mscala, 2) (.bff, 2) (.dev, 2) (.xml_AGV, 2) (.nustache, 2) (.hairy, 2) (.android, 2) (.puml, 2) (.bash-completion, 2) (.ct, 2) (.me, 2) (.rgba, 2) (.ccbproj, 2) (.xkframe, 2) (.cry, 2) (.celeryd, 2) (.htdigest, 2) (.aff, 2) (.win32, 2) (.socket, 2) (.01, 2) (.btf, 2) (.hepmc, 2) (.nunit, 2) (.xkentity, 2) (.lyr, 2) (.contrib, 2) (.dependency, 2) (.cshrc, 2) (.ssce, 2) (.w31, 2) (.JS, 2) (.fixture, 2) (.ruleset, 2) (.xktest, 2) (.dos, 2) (.pkm, 2) (.mft, 2) (.tld, 2) (.363, 2) (.fail, 2) (.seqdiag, 2) (.stoptags, 2) (.dic, 2) (.safariextz, 2) (.nex, 2) (.crab, 2) (.xkss4s, 2) (.nanorc, 2) (.idb, 2) (.yz, 2) (.lzma, 2) (.scm, 2) (.installed, 2) (.vbhtml, 2) (.vstdir, 2) (.pre, 2) (.confd, 2) (.pptx, 2) (.radius, 2) (.network, 2) (.prof, 2) (.tutor, 2) (.doxyfile, 2) (.SKIP, 2) (.deps, 2) (.gnuplot, 2) (.der, 2) (.ter, 2) (.cgi, 2) (.StyleCop, 2) (.tbz, 2) (.filter, 2) (.jruby:slim, 2) (.dmg, 2) (.chunk, 2) (.policy, 2) (.js-script, 2) (.behaviors, 2) (.footer, 2) (.javadoc, 2) (.vsixmanifest, 2) (.nokogiri, 2) (.macros, 2) (.qsf, 2) (.emacs, 2) (.03, 2) (.vtp, 2) (.sublime-project, 2) (.yajl, 2) (.m4f, 2) (.btfatlas, 2) (.meas, 2) (.bsd, 2) (.jdo, 2) (.b, 2) (.ruby, 2) (.litcoffee, 2) (.adj, 2) (.24, 2) (.kts, 2) (.rabl, 2) (.scml, 2) (.thread_config, 2) (.sqlite, 2) (.req, 2) (.wlang, 2) (.ttinclude, 2) (.bnf, 2) (.fcgi, 2) (.mva, 2) (.nl, 2) (.sparse, 2) (.Processor, 2) (.caf, 2) (.verb, 2) (.sassc, 2) (.entitlements, 2) (.ark, 2) (.jruby:rspec, 2) (.rnd, 2) 
DATA: (.glif, 128448) (.png, 68351) (.svg, 6598) (.xml, 6112) (.json, 6108) (.glyph, 4091) (.txt, 2328) (.ttf, 2120) (.html, 1493) (.pb, 840) (.yaml, 652) (.csv, 586) (.pdf, 571) (.eps, 491) (.geojson, 473) (, 467) (.plist, 466) (.configFile, 432) (.less, 388) (.py, 264) (.js, 169) (.md, 168) (.scss, 150) (.fea, 126) (.tsv, 116) (.otf, 103) (.css, 98) (.woff, 80) (.eot, 80) (.R, 56) (.woff2, 44) (.category, 42) (.nam, 35) (.processedHashMap, 32) (.korean, 28) (.ai, 26) (.ptl, 25) (.tpl, 25) (.c, 22) (.root, 22) (.yml, 21) (.jade, 19) (.jpg, 17) (.gif, 17) (.markdown, 16) (.h, 16) (.sh, 16) (.glyphs, 15) (.jpeg, 13) (.vfb, 13) (.Rds, 12) (.edn, 11) (.thai, 8) (.props, 8) (.ico, 7) (.lao, 6) (.xlsx, 5) (.designspace, 5) (.Rmd, 5) (.jsx, 5) (.cc, 5) (.styl, 5) (.sp3, 5) (.rb, 4) (.bft, 4) (.cfg, 4) (.sfd, 3) (.sass, 3) (.pyc, 3) (.rst, 3) (.vfbak, 3) (.tamil, 3) (.zip, 3) (.map, 3) (.go, 3) (.C, 3) (.ipynb, 2) (.third_party, 2) (.swf, 2) (.rtf, 2) (.ethiopic, 2) (.xls, 2) (.mk, 2) (.erb, 2) (.gz, 2) (.bib, 2) (.toml, 2) (.lock, 2) (.cmd, 2) 
Dateiendungen, aber nur 1mal pro Repo gezaehlt
DEV: (, 297) (.md, 281) (.yml, 212) (.json, 211) (.png, 205) (.txt, 193) (.js, 187) (.html, 173) (.css, 146) (.sh, 138) (.xml, 105) (.py, 104) (.ico, 92) (.svg, 87) (.h, 78) (.jpg, 78) (.gif, 75) (.bat, 72) (.in, 71) (.java, 59) (.c, 59) (.cfg, 55) (.rst, 52) (.conf, 50) (.cpp, 49) (.plist, 49) (.rb, 48) (.ttf, 47) (.ini, 46) (.properties, 44) (.yaml, 41) (.woff, 36) (.jar, 36) (.gz, 36) (.lock, 34) (.pdf, 34) (.pl, 33) (.csv, 32) (.template, 31) (.markdown, 30) (.gradle, 30) (.1, 30) (.cc, 29) (.m, 29) (.zip, 28) (.rc, 28) (.eot, 28) (.cmd, 28) (.exe, 27) (.config, 26) (.pbxproj, 26) (.cmake, 25) (.scss, 25) (.sln, 25) (.dat, 24) (.mk, 23) (.hpp, 23) (.less, 23) (.dll, 23) (.icns, 23) (.coffee, 23) (.inc, 22) (.erb, 22) (.bmp, 22) (.map, 22) (.ps1, 22) (.cs, 21) (.php, 21) (.ogg, 20) (.sql, 19) (.pem, 19) (.xcscheme, 19) (.wav, 19) (.lua, 19) (.gemspec, 19) (.patch, 19) (.go, 18) (.TXT, 18) (.proto, 17) (.crt, 17) (.m4, 17) (.key, 17) (.xcworkspacedata, 17) (.spec, 16) (.def, 16) (.psd, 16) (.el, 16) (.swift, 16) (.tmpl, 16) (.ipynb, 15) (.xib, 15) (.mm, 15) (.scala, 14) (.manifest, 14) (.vcxproj, 14) (.tex, 14) (.example, 14) (.otf, 14) (.woff2, 14) (.tgz, 13) (.opts, 13) (.tpl, 13) (.log, 13) (.info, 13) (.csproj, 13) (.inl, 13) (.ts, 12) (.rake, 12) (.props, 12) (.bash, 12) (.ac, 12) (.lib, 12) (.vim, 12) (.glsl, 12) (.pro, 12) (.am, 12) (.storyboard, 12) (.ru, 12) (.awk, 11) (.obj, 11) (.filters, 11) (.podspec, 11) (.s, 11) (.strings, 11) (.install, 11) (.xsl, 11) (.dds, 11) (.com, 11) (.rtf, 11) (.2, 11) (.doc, 11) (.xsd, 11) (.pch, 10) (.3, 10) (.sample, 10) (.pm, 10) (.so, 10) (.dylib, 10) (.xpm, 10) (.service, 10) (.bin, 10) (.mtl, 10) (.dot, 10) (.nsi, 10) (.toml, 10) (.test, 10) (.mp3, 10) (.enc, 10) (.README, 10) (.diff, 9) (.data, 9) (.po, 9) (.a, 9) (.desktop, 9) (.jpeg, 9) (.sub, 9) (.supp, 9) (.dsp, 9) (.vcproj, 9) (.raw, 9) (.default, 9) (.fnt, 9) (.ejs, 8) (.db, 8) (.htm, 8) (.LICENSE, 8) (.cxx, 8) (.list, 8) (.groovy, 8) (.pyx, 8) (.nuspec, 8) (.tif, 8) (.mo, 8) (.rdoc, 8) (.cnf, 8) (.tar, 8) (.pfx, 8) (.out, 8) (.resx, 8) (.R, 8) (.d, 8) (.vert, 8) (.frag, 8) (.dsw, 8) (.prefs, 8) (.guess, 8) (.targets, 8) (.asm, 8) (.settings, 7) (.cu, 7) (.7, 7) (.dist, 7) (.gyp, 7) (.sass, 7) (.swf, 7) (.0, 7) (.y, 7) (.VMS, 7) (.material, 7) (.user, 7) (.r, 7) (.xcf, 7) (.pod, 7) (.lang, 6) (.tga, 6) (.dtd, 6) (.wxs, 6) (.hbs, 6) (.sed, 6) (.graffle, 6) (.bak, 6) (.BSD, 6) (.inf, 6) (.i, 6) (.S, 6) (.reg, 6) (.tt, 6) (.bz2, 6) (.lis, 6) (.eps, 6) (.srl, 6) (.hlsl, 6) (.upstart, 6) (.5, 6) (.crl, 6) (.mak, 6) (.init, 6) (.idl, 6) (.cshtml, 6) (.class, 6) (.build, 6) (.pp, 5) (.xs, 5) (.bc, 5) (.ppm, 5) (.ANY, 5) (.mms, 5) (.8, 5) (.pub, 5) (.mcmeta, 5) (.launch, 5) (.bib, 5) (.mc, 5) (.docx, 5) (.jsx, 5) (.styl, 5) (.ps, 5) (.rules, 5) (.orig, 5) (.man, 5) (.cin, 5) (.tmLanguage, 5) (.cur, 5) (.resolved, 5) (.css_t, 5) (.UNIX, 5) (.types, 5) (.hin, 5) (.MAC, 5) (.xpi, 5) (.docs, 5) (.libyaml, 5) (.fish, 5) (.env, 5) (.xslt, 5) (.H, 5) (.cer, 5) (.rpm, 5) (.tiff, 5) (.simple, 5) (.MF, 5) (.CROSS, 5) (.xz, 5) (.gypi, 5) (.DLL, 5) (.git, 5) (.gcc, 5) (.GNU, 5) (.snk, 5) (.ai, 5) (.opt, 5) (.dif, 4) (.sty, 4) (.rs, 4) (.cbp, 4) (.label, 4) (.pot, 4) (.proj, 4) (.fre, 4) (.dox, 4) (.MD, 4) (.attr, 4) (.manpages, 4) (.arff, 4) (.iml, 4) (.input, 4) (.atlas, 4) (.dict, 4) (.psm1, 4) (.scene, 4) (.sysconfig, 4) (.xcplayground, 4) (.ldif, 4) (.slim, 4) (.cuh, 4) (.odg, 4) (.lnx, 4) (.feature, 4) (.fbx, 4) (.vms, 4) (.sol, 4) (.mar, 4) (.npy, 4) (.Rd, 4) (.nib, 4) (.xhtml, 4) (.doxy, 4) (.l, 4) (.command, 4) (.pdb, 4) (.ui, 4) (.hqx, 4) (.org, 4) (.dae, 4) (.hh, 4) (.builder, 4) (.golden, 4) (.ec, 4) (.tmx, 4) (.private, 4) (.t, 4) (.APACHE, 4) (.wixproj, 4) (.mul, 4) (.num, 4) (.pkg, 4) (.f, 4) (.PNG, 4) (.20, 4) (.form, 4) (.jade, 4) (.clj, 4) (.sbt, 4) (.vb, 3) (.glade, 3) (.RootCerts, 3) (.workspace, 3) (.erl, 3) (.mac, 3) (.hs, 3) (.SIC, 3) (.pc, 3) (.iss, 3) (.PRJ, 3) (.csh, 3) (.g4, 3) (.sdf, 3) (.win, 3) (.pkl, 3) (.node, 3) (.tab, 3) (.text, 3) (.tsv, 3) (.ipp, 3) (.g, 3) (.exp, 3) (.mmp, 3) (.natvis, 3) (.dia, 3) (.appxmanifest, 3) (.6, 3) (.WCE, 3) (.vc, 3) (.vm, 3) (.gem, 3) (.mp4, 3) (.22, 3) (.idx, 3) (.xccheckout, 3) (.xcscmblueprint, 3) (.LESSER, 3) (.fx, 3) (.cmakein, 3) (.23, 3) (.tap, 3) (.postrm, 3) (.Rmd, 3) (.zsh, 3) (.W64, 3) (.as, 3) (.jst, 3) (.j2, 3) (.mat, 3) (.OS2, 3) (.p12, 3) (.hpux10-cc, 3) (.textile, 3) (.wxl, 3) (.res, 3) (.max, 3) (.pyd, 3) (.ASN1, 3) (.src, 3) (.ENGINE, 3) (.lintian-overrides, 3) (.modulemap, 3) (.pas, 3) (.jsp, 3) (.MacOS, 3) (.chm, 3) (.rdb, 3) (.shared, 3) (.postinst, 3) (.DJGPP, 3) (.tmPreferences, 3) (.head, 3) (.htpasswd, 3) (.SSLeay, 3) (.xcconfig, 3) (.tcl, 3) (.NW, 3) (.21, 3) (.sas, 3) (.xcsettings, 3) (.pxd, 3) (.xaml, 3) (.addins, 3) (.cabal, 3) (.haml, 3) (.whl, 3) (.xbm, 3) (.cert, 3) (.mb, 3) (.W32, 3) (.4, 3) (.gost, 3) (.o, 3) (.os4, 3) (.ss, 3) (.nsh, 3) (.tsx, 3) (.adoc, 3) (.mingw, 3) (.changes, 2) (.userprefs, 2) (.7z, 2) (.stp, 2) (.ors, 2) (.sublime-settings, 2) (.qrc, 2) (.pxm, 2) (.pck, 2) (.safariextz, 2) (.confd, 2) (.capnp, 2) (.ark, 2) (.taml, 2) (.ppc64le, 2) (.projitems, 2) (.fp, 2) (.msvc, 2) (.24, 2) (.linux, 2) (.PL, 2) (.xcuserstate, 2) (.MIT, 2) (.psh, 2) (.pyi, 2) (.deps, 2) (.f90, 2) (.armhf, 2) (.bnf, 2) (.dep, 2) (.mel, 2) (.pvr, 2) (.js~, 2) (.cls, 2) (.xclangspec, 2) (.GIF, 2) (.cson, 2) (.cubeset, 2) (.te, 2) (.mustache, 2) (.lut, 2) (.asc, 2) (.handlers, 2) (.scm, 2) (.gliffy, 2) (.targ, 2) (.ep, 2) (.j3o, 2) (.fragment, 2) (.vsh, 2) (.mdb, 2) (.nupkg, 2) (.mcr, 2) (.bcc, 2) (.DAE, 2) (.xul, 2) (.transform, 2) (.cpu, 2) (.pxi, 2) (.recipes, 2) (.APACHE2, 2) (.j3odata, 2) (.rdf, 2) (.fontified, 2) (.liquid, 2) (.egg, 2) (.xrc, 2) (.contrib, 2) (.extension, 2) (.footer, 2) (.JPG, 2) (.graphml, 2) (.scpt, 2) (.sls, 2) (.gzip, 2) (.nex, 2) (.cd, 2) (.webp, 2) (.mbox, 2) (.ani, 2) (.devel, 2) (.s390x, 2) (.gexf, 2) (.blend, 2) (.wxi, 2) (.mc6, 2) (.xctimeline, 2) (.F, 2) (.train, 2) (.pbfilespec, 2) (.unix, 2) (.examples, 2) (.tmSnippet, 2) (.status, 2) (.suo, 2) (.snippet, 2) (.bmfc, 2) (.pyw, 2) (.tip, 2) (.local, 2) (.mca, 2) (.readme, 2) (.litcoffee, 2) (.initd, 2) (.bpr, 2) (.fa, 2) (.Processor, 2) (.abc, 2) (.socket, 2) (.prefab, 2) (.udev, 2) (.qml, 2) (.vsixmanifest, 2) (.hxx, 2) (.mab, 2) (.pdn, 2) (.mkd, 2) (.dcl, 2) (.cgfx, 2) (.ml, 2) (.header, 2) (.mll, 2) (.dj, 2) (.ds, 2) (.component, 2) (.gyb, 2) (.manx, 2) (.asset, 2) (.stderr, 2) (.ref, 2) (.npz, 2) (.appcache, 2) (.Dockerfile, 2) (.policy, 2) (.lst, 2) (.shp, 2) (.ruleset, 2) (.model, 2) (.handlebars, 2) (.yz, 2) (.foo, 2) (.terrain, 2) (.kml, 2) (.j3m, 2) (.factories, 2) (.empty, 2) (.jks, 2) (.logrotate, 2) (.DotSettings, 2) (.bor, 2) (.hairy, 2) (.deb, 2) (.vsd, 2) (.tmTheme, 2) (.ktx, 2) (.pgm, 2) (.meta, 2) (.labels, 2) (.apt, 2) (.XML, 2) (.req, 2) (.asciidoc, 2) (.expected, 2) (.z, 2) (.win32, 2) (.ansi, 2) (.SKIP, 2) (.j3md, 2) (.jsonl, 2) (.bar, 2) (.scp, 2) (.fr, 2) (.Config, 2) (.rda, 2) (.twig, 2) (.ver, 2) (.solaris, 2) (.wat, 2) (.bats, 2) (.fc, 2) (.nunit, 2) (.tmCommand, 2) (.mf, 2) (.mdown, 2) (.license, 2) (.pptx, 2) (.aidl, 2) (.cfi, 2) (.bash-completion, 2) (.options, 2) (.rng, 2) (.hdr, 2) (.mxml, 2) (.cl, 2) (.windows, 2) (.std, 2) (.pack, 2) (.vbs, 2) (.afm, 2) (.torsion, 2) (.entitlements, 2) (.response, 2) (.ftl, 2) (.StyleCop, 2) (.x, 2) (.shproj, 2) (.nbt, 2) (.csr, 2) (.Makefile, 2) (.gpu, 2) (.htc, 2) (.android, 2) (.if, 2) (.stdout, 2) (.sublime-project, 2) (.names, 2) (.http, 2) (.kt, 2) (.cgi, 2) (.fail, 2) (.pyc, 2) (.code, 2) (.rl, 2) (.aarch64, 2) (.isl, 2) (.old, 2) (.cql, 2) (.python, 2) (.templ, 2) (.p, 2) (.nanorc, 2) (.lzma, 2) (.msc, 2) (.ok, 2) (.jsm, 2) (.nl, 2) (.fcgi, 2) (.environment, 2) (.geojson, 2) (.gnuplot, 2) (.resources, 2) (.3ds, 2) (.pri, 2) (.ent, 2) (.valgrind, 2) (.dev, 2) (.jinja2, 2) (.dirs, 2) (.mdl, 2) (.bpf, 2) (.xls, 2) (.sec, 2) (.htdigest, 2) (.gnu, 2) (.jdl, 2) (.ms, 2) (.st, 2) (.inv, 2) (.xproj, 2) 
DATA: (, 39) (.md, 38) (.ttf, 29) (.svg, 26) (.json, 25) (.png, 22) (.txt, 22) (.html, 21) (.eot, 21) (.woff, 21) (.css, 21) (.js, 16) (.py, 13) (.yml, 12) (.plist, 12) (.otf, 12) (.fea, 11) (.glif, 11) (.csv, 9) (.pdf, 9) (.scss, 9) (.jpg, 8) (.less, 8) (.sh, 8) (.ico, 6) (.woff2, 6) (.markdown, 6) (.xml, 6) (.glyphs, 5) (.jpeg, 4) (.vfb, 4) (.ai, 4) (.sp3, 3) (.styl, 3) (.processedHashMap, 3) (.designspace, 3) (.jade, 3) (.lock, 2) (.Rmd, 2) (.R, 2) (.cc, 2) (.vfbak, 2) (.cfg, 2) (.sfd, 2) (.tpl, 2) (.cmd, 2) (.tsv, 2) (.map, 2) (.erb, 2) (.yaml, 2) (.xlsx, 2) (.rb, 2) (.zip, 2) 
EDU: (.md, 32) (, 27) (.png, 23) (.txt, 19) (.pdf, 18) (.html, 15) (.csv, 13) (.css, 12) (.json, 11) (.js, 10) (.jpg, 9) (.py, 8) (.sh, 8) (.xml, 8) (.svg, 8) (.tex, 7) (.gif, 6) (.data, 6) (.ipynb, 6) (.zip, 6) (.Rmd, 5) (.rdb, 5) (.R, 5) (.java, 5) (.xlsx, 5) (.rdx, 5) (.RData, 5) (.pptx, 4) (.yml, 4) (.log, 4) (.pyc, 4) (.jpeg, 4) (.rst, 4) (.rda, 4) (.MD, 3) (.properties, 3) (.otf, 3) (.jar, 3) (.ttf, 3) (.bat, 3) (.ico, 2) (.Rpres, 2) (.bash, 2) (.orig, 2) (.docx, 2) (.sql, 2) (.woff, 2) (.Rnw, 2) (.h5, 2) (.xls, 2) (.dat, 2) (.db, 2) (.eot, 2) (.m, 2) (.gz, 2) (.rds, 2) (.hs, 2) (.mat, 2) (.rb, 2) 
OTHER: (.md, 2) 
WEB: (.html, 48) (.css, 47) (.png, 46) (.js, 43) (, 42) (.md, 42) (.jpg, 32) (.svg, 31) (.yml, 24) (.woff, 23) (.ttf, 23) (.json, 22) (.eot, 22) (.gif, 20) (.txt, 20) (.ico, 17) (.scss, 17) (.lock, 14) (.otf, 12) (.xml, 12) (.rb, 12) (.pdf, 11) (.woff2, 10) (.jpeg, 8) (.php, 8) (.markdown, 7) (.map, 7) (.less, 6) (.psd, 6) (.JPG, 5) (.csv, 5) (.sh, 5) (.eps, 4) (.ru, 4) (.ai, 4) (.swf, 3) (.py, 3) (.erb, 3) (.yaml, 3) (.zip, 3) (.PNG, 2) (.graffle, 2) (.properties, 2) (.jar, 2) (.atom, 2) (.coffee, 2) (.opts, 2) (.war, 2) (.db, 2) 
HW: (, 37) (.md, 30) (.png, 20) (.txt, 19) (.html, 15) (.java, 14) (.py, 14) (.pdf, 13) (.jpg, 13) (.sh, 12) (.h, 12) (.json, 10) (.c, 9) (.cpp, 9) (.tex, 8) (.css, 8) (.js, 8) (.m, 7) (.xml, 7) (.gif, 6) (.class, 6) (.out, 6) (.cs, 5) (.csv, 5) (.dat, 5) (.zip, 5) (.sql, 5) (.rb, 5) (.sln, 5) (.JPG, 4) (.jar, 4) (.gz, 4) (.markdown, 4) (.docx, 4) (.map, 4) (.svg, 4) (.lock, 4) (.ico, 4) (.in, 4) (.woff, 4) (.scss, 4) (.pl, 4) (.coffee, 4) (.ipynb, 4) (.o, 4) (.mat, 4) (.mk, 3) (.sty, 3) (.erb, 3) (.d, 3) (.2, 3) (.config, 3) (.yml, 3) (.ttf, 3) (.plist, 3) (.1, 3) (.bak, 3) (.xcworkspacedata, 3) (.cc, 3) (.csproj, 3) (.old, 3) (.doc, 3) (.pbxproj, 3) (.eps, 3) (.xlsx, 3) (.nb, 3) (.prefs, 3) (.mkd, 3) (.iml, 3) (.asm, 3) (.xls, 3) (.log, 3) (.nuspec, 2) (.R, 2) (.sass, 2) (.properties, 2) (.rtf, 2) (.6, 2) (.xib, 2) (.pdb, 2) (.rdoc, 2) (.pages, 2) (.obj, 2) (.eot, 2) (.Makefile, 2) (.xcscheme, 2) (.0, 2) (.gypi, 2) (.cache, 2) (.odt, 2) (.scala, 2) (.pyc, 2) (.tgz, 2) (.dtd, 2) (.exe, 2) (.node, 2) (.3, 2) (.5, 2) (.woff2, 2) (.s, 2) (.js~, 2) (.EXE, 2) (.7, 2) (.y, 2) (.conf, 2) (.vcxproj, 2) (.diff, 2) (.patch, 2) (.rst, 2) (.storyboard, 2) (.list, 2) (.swp, 2) (.swift, 2) (.pptx, 2) (.fig, 2) (.ru, 2) (.less, 2) (.tar, 2) (.8, 2) (.php, 2) (.data, 2) (.filters, 2) (.ps1, 2) (.bat, 2) (.jpeg, 2) (.gyp, 2) (.4, 2) 
DOCS: (.md, 55) (, 40) (.png, 29) (.pdf, 20) (.html, 16) (.json, 16) (.jpg, 16) (.css, 16) (.js, 16) (.txt, 14) (.yml, 12) (.svg, 12) (.sh, 9) (.gif, 8) (.py, 8) (.lock, 6) (.ico, 6) (.scss, 6) (.tex, 6) (.xml, 6) (.ttf, 6) (.csv, 5) (.markdown, 4) (.eot, 4) (.mobi, 3) (.docx, 3) (.epub, 3) (.woff, 3) (.rb, 3) (.pptx, 2) (.woff2, 2) (.indd, 2) (.key, 2) (.plist, 2) (.pbxproj, 2) (.conf, 2) (.xcplayground, 2) (.storyboard, 2) (.pod, 2) (.map, 2) (.sty, 2) (.erb, 2) (.ai, 2) (.php, 2) (.xcworkspacedata, 2) (.zip, 2) (.swift, 2) (.h, 2) 

Beschreibungen analysieren

Die Beschreibungen von Repos geben oft einen Hinweis auf die mögliche Klasse, z.B. das Wort 'homework' bei HW oder 'data' bei DATA. Deshalb sollten wir die Beschreibungen automatisch analysieren z.B. bezüglich Worthäufigkeiten.

Extractor um Anzahl von Dateitypen zu sammeln

Bspw wichtig fuer EDU um zu gucken ob md drin ist.

  • Anzahl von Dateiendungen X im Verhaeltnis zu allen im Repo
  • Dateigroesse der Dateiendungen X im Verhaeltnis zu allen im Repo
  • languages.yml verwenden

Abgabe

  • CLI in README erwaehnen (Eddi)
  • #52
  • Model mit abgeben (trainiert sollte direkt mit allen Daten und nicht irgendein Split)
  • Bestes Model auswählen (nicht erstes)
  • vllt Icons austauschen?

Ausarbeitung

  • Reinschreiben dass Github API nicht alle Sprachen zurueck gibt (Eddi)
  • StandardScaler erklaeren
  • EDU Repositories aus additional data sets sind raus wenn HW drin ist (davor waren auch HW aus Prof sicht bei EDU)
  • Wichtige Keywords mit statistischer Analyse herausgefunden
  • Languages YML zitieren (type wird verwendet fuer programming = dev bspw) (Eddi)
  • Future Work: Error Handling

Improvements for presentation

  • Nicer font
  • Shadows around containers of classification input
  • Action-bar colorized
  • Histogram - Adjust dropdown with to visualization width
  • Make result of single classification nicer (e.g. bar chart)
  • Make random repo selection nicer
  • Icons

Mit fehlenden Repos umgehen

Falls in der importierten List bestimmte Repos nicht mehr existieren, sollten diese beim Trainieren und Klassifizieren übersprungen und (eventuell) eine entsprechende Meldung ausgegeben werden.

Kategorien balancieren

  • bspw indem man Repos von Kategorien dupliziert, sodass von jeder gleich viele dabei sind
  • nur mit doppelten Trainieren, mit normalem evaluieren

HW Extractors

  • Last commit als unnoetig erachtet, ActiveTime ist vllt sinnvoller
  • Keywords einfuegen aus features.md

Allgemeine FeatureExtractors implementieren

Für die allgemeinen Features sollten wir jeweils einen FeatureExtractor schreiben:

  • Anzahl Forks
  • Anzahl Contributors
  • Anzahl Commits
  • Anzahl Branches
  • Anzahl Dateien
  • Anzahl Stars
  • Anzahl Offene Issues
  • Ist Fork?
  • LastUpdated - CreatedAt
  • Size
  • Anzahl Watches
  • Hat Issues?
  • Hat Downloads?
  • Hat Wiki?
  • Hat Pages? - kann ueber api_repo._rawData.get('has_pages') geholt werden
  • Enthält bestimmtes Wort in der Beschreibung z.B. "homework"

Bitte weitere ergänzen siehe z.B. https://api.github.com/repos/angular/angular

WEB Extractors

  • index.html im Hauptverzeichnis
  • Hauptsaechlich Web Programmiersprachen (HTML, JS, CSS)
  • URL in Repo-Beschreibung
  • Keywords untersuchen und mitaufnehmen
  • font, assets, sass, css, js, img, _data-Ordner

Model abspeichern

  • repo wird ausserhalb von extractorn aufgerufen (#7 (comment))
  • vllt korrektes Ergebnis ausgeben, ist evtl. aber anderer Issue (siehe unten) passiert schon im classifier

Caching von Feature Werten

Wir sollten uns überlegen, wie wir die Feature Werte von einzelnen Repos cachen können, damit wir nicht vor jedem Training alle Daten neu laden müssen.

EDU Extractors

  • Keywords einfuegen
  • Anzahl md blocked by #29
  • committer hat uni/edu Mail Adresse -> erstmal draussen weil Daten vermutlich nicht gut genug sind (nicht jeder nutzt edu mail oder hat Uni eingetragen)

Clean up

  • Klassen vereinigen
  • Kategorie entfernen

Probleme mit Python 3.6

Die Abhaengigkeit tablib kann nicht mit python 3.6 installiert werden

  • Bei Dokumentation beachten
  • Abhaengigkeiten beheben

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.