Giter VIP home page Giter VIP logo

minicrawler's People

Contributors

ozzyczech avatar pracj3am avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

minicrawler's Issues

build check libicuuc

hi
configure is checking libicuc til uidna_nameToASCII_63. current release is 68.
i already have version 64 installed and therefore changed configure.ac to 64, wondering if it works. it compiled correctly at least

have fun.

Crawl Example

@pracj3am
Hey, Thanks for putting this project on Github. It is very interesting and i am glad i found it.

I have tried the example provided in README.md and also minicrawler executable in order to check the program but it seems that only first page is fetched, then program exits.

Is that example correct ? It does not extracts urls to crawl.

Thanks

OpenSSL problem on Mac

OpenSSL is deprecated deprecated in OS X 10.7

http://stackoverflow.com/questions/7406946/why-is-apple-deprecating-openssl-in-macos-10-7-lion

src/crawler.c:34:4: error: "please install OpenSSL 1.0.1"
#  error "please install OpenSSL 1.0.1"
   ^
src/crawler.c:86:20: warning: 'SSL_ctrl' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        const long opts = SSL_get_options(u->ssl);
                          ^
/usr/include/openssl/ssl.h:592:9: note: expanded from macro 'SSL_get_options'
        SSL_ctrl((ssl),SSL_CTRL_OPTIONS,0,NULL)
        ^
/usr/include/openssl/ssl.h:1512:6: note: 'SSL_ctrl' has been explicitly marked deprecated here
long    SSL_ctrl(SSL *ssl,int cmd, long larg, void *parg) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:92:13: error: use of undeclared identifier 'SSL_OP_NO_TLSv1_1'
        if (opts & SSL_OP_NO_TLSv1_1) {
                   ^
src/crawler.c:95:20: error: use of undeclared identifier 'SSL_OP_NO_TLSv1_2'
        } else if (opts & SSL_OP_NO_TLSv1_2) {
                          ^
src/crawler.c:96:21: error: use of undeclared identifier 'SSL_OP_NO_TLSv1_1'
                u->ssl_options |= SSL_OP_NO_TLSv1_1;
                                  ^
src/crawler.c:99:21: error: use of undeclared identifier 'SSL_OP_NO_TLSv1_2'
                u->ssl_options |= SSL_OP_NO_TLSv1_2;
                                  ^
src/crawler.c:114:16: warning: 'SSL_connect' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        const int t = SSL_connect(u->ssl);
                      ^
/usr/include/openssl/ssl.h:1508:6: note: 'SSL_connect' has been explicitly marked deprecated here
int     SSL_connect(SSL *ssl) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:121:21: warning: 'SSL_get_error' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
    const int err = SSL_get_error(u->ssl, t);
                    ^
/usr/include/openssl/ssl.h:1517:5: note: 'SSL_get_error' has been explicitly marked deprecated here
int     SSL_get_error(const SSL *s,int ret_code) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:159:14: warning: 'ERR_get_error' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        while ((e = ERR_get_error())) {
                    ^
/usr/include/openssl/err.h:266:15: note: 'ERR_get_error' has been explicitly marked deprecated here
unsigned long ERR_get_error(void) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
              ^
src/crawler.c:160:36: warning: 'ERR_error_string' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                debugf("[%d]\t\t%s\n", u->index, ERR_error_string(e, NULL));
                                                 ^
src/h/proto.h:9:50: note: expanded from macro 'debugf'
#define debugf(...)   {if(debug) fprintf(stderr, __VA_ARGS__);}
                                                 ^
/usr/include/openssl/err.h:279:7: note: 'ERR_error_string' has been explicitly marked deprecated here
char *ERR_error_string(unsigned long e,char *buf) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
      ^
src/crawler.c:167:58: warning: 'ERR_reason_error_string' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                        sprintf(u->error_msg + strlen(u->error_msg), " (%s)", ERR_reason_error_string(last_e));
                                                                              ^
/usr/include/secure/_stdio.h:47:56: note: expanded from macro 'sprintf'
  __builtin___sprintf_chk (str, 0, __darwin_obsz(str), __VA_ARGS__)
                                                       ^
/usr/include/openssl/err.h:283:13: note: 'ERR_reason_error_string' has been explicitly marked deprecated here
const char *ERR_reason_error_string(unsigned long e) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
            ^
src/crawler.c:174:3: warning: 'SSL_shutdown' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                SSL_shutdown(u->ssl);
                ^
/usr/include/openssl/ssl.h:1548:5: note: 'SSL_shutdown' has been explicitly marked deprecated here
int SSL_shutdown(SSL *s) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
    ^
src/crawler.c:175:3: warning: 'SSL_free' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                SSL_free(u->ssl);
                ^
/usr/include/openssl/ssl.h:1506:6: note: 'SSL_free' has been explicitly marked deprecated here
void    SSL_free(SSL *ssl) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:190:16: warning: 'SSL_read' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        const int t = SSL_read(u->ssl, buf, size);
                      ^
/usr/include/openssl/ssl.h:1509:6: note: 'SSL_read' has been explicitly marked deprecated here
int     SSL_read(SSL *ssl,void *buf,int num) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:203:18: warning: 'SSL_get_error' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        const int err = SSL_get_error(u->ssl, t);
                        ^
/usr/include/openssl/ssl.h:1517:5: note: 'SSL_get_error' has been explicitly marked deprecated here
int     SSL_get_error(const SSL *s,int ret_code) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:220:16: warning: 'ERR_get_error' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                        while ((e = ERR_get_error())) {
                                    ^
/usr/include/openssl/err.h:266:15: note: 'ERR_get_error' has been explicitly marked deprecated here
unsigned long ERR_get_error(void) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
              ^
src/crawler.c:221:38: warning: 'ERR_error_string' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                                debugf("[%d]\t\t%s\n", u->index, ERR_error_string(e, NULL));
                                                                 ^
src/h/proto.h:9:50: note: expanded from macro 'debugf'
#define debugf(...)   {if(debug) fprintf(stderr, __VA_ARGS__);}
                                                 ^
/usr/include/openssl/err.h:279:7: note: 'ERR_error_string' has been explicitly marked deprecated here
char *ERR_error_string(unsigned long e,char *buf) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
      ^
src/crawler.c:230:34: warning: 'ERR_reason_error_string' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                                sprintf(errbuf + n, " (%s)", ERR_reason_error_string(last_e));
                                                             ^
/usr/include/secure/_stdio.h:47:56: note: expanded from macro 'sprintf'
  __builtin___sprintf_chk (str, 0, __darwin_obsz(str), __VA_ARGS__)
                                                       ^
/usr/include/openssl/err.h:283:13: note: 'ERR_reason_error_string' has been explicitly marked deprecated here
const char *ERR_reason_error_string(unsigned long e) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
            ^
src/crawler.c:247:16: warning: 'SSL_write' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        const int t = SSL_write(u->ssl, buf, size);
                      ^
/usr/include/openssl/ssl.h:1511:6: note: 'SSL_write' has been explicitly marked deprecated here
int     SSL_write(SSL *ssl,const void *buf,int num) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:252:18: warning: 'SSL_get_error' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        const int err = SSL_get_error(u->ssl, t);
                        ^
/usr/include/openssl/ssl.h:1517:5: note: 'SSL_get_error' has been explicitly marked deprecated here
int     SSL_get_error(const SSL *s,int ret_code) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:269:16: warning: 'ERR_get_error' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                        while ((e = ERR_get_error())) {
                                    ^
/usr/include/openssl/err.h:266:15: note: 'ERR_get_error' has been explicitly marked deprecated here
unsigned long ERR_get_error(void) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
              ^
src/crawler.c:270:38: warning: 'ERR_error_string' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                                debugf("[%d]\t\t%s\n", u->index, ERR_error_string(e, NULL));
                                                                 ^
src/h/proto.h:9:50: note: expanded from macro 'debugf'
#define debugf(...)   {if(debug) fprintf(stderr, __VA_ARGS__);}
                                                 ^
/usr/include/openssl/err.h:279:7: note: 'ERR_error_string' has been explicitly marked deprecated here
char *ERR_error_string(unsigned long e,char *buf) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
      ^
src/crawler.c:275:34: warning: 'ERR_reason_error_string' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                                sprintf(errbuf + n, " (%s)", ERR_reason_error_string(last_e));
                                                             ^
/usr/include/secure/_stdio.h:47:56: note: expanded from macro 'sprintf'
  __builtin___sprintf_chk (str, 0, __darwin_obsz(str), __VA_ARGS__)
                                                       ^
/usr/include/openssl/err.h:283:13: note: 'ERR_reason_error_string' has been explicitly marked deprecated here
const char *ERR_reason_error_string(unsigned long e) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
            ^
src/crawler.c:622:13: warning: 'SSL_new' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        SSL *ssl = SSL_new(mossad());
                   ^
/usr/include/openssl/ssl.h:1497:7: note: 'SSL_new' has been explicitly marked deprecated here
SSL *   SSL_new(SSL_CTX *ctx) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:623:14: warning: 'BIO_new_socket' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        BIO *sbio = BIO_new_socket(u->sockfd, BIO_NOCLOSE);
                    ^
/usr/include/openssl/bio.h:685:6: note: 'BIO_new_socket' has been explicitly marked deprecated here
BIO *BIO_new_socket(int sock, int close_flag) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
     ^
src/crawler.c:624:2: warning: 'SSL_set_bio' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        SSL_set_bio(ssl, sbio, sbio);
        ^
/usr/include/openssl/ssl.h:1391:6: note: 'SSL_set_bio' has been explicitly marked deprecated here
void    SSL_set_bio(SSL *s, BIO *rbio,BIO *wbio) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:625:2: warning: 'SSL_ctrl' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        SSL_set_options(ssl, u->ssl_options);
        ^
/usr/include/openssl/ssl.h:588:2: note: expanded from macro 'SSL_set_options'
        SSL_ctrl((ssl),SSL_CTRL_OPTIONS,(op),NULL)
        ^
/usr/include/openssl/ssl.h:1512:6: note: 'SSL_ctrl' has been explicitly marked deprecated here
long    SSL_ctrl(SSL *ssl,int cmd, long larg, void *parg) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:626:2: warning: 'SSL_ctrl' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
        SSL_set_tlsext_host_name(ssl, u->host);
        ^
/usr/include/openssl/tls1.h:157:42: note: expanded from macro 'SSL_set_tlsext_host_name'
#define SSL_set_tlsext_host_name(s,name) \
                                         ^
/usr/include/openssl/ssl.h:1512:6: note: 'SSL_ctrl' has been explicitly marked deprecated here
long    SSL_ctrl(SSL *ssl,int cmd, long larg, void *parg) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:1157:3: warning: 'MD5_Init' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                MD5_Init(&context);
                ^
/usr/include/openssl/md5.h:113:5: note: 'MD5_Init' has been explicitly marked deprecated here
int MD5_Init(MD5_CTX *c) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
    ^
src/crawler.c:1158:3: warning: 'MD5_Update' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                MD5_Update(&context, u->buf + u->headlen, u->bufp - u->headlen);
                ^
/usr/include/openssl/md5.h:114:5: note: 'MD5_Update' has been explicitly marked deprecated here
int MD5_Update(MD5_CTX *c, const void *data, size_t len) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
    ^
src/crawler.c:1159:3: warning: 'MD5_Final' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                MD5_Final(HEntity, &context);
                ^
/usr/include/openssl/md5.h:115:5: note: 'MD5_Final' has been explicitly marked deprecated here
int MD5_Final(unsigned char *md, MD5_CTX *c) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
    ^
src/crawler.c:1829:4: warning: 'SSL_free' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                        SSL_free(u->ssl);
                        ^
/usr/include/openssl/ssl.h:1506:6: note: 'SSL_free' has been explicitly marked deprecated here
void    SSL_free(SSL *ssl) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:1947:6: warning: 'SSL_free' is deprecated: first deprecated in OS X 10.7 [-Wdeprecated-declarations]
                                        SSL_free(u->ssl);
                                        ^
/usr/include/openssl/ssl.h:1506:6: note: 'SSL_free' has been explicitly marked deprecated here
void    SSL_free(SSL *ssl) DEPRECATED_IN_MAC_OS_X_VERSION_10_7_AND_LATER;
        ^
src/crawler.c:2099:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                read:plain_read,
                ^~~~~
                .read = 
src/crawler.c:2100:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                write:plain_write,
                ^~~~~~
                .write = 
src/crawler.c:2101:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                parse_url:parseurl,
                ^~~~~~~~~~
                .parse_url = 
src/crawler.c:2102:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                launch_dns:launchdns,
                ^~~~~~~~~~~
                .launch_dns = 
src/crawler.c:2103:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                check_dns:checkdns,
                ^~~~~~~~~~
                .check_dns = 
src/crawler.c:2104:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                open_socket:opensocket,
                ^~~~~~~~~~~~
                .open_socket = 
src/crawler.c:2105:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                connect_socket:connectsocket,
                ^~~~~~~~~~~~~~~
                .connect_socket = 
src/crawler.c:2106:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                handshake:empty_handshake,
                ^~~~~~~~~~
                .handshake = 
src/crawler.c:2107:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                gen_request:genrequest,
                ^~~~~~~~~~~~
                .gen_request = 
src/crawler.c:2108:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                send_request:sendrequest,
                ^~~~~~~~~~~~~
                .send_request = 
src/crawler.c:2109:3: warning: use of GNU old-style field designator extension [-Wgnu-designator]
                recv_reply:readreply,
                ^~~~~~~~~~~
                .recv_reply = 
39 warnings and 5 errors generated.
make: *** [src/crawler.lo] Error 1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.