colinsongf / spider_douban Goto Github PK
View Code? Open in Web Editor NEWThis project forked from yuanxiangxie/spider_douban
A python script to crawl DouBan top250 movie.
This project forked from yuanxiangxie/spider_douban
A python script to crawl DouBan top250 movie.
### 作者:Made By Yuanxiang ### 时间:2016/08/21 ### 内容:爬取豆瓣上top250的电影 ### 项目:电影名字,简介链接,电影宣传图片,星级评分以及评价人数等项目 ### 难点:处理各种编码 下面简要介绍一下该项目, 1.首先是爬取电影内容:略(这么naive的事就不说了,很简单的urllib2+BeautifulSoup实现) 2.其次介绍一下用数据库装载数据:这是一个很大的难点啊啊啊啊!!!首先你得确保自己的电脑平台的编码,然后是python环境的编码(pycharm),然后是脚本里面的保存的编码,然后是数据库的设置编码,你建立的数据库比如叫做DouBan,这个编码要对。然后是建立的数据库表的编码,写入数据库的编码,读出数据库的编码。其中一个不对劲,你就会发现网页乱码,中文乱码,数据库内容乱码等一些列的奇葩问题。我花了整整一天翻阅无数blog,stackoverflow,论坛,贴吧才解决。你们好自为之吧,可能会有小伙伴那说我也这么做但是什么都没遇到啊(鄙视脸),只能说你很幸运而已,我换台电脑也没出现这么多问题。呵呵 3.心累,不想写了
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.