Giter VIP home page Giter VIP logo

china_regions's Introduction

china_regions

最全最新**省,市,地区 json 及 sql 数据,自动抓取国标 http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/ 数据,并且自动生成 JavaScript es6 module 以及 sql 数据。

最新国标行政区规划最低到居委会这一级别了,行政区代码代码也变长了,不包含港澳台信息,按需所需不同的版本,见 https://github.com/wecatch/china_regions/releases

演示地址

http://wecatch.me/china_regions/

如何使用

数据分 json、es6 module、sql 三种格式存储,es6 module 和 sql 是根据 json 自动生成,json 数据又是根据最新国标生成,

├── js              # js module 格式
├── json            # json 格式
├── mysql           # mysql sql 格式

直接拷贝 json 和 es6 文件可直接使用,也可以根据对应的语言生成不同的模块。

Village 数据文件特别大,默认不包含在仓库中,可以 clone 仓库,在 src 中解压 village 的压缩文件,然后执行 python makedata.py

如何更新到最新国标

仓库中的现在的数据是根据最新国标生成,如果在使用中发现国标有变动,可以手动进行更新,需要有 node8 或更高环境:

  1. git clone 本仓库
  2. yarn install 或者 npm install
  3. 移除 src 目录下的 json 文件:
    ├── city.json
    ├── country.json
    ├── province.json
    ├── source.json
    ├── town.json
    └── village.json
  1. 打开 main.js 文件,取消对 main 函数执行的注释,开始执行 node main.js,一般情况下可以顺利爬取到 province、 city、country 的信息
  2. 利用已经爬取的 province、city、country 开始同步其他行政区域的信息,注释掉 main 函数根据需要分别打开 pullTownDataSync、pullVillageDataSync 爬取其他行政区域的信息,注释事项见函数注释
  3. 最后执行 python makedata.py 生成各种格式文件

注意事项

根据 town 爬取的 village 数据非常大,默认情况下不会自动生成 village 的信息,可以根据自己的需要 clone 仓库之后自己生成

行政级别顺序是:province-> city --> country --> town --> village,对应的是:省->市(市辖区)->县(区、市)->镇(街道)->村(居委会)

爬取 village 时由于数据量特别大会导致 nodejs 出现内存泄漏的情况,所以每次增量更新文件时会自动进行文件备份,生成 src/village_backup.json 备份文件不进仓库,最后再手动干预偏移量

village 的数据文件是压缩过的解压执行 tar xvfz village.tar.gz .

默认情况下不生成 village 这个级别的数据,如果需要请执行 makedata.py

反馈

如果国标页面 html 结构发生变化,请提 issue。

更新记录

2021.2.23

更新到 http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/2020/index.html 2020 最新数据

2019.4.17

fix #17 针对东莞市 中山市 儋州市三个不设区的市单独处理, 这三个市没有区,直接到镇 town,镇的上一级就是市,开发者可以根据自己的情况特殊处理,详见 src/special_city.json,SQL 数据包含在 town.sql 中

2019.4.9

2019.3.10

  • 更新数据生成的方式
  • 校验数据生成是否准确 cat src/village.json | grep id | wc -l == wc -l mysql/village.sql

2019.2.11

  • 更新数据抓取方式,使用 nodejs 抓取
  • 更新数据到最新的 2018 国标
  • 移除对 sqlite 以及 postgresql 的

china_regions's People

Contributors

dependabot[bot] avatar edwardzjl avatar jayl1n avatar mofelee avatar zhyq0826 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

china_regions's Issues

sql文件中天津市数据重复

你好,很感谢你提供的数据,我发现county.sql文件中,天津市数据重复,town.sql文件中,天津市数据重复,望修改。

按照操作步骤出现异常

删除json文件之后提示文件不存在

PS F:\china_regions-3.3> node main.js
2020-09-09T11:07:58+0800 main.js:51 (newRequestPromise) request == > http://219.235.129.117:80/tjsj/tjbz/tjyqhdmhcxhfdm/2018/
undefined:1

SyntaxError: Unexpected end of JSON input
at JSON.parse ()
at pullVillageDataSync (F:\china_regions-3.3\main.js:391:25)
at Object. (F:\china_regions-3.3\main.js:422:1)
at Module._compile (internal/modules/cjs/loader.js:1133:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1153:10)
at Module.load (internal/modules/cjs/loader.js:977:32)
at Function.Module._load (internal/modules/cjs/loader.js:877:14)
at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:74:12)
at internal/main/run_main_module.js:18:47
PS F:\china_regions-3.3> node main.js
2020-09-09T11:08:42+0800 main.js:51 (newRequestPromise) request == > http://219.235.129.117:80/tjsj/tjbz/tjyqhdmhcxhfdm/2018/

然后我在文件夹里加上了town.json

F:\china_regions-3.3\main.js:393
jsonData.slice(offset).forEach(function(element, index) {
^

TypeError: jsonData.slice is not a function
at pullVillageDataSync (F:\china_regions-3.3\main.js:393:14)
at Object. (F:\china_regions-3.3\main.js:422:1)
at Module._compile (internal/modules/cjs/loader.js:1133:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1153:10)
at Module.load (internal/modules/cjs/loader.js:977:32)
at Function.Module._load (internal/modules/cjs/loader.js:877:14)
at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:74:12)
at internal/main/run_main_module.js:18:47
PS F:\china_regions-3.3>

数据未更新,这个项目还在维护么

安徽省芜湖市弋江区是
{"id":"340203000000","name":"弋江区","url":"http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/2020/34/02/340203.html"}
最新的是
[340209000000](http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/2021/34/02/340209.html)

缺失数据

county里没有county_id=441900000000的数据 (东莞市),但是在town表里存在county_id=441900000000 的数据。

原因是东莞市没有county,从第二级(city)直接到第四级(town)了

方案:是否需要插入一条数据在county?

INSERT county VALUES(1,'东莞市','441900000000','441900000000')

信息:

SELECT * FROM town a
left JOIN county b ON  a.county_id = b.county_id
WHERE b.NAME IS null

SELECT * FROM city
WHERE city_id ='441900000000'

市辖区在第三级是不是多余的?

山东省,济南市,市辖区:这个数据不对吧?济南市下面应该直接就是历下区、槐荫区等等。怎么会多了一个市辖区呢?

而且第三级的这个市辖区,下面没有第四级的数据。

请作者去掉这个数据,是否可行?

demo bug

你那个demo有bug 选了 第一次选市后面区有数据,第二次选了东莞,中山等 后面区的数据还是之前第一个选的市的区数据

数据缺失

广东东莞,广东中山,海南三沙,等地的区县数据不存在

city.json 中可能存在错误数据

在 city.json 1102 行,海南省包含海南省

"460000": [
        {
            "province": "海南省", 
            "name": "海南省", 
            "id": "460000"
        }, 
        {
            "province": "海南省", 
            "name": "海口市", 
            "id": "460100"
        }, 
        {
            "province": "海南省", 
            "name": "三亚市", 
            "id": "460200"
        }, 
        {
            "province": "海南省", 
            "name": "三沙市", 
            "id": "460300"
        }, 
        {
            "province": "海南省", 
            "name": "儋州市", 
            "id": "460400"
        }, 
        {
            "province": "海南省", 
            "name": "省直辖县级行政区划", 
            "id": "469000"
        }
    ], 

在 area.json 中 11319 行可能存在错误

 "460000": [], 

增加东莞中山儋州

INSERT INTO country VALUES ('3283', '东城街道办事处', '441900003000', '441900000000');
INSERT INTO country VALUES ('3284', '南城街道办事处', '441900004000', '441900000000');
INSERT INTO country VALUES ('3285', '万江街道办事处', '441900005000', '441900000000');
INSERT INTO country VALUES ('3286', '莞城街道办事处', '441900006000', '441900000000');
INSERT INTO country VALUES ('3287', '石碣镇', '441900101000', '441900000000');
INSERT INTO country VALUES ('3288', '石龙镇', '441900102000', '441900000000');
INSERT INTO country VALUES ('3289', '茶山镇', '441900103000', '441900000000');
INSERT INTO country VALUES ('3290', '石排镇', '441900104000', '441900000000');
INSERT INTO country VALUES ('3291', '企石镇', '441900105000', '441900000000');
INSERT INTO country VALUES ('3292', '横沥镇', '441900106000', '441900000000');
INSERT INTO country VALUES ('3293', '桥头镇', '441900107000', '441900000000');
INSERT INTO country VALUES ('3294', '谢岗镇', '441900108000', '441900000000');
INSERT INTO country VALUES ('3295', '东坑镇', '441900109000', '441900000000');
INSERT INTO country VALUES ('3296', '常平镇', '441900110000', '441900000000');
INSERT INTO country VALUES ('3297', '寮步镇', '441900111000', '441900000000');
INSERT INTO country VALUES ('3298', '樟木头镇', '441900112000', '441900000000');
INSERT INTO country VALUES ('3299', '大朗镇', '441900113000', '441900000000');
INSERT INTO country VALUES ('3300', '黄江镇', '441900114000', '441900000000');
INSERT INTO country VALUES ('3301', '清溪镇', '441900115000', '441900000000');
INSERT INTO country VALUES ('3302', '塘厦镇', '441900116000', '441900000000');
INSERT INTO country VALUES ('3303', '凤岗镇', '441900117000', '441900000000');
INSERT INTO country VALUES ('3304', '大岭山镇', '441900118000', '441900000000');
INSERT INTO country VALUES ('3305', '长安镇', '441900119000', '441900000000');
INSERT INTO country VALUES ('3306', '虎门镇', '441900121000', '441900000000');
INSERT INTO country VALUES ('3307', '厚街镇', '441900122000', '441900000000');
INSERT INTO country VALUES ('3308', '沙田镇', '441900123000', '441900000000');
INSERT INTO country VALUES ('3309', '道滘镇', '441900124000', '441900000000');
INSERT INTO country VALUES ('3310', '洪梅镇', '441900125000', '441900000000');
INSERT INTO country VALUES ('3311', '麻涌镇', '441900126000', '441900000000');
INSERT INTO country VALUES ('3312', '望牛墩镇', '441900127000', '441900000000');
INSERT INTO country VALUES ('3313', '中堂镇', '441900128000', '441900000000');
INSERT INTO country VALUES ('3314', '高埗镇', '441900129000', '441900000000');
INSERT INTO country VALUES ('3315', '松山湖管委会', '441900401000', '441900000000');
INSERT INTO country VALUES ('3316', '东莞港', '441900402000', '441900000000');
INSERT INTO country VALUES ('3317', '东莞生态园', '441900403000', '441900000000');
INSERT INTO country VALUES ('3318', '石岐区街道办事处', '442000001000', '442000000000');
INSERT INTO country VALUES ('3319', '东区街道办事处', '442000002000', '442000000000');
INSERT INTO country VALUES ('3320', '火炬开发区街道办事处', '442000003000', '442000000000');
INSERT INTO country VALUES ('3321', '西区街道办事处', '442000004000', '442000000000');
INSERT INTO country VALUES ('3322', '南区街道办事处', '442000005000', '442000000000');
INSERT INTO country VALUES ('3323', '五桂山街道办事处', '442000006000', '442000000000');
INSERT INTO country VALUES ('3324', '小榄镇', '442000100000', '442000000000');
INSERT INTO country VALUES ('3325', '黄圃镇', '442000101000', '442000000000');
INSERT INTO country VALUES ('3326', '民众镇', '442000102000', '442000000000');
INSERT INTO country VALUES ('3327', '东凤镇', '442000103000', '442000000000');
INSERT INTO country VALUES ('3328', '东升镇', '442000104000', '442000000000');
INSERT INTO country VALUES ('3329', '古镇镇', '442000105000', '442000000000');
INSERT INTO country VALUES ('3330', '沙溪镇', '442000106000', '442000000000');
INSERT INTO country VALUES ('3331', '坦洲镇', '442000107000', '442000000000');
INSERT INTO country VALUES ('3332', '港口镇', '442000108000', '442000000000');
INSERT INTO country VALUES ('3333', '三角镇', '442000109000', '442000000000');
INSERT INTO country VALUES ('3334', '横栏镇', '442000110000', '442000000000');
INSERT INTO country VALUES ('3335', '南头镇', '442000111000', '442000000000');
INSERT INTO country VALUES ('3336', '阜沙镇', '442000112000', '442000000000');
INSERT INTO country VALUES ('3337', '南朗镇', '442000113000', '442000000000');
INSERT INTO country VALUES ('3338', '三乡镇', '442000114000', '442000000000');
INSERT INTO country VALUES ('3339', '板芙镇', '442000115000', '442000000000');
INSERT INTO country VALUES ('3340', '大涌镇', '442000116000', '442000000000');
INSERT INTO country VALUES ('3341', '神湾镇', '442000117000', '442000000000');
INSERT INTO country VALUES ('3342', '那大镇', '460400100000', '460400000000');
INSERT INTO country VALUES ('3343', '和庆镇', '460400101000', '460400000000');
INSERT INTO country VALUES ('3344', '南丰镇', '460400102000', '460400000000');
INSERT INTO country VALUES ('3345', '大成镇', '460400103000', '460400000000');
INSERT INTO country VALUES ('3346', '雅星镇', '460400104000', '460400000000');
INSERT INTO country VALUES ('3347', '兰洋镇', '460400105000', '460400000000');
INSERT INTO country VALUES ('3348', '光村镇', '460400106000', '460400000000');
INSERT INTO country VALUES ('3349', '木棠镇', '460400107000', '460400000000');
INSERT INTO country VALUES ('3350', '海头镇', '460400108000', '460400000000');
INSERT INTO country VALUES ('3351', '峨蔓镇', '460400109000', '460400000000');
INSERT INTO country VALUES ('3352', '王五镇', '460400111000', '460400000000');
INSERT INTO country VALUES ('3353', '白马井镇', '460400112000', '460400000000');
INSERT INTO country VALUES ('3354', '中和镇', '460400113000', '460400000000');
INSERT INTO country VALUES ('3355', '排浦镇', '460400114000', '460400000000');
INSERT INTO country VALUES ('3356', '东成镇', '460400115000', '460400000000');
INSERT INTO country VALUES ('3357', '新州镇', '460400116000', '460400000000');
INSERT INTO country VALUES ('3358', '洋浦经济开发区', '460400499000', '460400000000');
INSERT INTO country VALUES ('3359', '华南热作学院', '460400500000', '460400000000');

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.