小说爬虫
Kotlin
- Selenium
- JSoup
- Kotlin Serialization
- Log4j
- Apache POI
- 获取小说信息
// top.bilitianx.network.PersistKt
fun persistJSON(id: Int, filename: String)
id
:小说的IDfilename
:JSON保存的路径
温馨提示:有的章节url为“javascript:cid()”,需手动修改
运行结果示例:
{
"id": 8,
"name": "欢迎来到实力至上主义的教室",
"volumes": [
{
"name": "第一卷",
"chapters": [
{
"name": "插图",
"url": "https://www.linovelib.com/novel/8/114783.html"
},
// ......
]
}
// ......
]
}
- 爬取为Word文档
// top.bilitianx.network.PersistKt
fun persistMSWord(filename: String)
filename
:小说的JSON路径