Giter VIP home page Giter VIP logo

Comments (5)

ddcw avatar ddcw commented on July 26, 2024

老哥我这边执行时报这个错误,然后我在ibd2sql\mysql_json.py加了sys.setrecursionlimit(2000),运行不报错,但生成的sql文件明显有数据缺失,这个问题咋解决呢

  1. 递归深度默认应该是1000, 如果json数据很大的话, 可能确实不够, 可以加大一点递归深度(sys.setrecursionlimit(10000))
  2. 数据缺失 是这个json字段 还是 其它字段呢? 是否做过online ddl呢, mysql版本是多少呢?

from ibd2sql.

lihong-roy avatar lihong-roy commented on July 26, 2024

就是我加了sys.setrecursionlimit(10000)后运行不报错,但原本db里大约有四五百条数据,最后只生成了160多条,我怀疑是有脏数据,但我不清楚是哪一条,如果遇到脏数据能不能跳过然后去生成后面的数据呢,谢谢

from ibd2sql.

ddcw avatar ddcw commented on July 26, 2024

数据行数不对的话, 通常是坏块, 但坏块会报错啊. 如果要所有数据的话, 可以暴力解析. 参考如下: 修改filename为实际值即可.

filename="/tmp/ddcw_alltype_table.ibd" # 要解析的ibd文件名
python3 main.py ${filename} --ddl # 获取表结构信息
filesize=`stat -c %s ${filename}`
maxpagecount=$[ ${filesize} / 16384 ]
current_page=1
while [ ${current_page} -le ${maxpagecount} ];do
	echo "-- ${filename} PAGE NO: ${current_page}"; 
	current_page=$[ ${current_page} + 1 ]
	python3 main.py ${filename} --sql --page-start ${current_page} --page-count 1 2>/dev/null ; 
done

from ibd2sql.

lihong-roy avatar lihong-roy commented on July 26, 2024

老哥上面那段代码可以提供windows cmd运行的版本吗,看不太懂多谢,我尝试着python main.py file.ibd --sql --page-start 1 --page-count 60,最多还是生成跟上次一样的数据,不好意思麻烦了谢谢

from ibd2sql.

ddcw avatar ddcw commented on July 26, 2024

老哥上面那段代码可以提供windows cmd运行的版本吗,看不太懂多谢,我尝试着python main.py file.ibd --sql --page-start 1 --page-count 60,最多还是生成跟上次一样的数据,不好意思麻烦了谢谢

估计ibd文件里面就只有那点数据了, cmd命令我也不熟, 你可以参考如下的命令 (修改文件路径, 保存在.bat文件, 然后执行):
这个脚本是 一页页的解析ibd里面的数据, 效率会比较低, 但是每一页都会去解析. 不会存在数据遗漏问题.

@echo off

REM 要解析的ibd文件名
set "filename=F:\py_workspace\ddcw_alltype_table.ibd"

REM 获取表结构信息
python main.py %filename% --ddl

REM 使用PowerShell获取文件大小
for /f "usebackq" %%A in (`PowerShell -Command "(Get-Item '%filename%').length"` ) do set filesize=%%A

REM 计算最大页数
SET /A maxpagecount=%filesize% / 16384
SET /A current_page=1



REM 循环处理每一页
:loop
if %current_page% gtr %maxpagecount% (
    goto :endloop
)

REM 显示当前页
echo -- %filename% PAGE NO: %current_page%

REM 执行命令处理当前页
python main.py %filename% --sql --page-start %current_page% --page-count 1 2> nul

REM 递增当前页码
SET /A current_page=current_page+1

REM 继续下一次循环
goto :loop

:endloop





timeout /t 10000

from ibd2sql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.