Giter VIP home page Giter VIP logo

Comments (11)

wklken avatar wklken commented on August 30, 2024

优先级提升, 1.15 或者后一个版本需要出去

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024

难点:

  • 如何保证并发的情况下, 在删除数据的同时, 可能数据已经被读取但是还没有被引用, 删除后再被引用的情况, 没有外键约束

原则:

  • 可以删除失败, 但是不能错误的删除

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024

可能需要进行到Y2022M02

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024

问题描述:

不考虑并发的情况下直接删除未被引用的expression:

DELETE FROM expression WHERE pk NOT IN (SELECT expression_pk FROM policy)

但是在删除的过程中,如果存在并发的模板授权操作,并且能查询到正在删除的expression的数据:

SELECT
pk,
type,
expression,
signature
FROM expression
WHERE pk IN (
	SELECT
	MIN(pk)
	FROM expression
	WHERE signature IN (?)
	AND type = ?
	GROUP BY signature
)

那么policy中模板授权的数据引用的expression_pk不存在的问题,即悬空引用

问题分析

要解决以上问题,可以从2个方向考虑

  1. 查询的时候对查询得到的数据上锁,使之删除的时候被忽略 -- 行不通
  2. 删除的时候确保删除的数据一定不会被引用到 -- 可以实现

方向1: 查询的时候对查询得到的数据上锁,使之删除的时候被忽略

通过SELECT * FROM expression FOR UPDATE可以上行锁, 删除的时候只会阻塞有锁的数据,等到锁释放的时候,数据还是会被删除,暂未找到删除时忽略被锁数据的方法

方向2: 删除的时候确保删除的数据一定不会被引用到

  • 对未被引用的数据先标记, 新增一个tinyint字段作为标记, 定时任务更新未被引用的数据状态
ALTER TABLE `bkiam`.`expression` ADD COLUMN `is_quoted` TINYINT(1) DEFAULT 1 NOT NULL;

// 定时更新
UPDATE experssion SET is_quoted = 0 WHERE pk NOT IN (SELECT expression_pk FROM policy);
  • 创建表达式时, 查询复用的expression筛选被标记的数据
SELECT
pk,
type,
expression,
signature
FROM expression
WHERE pk IN (
	SELECT
	MIN(pk)
	FROM expression
	WHERE signature IN (?)
	AND type = ?
	AND is_quoted = 1  // 增加筛选条件
	GROUP BY signature
)
  • 删除时选择标记过后一段时间内的数据,并且判断pk是否被引用, 定时任务
DELETE FROM expression WHERE pk NOT IN (SELECT expression_pk FROM policy) AND is_quoted = 0 AND updated_at < ?

2次定时任务之间存在时间差, 保证查询时一定不会查到已标记的数据

from bk-iam-saas.

wklken avatar wklken commented on August 30, 2024

方案 2 改一下, 更新signature使其无法被引用

问题: 第一个sql 有没有并发问题吗?

# 1. 确保不会被引用到
UPDATE expression set signature = signature+'timestamp' WHERE pk NOT IN (SELECT expression_pk FROM policy)

# 2. 删除
DELETE from expression where signature = 'timestamp' 

有问题, signature是 32 位, 应该不能更新

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024
  • 现有的expression表中有type类型,0 自定义, 1 模板策略,如果新增一个type 为-1, 或者其它的数值,更新时直接更新为新的值, 然后,删除时做筛选
  • 在删除的同时,需要把-1的数据如果有被引用刷新回1
  • 注释一下, 如果自定义的权限未来会被引用,需要把未引用的策略更新成-1

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024
ALTER TABLE `bkiam`.`expression` MODIFY COLUMN `type` SMALLINT(5) signed NOT NULL DEFAULT 0;

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024

定时任务流程

  1. 更新因并发查询导致更新的, 更新时间为当前时间减1天
    UPDATE expression SET type=1 WHERE type=-1 AND pk IN (SELECT expression_pk FROM policy) AND updated_at < ?

  2. 删除真正未被引用的数据
    DELETE FROM expression WHERE type=-1 AND pk NOT IN (SELECT expression_pk FROM policy) AND updated_at < ?

  3. 更新未被引用的
    UPDATE expression SET type=-1 WHERE type=1 AND pk NOT IN (SELECT expression_pk FROM policy)

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024
  • saas增加调用后台清理api任务

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024

sql: IN (SELECT expression_pk FROM policy)

  • 考虑是否用join可行,是否会更好
  • 用SELECT EXISTS是否会更好

from bk-iam-saas.

zhu327 avatar zhu327 commented on August 30, 2024

调研结果:

SELECT * FROM expression WHERE pk NOT IN (SELECT expression_pk FROM policy); 2.43

SELECT * FROM expression WHERE pk NOT IN (SELECT DISTINCT expression_pk FROM policy); 2.50

ALTER TABLE `policy` ADD INDEX `idx_expression_pk` (`expression_pk`);

SELECT * FROM expression LEFT JOIN policy ON expression.pk = policy.expression_pk WHERE policy.expression_pk IS NULL; 3.40

SELECT * FROM expression WHERE NOT EXISTS (SELECT * from policy WHERE expression.pk=policy.expression_pk); 2.73

使用join或者exists都依赖于新建的索引,policy/expression表数量都大于150万

from bk-iam-saas.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.