redis 分布式锁的 5个坑 Redission的Rlock trylock方法

84 0 0

RLock tryLock leaseTime
在 Redission 通过续约机制，每隔一段时间去检测锁是否还在进行，如果还在运行就将对应的 key 增加一定的时间，保证在锁运行的情况下不会发生 key 到了过期时间自动删除的情况

RLock tryLock WRONGTYPE Operation against a key holding the wrong kind of value
原因：用的方法与redis服务器中存储数据的类型存在冲突。
比如：有一个key的数据存储的是list类型的，但使用redis执行数据操作的时候却使用了非list的操作方法。

RLock和Lock获取锁的方法：关键是：long leaseTime参数，自动超时时间的设置，解决finally异常导致锁未正常释放的情况。
该RLock接口主要继承了Lock接口还有其他Redisson, 并扩展了部分方法, 比如:boolean tryLock(long waitTime, long leaseTime, TimeUnit unit)新加入的leaseTime主要是用来设置锁的过期时间, 如果超过leaseTime还没有解锁的话, redis就强制解锁. leaseTime的默认时间是30s

RLock.java
Returns true as soon as the lock is acquired. If the lock is currently held by another thread in this or any other process in the distributed system this method keeps trying to acquire the lock for up to waitTime before giving up and returning false. If the lock is acquired, it is held until unlock is invoked, or until leaseTime have passed since the lock was granted - whichever comes first.
Params:
waitTime – the maximum time to aquire the lock
leaseTime – lease time
unit – time unit
Returns:
true if lock has been successfully acquired
Throws:
InterruptedException – - if the thread is interrupted before or during this method.

##中文翻译
获取锁后立即返回true。如果锁当前由分布式系统中此进程或任何其他进程中的另一个线程持有，则此方法在放弃并返回false之前，会尝试获取锁长达waitTime。
如果获得了锁，它将一直被持有，直到调用解锁，或者直到自授予锁以来已经过了租赁时间——以先到者为准。
参数：
waitTime–获取锁的最长时间
租赁时间-租赁时间
单位-时间单位
退货：
如果已成功获取锁，则为true
投掷：
InterruptedException–如果线程在此方法之前或期间中断。

##方法
boolean tryLock(long waitTime, long leaseTime, TimeUnit unit) throws InterruptedException;




Lock.java
Acquires the lock if it is free within the given waiting time and the current thread has not been interrupted.
If the lock is available this method returns immediately with the value true. If the lock is not available then the current thread becomes disabled for thread scheduling purposes and lies dormant until one of three things happens:
The lock is acquired by the current thread; or
Some other thread interrupts the current thread, and interruption of lock acquisition is supported; or
The specified waiting time elapses
If the lock is acquired then the value true is returned.
##中文翻译
如果锁在给定的等待时间内空闲并且当前线程未被中断，则获取锁。
如果锁可用，则此方法立即返回值true。如果锁不可用，则出于线程调度目的，当前线程将被禁用，并处于休眠状态，直到发生以下三种情况之一：
锁由当前线程获取；或
其他线程中断当前线程，支持中断锁获取；或
指定的等待时间已过
如果获取了锁，则返回值true。

If the current thread:
has its interrupted status set on entry to this method; or
is interrupted while acquiring the lock, and interruption of lock acquisition is supported,
then InterruptedException is thrown and the current thread's interrupted status is cleared.
If the specified waiting time elapses then the value false is returned. If the time is less than or equal to zero, the method will not wait at all.
Implementation Considerations
##中文翻译
如果当前线程：
在进入此方法时设置其中断状态；或
在获取锁的同时被中断并且支持锁获取的中断，
则抛出InterruptedException，并清除当前线程的中断状态。
如果经过了指定的等待时间，则返回值false。如果时间小于或等于零，则该方法根本不会等待。
实施注意事项


The ability to interrupt a lock acquisition in some implementations may not be possible, and if possible may be an expensive operation. The programmer should be aware that this may be the case. An implementation should document when this is the case.
An implementation can favor responding to an interrupt over normal method return, or reporting a timeout.
A Lock implementation may be able to detect erroneous use of the lock, such as an invocation that would cause deadlock, and may throw an (unchecked) exception in such circumstances. The circumstances and the exception type must be documented by that Lock implementation.
Params:
time – the maximum time to wait for the lock
unit – the time unit of the time argument
##中文翻译
在某些实现中，中断锁获取的能力可能是不可能的，如果可能的话，这可能是一项昂贵的操作。程序员应该意识到这可能是事实。在这种情况下，实施应该记录下来。
一个实现可能更倾向于响应中断而不是正常的方法返回，或者报告超时。
Lock实现可能能够检测锁的错误使用，例如会导致死锁的调用，并在这种情况下抛出（未检查的）异常。该Lock实现必须记录情况和异常类型。
参数：
time–等待锁的最长时间
unit–时间参数的时间单位

Returns:
true if the lock was acquired and false if the waiting time elapsed before the lock was acquired
Throws:
InterruptedException – if the current thread is interrupted while acquiring the lock (and interruption of lock acquisition is supported)
##中文翻译
退货：
如果获取了锁，则为true，如果在获取锁之前经过了等待时间，则为false
投掷：
InterruptedException-如果当前线程在获取锁时中断（并且支持中断锁获取）

##方法
boolean tryLock(long time, TimeUnit unit) throws InterruptedException;

补充：
redis 分布式锁的 5个坑
1.锁未被释放
拿到锁的线程处理完业务及时释放锁，如果是重入锁未拿到锁后，线程可以释放当前连接并且sleep一段时间。
RLock lock = redissonClient.getLock(“stockLock”);
finally {
lock.unlock();
}

// 释放当前redis连接
redis.close();
// 休眠1000毫秒
sleep(1000);

2.B的锁被A给释放了
Redis实现锁的原理在于 SETNX命令。当 key不存在时将 key的值设为 value ，返回值为 1；若给定的 key 已经存在，则 SETNX不做任何动作，返回值为 0
SETNX key value
A、B两个线程来尝试给key myLock加锁，A线程先拿到锁（假如锁3秒后过期），B线程就在等待尝试获取锁，到这一点毛病没有。
那如果此时业务逻辑比较耗时，执行时间已经超过redis锁过期时间，这时A线程的锁自动释放（删除key），B线程检测到myLock这个key不存在，执行 SETNX命令也拿到了锁。
但是，此时A线程执行完业务逻辑之后，还是会去释放锁（删除key），这就导致B线程的锁被A线程给释放了。
为避免上边的情况，一般我们在每个线程加锁时要带上自己独有的value值来标识，只释放指定value的key，否则就会出现释放锁混乱的场景。

3.数据库事务超时

@Transaction
   public void lock() {
   
        while (true) {
            boolean flag = this.getLock(key);
            if (flag) {
                insert();
            }
        }
    }

比如：我们解析一个大文件，再将数据存入到数据库，如果执行时间太长，就会导致事务超时自动回滚。
一旦你的key长时间获取不到锁，获取锁等待的时间远超过数据库事务超时时间，程序就会报异常。
一般为解决这种问题，我们就需要将数据库事务改为手动提交、回滚事务。

@Autowired
    DataSourceTransactionManager dataSourceTransactionManager;
	
    @Transaction
    public void lock() {
        //手动开启事务
        TransactionStatus transactionStatus = dataSourceTransactionManager.getTransaction(transactionDefinition);
        try {
            while (true) {
                boolean flag = this.getLock(key);
                if (flag) {
                    insert();
                    //手动提交事务
                    dataSourceTransactionManager.commit(transactionStatus);
                }
            }
        } catch (Exception e) {
            //手动回滚事务
            dataSourceTransactionManager.rollback(transactionStatus);
        }
	}

4.锁过期了，业务还没执行完
我们可以在加锁的时候，手动调长redis锁的过期时间，可这个时间多长合适？业务逻辑的执行时间是不可控的，调的过长又会影响操作性能。
要是redis锁的过期时间能够自动续期就好了
为了解决这个问题我们使用redis客户端redisson，redisson很好的解决了redis在分布式环境下的一些棘手问题，它的宗旨就是让使用者减少对Redis的关注，将更多精力用在处理业务逻辑上。
redisson在加锁成功后，会注册一个定时任务监听这个锁，每隔10秒就去查看这个锁，如果还持有锁，就对过期时间进行续期。默认过期时间30秒。这个机制也被叫做：“看门狗”
举例子：假如加锁的时间是30秒，过10秒检查一次，一旦加锁的业务没有执行完，就会进行一次续期，把锁的过期时间再次重置成30秒。

5.redis主从复制
redis cluster集群环境下，假如现在A客户端想要加锁，它会根据路由规则选择一台master节点写入key mylock，在加锁成功后，master节点会把key异步复制给对应的slave节点。
如果此时redis master节点宕机，为保证集群可用性，会进行主备切换，slave变为了redis master。B客户端在新的master节点上加锁成功，而A客户端也以为自己还是成功加了锁的。
此时就会导致同一时间内多个客户端对一个分布式锁完成了加锁，导致各种脏数据的产生。
至于解决办法嘛，目前看还没有什么根治的方法，只能尽量保证机器的稳定性，减少发生此事件的概率

如果在某个时间进行主备切换，很有可能在预备slave 上还没有master节点的锁。具体流程如下
1.Redis的master节点上拿到了锁；
2.但是这个加锁的key还没有同步到slave节点；
3.master故障，发生故障转移，slave节点升级为master节点；
最终导致导致锁丢失。
在这个背景下，Redis作者antirez基于分布式环境下提出了一种更高级的分布式锁的实现方式：Redlock。

我理解的算法大致如下
假设有N个Redis 节点。这些节点完全互相独立，不存在主从复制或者其他集群协调机制。确保将在N个实例上使用与在Redis单实例下相同方法获取和释放锁。
这里重点就是完全互相独立！

演示一下简单操作:

RedissonRedLock redLock = new RedissonRedLock(lock1, lock2, lock3);
// 这里的lock1  lock2   lock3 就是从各个redis节点获取的 锁
boolean isLock;
try {
    isLock = redLock.tryLock(500, 30000, TimeUnit.MILLISECONDS);
    System.out.println("isLock = "+isLock);
    if (isLock) {
        //TODO if get lock success, do something;
        Thread.sleep(30000);
    }
} catch (Exception e) {
} finally {
    // 无论如何, 最后都要解锁
    System.out.println("");
    redLock.unlock();
}

最大的变化就是RedLock 的初始化RedissonRedLock redLock = new RedissonRedLock(lock1, lock2, lock3); 这里选择的是三个节点，可以选择多个。

# 随笔