一、背景
关于Redis双向同步预研,应用在接入Dynomite之后出现Redis连接异常情况,抛出 Unexpected end of stream。
二、结论
- Dynomite关于Redis指令支持 中表示不支持
TIME指令 - spring-data-redis 1.6.0.RELEASE 中当执行
PEXPIRE有效期超过Integer.MAX_VALUE,避免溢出,会采用TIME指令获取Redis Server时间,并基于此进行叠加 - 升级 spring-data-redis 版本可解决该问题(验证 1.8.16.RELEASE 可解决)
三、过程
3.1 现象
应用在接入Dynomite之后,调用某接口出现Redis连接异常。异常日志如下
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40)
at redis.clients.jedis.Protocol.process(Protocol.java:151)
at redis.clients.jedis.Protocol.read(Protocol.java:215)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:340)
at redis.clients.jedis.Connection.getIntegerReply(Connection.java:265)
at redis.clients.jedis.BinaryJedis.expire(BinaryJedis.java:436)
at org.springframework.data.redis.connection.jedis.JedisConnection.expire(JedisConnection.java:807)
3.2 定位
确认Redis客户端配置
进入应用终端,使用Arthas获取 JedisConnectionFactory 信息
java -jar arthas-boot.jar
# 获取RequestMappingHandlerAdapter索引
tt -t org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter invokeHandleMethod
# 根据RequestMappingHandlerAdapter索引获取Spring上下文
tt -i [index] -w 'target.getApplicationContext().getBean("jedisConnectionFactory").getConnection()'
经确认Redis配置无误,确认是Dynomite客户端配置。
根据异常堆栈定位问题
其实现象中的堆栈隐藏了一部分细节,经过Debug可以发现在执行堆栈函数之前,已经有Redis命令执行。
重复Debug确认问题函数,发现问题函数在于 JedisConnection.expire(byte[], long)
// spring-data-redis 1.6.0.RELEASE
public class JedisConnection extends AbstractRedisConnection {
public Boolean expire(byte[] key, long millis) {
/*
* @see DATAREDIS-286 to avoid overflow in Jedis
*
* TODO Remove this workaround when we upgrade to a Jedis version that contains a
* fix for: https://github.com/xetorthio/jedis/pull/575
*/
if (millis > Integer.MAX_VALUE) {
// LINE 982 at JedisConnection
// time() 函数有问题
return pExpireAt(key, time() + millis);
}
try {
if (isPipelined()) {
pipeline(new JedisResult(pipeline.expire(key, (int) seconds), JedisConverters.longToBoolean()));
return null;
}
if (isQueueing()) {
transaction(new JedisResult(transaction.expire(key, (int) seconds), JedisConverters.longToBoolean()));
return null;
}
return JedisConverters.toBoolean(jedis.expire(key, (int) seconds));
} catch (Exception ex) {
throw convertJedisAccessException(ex);
}
}
}
public class BinaryJedis {
public List<String> time() {
checkIsInMultiOrPipeline();
client.time();
return client.getMultiBulkReply();
}
}
public class BinaryClient {
public void time() {
// 问题根源
sendCommand(TIME);
}
}
最终确认问题
直连Dynomite客户端,执行TIME指令,服务器直接断开连接。
抓包结果也如此。 FIN 为 Dynomite 主动断开连接,而后客户端再起发起请求时(即catch之后的expire指令)Dynomite返回RST
问题解决
升级spring-data-redis版本
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-redis</artifactId>
<!--<version>1.6.0.RELEASE</version>-->
<version>1.8.16.RELEASE</version>
</dependency>
该版本中 JedisConnection.expire(byte[], long) 源码如下
public Boolean expire(byte[] key, long seconds) {
Assert.notNull(key, "Key must not be null!");
if (seconds > Integer.MAX_VALUE) {
return pExpire(key, TimeUnit.SECONDS.toMillis(seconds));
}
try {
if (isPipelined()) {
pipeline(new JedisResult(pipeline.expire(key, (int) seconds), JedisConverters.longToBoolean()));
return null;
}
if (isQueueing()) {
transaction(new JedisResult(transaction.expire(key, (int) seconds), JedisConverters.longToBoolean()));
return null;
}
return JedisConverters.toBoolean(jedis.expire(key, (int) seconds));
} catch (Exception ex) {
throw convertJedisAccessException(ex);
}
}
重新Debug 抓包验证,TCP请求正常