【hbase-2.0.2】 hbase节点自己挂掉

2018-11-09 09:12:09,053 INFO  [Close-WAL-Writer-22] util.FSHDFSUtils: Recover lease on dfs file /hbase/WALs/dn1,16020,1541665707058/dn1%2C16020%2C1541665707058.1541725870076
2018-11-09 09:12:09,080 WARN  [Close-WAL-Writer-22] wal.AsyncFSWAL: close old writer failed
java.io.FileNotFoundException: File does not exist: /hbase/WALs/dn1,16020,1541665707058/dn1%2C16020%2C1541665707058.1541725870076
    at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
    at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2854)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:669)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:675)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1758)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
    at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
    at org.apache.hadoop.hdfs.DFSClient.recoverLease(DFSClient.java:1256)
    at org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:279)
    at org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:275)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.recoverLease(DistributedFileSystem.java:275)
    at org.apache.hadoop.hbase.util.FSHDFSUtils.recoverLease(FSHDFSUtils.java:283)
    at org.apache.hadoop.hbase.util.FSHDFSUtils.recoverDFSFileLease(FSHDFSUtils.java:216)
    at org.apache.hadoop.hbase.util.FSHDFSUtils.recoverFileLease(FSHDFSUtils.java:163)
    at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.recoverAndClose(FanOutOneBlockAsyncDFSOutput.java:555)
    at org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.close(AsyncProtobufLogWriter.java:156)
    at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.lambda$closeWriter$6(AsyncFSWAL.java:641)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /hbase/WALs/dn1,16020,1541665707058/dn1%2C16020%2C1541665707058.1541725870076
    at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
    at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2854)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:669)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:675)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
    at java.security.AccessController.doPrivileged(Native Method)


无标题.png

 
 
 
2018-11-10 补充
第二次挂掉的日志

2018-11-10 06:44:51,817 INFO [MemStoreFlusher.1] regionserver.HRegion: Flushing 1/1 column families, dataSize=76.84 MB heapSize=128.02 MB
2018-11-10 06:44:51,885 INFO [RpcServer.default.FPBQ.Fifo.handler=76,queue=4,port=16020] regionserver.HRegion: writing data to region 划掉,\x83(@\x8Bfniqf.0l9r.cn\x01,1541803051295.343419b1785be1a24cde6046ff1506ef. with WAL disabled. Data may be lost in the event of a crash.
2018-11-10 06:45:00,882 INFO [MemStoreFlusher.0] regionserver.HRegion: Flushing 1/1 column families, dataSize=25.54 MB heapSize=57.49 MB
2018-11-10 06:45:59,721 WARN [ResponseProcessor for block BP-738707272-划掉-1530583979305:blk_1083816520_10079749] hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-738707272-划掉-1530583979305:blk_1083816520_10079749
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2294)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:847)
2018-11-10 06:46:00,256 WARN [main-SendThread(nna:2181)] zookeeper.ClientCnxn: Client session timed out, have not heard from server in 75830ms for sessionid 0x401c6473d414cd4
2018-11-10 06:46:00,256 INFO [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=1.36 GB, freeSize=18.64 GB, max=20 GB, blockCount=11016, accesses=3009211, hits=3009211, hitRatio=100.00%, , cachingAccesses=3005897, cachingHits=3005897, cachingHitsRatio=100.00%, evictions=6778, evicted=0, evictedPerRun=0.0
2018-11-10 06:46:00,256 INFO [BucketCacheStatsExecutor] bucket.BucketCache: failedBlockAdditions=0, totalSize=32.00 GB, freeSize=31.64 GB, usedSize=369.64 MB, cacheSize=361.53 MB, accesses=27986918, hits=10715280, IOhitsPerSecond=393, IOTimePerHit=0.01, hitRatio=38.29%, cachingAccesses=10742305, cachingHits=10709179, cachingHitsRatio=99.69%, evictions=0, evicted=6355, evictedPerRun=0.0
2018-11-10 06:45:59,721 WARN [ResponseProcessor for block BP-738707272-划掉-1530583979305:blk_1083816517_10079746] hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block BP-738707272-划掉-1530583979305:blk_1083816517_10079746
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2294)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:847)
2018-11-10 06:46:00,261 WARN [AsyncFSWAL-0] wal.AsyncFSWAL: sync failed
java.io.IOException: stream already broken
at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:424)
at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:513)
at org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.sync(AsyncProtobufLogWriter.java:143)
at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:351)
at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume(AsyncFSWAL.java:534)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
2018-11-10 06:46:00,260 INFO [main-SendThread(nna:2181)] zookeeper.ClientCnxn: Client session timed out, have not heard from server in 75830ms for sessionid 0x401c6473d414cd4, closing socket connection and attempting reconnect
2018-11-10 06:46:00,260 WARN [DataStreamer for file /hbase/data/default/划掉/f321ac78d0e3d7cac6dfdc6af9a74b48/.tmp/a/7d6fe995824c4921976044021ef6165e block BP-738707272-划掉-1530583979305:blk_1083816510_10079739] hdfs.DFSClient: DataStreamer Exception
java.io.IOException: 断开的管道
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.hadoop.hdfs.DFSPacket.writeTo(DFSPacket.java:176)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:613)
2018-11-10 06:46:00,271 WARN [AsyncFSWAL-0] wal.AsyncFSWAL: sync failed
java.io.IOException: stream already broken
at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush0(FanOutOneBlockAsyncDFSOutput.java:424)
at org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.flush(FanOutOneBlockAsyncDFSOutput.java:513)
at org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.sync(AsyncProtobufLogWriter.java:143)
at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:351)
at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.consume(AsyncFSWAL.java:534)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 
已邀请:

ev

赞同来自:

这是个bug,请问你是在什么情况下触发该bug的?
以下是issue链接:28718.1525243680327@Atlassian.JIRA%3E" rel="nofollow" target="_blank">http://mail-archives.apache.or ... %253E

Potato - 我的土豆不是梦

赞同来自:

你的第二次日志,我也经常遇到,如下图

微信图片_20190118213123.png

 请问你的问题解决了吗?如何解决的?
我仅仅获取hbase的连接后,读取数据。经常报上面那个错误,再次运行又不报这个错误了。至今没找到解决方案!!!

要回复问题请先登录注册


中国HBase技术社区微信公众号:
hbasegroup

欢迎加入HBase生态+Spark社区钉钉大群