- A+
一:背景
1. 讲故事
上个月中旬,星球里的一位朋友在微信找我,说他的程序跑着跑着内存会不断的缓慢增长并无法释放,寻求如何解决 ?
得,看样子星球还得好好弄!!! ??? 不管怎么说,先上 windbg 说话。
二:Windbg 分析
1. 经验推理
从朋友的截图看,有大量的 8216 字节的 byte[]
,这表示什么呢? 追随本系列的朋友应该知道,有一篇 某三甲医院
的内存暴涨的dump中,也同样有此 size= (8216-24=8192)
的 byte[] 数组, 他的问题是 Oracle 中读取某大字段时sdk里的 OraBuf 出了问题,换句话说,这肯定又是底层或者第三方库中的池对象搞出来的东西,接下来从 托管堆
看起。
2. 查看托管堆
0:000> !dumpheap -stat Statistics: 00007ffe107248f0 483707 15478624 System.Threading.PreAllocatedOverlapped 00007ffe1079c160 483744 15479808 System.Threading.ThreadPoolBoundHandle 00007ffe1079cff8 483701 23217648 System.Threading._IOCompletionCallback 00007ffe106e7a90 483704 23217792 Microsoft.Win32.SafeHandles.SafeFileHandle 00007ffe1079b088 483703 30956992 System.IO.FileSystemWatcher+AsyncReadState 00007ffe1079ceb0 483707 34826904 System.Threading.OverlappedData 00007ffe1079ccb0 483707 34826904 System.Threading.ThreadPoolBoundHandleOverlapped 0000016c64651080 245652 1473128080 Free 00007ffe105abf30 488172 3977571092 System.Byte[]
扫完托管堆,卧槽 ,byte[]
没吸引到我,反而被 System.IO.FileSystemWatcher+AsyncReadState
吸引到了,毕竟被 System.IO.FileSystemWatcher
折腾多次了,它已经深深打入了我的脑海。。。毕竟让程序卡死,让句柄爆高的都是它。。。这一回八成又是它惹的祸,看样子还是有很多程序员栽在这里哈。
为做到严谨,我还是从最大的 System.Byte[]
入手,按size
对它进行分组再按totalsize
降序,丑陋的脚本我就不发了,直接上脚本的输出结果。
!dumpheap -mt 00007ffe105abf30 size=8216,count=483703,totalsize=3790M size=8232,count=302,totalsize=2M size=65560,count=6,totalsize=0M size=131096,count=2,totalsize=0M size=4120,count=11,totalsize=0M size=56,count=301,totalsize=0M size=88,count=186,totalsize=0M size=848,count=16,totalsize=0M size=152,count=85,totalsize=0M size=46,count=242,totalsize=0M size=279,count=38,totalsize=0M !dumpheap -mt 00007ffe105abf30 -min 0n8216 -max 0n8216 -short 0000016c664277f0 0000016c66432a48 0000016c6648ef88 0000016c6649daa8 0000016c6649fb00 0000016c664a8b90 ...
从输出结果看,size=8216
的 byte[]
有 48w 个,然后脚本也列出了一些 8216 大小的 address 地址,接下来用 !gcroot
看下这些地址的引用。
0:000> !gcroot 0000016c664277f0 HandleTable: 0000016C65FC28C0 (async pinned handle) -> 0000016C6628DEB0 System.Threading.OverlappedData -> 0000016C664277F0 System.Byte[] Found 1 unique roots (run '!gcroot -all' to see all roots). 0:000> !gcroot 0000016c667c80d0 HandleTable: 0000016C65FB7920 (async pinned handle) -> 0000016C663260F8 System.Threading.OverlappedData -> 0000016C667C80D0 System.Byte[]
从输出中可以看到这些 byte[] 都是 async pinned
,也就是当异步IO回来的时候需要给 byte[]
填充的存储空间,接下来我们看看如何通过 OverlappedData
找到源码中定义为 8192 大小的 byte[]
地方。
如果你了解 FileSystemWatcher ,反向查找链大概是这样的 OverlappedData
-> ThreadPoolBoundHandleOverlapped
-> System.IO.FileSystemWatcher+AsyncReadState
-> Buffer[]
, 这中间涉及到 ThreadPool 和 SafeHandle 的绑定。
0:000> !do 0000016C663260F8 Name: System.Threading.OverlappedData MethodTable: 00007ffe1079ceb0 EEClass: 00007ffe107ac8d0 Size: 72(0x48) bytes File: C:Program FilesdotnetsharedMicrosoft.NETCore.App5.0.10System.Private.CoreLib.dll Fields: MT Field Offset Type VT Attr Value Name 00007ffe106e3c08 40009ce 8 System.IAsyncResult 0 instance 0000000000000000 _asyncResult 00007ffe104a0c68 40009cf 10 System.Object 0 instance 0000016c66326140 _callback 00007ffe1079cb60 40009d0 18 ...eading.Overlapped 0 instance 0000016c663260b0 _overlapped 00007ffe104a0c68 40009d1 20 System.Object 0 instance 0000016c667c80d0 _userObject 00007ffe104af508 40009d2 28 PTR 0 instance 00000171728f66e0 _pNativeOverlapped 00007ffe104aee60 40009d3 30 System.IntPtr 1 instance 0000000000000000 _eventHandle 00007ffe104ab258 40009d4 38 System.Int32 1 instance 0 _offsetLow 00007ffe104ab258 40009d5 3c System.Int32 1 instance 0 _offsetHigh 0:000> !do 0000016c663260b0 Name: System.Threading.ThreadPoolBoundHandleOverlapped MethodTable: 00007ffe1079ccb0 EEClass: 00007ffe107ac858 Size: 72(0x48) bytes File: C:Program FilesdotnetsharedMicrosoft.NETCore.App5.0.10System.Private.CoreLib.dll Fields: MT Field Offset Type VT Attr Value Name 00007ffe1079ceb0 40009d6 8 ...ng.OverlappedData 0 instance 0000016c663260f8 _overlappedData 00007ffe1079b818 40009c0 10 ...ompletionCallback 0 instance 0000016f661ab8a0 _userCallback 00007ffe104a0c68 40009c1 18 System.Object 0 instance 0000016c667ca0e8 _userState 00007ffe107248f0 40009c2 20 ...locatedOverlapped 0 instance 0000016c66326090 _preAllocated 00007ffe104af508 40009c3 30 PTR 0 instance 00000171728f66e0 _nativeOverlapped 00007ffe1079c160 40009c4 28 ...adPoolBoundHandle 0 instance 0000000000000000 _boundHandle 00007ffe104a7238 40009c5 38 System.Boolean 1 instance 0 _completed 00007ffe1079b818 40009bf 738 ...ompletionCallback 0 static 0000016f661ab990 s_completionCallback 0:000> !do 0000016c667ca0e8 Name: System.IO.FileSystemWatcher+AsyncReadState MethodTable: 00007ffe1079b088 EEClass: 00007ffe107a9dc0 Size: 64(0x40) bytes File: C:Program FilesdotnetsharedMicrosoft.NETCore.App5.0.10System.IO.FileSystem.Watcher.dll Fields: MT Field Offset Type VT Attr Value Name 00007ffe104ab258 400002b 30 System.Int32 1 instance 1 <Session>k__BackingField 00007ffe105abf30 400002c 8 System.Byte[] 0 instance 0000016c667c80d0 <Buffer>k__BackingField 00007ffe106e7a90 400002d 10 ...es.SafeFileHandle 0 instance 0000016c66326028 <DirectoryHandle>k__BackingField 00007ffe1079c160 400002e 18 ...adPoolBoundHandle 0 instance 0000016c66326058 <ThreadPoolBinding>k__BackingField 00007ffe107248f0 400002f 20 ...locatedOverlapped 0 instance 0000016c66326090 <PreAllocatedOverlapped>k__BackingField 00007ffe1079b8c8 4000030 28 ...eSystem.Watcher]] 0 instance 0000016c66326078 <WeakWatcher>k__BackingField
上面的 <Buffer>k__BackingField
就是当初丢给 OverlappedData 作为 异步IO 读写的缓冲,然后看下 System.IO.FileSystemWatcher+AsyncReadState
的源码。
有了这些原理之后,接下来就可以问朋友是否有对 appsettings
设置了 reloadonchange=true
的情况,朋友找了下代码,写法大概如下:
public object GetxxxFlag() { string value = AppConfig.GetConfig("appsettings.json").GetValue("xxxx", "0"); return new { state = 200, data = value }; } public class AppConfig { public static AppConfig GetConfig(string settingfile = "appsettings.json") { return new AppConfig(settingfile); } } public class AppConfig { private AppConfig(string settingfile) { _config = new ConfigurationBuilder().AddJsonFile(settingfile, optional: true, reloadOnChange: true).Build(); _settingfile = settingfile; } }
从源码逻辑看,我猜测朋友将 GetConfig
方法标记成 static 后就以为是单例化了,再次调用不会重复 new AppConfig(settingfile)
,所以问题就出在这里。
不过有意思的是,前面二篇的 FileSystemWatcher
都会造成程序卡死,那这一篇为啥没有呢?恰好他没有在程序根目录中放日志文件,不然的话。。。???,可万万没想到逃过了卡死却没逃过一个 watcher 默认 8byte
空间的灵魂拷问。。。???
三:总结
总的来说,设置 reloadOnChange: true
一定要慎重, 可能它会造成你的程序卡死,句柄泄漏,内存泄漏 等等!!! 改进方案我就不说了,参考我前面的系列文章吧。