{"id":126283,"date":"2024-11-13T10:36:57","date_gmt":"2024-11-13T01:36:57","guid":{"rendered":"https:\/\/softantenna.com\/blog\/?p=126283"},"modified":"2024-11-13T10:36:57","modified_gmt":"2024-11-13T01:36:57","slug":"linux-performance-4000","status":"publish","type":"post","link":"https:\/\/softantenna.com\/blog\/linux-performance-4000\/","title":{"rendered":"\u4e00\u884c\u306e\u5909\u66f4\u306b\u3088\u308aLinux\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c4000%(40\u500d)\u5411\u4e0a"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/softantenna.com\/blog\/wp-content\/uploads\/2023\/08\/pexels-photo-842654.jpeg\" alt=\"\" width=\"1125\" height=\"750\" class=\"aligncenter size-full wp-image-113050\" \/><\/p>\n<p>Linux\u30ab\u30fc\u30cd\u30eb\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u306e\u305f\u3063\u305f1\u884c\u306e\u5909\u66f4\u306b\u3088\u308a\u3001\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c\u7d044000%\u5411\u4e0a\u3057\u305f\u3053\u3068\u304c\u5224\u660e\u3057\u3001\u6ce8\u76ee\u3092\u96c6\u3081\u3066\u3044\u307e\u3059(<a href=\"https:\/\/www.phoronix.com\/news\/Intel-Linux-3888.9-Performance\">Phoronix<\/a>)\u3002<\/p>\n<p>Intel\u304c\u904b\u7528\u3057\u3066\u3044\u308bLinux\u30ab\u30fc\u30cd\u30eb\u30c6\u30b9\u30c8\u30ed\u30dc\u30c3\u30c8\u306e<a href=\"https:\/\/lore.kernel.org\/lkml\/202411072132.a8d2cf0f-oliver.sang@intel.com\/\">\u5831\u544a<\/a>\u306b\u3088\u308b\u3068\u3001Intel Xeon Platinum(Cooper Lake)\u30c6\u30b9\u30c8\u30b5\u30fc\u30d0\u30fc\u4e0a\u3067\u52d5\u4f5c\u3059\u308b\u300cwill-it-scale.per_process_ops\u300d\u30b9\u30b1\u30fc\u30e9\u30d3\u30ea\u30c6\u30a3\u30c6\u30b9\u30c8\u30b1\u30fc\u30b9\u3067\u30013888.9\uff05\u306e\u6539\u5584\u304c\u78ba\u8a8d\u3055\u308c\u305f\u3068\u306e\u3053\u3068\u3002\u30ab\u30fc\u30cd\u30eb\u30c6\u30b9\u30c8\u30ed\u30dc\u30c3\u30c8\u306f\u3001Linux\u30ab\u30fc\u30cd\u30eb\u306e\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u3092\u76e3\u8996\u3057\u3001\u5909\u66f4\u304c\u3042\u308c\u3070\u81ea\u52d5\u7684\u306b\u30d3\u30eb\u30c9\u3084\u30c6\u30b9\u30c8\u3092\u5b9f\u65bd\u3059\u308b\u30b5\u30fc\u30d3\u30b9\u3067\u3001\u30ab\u30fc\u30cd\u30eb\u306e\u54c1\u8cea\u5411\u4e0a\u3084\u30d0\u30b0\u306e\u65e9\u671f\u767a\u898b\u306e\u305f\u3081\u306b\u904b\u7528\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>\u5927\u5e45\u306a\u6027\u80fd\u5411\u4e0a\u306e\u539f\u56e0\u3068\u306a\u3063\u305f\u30b3\u30df\u30c3\u30c8\u306f\u300c<a href=\"https:\/\/git.kernel.org\/pub\/scm\/linux\/kernel\/git\/torvalds\/linux.git\/commit\/?id=d4148aeab412432bf928f311eca8a2ba52bb05df\">mm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizes<\/a>\u300d\u3067\u3001\u30d1\u30c3\u30c1\u30e1\u30c3\u30bb\u30fc\u30b8\u306b\u3088\u308b\u3068\u3001\u3053\u306e\u30b3\u30df\u30c3\u30c8\u306b\u3088\u308a\u3001\u4ee5\u524d\u306e\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u4f4e\u4e0b\u304c\u4fee\u6b63\u3055\u308c\u3001\u7279\u6b8a\u306a\u30b1\u30fc\u30b9\u3067\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c\u5927\u5e45\u306b\u5411\u4e0a\u3059\u308b\u3053\u3068\u304c\u78ba\u8a8d\u3055\u308c\u305f\u3068\u306e\u3053\u3068\u3002<\/p>\n<blockquote>\n<p>Since commit efa7df3e3bb5 (\"mm: align larger anonymous mappings on THP boundaries\") a mmap() of anonymous memory without a specific address hint and of at least PMD_SIZE will be aligned to PMD so that it can benefit from a THP backing page.<\/p>\n<p>\u30b3\u30df\u30c3\u30c8efa7df3e3bb5(\"mm: \u3088\u308a\u5927\u304d\u306a\u533f\u540d\u30de\u30c3\u30d4\u30f3\u30b0\u3092 THP \u5883\u754c\u306b\u5408\u308f\u305b\u308b\")\u4ee5\u964d\u3001\u7279\u5b9a\u306e\u30a2\u30c9\u30ec\u30b9\u30d2\u30f3\u30c8\u306a\u3057\u3067 mmap()\u3055\u308c\u3001\u304b\u3064 PMD_SIZE \u4ee5\u4e0a\u306e\u30b5\u30a4\u30ba\u3092\u6301\u3064\u533f\u540d\u30e1\u30e2\u30ea\u306e\u30de\u30c3\u30d4\u30f3\u30b0\u306f\u3001THP(Transparent Huge Pages)\u5bfe\u5fdc\u306e\u30da\u30fc\u30b8\u306e\u6069\u6075\u3092\u53d7\u3051\u3089\u308c\u308b\u3088\u3046\u306b PMD \u5883\u754c\u306b\u5408\u308f\u305b\u3066\u914d\u7f6e\u3055\u308c\u308b\u3088\u3046\u306b\u306a\u308a\u307e\u3057\u305f\u3002<\/p>\n<p>However this change has been shown to regress some workloads significantly. [1] reports regressions in various spec benchmarks, with up to 600% slowdown of the cactusBSSN benchmark on some platforms. The benchmark seems to create many mappings of 4632kB, which would have merged to a large THP-backed area before commit efa7df3e3bb5 and now they are fragmented to multiple areas each aligned to PMD boundary with gaps between. The regression then seems to be caused mainly due to the benchmark's memory access pattern suffering from TLB or cache aliasing due to the aligned boundaries of the individual areas.<\/p>\n<p>\u3057\u304b\u3057\u3001\u3053\u306e\u5909\u66f4\u306b\u3088\u308a\u3001\u4e00\u90e8\u306e\u30ef\u30fc\u30af\u30ed\u30fc\u30c9\u306b\u304a\u3044\u3066\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c\u5927\u5e45\u306b\u4f4e\u4e0b\u3059\u308b\u3053\u3068\u304c\u78ba\u8a8d\u3055\u308c\u3066\u3044\u307e\u3059\u3002\u4f8b\u3048\u3070\u3001\u3042\u308b\u30d9\u30f3\u30c1\u30de\u30fc\u30af\uff08[1]\uff09\u3067\u306f\u3001\u8907\u6570\u306e spec \u30d9\u30f3\u30c1\u30de\u30fc\u30af\u306b\u304a\u3044\u3066\u30ea\u30b0\u30ec\u30c3\u30b7\u30e7\u30f3\u304c\u898b\u3089\u308c\u3001\u7279\u5b9a\u306e\u30d7\u30e9\u30c3\u30c8\u30d5\u30a9\u30fc\u30e0\u3067\u306f cactusBSSN \u30d9\u30f3\u30c1\u30de\u30fc\u30af\u3067\u6700\u5927600%\u306e\u9045\u5ef6\u304c\u767a\u751f\u3057\u3066\u3044\u307e\u3059\u3002\u3053\u306e\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u306f 4632kB \u306e\u591a\u304f\u306e\u30de\u30c3\u30d4\u30f3\u30b0\u3092\u751f\u6210\u3059\u308b\u3088\u3046\u3067\u3001\u30b3\u30df\u30c3\u30c8 efa7df3e3bb5 \u4ee5\u524d\u306f\u3053\u308c\u3089\u304c THP \u5bfe\u5fdc\u306e\u5927\u304d\u306a\u30a8\u30ea\u30a2\u306b\u7d71\u5408\u3055\u308c\u3066\u3044\u307e\u3057\u305f\u304c\u3001\u73fe\u5728\u3067\u306f PMD \u5883\u754c\u306b\u5408\u308f\u305b\u3066\u5206\u5272\u3055\u308c\u3001\u9593\u306b\u30ae\u30e3\u30c3\u30d7\u304c\u751f\u3058\u3066\u3044\u307e\u3059\u3002\u3053\u306e\u30ea\u30b0\u30ec\u30c3\u30b7\u30e7\u30f3\u306f\u3001\u4e3b\u306b\u500b\u3005\u306e\u30a8\u30ea\u30a2\u304c\u6574\u5217\u3055\u308c\u305f\u5883\u754c\u3092\u6301\u3064\u3053\u3068\u306b\u3088\u308b TLB \u3084\u30ad\u30e3\u30c3\u30b7\u30e5\u306e\u30a8\u30a4\u30ea\u30a2\u30b7\u30f3\u30b0\u306e\u5f71\u97ff\u3067\u3001\u30d9\u30f3\u30c1\u30de\u30fc\u30af\u306e\u30e1\u30e2\u30ea\u30a2\u30af\u30bb\u30b9\u30d1\u30bf\u30fc\u30f3\u306b\u60aa\u5f71\u97ff\u3092\u4e0e\u3048\u308b\u305f\u3081\u306b\u767a\u751f\u3057\u3066\u3044\u308b\u3068\u898b\u3089\u308c\u307e\u3059\u3002<\/p>\n<p>Another known regression bisected to commit efa7df3e3bb5 is darktable and early testing suggests this patch fixes the regression there as well.<\/p>\n<p>\u5225\u306e\u65e2\u77e5\u306e\u30ea\u30b0\u30ec\u30c3\u30b7\u30e7\u30f3\u3068\u3057\u3066\u3001efa7df3e3bb5 \u306e\u30b3\u30df\u30c3\u30c8\u304c\u539f\u56e0\u3068\u3055\u308c\u305f darktable \u306e\u554f\u984c\u304c\u3042\u308a\u3001\u521d\u671f\u30c6\u30b9\u30c8\u3067\u306f\u3053\u306e\u30d1\u30c3\u30c1\u306b\u3088\u308a\u30ea\u30b0\u30ec\u30c3\u30b7\u30e7\u30f3\u304c\u89e3\u6c7a\u3055\u308c\u308b\u3053\u3068\u3082\u78ba\u8a8d\u3055\u308c\u3066\u3044\u307e\u3059\u3002<\/p>\n<p>To fix the regression but still try to benefit from THP-friendly anonymous mapping alignment, add a condition that the size of the mapping must be a multiple of PMD size instead of at least PMD size. In case of many odd-sized mapping like the cactusBSSN creates, those will stop being aligned and with gaps between, and instead naturally merge again.<\/p>\n<p>\u30ea\u30b0\u30ec\u30c3\u30b7\u30e7\u30f3\u3092\u89e3\u6c7a\u3057\u3064\u3064\u3001THP \u306b\u9069\u3057\u305f\u533f\u540d\u30de\u30c3\u30d4\u30f3\u30b0\u306e\u30a2\u30e9\u30a4\u30e1\u30f3\u30c8\u3092\u7dad\u6301\u3059\u308b\u305f\u3081\u306b\u3001\u30de\u30c3\u30d4\u30f3\u30b0\u30b5\u30a4\u30ba\u304c PMD \u30b5\u30a4\u30ba\u306e\u300c\u500d\u6570\u300d\u3067\u3042\u308b\u3053\u3068\u3092\u6761\u4ef6\u306b\u8ffd\u52a0\u3057\u307e\u3057\u305f\u3002cactusBSSN \u306e\u3088\u3046\u306b\u7570\u5e38\u306a\u30b5\u30a4\u30ba\u306e\u30de\u30c3\u30d4\u30f3\u30b0\u304c\u591a\u6570\u3042\u308b\u5834\u5408\u3001\u3053\u308c\u306b\u3088\u308a\u3053\u308c\u3089\u306e\u30de\u30c3\u30d4\u30f3\u30b0\u306f\u30ae\u30e3\u30c3\u30d7\u306a\u3057\u3067\u518d\u3073\u81ea\u7136\u306b\u7d71\u5408\u3055\u308c\u308b\u3088\u3046\u306b\u306a\u308a\u307e\u3059\u3002\n<\/p><\/blockquote>\n<p>\u30de\u30fc\u30b8\u3055\u308c\u305fmmap\u30d1\u30c3\u30c1\u306f\u3001\u308f\u305a\u304b1\u884c\u306e\u30b3\u30fc\u30c9\u306b\u5f71\u97ff\u3059\u308b\u3082\u306e\u3067\u3059\u3002<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/softantenna.com\/blog\/wp-content\/uploads\/2024\/11\/s_20241113_103434.jpg\" alt=\"\" width=\"842\" height=\"250\" class=\"aligncenter size-full wp-image-126284\" \/><\/p>\n<p>\u305f\u3060\u3057\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u306e\u6539\u5584\u306f\u3001\u5408\u6210\u30c6\u30b9\u30c8\u306b\u304a\u3051\u308b\u3082\u306e\u3067\u3042\u308a\u3001\u5b9f\u969b\u306e\u30ef\u30fc\u30af\u30ed\u30fc\u30c9\u3067\u306f\u3053\u308c\u307b\u3069\u5927\u304d\u306a\u6539\u5584\u304c\u898b\u3089\u308c\u308b\u53ef\u80fd\u6027\u306f\u4f4e\u3044\u3053\u3068\u306b\u6ce8\u610f\u3059\u308b\u5fc5\u8981\u304c\u3042\u308a\u307e\u3059\u3002<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Linux\u30ab\u30fc\u30cd\u30eb\u30bd\u30fc\u30b9\u30b3\u30fc\u30c9\u306e\u305f\u3063\u305f1\u884c\u306e\u5909\u66f4\u306b\u3088\u308a\u3001\u30d1\u30d5\u30a9\u30fc\u30de\u30f3\u30b9\u304c\u7d044000%\u5411\u4e0a\u3057\u305f\u3053\u3068\u304c\u5224\u660e\u3057\u3001\u6ce8\u76ee\u3092\u96c6\u3081\u3066\u3044\u307e\u3059(Phoronix)\u3002 Intel\u304c\u904b\u7528\u3057\u3066\u3044\u308bLinux\u30ab\u30fc\u30cd\u30eb\u30c6\u30b9\u30c8\u30ed\u30dc\u30c3\u30c8\u306e\u5831\u544a\u306b\u3088\u308b\u3068\u3001 [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":113050,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"swell_btn_cv_data":"","footnotes":""},"categories":[75],"tags":[37],"class_list":["post-126283","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-software","tag-linux"],"_links":{"self":[{"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/posts\/126283","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/comments?post=126283"}],"version-history":[{"count":0,"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/posts\/126283\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/media\/113050"}],"wp:attachment":[{"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/media?parent=126283"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/categories?post=126283"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/softantenna.com\/blog\/wp-json\/wp\/v2\/tags?post=126283"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}