Los procesos de vaciado consumen demasiada CPU

9

El servidor es una instancia EC2, significa guardar archivos en NAS (NFS) desde HTTPD.

Los procesos como flush-0: 32 consumen más del 90% de CPU y promedio de carga: 65.50, 64.02, 66.59.

Según el gráfico, aumenta cada día, mientras que el promedio de carga inicial fue de ~ 1.01, 2.02, 1.80 en 4 núcleos. He agregado otra instancia similar en Load Balancer y su utilización de CPU es solo de aproximadamente% 6 ATM.

¿Qué hacen exactamente estos procesos de descarga?

¿Quizás deberíamos desactivar la caché de atributos NFS si los clientes solo necesitan escribir datos?

¿Podría ser debido a la fragmentación de paquetes?

Aquí hay algunas estadísticas de nfsstat -s -4:

=================================================================
Server 0:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
715054137   0          0          0          0       

Server nfs v4:
null         compound     
993       0% 715053143 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 143229323  6% 78092765  3% 36693816  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
3486926   0% 0         0% 0         0% 679872421 28% 158406682  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 95872524  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
78173920  3% 0         0% 46442107  1% 1668      0% 715044032 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
7110      0% 42081145  1% 9116904   0% 0         0% 7026      0% 1257      0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
14        0% 81591622  3% 81659244  3% 0         0% 21028018  0% 3244      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3244      0% 0         0% 114086560  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 1:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
307172153   0          0          0          0       

Server nfs v4:
null         compound     
427       0% 307171725 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 58451998  5% 32717934  3% 15557564  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
1424670   0% 0         0% 0         0% 291829363 28% 67959378  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 41790934  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
32741459  3% 0         0% 18993781  1% 75        0% 307167167 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
3108      0% 18598329  1% 3892199   0% 0         0% 742       0% 2         0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
3         0% 34142743  3% 34166131  3% 0         0% 8963430   0% 1449      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
1449      0% 0         0% 54628017  5% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 2:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
53026598   0          0          0          0       

Server nfs v4:
null         compound     
89        0% 53026508 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 9605179   5% 5472897   3% 2633853   1% 
create       delegpurge   delegreturn  getattr      getfh        link         
231276    0% 0         0% 0         0% 50395149 28% 11903036  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 7528324   4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
5477948   3% 0         0% 2760996   1% 15        0% 53025580 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
633       0% 3566001   2% 673466    0% 0         0% 11        0% 0         0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
2         0% 5704253   3% 5709223   3% 0         0% 1526967   0% 292       0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
292       0% 0         0% 10439844  5% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 3:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
63045403   0          0          0          0       

Server nfs v4:
null         compound     
121       0% 63045280 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 11504749  5% 6504139   3% 3119453   1% 
create       delegpurge   delegreturn  getattr      getfh        link         
271128    0% 0         0% 0         0% 59865633 28% 14058385  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 8852565   4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
6521385   3% 0         0% 3365913   1% 15        0% 63043988 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
874       0% 4209822   2% 791702    0% 0         0% 6         0% 0         0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
9         0% 6775367   3% 6792509   3% 0         0% 1811226   0% 409       0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
409       0% 0         0% 12368747  5% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 4:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
817288490   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 817287285 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 164320609  6% 89471711  3% 42448842  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4101436   0% 0         0% 0         0% 778155935 28% 180629867  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 109104313  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
89598740  3% 0         0% 53534516  1% 9727      0% 817288175 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8020      0% 46348481  1% 10773529  0% 0         0% 100880    0% 12342     0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
80        0% 93709338  3% 93712518  3% 0         0% 24303185  0% 3352      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3352      0% 0         0% 127464001  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 5:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
804660319   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 804659114 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161331719  6% 88318366  3% 41571552  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4090384   0% 0         0% 0         0% 764533960 28% 177969216  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 107385644  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88321805  3% 0         0% 54307425  2% 444       0% 804647492 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8353      0% 45980723  1% 10410930  0% 0         0% 88471     0% 440       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
21        0% 92410970  3% 92412629  3% 0         0% 23733174  0% 3688      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3688      0% 0         0% 125268800  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 6:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
795385017   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 795383812 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 158633282  5% 87331357  3% 41400927  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4080179   0% 0         0% 0         0% 756063664 28% 176355823  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 106692513  4% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
87333398  3% 0         0% 53591273  2% 187       0% 795371861 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8367      0% 45030006  1% 10352133  0% 0         0% 80473     0% 151       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
28        0% 91411943  3% 91413728  3% 0         0% 23629833  0% 3707      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3707      0% 0         0% 124033436  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 7:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
801916264   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 801915059 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161888929  6% 88285947  3% 41212864  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4069479   0% 0         0% 0         0% 762072130 28% 177131560  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 106437411  3% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88288337  3% 0         0% 54779036  2% 191       0% 801903449 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8225      0% 45488312  1% 10259565  0% 0         0% 76243     0% 177       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
4         0% 92355900  3% 92357993  3% 0         0% 23515286  0% 3558      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3558      0% 0         0% 123741908  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 8:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
804732833   1          1          0          0       

Server nfs v4:
null         compound     
1204      0% 804731628 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161428939  6% 88340891  3% 41568432  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4085332   0% 0         0% 0         0% 764396486 28% 177796853  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 107176837  3% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88342790  3% 0         0% 54344886  2% 187       0% 804720008 29% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8226      0% 46219141  1% 10381361  0% 0         0% 83380     0% 160       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
0         0% 92426602  3% 92428282  3% 0         0% 23736349  0% 3554      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3554      0% 0         0% 125088530  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 

=================================================================
Server 9:

Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
800961003   0          0          0          0       

Server nfs v4:
null         compound     
1204      0% 800959798 99% 

Server nfs v4 operations:
op0-unused   op1-unused   op2-future   access       close        commit       
0         0% 0         0% 0         0% 161394704  6% 88264733  3% 41642226  1% 
create       delegpurge   delegreturn  getattr      getfh        link         
4098314   0% 0         0% 0         0% 761225542 28% 172733291  6% 0         0% 
lock         lockt        locku        lookup       lookup_root  nverify      
0         0% 0         0% 0         0% 102217363  3% 0         0% 0         0% 
open         openattr     open_conf    open_dgrd    putfh        putpubfh     
88272429  3% 0         0% 53937975  2% 467       0% 800948292 30% 0         0% 
putrootfh    read         readdir      readlink     remove       rename       
8312      0% 45893437  1% 10409370  0% 0         0% 83127     0% 478       0% 
renew        restorefh    savefh       secinfo      setattr      setcltid     
35        0% 92369729  3% 92371221  3% 0         0% 23772628  0% 3637      0% 
setcltidconf verify       write        rellockowner bc_ctl       bind_conn    
3637      0% 0         0% 124997490  4% 0         0% 0         0% 0         0% 
exchange_id  create_ses   destroy_ses  free_stateid getdirdeleg  getdevinfo   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
getdevlist   layoutcommit layoutget    layoutreturn secinfononam sequence     
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
set_ssv      test_stateid want_deleg   destroy_clid reclaim_comp 
0         0% 0         0% 0         0% 0         0% 0         0% 
Roman Newaza
fuente
¿Cuánto es demasiado?
Fred Foo
dado que la pregunta es sobre un error del kernel que informa que la versión del kernel habría ayudado.
Dmitri Chubarov

Respuestas:

6

Los procesos de vaciado son responsables de gestionar la reescritura de páginas sucias en el sistema de archivos del que provienen. Nunca deberían ocupar mucha CPU; la mayor parte de su tiempo se pasa esperando el disco (o la red para NFS y similares). Si ve un uso elevado de la CPU en un proceso de descarga, puede ser un error del kernel; intente reiniciar, esto debería borrar el estado.

bdonlan
fuente
¡Hola! ¿Qué son las páginas sucias? Sí, acabo de terminar la instancia fallida y las otras funcionan normalmente en cajeros automáticos.
Roman Newaza
Las páginas sucias son porciones de archivos que se almacenan en la memoria caché, se han modificado en la memoria mediante algún proceso, pero aún no se han vuelto a escribir en el disco. Normalmente, esta escritura ocurre en segundo plano, a menos que la cantidad de páginas sucias sea demasiado alta, la memoria libre sea demasiado baja o el programa pregunte explícitamente: cuando la escritura ocurre en segundo plano, el trabajo de Flush es activar esta escritura.
bdonlan
¿Qué echo 3 > /proc/sys/vm/drop_cachespasa si corro para vaciar la memoria caché de página, dentries e inodes?
Roman Newaza
@RomanNewaza esto solo soltará cachés que no estén bloqueados o sucios. flush generalmente no hace nada con cachés no sucios, por lo que dudo que haga una diferencia (y esto afectará gravemente el rendimiento en un servidor hasta que los cachés se vuelvan a calentar)
bdonlan
Veo. El problema parece haber desaparecido después de la actualización del kernel.
Roman Newaza