2022年服务器巡检规范 2.pdf
服务器巡检规范一) 收集服务器(应用、数据库)软硬件信息部分此部分工作基本为一次性工作,收集完毕后整理成文档保存以备查。1. 收集主机名:hostname 2. 收集 cpu数量和信息:grep model name /proc/cpuinfo 3. 收集内存信息:free 4. 收集磁盘信息:fdisk l 5. 收集磁盘使用情况:df m 6. 收集系统信息:getconf LONG_BIT lsb_release a cat /etc/issue cat /proc/version 7. 收集应用安装路径jdk 安装路径tomcat 安装路径及startup.sh 参数trans 安装路径及startup.sh 参数8. 收集数据库信息cat /home/oracle/.bash_profile (数据库安装参数)以下使用 PL/SQL DEVELOPER 工具查看select * from v$version; (数据库版本)select name,value from v$parameter where name in (db_name,service_names,instance_name,processes,sga_max_size,db_cache_size,large_pool_size,shared_pool_size,java_pool_size,log_buffer,log_archive_dest,undo_management,undo_tablespace,undo_retention,db_recovery_file_dest_size,db_recovery_file_名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 1 页,共 5 页 - - - - - - - - - dest,pga_aggregate_target); (数据库参数)select * from v$nls_parameters; (数据库字符集)select * from v$controlfile; (控制文件 ) select * from v$log; select * from v$logfile; (在线日志 ) archive log list (归档日志设置 ) select tablespace_name,file_name,autoextensible,bytes/1024/1024 bytes(mb) from dba_data_files order by tablespace_name,file_name; (数据文件)select username,default_tablespace,temporary_tablespace from dba_users order by 1; (用户信息 ) 期望结果: 根据现场实际情况, 应用用户的缺省表空间不能为system,临时表空间必须为temp二) 日常巡检部分此部分工作要求各区每周一次,发现异常情况及时处理,重大问题向主管汇报,收集完毕后整理成文档保留。应用服务器:1.检查磁盘使用情况df h 2.检查应用进程ps ef|grep java 3.检查应用和传送后台日志占用情况ll h ./logs/catalina.out 超过 2g 删除,通过重启可以在生成4.监控系统性能vmstat 2 10 5.检查服务器负载情况 (只截取繁忙时段 ) sar 数据库服务器:名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 2 页,共 5 页 - - - - - - - - - 1.检查磁盘使用情况df h 2.检查数据库进程ps ef|grep ora 3.检查数据库 alert 日志情况tail f $oracle_base/admin/sid /bdump/alert_.log检查日志中是否有报错more $oracle_base/admin/sid /bdump/alert_.log 后按 v 进入编辑模式按 : 进入查找模式输入/ ora- 查找 ora-关键字4.监控系统性能vmstat 2 10 5.检查服务器负载情况 (只截取繁忙时段 ) sar 6.检查监听情况lsnrctl status 7.检查监听日志ll h $oracle_home/network/log/listener.log接近 2g 需要及时删除8.检查数据库实例情况select inst_id, instance_name, host_name, version, to_char (startup_time, yyyy-mm-dd hh24:mi:ss) startup_time, status, archiver, database_status from gv$instance; 9.数据库打开时间检查select inst_id, dbid, name, to_char (created, yyyy-mm-dd hh24:mi:ss) created, log_mode, to_char (version_time, yyyy-mm-dd hh24:mi:ss) version_time,open_mode from gv$database; 10.检查连接情况select count(*) from v$session (一般总数不能超过150) 11.检查并发数select count(*) from v$session where status=active; 名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 3 页,共 5 页 - - - - - - - - - 12.表空间可用性检查 select tablespace_name,status from dba_tablespaces; 13.检查表空间使用情况select a.tablespace_name,trunc(sum(a.bytes)/1024/1024,2) total, trunc(sum(a.bytes)/1024/1024 - sum(b.bytes)/1024/1024,2) used, trunc(sum(b.bytes)/1024/1024,2) free, to_char(trunc(sum(a.bytes)/1024/1024-sum(b.bytes)/1024/1024)/(sum(a.bytes)/1024/1024),4)*100)|% pused, to_char(trunc(sum(b.bytes)/1024/1024)/(sum(a.bytes)/1024/1024),4)*100)|% pfree from (select sum(bytes) bytes,tablespace_name from dba_data_files group by tablespace_name) a,(select sum(bytes) bytes,tablespace_name from dba_free_space group by tablespace_name) b where a.tablespace_name=b.tablespace_name(+) group by a.tablespace_name; 14共享池性能检查select request_misses, request_failures from v$shared_pool_reserved; 期望结果: request_misses 和 request_failures 应该接近于 0。巡检说明: request_misses是保留列表没有满足请求的可用内存片从而开始利用 lru 列表刷新对象的次数; request_failures是未找到满足请求的内存次数15. 监控 sga 中字典缓冲区的命中率,应接近1select parameter, gets,getmisses , getmisses/(gets+getmisses)*100 miss ratio, (1-(sum(getmisses)/ (sum(gets)+sum(getmisses)*100 hit ratio from v$rowcache where gets+getmisses 0 group by parameter, gets, getmisses; 16. 数据库 redo log 缓冲区检查,应该小于1%select name, gets, misses, immediate_gets, immediate_misses, decode(gets,0,0,misses/gets*100) ratio1, decode(immediate_gets+immediate_misses,0,0, immediate_misses/(immediate_gets+immediate_misses)*100) ratio2 from v$latch where name in (redo allocation, redo copy); 17.检查 job 任务情况select job,schema_user,last_date,last_sec,next_date,total_time,broken,failures,what from dba_jobs;名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 4 页,共 5 页 - - - - - - - - - 18.检查 sql 读盘频率select a.username,b.disk_reads,b.executions, round(b.disk_reads/decode(b.executions,0,1,b.executions),2) disk_read_ratio,b.sql_text from dba_users a,v$sqlarea b where a.user_id = b.parsing_user_id and disk_reads 90000; 19. datafile i/o 的频率: select c.tablespace_name tbs,b.name,a.phyblkrd+a.phyblkwrt total,a.phyrds,a.phywrts,a.phyblkrd,a.phyblkwrt from v$filestat a,v$datafile b,dba_data_files c where b.file# = a.file# and b.file# = c.file_id order by tablespace_name,a.file#; 20. disk i/o 的频率 : select substr(b.name,1,13) disk,c.tablespace_name,a.phyblkrd+a.phyblkwrt total,a.phyrds,a.phywrts, a.phyblkrd,a.phyblkwrt,(a.readtim/decode(a.phyrds,0,1,a.phyblkrd)/100) avg_rd_time, (a.writetim/decode(a.phywrts,0,1,a.phyblkwrt)/100) avg_wrt_time from v$filestat a,v$datafile b,dba_data_files c where b.file# = a.file# and b.file# = c.file_id order by disk,c.tablespace_name,a.file#; 21.检查碎片程度最高的表select segment_name table_name , count(*) extents from dba_segments where owner in ( 该 区 数 据 用 户 ) group by segment_name having count(*) = (select max( count(*) ) from dba_segments group by segment_name); 22.检查序列数select seq_ywlsh.nextval seq_ywlsh from dual; (ywlsh 序列) select seq_ryid.nextval seq_ryid from dual; ( ryid 序列) 名师资料总结 - - -精品资料欢迎下载 - - - - - - - - - - - - - - - - - - 名师精心整理 - - - - - - - 第 5 页,共 5 页 - - - - - - - - -