[ show all running processes ]

(1) ps -aux | less

'ps' means: Process Status

The -a option tells ps to list the processes of all users on the system rather than just those of the current user, with the exception of group leaders and processes not associated with a terminal. A group leader is the first member of a group of related processes.

The -u option tells ps to provide detailed information about each process. The -x option adds to the list processes that have no controlling terminal, such as daemons, which are programs that are launched during booting (i.e., computer startup) and run unobtrusively in the background until they are activated by a particular event or condition.

As the list of processes can be quite long and occupy more than a single screen, the output of ps -aux can be piped (i.e., transferred) to the less command, which lets it be viewed one screenful at a time. The output can be advanced one screen forward by pressing the SPACE bar and one screen backward by pressing the b key.

(2) top

the command-line will show a process monitor. The meaning of each column is as follows:

  • Process ID
  • User
  • Priority
  • Nice level
  • Virtual memory used by process
  • Resident memory used by a process
  • Shareable memory
  • CPU used by process as a percentage
  • Memory used by process as a percentage
  • Time process has been running
  • Command

Some options:

  • -h - Show the current version
  • -c - This toggles the command column between showing command and program name
  • -d - Specify the delay time between refreshing the screen
  • -o - Sorts by the named field
  • -p - Only show processes with specified process IDs
  • -u - Show only processes by the specified user
  • -i - Do not show idle tasks

[ create a new env with conda ]

conda create --name {ENV_NAME} python={PYTHON_VERSION}

There are 2 variables in the command line, ENV_NAME and PYTHON_VERSION. Decide them to your needs.

[ show the usage of the Nvidia GPUs]

(1) nvidia-smi 

Some useful option:

-l : Output the status frequently, default parameter is 5. You can make it show every 10 seconds by nvidia-smi -l 10.

-i : Choose to show a certain GPU by this option.

-f : redirect the output to other files.

To get more information officially, visit this doc: http://developer.download.nvidia.com/compute/DCGM/docs/nvidia-smi-367.38.pdf

(2) gpustat

gpustat is another tool to show the status of GPU. Using with watch command is a good way to frequently show sensor data.

gpustat [options]

options:

  • --color : Force colored output (even when stdout is not a tty)
  • --no-color : Suppress colored output
  • -u--show-user : Display username of the process owner
  • -c--show-cmd : Display the process name
  • -p--show-pid : Display PID of the process
  • -P--show-power : Display GPU power usage and/or limit (draw or draw,limit)
  • --watch-i--interval : Run in watch mode (equivalent to watch gpustat) if given. Denotes interval between updates.

[ watch ]

a built-in command of Linux

watch [options] {COMM}

options:

-n:指定指令执行的间隔时间(秒);
-d:高亮显示指令输出信息不同之处;
-t:不显示标题。

eg. watch -n 1 -d gpustat

[ show size of directories ]

du -h --max-depth=1

-h: human understandable

--max-depth: maximam recurrent depth

[ show disk usage]

df -h

-h: human understandable

[ output the line number of a file ]

wc [options] file

wc means word count, the output without any option consists of three numbers which are the numbers of lines, words and bytes.

[ download files and install ]

wget

frequently used option: wget -O {NEW_NAME} {FILE_URL}

[ stop a process forcibly ]

kill [options]

options:

-a:当处理当前进程时,不限制命令名和进程号的对应关系;
-l <信息编号>:若不加<信息编号>选项,则-l参数会列出全部的信息名称;
-p:指定kill 命令只打印相关进程的进程号,而不发送任何信号;
-s <信息名称或编号>:指定要送出的信息;
-u:指定用户。

-s signals:

HUP     1    终端断线
INT 2 中断(同 Ctrl + C)
QUIT 3 退出(同 Ctrl + \)
TERM 15 终止
KILL 9 强制终止
CONT 18 继续(与STOP相反, fg/bg命令)
STOP 19 暂停(同 Ctrl + Z)

ps -aus to search for pid, then kill it by signal 9.

[ 进程杀死后显存占用解除 ]

sudo fuser /dev/nvidia*

查看占用nvidia显卡的进程pid

ps -aux | less

查看对应进程的状态、所有者、任务等信息,防止误杀

kill -9 pid

杀死僵尸进程

[ get the information about the hardware ]

CPU

cpu型号:cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c

cpu个数:cat /proc/cpuinfo| grep "physical id"| sort| uniq| wc -l

cpu核心数:cat /proc/cpuinfo | grep "cpu cores"| uniq

cpu总线程数:cat /proc/cpuinfo | grep 'processor' | sort -u | wc -l

内存

内存情况:free -h

硬盘

硬盘分区情况:lsblk

lsblk -o (output column) {COLUMN_NAME}

eg. lsblk -o NAME,ROTA

输出名称和旋转式磁盘两列信息。通过这个命令,可以找SSD。

硬盘分区使用情况:df -h

硬盘容量占用情况:du -h

查看某一个文件或文件夹的大小: du -sh {DIR_NAME}

等价于 du -s -h {DIR_NAME}

-s: summarize, -h: human-readable

统计文件个数:du {DIR_NAME} 或者 du {DIR_NAME} | wc -l

网卡

网卡型号:lspci | grep -i 'eth'

显卡

显卡使用情况:nvidia-smi

[ show image file information via command-line ]

concise version: identify {FILE_NAME}

detailed version: identify -verbose {FILE_NAME}

[ 硬盘挂载 ]

重启服务器之后,原来的一块硬盘根目录路径依然存在,但是里面的数据无法访问到,这是因为这块硬盘在服务器重启之后没有挂载。

sudo mount /dev/sdb /media/data/

命令行很简单,但是要找到是哪块物理硬盘掉了,对应挂到哪个挂载点,需要经验积累。

涉及到的命令: lsblk(查看硬盘挂载情况,查看ssd),mount(挂载)。

*挂载时遇到mount: unknown filesystem type 'LVM2_member':使用逻辑卷Logic Volume的名称挂载即可。

使用 sudo lvdisplay 查看逻辑卷路径 LV path,而后使用 mount {LV_path} {MOUNT_POINT}完成挂载。

[ 创建用户、分配超级用户权限 ]

创建用户: adduser {USER_NAME},然后根据提示输入密码,之后回车选择default即可。

删除用户:先退出登录的用户,然后userdel {USER_NAME}

若需要同时删除用户目录,userdel --remove-home {USER_NAME}

给超级权限:vim /etc/sudoers

在User privilege specification中加入:{USER_NAME} ALL=(ALL:ALL) ALL即可。

[ 添加已知IP的主机名 ]

/etc/hosts 文件中记录了已知的ip及对应的主机名,在文件中的第一部分添加ip和主机名即可。

最新文章

  1. Script Task 引用 package variable
  2. CentOS默认开放的本地端口范围
  3. django rest framework csrf failed csrf token missing or incorrect
  4. it&#39;s hard to say
  5. sharepoint 2010 中获取system账号的真实账号
  6. FFMPEG视音频编解码零基础学习方法
  7. 一起啃PRML - 1.2.4 The Gaussian distribution 高斯分布 正态分布
  8. Android(java)学习笔记182:保存数据到SD卡 (附加:保存数据到内存)
  9. em px 简单换算
  10. Phaser小游戏
  11. 深入PHP变量存储结构 标签: PHP存储
  12. java 反射 类装载器
  13. 后台返回json可能会出现的异常解析:java.lang.IllegalStateException: WRITER
  14. mysql大小写敏感(默认为1,不敏感)
  15. wxpython实现界面跳转
  16. Python模块——xml
  17. ActiveMQ异步分发消息
  18. windows中eclipse调试hadoop
  19. 发一些靠谱的招聘网站(含ios)
  20. as3 文档类引用

热门文章

  1. 高性能分布式锁-redisson的使用
  2. 第N次学习javaIO之后
  3. Zookeeper + Guava loading cache 实现分布式缓存
  4. 20个实用便捷的CSS3工具、库及实例
  5. javascript变量的引用类型值
  6. 【javascript】javasrcipt设计模式之状态模式
  7. Excel 2016 for Mac
  8. Android解析WindowManagerService(三)Window的删除过程
  9. 运行第一个MapReduce程序,WordCount
  10. 利用Vagrant完成开发环境配置