QED Computing (over SSH)

In the last week, two people, for entirely unrelated reasons, have asked me about how to do some basic computing tasks over SSH: connecting, checking which machines are in use, etc., so I thought I’d summarize my setup for connecting to the QED computers here so that next time it happens, I have somewhere to point.

Non-Windows only

Just a word of warning: I don’t use, and don’t care to use, Windows computers.  I expect there are equivalents to what I describe below for Windows, but I don’t know and, frankly, don’t care.  Windows is for playing games, this post is all about getting work done.

Most of this will also work just fine on a Mac; the parallel-ssh program I use below is probably available somewhere (and is most likely building on a Mac if not), but I don’t use Mac systems on a regular basis, so can’t help with that.

Another note: terminal commands are written like this:

$ some command

Note where the “$” represents the prompt.  If copy and pasting, don’t copy the $.

SSH keys

If you’re typing in your login password every time you SSH somewhere, you’re doing it wrong.  SSH supports much more secure keys, which are sort of like passwords but much more clever: they are generated in pairs, a private key and a public key, the private key never leaving your computer.  Keys are considerably more secure than something that stores your actual password, and, because they are considerably longer than any password, are essentially impossible to crack.

At a conceptual level the server says “authenticate yourself” and your ssh client says “I’m going to use this key” to which the server says “okay, I know the public signature for your key, so prove that it’s really you by using this random data: XYZ123blahblahblah”.  Your SSH client then uses the random data and “signs” it with the private key then sends it back.  The important bit is that the public key lets the server verify the signature, but only the private key is actually capable of producing the signature.

You could think of this metaphorically as recognizing a person’s handwriting: if I have a sample of your handwriting, and ask you to write a few random words (of my choice), I can look at what you write and decide whether it’s really you.  Letting me choose the random words is important: otherwise someone else could just give me a copy of your handwriting that they took a picture of the last time you passed this handwriting check.  But if I tell you the words, that won’t work.

So how do I use an SSH key?

Generating an SSH key

First, you need to generate the public/private pair, if you don’t have one already.  (If you do, it’ll typically be in ~/.ssh/id_rsa and id_rsa.pub).

$ ssh-keygen -t rsa -b 4096

(the “-t rsa” tells it to generate an RSA key, and the “-b 4096” tells it to use a 4096-bit key)

This will ask you some questions:

Generating public/private rsa key pair.
Enter file in which to save the key (/home/blah/.ssh/id_rsa):

Just hit enter here: the default location is fine.

Created directory '/home/blah/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:

These should not be left empty.  The passphrase gives you an extra layer of security, so that even if someone manages to gain access to your system and copy your private key, they can’t use it without also having your password.  Modern non-Windows systems have some sort of private key storage, however, so you will only have to unlock the key once, and this unlocking typically happens automatically when you log into your computer.  The only time I normally need to specify this password is when I’m already connected via SSH, e.g. from home to my office computer, then try to SSH from my office computer somewhere else.  The key storage I mentioned is smart enough to know that my SSH connection is not part of the same session as my desktop login, and doesn’t unlock the key for me.

Now you’ll get confirmation along with a digital fingerprint and random ASCII art supposedly representing your key:

Your identification has been saved in /home/blah/.ssh/id_rsa.
Your public key has been saved in /home/blah/.ssh/id_rsa.pub.
The key fingerprint is:
a7:f7:b5:a7:7d:69:40:1d:30:57:9d:b1:a8:a0:58:d0 blah@loki
The key's randomart image is:
+---[RSA 4096]----+
|     ..      o.o*|
|      .E      +oo|
|       . .   ....|
|      o . . .. . |
|     . .S ...    |
|         o   .   |
|        . .   o .|
|         . . . =o|
|            . +oo|
+-----------------+

Installing your SSH key

Of course, the key is useless until we’ve actually distributed the public key to some remote machine we want to SSH to.

The QED workstations all share a common network-mounted home directory, which means anything you do on one machine is available on all the machines at once.  In practice, this makes life a little easier because we only need to install the SSH key on one of the machines to have it available on all of them.  So let’s install it to markov.econ.queensu.ca.  The easiest way to do this is to use the ‘ssh-copy-id’ command:

$ ssh-copy-id rhinelaj@markov.econ.queensu.ca
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
rhinelaj@markov.econ.queensu.ca's password:

With luck, this is the last time I’ll need to type in that password.  Type it in, then you should see:

Number of key(s) added: 1

Now try logging into the machine, with: “ssh ‘rhinelaj@markov.econ.queensu.ca'”
and check to make sure that only the key(s) you wanted were added.

So let’s try what it suggests:

$ ssh rhinelaj@markov.econ.queensu.ca

At this point, you’ll be asked for the private key passphrase (not your remote SSH account password).  Different systems prompt in different ways.  On most Linux systems, you’ll typically see something asking you whether you want to add this to the keyring, and automatically unlock it when you log in.  Do that.

If all worked, you should find yourself logged in to the remote system, without being asked for your remote login password:

Last login: Thu Dec 4 12:44:27 2014 from <snip>
rhinelaj@markov:~$

Great, it worked!

Embracing laziness

The next thing I usually do is make my life easier by adding things to the ~/.ssh/config file.  This file lets you add various SSH configuration settings based on the host you’re trying to connect to.

Here’s the relevant contents of that file for me (I have several other entries, which aren’t relevant here):

Host *
    HashKnownHosts no
    StrictHostKeyChecking no
Host qed sargan tobin rust markov gauss lovell waugh durbin frisch granger cox urquhart wolpin arrow
    HostName %h.econ.queensu.ca
    User rhinelaj

The first three lines define global options to apply when I SSH to any host–they turn off known host hashing and strict host key checking.

The former (which defaults to yes) tells SSH to put the actual host name instead of an irreversible hash of the hostname in the ~/.ssh/known_hosts file when connecting to a new computer.  I like this setting because it lets me do ssh host tab-completion in bash, which understands how to read the known_hosts file.  The disadvantage is disabling it is that, if someone gains access to my system, they can see what other systems I SSH to.  Of course, that information is also in my ~/.bash_history file, so it doesn’t seem like a big security hole to me (especially since my private key, of course, has a passphrase).

The third line, turning off StrictHostKeyChecking, gets rid of the prompt you get the first time you connect to a new machine:

The authenticity of host 'markov.econ.queensu.ca (130.15.74.64)' can't be established.
RSA key fingerprint is 40:a6:37:20:6a:5d:b2:9a:09:f2:b8:a5:66:6c:d3:35.
Are you sure you want to continue connecting (yes/no)? yes

Turning the setting off is telling ssh to just automatically answer “yes” to that question.  (It still complains loudly and refuses to connect if a host key changes, because that might indicate that some other machine is pretending to be “markov”).

The last three lines of the .ssh/config snippet above are specific to the QED servers listed (“qed”, “sargan”, etc.).  The HostName line indicates to ssh that when I say I want to connect to host “sargan”, what I really mean is “sargan.econ.queensu.ca” (and likewise for the other machine names).  The User line indicates that when connecting to markov (or whichever) I want to use the login name “rhinelaj” instead of my local system username (which is not “rhinelaj”).

So now I’ve shortened this:

ssh rhinelaj@markov.econ.queensu.ca

into this:

ssh markov

Another option you might consider, particularly if usually running graphical programs remotely, is the ForwardX11 directive (which corresponds to the “-X” command line argument).

So, by adding the following line (under the “Host qed sargan …” line), I can also tell ssh to always forward X11 connections.  So now we’ve shortened

ssh -X rhinelaj@markov.econ.queensu.ca

into this:

ssh markov

If you add this, but sometimes want to connect without forwarding X11, you can use “ssh -x markov” (lower-case x) to disable the X11 forwarding on that connection.

I don’t have this line in my ~/.ssh/config because I usually just running command-line tools, not graphical tools, and having it slows down the initial connection a little (because it has to establish the X11 tunnel), so I typically leave it off.  If you’re usually using remote graphical tools, by all means turn it on.

Running remote commands

There’s a nifty command tool that I’ll get to in a bit that lets you run the same command on a bunch of different computers, but first you should know about running a single command on a single computer.  You can, of course, do this:

me@local$ ssh markov
me@sargan$ echo hi
hi
me@sargan$ exit
logout
Connection to sargan.econ.queensu.ca closed.

but you can also stick a command on the ssh line to connect, run the command, then disconnect all at once, like this:

me@local$ ssh markov echo hi
hi

Determining system load

To figure out whether a machine is under load, you want to use the “uptime” command.  uptime gives you a single line containing various useful information: the current system time, how long the system has been running, the number of users logged in, and the “load average” of the system.  For example:

$ uptime
19:14:59 up 3 days, 6:10, 3 users, load average: 0.13, 0.16, 0.14

Another useful command is “w”, which gives you the same output as uptime and tells you who is logged in (i.e. who are those 3 users?).  In my case, one is my desktop session, and two are sessions for terminals I have open right now:

jagerman@loki:~$ w
 19:14:59 up 3 days,  6:10,  3 users,  load average: 0.13, 0.16, 0.14
USER     TTY    FROM     LOGIN@   IDLE  JCPU  PCPU   WHAT
jagerman :0     :0       Mon13   ?xdm?  6:52m 0.07s gdm-session-worker [pam/gdm-password]
jagerman pts/0  :0       Mon13    2:03  0.52s 0.52s bash
jagerman pts/2  :0       19:12    3.00s 0.12s 1:56 /usr/lib/gnome-terminal/gnome-terminal-server

This also has some other useful information: how long the users have been idle, how much CPU they have used, and what command is currently active.

So what exactly are those three load numbers?  Here’s the technical definition from the uptime manual, which is fairly clear:

System load averages is the average number of processes that are either in a runnable or uninterruptable state. A process in a runnable state is either using the CPU or waiting to use the CPU. A process in uninterruptable state is waiting for some I/O access, eg waiting for disk. The averages are taken over the three time intervals. Load averages are not normalized for the number of CPUs in a system, so a load average of 1 means a single CPU system is loaded all the time while on a 4 CPU system it means it was idle 75% of the time.

Load numbers certainly aren’t everything, but are a good indicator of whether a system is busy.  Seeing the 5 and 15 minute averages also gives you an indicator of whether this business is new (and perhaps temporary), which would have a high 1 minute average, but lower 5 and 15 minute load averages, or whether its been busy for a while (generally all three loads are close to the same).

So, to find out how busy some system is, let’s say “rust”, and who is logged into it, use:

$ ssh rust w
 19:35:43 up 7 days, 3:29, 3 users, load average: 0.00, 0.01, 0.05
USER    TTY     FROM              LOGIN@ IDLE  JCPU  PCPU  WHAT
someone pts/8   somewhere.queensu Tue12  2days 0.10s 0.10s -bash
someone pts/10  somewhere.queensu Tue12  2days 7:00 7:00   top

(I edited the output a little for privacy reasons).

What CPU?

Something else I often want to know is how fast the CPU in a given machine is.  (It matters less these days, as most of the QED machines are pretty similar, but a couple years ago the CPU speeds different more drastically).

Linux exposes CPU information in the virtual file /proc/cpuinfo.  (I won’t copy it out here as it’s rather long and boring).  You can extract that information with some shell commands, though.  In particular this command:

$ grep ^model\ name\\\|^cpu\ cores /proc/cpuinfo | sort -ru

grabs lines from /proc/cpuinfo starting with “model name” or “cpu cores”, then sorts it in reverse order (“sort -r”) and eliminates duplicate lines (“sort -u”), producing:

model name      : Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
cpu cores       : 4

Of course, knowing whether the CPUs are in use is nice to know too, so I’ll add the following two commands (to prefix the uptime line with a label matching the above):

$ echo -en "uptime\t\t:"; uptime

Now put it all together:

$ grep ^model\ name\\\|^cpu\ cores /proc/cpuinfo | sort -ru; echo -en "uptime\t\t:"; uptime

gives me the rather nice info:

model name : Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
cpu cores : 4
uptime : 19:49:39 up 3 days, 6:45, 4 users, load average: 0.10, 0.11, 0.20

And I can run it on a remote system like this:

$ ssh markov 'grep ^model\ name\\\|^cpu\ cores /proc/cpuinfo | sort -ru; echo -en "uptime\t\t:"; uptime'
model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
cpu cores : 4
uptime : 19:50:21 up 27 days, 22:34, 1 user, load average: 0.08, 0.03, 0.05

Multiple systems at once

The last piece of the puzzle is a nifty command called “parallel-ssh”, which comes in the “pssh” package on debian and Ubuntu systems.  It lets you run the same command on several systems at once:

$ parallel-ssh -i -t 0 -p 20 -H "markov rust sargan" echo hi
[1] 19:52:40 [SUCCESS] sargan
hi
[2] 19:52:40 [SUCCESS] rust
hi
[3] 19:52:40 [SUCCESS] markov
hi

(It’ll be coloured, too: try it yourself to see).

Run parallel-ssh –help to see what all those command line arguments do.

Of course, that’s a rather lengthy thing to type out several times, so I did two things: first, I created a text file with all the QED workstations in it, one per line, and saved the file as ~/.ssh/d211.txt:

sargan
tobin
rust
#markov
gauss
lovell
waugh
durbin
frisch
granger
cox
wolpin

(Edit: 12 January 2015: Note that markov is commented out above, as the machine is currently out of commission).

Next I created a bash alias ‘msh’ that runs parallel-ssh with the various arguments I want, and gives it this file.  Add this line to the end of ~/.bashrc, so that it’ll get loaded for all new terminals:

alias msh='parallel-ssh -i -t 0 -p 20 -h ~/.ssh/d211.txt'

Now open up a new terminal, and run:

$ msh echo hi

to get the output:

[1] 19:57:21 [SUCCESS] sargan
hi
[2] 19:57:21 [SUCCESS] gauss
hi
[3] 19:57:21 [SUCCESS] granger
hi
[4] 19:57:21 [SUCCESS] markov
hi
[5] 19:57:21 [SUCCESS] waugh
hi
[6] 19:57:21 [SUCCESS] tobin
hi
[7] 19:57:21 [SUCCESS] frisch
hi
[8] 19:57:21 [SUCCESS] durbin
hi
[9] 19:57:21 [SUCCESS] cox
hi
[10] 19:57:21 [SUCCESS] wolpin
hi
[11] 19:57:21 [SUCCESS] rust
hi
[12] 19:57:21 [SUCCESS] lovell
hi

So now, putting the pieces together, we can use msh with the nifty give-me-your-cpu-and-system-load command string, like this:

$ msh 'grep ^model\ name\\\|^cpu\ cores /proc/cpuinfo | sort -ru; echo -en "uptime\t\t:"; uptime'

which gives a bunch of output (one per remote system) like this:

[5] 19:58:26 [SUCCESS] markov
model name : Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz
cpu cores : 4
uptime : 19:58:27 up 27 days, 22:42, 1 user, load average: 0.00, 0.01, 0.05

Of course, you can use ‘msh command’ to run all sorts of other things on all of them, too.

Leave a Reply

Your email address will not be published. Required fields are marked *