Here in the Linux Laboratory at Northern Michigan University, we have quite a few users and quite a few computers for them to use. It is important for laboratoies like us to quantify usage. This data can be used to justify expansion of a computer laboratory, describe who is actually using the machines, which machines are being used, or just satisfy simple curiosity.
Being the curious type, I sat down to write a program that would gather usage information. The information I wanted includes:
- How much time each user spends online.
- How much time each computer spends being used.
- How often the computer is up.
- User total usage time divided by weeks (to see long term trends).
- User total usage time divided by day for the last couple of days (to see current trends).
My first thought was to just stick my head in at odd times and count users. But for such a strategy to work, I would have to count users at various times in the day, including times I might not otherwise be inclined to visit the lab (like early mornings). Further, I would miss users using the lab remotely, over the internet.
My second thought was to use the « w » command. This command reads a log file (normally /var/log/wtmp) and produces a line of output for every logon event in the past, describing who was logged on and for how long. My hope was that a summary of this information would provide the usage statistics I was looking for. Unfortunately, this command does not produce foolproof output. If the machine crashes while someone is logged on, then « w » will sometimes produce the wrong total time online. Even worse, if a person is logged on but idle, this idle time still counts as usage as computed by « w ».
Counting idle time was unacceptable to me. We have several users with computers in their offices, and they are essentially logged on 24 hours per day 7 days per week. Their usage is nowhere near this level (yes, even college professors go to sleep!)
Luckily , there was an alternative to « w ». The easiest way to find out who is currently logged onto a computer is to use finger, a program designed for just this purpose. The command « finger @hostname » will describe who is logged on to « hostname », and how long since they actually typed a command (i.e. finger knows their idle time).
Finger produces a header line, and the one line for every person logged on. Eliminating the users with a high idle time will provide a list of users who are using the computer at any given moment. A log file of such lists, gathered at regular intervals, will describe usage over the time the log file was gathered.
There is an important statistical assumption here. We assume that a set of entries will accurately describe usage over the whole time period, not just the precise moments when those entries occur. For this assumption to be valid the entries should be gathered at regular intervals.
The other complicated issue is to define usage. Often a single computer will have several users logged on simultaneously, and often a single user will be logged on to multiple computers at once (as I am now). It becomes important to carefully define usage in these cases. I adopted the following definitions.
- A computer is in use if and only if there is at least one user using that computer.
- A user is logged on if and only if the user is logged onto at least one computer.
- A computer is up if and only if it responds to the finger command at all, and is otherwise down. Note that a computer that is currently running Windows will NOT respond, and will therefore be counted as down (which makes sense to me!).
Given these definition, it becomes important not double count users where they are logged in more than once, and to not double count computers when they have more than one user. Correct programming eliminates these double countings (see the source code below).
The Log file
The log file contains a series of records, each one of which is a description of the results of running finger on the set of hosts. The size of each entry is minimized, since many entries will be gathered yet the log file should remain modest in size. The top of each entry contains the date and time the entry was gathered, which is important for gathering time and date based statistics. The log file entry below shows that it is 11 45 in the evening on 10/11/97, and that I am the only one logged in besides root. Root and I are using the computers ogaa and ogimaa. Also shown is that the computer nigig is down, since it is not listed at all.
|Date 97 10 11 23 45 Host ogimaa 1 Host bine 0 Host gaag 0 Host makwa 0 Host mooz 0 Host zagime 0 Host ogaa 1 Host euclid 0 Host euler 0 Host fermat 0 User randy User root
Total 2 users
The program is named fingersummarize, since its job is to summarize a set of results from the finger command. It is written in Perl, since Perl offers wonderful support for associative arrays (where the usage stats are stored) and working with strings (from the log file and the output of finger).
There are two basic tasks of fingersummarize. These functions could easily be done with two separate programs, but I find it easier to have one program with options rather than two executables.
- It should gather finger results, and store them in a log file. (fingersummarize -probe)
- It should read the log file and produce the usage statistics. (fingersummarize -print)
Fingersummarize can be installed easily. Just follow the instructions below.
- Copy the executable to someplace on your system, such as /usr/local/bin.
cp /tmp/fingersummarize /usr/local/bin; chmod 755 /usr/local/bin/fingersummarize
- Edit the top of the executable so that fingersummarize will probe your machines instead of mine. This should be very easy to do.
- Make a blank log file and put that log file somewhere. Often /var/log/fingersummarize is a reasonable place.
echo > /var/log/fingersummarize; chmod 600 /var/log/fingersummarize
- Install a line in cron so that fingersummarize will run in probe mode at regular intervals. Below is the line I use, which runs fingersummarize every fifteen minutes for every hour.
0,15,30,45 * * * * /usr/local/bin/fingersummarize -probe >> /var/log/fingersummarizelog
That’s it. Now, whenever you want to see a current summary of the usage data, just run
fingersummarize -print < /var/log/fingersummarizelog
Here is some sample output. A current example for my lab can he had at http://euclid.nmu.edu/fingerprobe.txt . The executable itself can be had at http://euclid.nmu.edu/~randy/Papers/fingerprobe . Note that the total number of hours computers were in use (12.8 hours/week) exceeds the total number of hours that people were using computers (10.8hours/week). This just means there were times that some person was using more than one computer at a time. Also, note that the useage spikes at 10am, since a particular class sometimes meets in the lab at 10am.
|Stats by user
User Total Usage Hours
Name Observ. Percent /Day
abasosh 47 4 0.42
agdgdfg 54 4.6 0.49
arnelso 7 0.6 0.06
bparton 2 0.1 0.01
bob 28 2.4 0.25
brandk 101 8.7 0.92
btsumda 37 3.2 0.33
chgijs 1 0 0
clntudp 1 0 0
daepke 2 0.1 0.01
dan 93 8 0.84
dfliter 17 1.4 0.15
gclas 43 3.7 0.39
goofy 15 1.3 0.13
gypsy 2 0.1 0.01
jadsjhf 2 0.1 0.01
jbsdjh 2 0.1 0.01
jdefgg 2 0.1 0.01
jeffpat 6 0.5 0.05
jpaulin 7 0.6 0.06
jstyle 4 0.3 0.03
jstamo 17 1.4 0.15
jwilpin 37 3.2 0.33
jwilpou 79 6.8 0.72
kangol 39 3.3 0.35
matt 58 5 0.52
mhgihjj 8 0.6 0.07
randy 187 16.2 1.7
rbush 2 0.1 0.01
root 22 1.9 0.2
rpijj 2 0.1 0.01
sbeyne 17 1.4 0.15
sdajani 1 0 0
sdalma 28 2.4 0.25
ship 1 0 0
skinny 48 4.1 0.43
stacey 2 0.1 0.01
tbutler 35 3 0.31
tmarsha 5 0.4 0.04
tpauls 34 2.9 0.31
vladami 30 2.6 0.27
xetroni 26 2.2 0.23
Overall 1151 10.24
Stats by Host
|Stats by the Week
97 10 04 74.5705816481128
97 09 28 55.9130434782609
97 09 21 64.7
97 09 14 113.023956442831
Last Two Weeks
Stats by the Hour