Systems Admin (Linux)

by

Up awake, breakfast, computer on while feeding cats.

First thing, its just gone seven, I was not on for alerts for the night before, but time to check see what has been happening over night.
Alerts? Patterns in support mails? Hammer out a few replies before realising that time is getting away from me... no seriously, its much later now, cycling kit, hunting down stuff for bag to head in with.

Arrival at work, its 0820, ridden in from home, signed in at the front desk, sock-footed, shoes in hand not wishing to base for apex on the (curiously warm) slate floor. Breakfast is still bouncing around in my belly, as I wave the RFID pass at the door. It is shower, freshen up, and take what entered my trusty courier bag as a freshly ironed shirt, is now coming out looking considerably more “why did I bother”.

Heading up to the office, I am second in. Chief developer is in doing developer things.

Punching the power buttons to the respective pieces of hardware, I sit down, fill out the days goals, things to remember, and glance over the few pieces of paper left from heated discussions on “this is why you have problem X” involving clouds, boxes, arrows, and a foreign scrawl of my boss who is looking for complication in an error that I was not having, and in no way trying to solve. These are temporary shuffled, and filed on a far flung corner of the desk.

Must focus, must not read email, must not read email yet, must n...

Open email client.

While I am waiting for it to start punch up a phalanx of comms tools, skype, pidgin, gtwitter, browser, console, console, console... you cannot have too many at the ready; rule of the thumb – 'mine', 'theirs', and 'oh crap'.

The phone is ringing, something is broken. Answer phone message says 10am phone support, mail in otherwise. It is ignored, you can hear them replace the handset, think, redial. Place handset in the draw. Its muffled tones still audible, but a small sense of relief washes over me.

Headphones on – today some Counting Crows. Click. Glance at empty tea cup for about the fourth time, as I start work on trying to figure out why our mail relay is not relaying. Clue in the name and all that, it should, it is not. There was an issue yesterday, we solved it by blacklisting a users site / password. Still looking at an 8000 mail backlog at 4pm, and spiralling load seeing cron queue flushes running into each other and making the whole situation a lot worse, as it starts at the top of the list with 'can I send that right now, no, next one, errr, no, hang on its busy in here, can ... I.... send... the' – before it spawns itself again and repeat. Urgh.

The queue is down at a comfortable 2000 and the load is minimal, however must figure out where the same patter junk that is in there is coming from – notes into book.

Thankfully due to the wonders of automated monitoring this is something we know about, that and the influx of support, that will continue for a solid day after the issue has disappeared, gone, vanished, fixed.. but still to blame for every form of user error under the sun. Thanks in part to the open honesty of sticking the issues we are having on line to let the people know. Rather than “OMG a PR disaster” - we would rather people know that things are not so hot, and we are all over fixing it (so – shhh, don't pester).

None the less, there is an issue, and its bigger than I have seen. What was an issue, is not, its something else, this is the tip of the iceberg.

There is much logging in, typing, scratching head, random ideas, log checking, process checking, scratching head.

Time to wake the cavalry. One text, one boss, one weary head appears on Skype.

Very often its not the act of getting two people on the task that fixes it. It is the need for you to verbalise what you have done and what you have not done, and what you are seeing as the state of play that unlocks the block in your head that stops you from making the pain go away. The downside is that unless you have a strong word with yourself this manifests itself as “you slave away at something for hours, ask a mate, and its done in seconds, you are so thick”... which is not how it actually works, just looks that way.

The boss arrives online, and ongoing Skype windows with client is relayed back and fourth.

The issues are located. Needless to say, as with when you loose things, they are always in the last place you would look. There is no point in looking in the least likely places first, as the laws that hold time, physics, magic together knows this, and adjusts the fault accordingly.

Schrödinger and his moggy step in, as you find things that should not be working because of X and Y, but on fixing those the issue does not go away.

Divine intervention is summoned. It is apparent that my subscription to that service is long overdue, and it takes an age for us to eventually spot two separate issues buried in the logs that are causing the hidden issues, and the upstream causes that are presenting to us and the users.

The playlist has stopped, and I had not noticed. Somehow it is over an hour later, and the my grey matter is tired. After the rush to fix things its awfully quiet, like the pain of silence after loudness. The cup is still empty, but at least the phone has stopped.

The status blog is updated, a deep breath is taken, and the wander to the facilities to empty me, and fill the kettle occurs in the restbite.

I return to find a developer unable to access some images, and he has resorted to 777. Never a good sign. Permissions, it is always permissions, and 777 is not a cure for all and just makes you look desperate ;)

If there is something that developers don't pay attention to it is permissions, specifically special characters, and what happens if you stick (or don't) an execute bit onto a directory... it means you can change into it... whether you can read the contents or not. This has to happen in the whole traverse to where you need to be. Always overlooked, rarely grasped. Lecture over. Issue solved with a chmod and a small explanation. Developer unlikely to have taken it in, as is overjoyed his images now render.

Right – two spam mail addresses I caught from the relay earlier, time to get a bit medieval with those. Take them to bits, find out where they where sent from, which authenticated account, or which web server was trying to spew out this torrent of crud, and action it.... after filling flask from tea pot, as another developer arrives.

Distraction – the name of the game.

Another hour has passed and nothing to show for it but a bunch of coloured tags in the support inbox, and a whole bunch of the green ones missing. Mostly saying the same thing; “yes we know, its fixed, not yet, time, read the status blog”.

There are currently three terminal windows open on my desktop. The 'Top Trump' of these is the system one for me, with no fewer than 11 concurrent shells to remote servers. One of which is keen on telling me “ squid[23672]: WARNING: Median response time is 288 milliseconds” - which results in another delay as I google to find out whether this is anything to be concerned about.... although to be fair .3 of a second does not feel like something that is going to spoil my day at this juncture, whatever it is.

Glances at cup, empty, flask, fills, switches to some James Yorkston.

Picks on one of the rogue emails, works way through headers, located, noted.

Picks second one, headers, located, noted. Same user, frowning face, account suspended, email sent out.

It is approaching midday. I am calm from the panic this morning, and looking over the tasks of administration and support that lay before me.

Support.

Need to purge all these spam mails really, hack about with a testing script, stop service, (brace for grief) run script (hope it did what it was meant to and tested to do) restart service (brace for needless complications).


# for foo in `mailq | grep -v 8BITMIME | grep "Security-Services" | awk '{ print $1 }'`; do find /var/spool/ -name "*$foo*" -type f -exec rm -vf {} \; ; done;

Much comedy with the post. An R2D2 arrives with a display stand in it.

The monstrous erection in the office is suffering from too many cooks. I am not being a cook I am pressing on with stuff and turning up volume to reduce distraction.

Spurious error from clients, who is unwilling to leave phone until gotten to the bottom of – no amount of we will need to look into it is working, despite this being the truth. Thankfully this is not my ear being cooked, but a colleague, and they are relaying their pain via IRC.
Leaving errors in Sendmail log that you are not entirely sure what they mean. Its a one off box, we don't use sendmail anywhere else, so its going to be a lot of legwork for no re-use. Obvious leads come to nothing. Client then reveals that they still had virus scanning switched on, and when they turned it off it started working again, rebooting the machine and it works with it on. None the wiser to what the error was saying, and mildly peeved that (as with 90% of the cases) tail chasing takes up more of the day.

Annoying developer is what seems to be purposefully squeaking their chair as I try to keep track of what and where in this stack of console windows. Its not long until lunch now, and I have both grease and oil in my draws here.

Lunch follows at an almost two o'clock, as a consensus of three standing with consoles locked seems to be the rule of thumb. Lunchtime these days is down to the building canteen, where you can play 'jacket potato roulette' on size, or push your luck with pannini with 'garnish' – where you help yourself, and not make eye contact the staff, becoming, 'bowl of salad' with pannini.

The breeze is shot, and lies dead in the gutter, normally resulting in plans for something 'creative' to do, that will amuse us. The sign on the door suggests we are creatives, so that's 'okay' – we are creative people. However the building owners are aware of this, and see the fruits of our idle minds as “childish”. We beg to differ. Today's effort to raise a smile was to order rubber/plastic newts to push under the door of a company on the ground floor who have just had to outlay 20K in clearing their old factory site of newts. My last masterpiece was noticing how the lights shone through the glass brick walls revealing interesting shapes on the other side. I chose to create a shape of my own with some freehand curves, card, and some tape... now you can hear peoples footsteps go past outside, stop, laugh, pause, move on. It keeps me sane.

Doing this job in a bank, or dull, monetary environment must be sooo uneventful, even if the pay is (much) higher. Web, media, creative, maybe its just me, maybe its just where the people with idle minds gather, who can say.

The afternoon picks up speed at an alarming rate. There are those days where the progress of a single minute seems like hours, and days when you look up and wonder who moved the hands on the clock. This was one of the latter today.

Firewalling is task of the hour(s). Still getting fragmentation resulting in certain protocols crumbling. SMTP and FTP get going and fall on their faces horrifically. The issue being that the upstream connection we have bypasses the route it should take to get us onto the building leased line. However the box we connect to doesn't seem to be doing the MTU path discovery we need, so the window size is all over the place. Resultant, you send a bunch, fragmentation reaches threshold, it harshly drops the connection. Data goes down pan. Error message on screen for no apparent reason.

We have no control or access to the PIX to which we connect so its an afternoon of read, try, upset people, read, try, progress, read, try, broken. There is no dev system when you are connecting out like this.

Various echo “1” > /proc/sys/network continues with tweakage... which as everyone knows goes better, better, better, broken, FFS.

Overall steps forward made in performance and knowledge, so boxes ticked.

Tech. people tasked with selling tech.. Never a good idea. Large erection back up in the office and trying to organise. Home time has passed by some 40 mins now and I am working obsessively trying to make as much 'this could break things badly' done as I can with people wandering around. They are trying to figure out how wide the printing needs to be for the exhibition wall thing. Instructions translated beautifully from the Chinese. Excellent.

It's proper dark outside.

Time to organise my desk in a “you need to be doing this” for the morning. Write a few notes to self. Mail a few things home. Un-mount external drives. Logout of umpteen windows with various states of done things... and of course the one that you really didn't want to... every time. D'Oh. WHY OH WHY. Log back in, sort what you had not finished. Log out.

The day of distraction, being pulled five ways at once, this that the other if not following each other then at the same time in a number of consoles. My head is tired.

I can almost feel the bighting cold of outdoors as I prepare to shutdown, go get changed, and take the back roads along the hillside winding slowly up to home. 30 mins bad day, 20 mins good... it's always four years better than the morning train commute to London. Although there is a lot to be said for the “few, we happy few, we band of brothers” who fight their way in to the city from the more remote reaches... real belonging, humour, and despair in the train system.

The day finished at five officially, it will be gone six when I write my name next to the 0820 of arrival today.

Article types: