Wednesday, September 26, 2012

the account locking mystery

It start last monday 10.40 am when a user notified me that scampoint was down. We did the usual checks, but it was the first time we encountered the symptom - scampoint admin account was locked out, and irregularly, sometimes 30 seconds, 1 minute, 5 minutes, 10 minutes, 30 minute, finally stopping at 2 pm. 5 pm it started, locked out a few times and stopped at 6 pm. For the whole day we were checking the various logs, and well... nobody could identify the problem. The application administrator, server administrators, application support vendor. The source IP locking the account was the scampoint server itself.

Tuesday was peaceful. We did a reboot of the servers just to make sure. The server administrator was telling me that the application administrator doesn't even know how to troubleshoot the problem. I told him, administrators are like operators, and if they can do troubleshooting then they are better than others, like himself, he is an server administrator, but he helps us troubleshoot our scampoint issues sometimes. So that was a psychological nudge that I was passing problems to him.

After spending 2 days thinking, I asked these guys what they thought of my speculation, that it's a human triggered action, because it only happened during office hours, has irregular occurrence, and anybody who tries to log in with our user name will lock us out with the wrong password, which is so easy to find out because our admin account name appears all the over place, if you know how to look. What I could think of was just typing a wrong password for a particular user name will lock the user out. Everybody was still clueless, so I escalated to their infra manager, i.e. my infra guy, to ask him for ideas. He told me that he was impossible to prevent an account from getting locked out.

I went back to the administrators to tell them that we will be on our own so I asked the server administrator whether he could write us a script to automatically check our account every minute and unlock it if it is locked, and then send an email notification. His incentive was to feel satisfied that he has contributed to improving the productivity of the team, have lunch in peace, and not having to take turns to monitor whether the account got locked, else we will all have to write an incident report because the intranet is down when everybody is out for lunch. He was helpful enough to agree, and that took some stress out of our plates. My agreement with him was that it's just to help us save time to solve other issues until we can think of something for this issue. Today we had about 10 burning scampoint issues on hand. One of those days where it feels like I am defusing a time bomb, and I cannot make any mistake. The scampoint administrator is already quite stressed out and tickets are piling up.

Luckily for today, the account locking happened 3 times between 10.10 and 11 am, and for the rest of the day it was fine. I was trying to reproduce the problem, but couldn't, this is just self-affirmation I won't pass as a hacker. Thanks to the guy who told me it's impossible to prevent the account from getting locked out, I turned to my friends, and one of them suggested that I lock down the account. I was so glad that he had been thinking about it for 2 days too! So we will try that tomorrow and if it works, we may never know what caused it, and it may be a pity not to know the answer to the mystery.

Tuesday, September 18, 2012

who wrote it?

Me: [addressing question to vendor project manager] why was this sentence written this way?
Vendor project manager: Hmm, I didn't write this, I am not sure.
Me: Who wrote this paragraph?
Vendor project manager: [vendor technical director] wrote it.
Me: [addressing question to vendor technical director] why was this sentence written this way?
Vendor technical director: I don't know why it was written this way, I didn't write this.
Me: Who wrote this?
Vendor technical director: [vendor project manager] wrote it.
Me: When? [vendor project manager] said he didn't write it.
Vendor technical director: he typed it after the meeting he had with your team.
Users and I all burst out laughing.

Friday, September 14, 2012

strangleton

I am usually patient, calm and good, but when I am sickish, or lack sleep, I become aggressive and any one who annoys me makes them a strangle-ton, adapted the word from singleton.


Me: have we settled the outstanding issues with the user?
Vendor: yes, we have already explained to the user that it cannot be done and customisation effort is required.
Me: and the user agreed?
Vendor: yes, the user agreed to have it in phase 2.
Me: there is no phase 2.
Vendor: yes, there is no phase 2
Me: if we make this application too difficult to use, they wont use it and we don't need to think of any phase 2.
Vendor: yes correct.
Me: the users asked for a breadcrumb trail, why can't we give it to them since it's out of the box?
Vendor: because we didn't give them rights to open the root level folder so the breadcrumb trail doesn't work.
Me: why aren't you giving them rights to the root folder?
Vendor: because giving them rights to open the folder means they will be able to see the files and delete.
Me: can't we just don't give them rights to delete those files?
Vendor: no, because we don't want to break inheritance, and they have delete rights at the root.
Me: what are these files at the root folder?
Vendor: system files, the application pages that the user needs to use the system, that's why we cannot allow them to delete.
Me: why do you store the system files with the user files?
Vendor: because we only used one document library for our applications.
Me: how hard is it to separate the system and user files into two document libraries?
Vendor: it's a simple change of configuration.
Me: then can we do that and give users the breadcrumb?
Vendor: yes
Me: we need to give the users as much productivity features as much as it is within the cost and scope and breadcrumbs is there, and we will be short changing them by your (lousy) application design.


Me: now to the next issue. Why are you telling users to move files one by one instead of using the file explorer bulk moving of files functionality?
Vendor: because we only allow admin to delete/move files in bulk. Users are not admin, so they cant use the function, that's a requirement.
Me: but there is still an option for the user to move files via file explorer.
Vendor: they won't know because we didn't tell them.
Me: they are intelligent people and you think they won't know just because you don't tell them?
Vendor: yes
Me: can we tell them that they can use it?
Vendor: no, they are able to delete files in the file explorer when they don't have rights, but if they use the application they will not be able to delete.
Me: that's a breach of the security model of the product, are you sure you can delete?

His colleague explained to him that the user can click delete but if he refreshes the window, the file will still be there because he only deleted his local cached copy.

Vendor: there is another thing, if they delete from the file explorer there is no audit trail.
Me: audit trail for file explorer is an out of the box feature, why isn't it in the audit trail?
Vendor: because we customised the audit trail so there is no audit trail at the file explorer because we didn't want users to user that function.
Me: since you have already customised the audit trail, there is not much we can do for that, but tell the users that if they use the file explorer to move files, it will not be captured in the audit trail. I don't want the users to use the file explorer and then report the audit trail not capturing the move as a bug.
Vendor: ok.
Me: so think about how you want to tell the user everything we have discussed and we will meet the user next week.

Well the positive side of it is that I know we should be able to sign off the system next week after these issues are resolved. The bad part was to the user I took 2 weeks to get round to this because firstly, the vendors were supposed to be closing all the issues, and secondly, because I was away from work due to yaya's teacher's day school holiday, yaya's HFMD, 1-day course, my own body hibernating, my more ferocious users hunting me down immediately on my return, and of course the 10 chess boards that I rotate playing everyday. Another vendor I met today thought that I only have 1 project with them. Isn't that what every user thinks too? that they have tonnes of work whereas we are idling somewhere escaping from all the work. haha.

The trick is similar to multiplexing, getting the right sampling frequency of the user, and always appearing at a particular frequency to make them feel that you are full time with them, and always there for them, when you are not. And like in networks, you need to have a range of bosses/users/vendors with differet frequencies, so that you don't miss any packet. If they have the same frequency, then we need to double the sampling frequency. Ok digressed too much, the multiplexing bit was just a friday crazy bit, cant really be applied at work, there is no trick. Just pure brain juice being used to play 10 boards of chess concurrently. It helps if your opponent is 10 times slower than you are.

Saturday, September 1, 2012

manual work rocks

This happened yesterday, a friday. My colleague asked me to attend her migration meeting at 4 pm but I told her I wasn't free. My phone rang at 5 pm, she asked me to go her meeting because there is an issue. So I went and it was over in 10 min.

Me: what is the issue?
Vendor 1: our migration service request only covers exporting of data to DVD but not importing.
Me: why not import?
Vendor 1: because there is no way to import a data table with attachments.
Me: so how is it supposed to be done?
Vendor 1: you will need to import the spreadsheet first, then manually attach the documents back one by one. It is not in our migration scope to do it. Users are also not going to do it.
Me: and why must I be the one doing it?
Vendor 1: that's the issue now.
Me: do you have any unique identifier to link the data row and attachments?
Vendor 1: no, you need to look at the title.
Me: how is the folder structure going to be like in the DVD?
Vendor 1: just 1, 2, 3, 4, 5, ...
Me: how do you get those numbers?
Vendor 1: we drag out one folder at a time, manually, the numbers are system generated.
Me: if the system has the number, can you export it out to the spreadsheet?
Vendor 1: yes.
Me: if you have the number, then you can create a link to folder containing these files. We just upload all these numbered folders into a folder.
Vendor 2 (vendor 1's boss): oh...
Me: do you understand?
Vendor 1: we can't link to the folder.
Me: we manually pre-fix the URL to the root attachments folder to those unique numbers.
Vendor 2 explained to vendor 1 because he couldn't understand. At this point, the users already understood.
Users: we will use this method.
Vendor 1: we will just give you the spreadsheet and DVD and there will be no links to the attachments.
Me: nevermind, I will show you how do it when the time comes.

He must be still lost, but I think his boss should be able to explain to him. I really don't how they survive as vendors to be recommending manual work. Actually I don't really have any business in this, but that's how my work is.