It start last monday 10.40 am when a user notified me that scampoint was down. We did the usual checks, but it was the first time we encountered the symptom - scampoint admin account was locked out, and irregularly, sometimes 30 seconds, 1 minute, 5 minutes, 10 minutes, 30 minute, finally stopping at 2 pm. 5 pm it started, locked out a few times and stopped at 6 pm. For the whole day we were checking the various logs, and well... nobody could identify the problem. The application administrator, server administrators, application support vendor. The source IP locking the account was the scampoint server itself.
Tuesday was peaceful. We did a reboot of the servers just to make sure. The server administrator was telling me that the application administrator doesn't even know how to troubleshoot the problem. I told him, administrators are like operators, and if they can do troubleshooting then they are better than others, like himself, he is an server administrator, but he helps us troubleshoot our scampoint issues sometimes. So that was a psychological nudge that I was passing problems to him.
After spending 2 days thinking, I asked these guys what they thought of my speculation, that it's a human triggered action, because it only happened during office hours, has irregular occurrence, and anybody who tries to log in with our user name will lock us out with the wrong password, which is so easy to find out because our admin account name appears all the over place, if you know how to look. What I could think of was just typing a wrong password for a particular user name will lock the user out. Everybody was still clueless, so I escalated to their infra manager, i.e. my infra guy, to ask him for ideas. He told me that he was impossible to prevent an account from getting locked out.
I went back to the administrators to tell them that we will be on our own so I asked the server administrator whether he could write us a script to automatically check our account every minute and unlock it if it is locked, and then send an email notification. His incentive was to feel satisfied that he has contributed to improving the productivity of the team, have lunch in peace, and not having to take turns to monitor whether the account got locked, else we will all have to write an incident report because the intranet is down when everybody is out for lunch. He was helpful enough to agree, and that took some stress out of our plates. My agreement with him was that it's just to help us save time to solve other issues until we can think of something for this issue. Today we had about 10 burning scampoint issues on hand. One of those days where it feels like I am defusing a time bomb, and I cannot make any mistake. The scampoint administrator is already quite stressed out and tickets are piling up.
Luckily for today, the account locking happened 3 times between 10.10 and 11 am, and for the rest of the day it was fine. I was trying to reproduce the problem, but couldn't, this is just self-affirmation I won't pass as a hacker. Thanks to the guy who told me it's impossible to prevent the account from getting locked out, I turned to my friends, and one of them suggested that I lock down the account. I was so glad that he had been thinking about it for 2 days too! So we will try that tomorrow and if it works, we may never know what caused it, and it may be a pity not to know the answer to the mystery.