Notices
Smith College Admission Academics Student Life About Smith news Offices
  Home > Offices > Information Technology Services > TARA >  
   Reset Password     Smith Directory    Smith Portal   Server Status  Smith Software
 
       
 




 

Detailed Findings


May 15th Service Interruption Summary

In 2013 ITS implemented a new Identity Management (IDM) system using a product called “NetIQ.” IDM systems are designed to manage the life-cycle of individual identities including authorization, personal information (such as name and contact information), and passwords across all connected Smith systems.  Connected systems include OneCard, Google, moodle (LMS), and Active Directory (AD).

Banner, our current ERP and data authority on Smith employees, students and staff tracks identity changes and feeds a specific database table that the IDM tracks called the person_attribute table. This table changes over the course of any given day and the IDM checks the table every 15 minutes after it last completes a check-in. That last part is an important distinction: the IDM takes a varying amount of time to process the table so the “15 minute check-in” in fact drifts around the clock and does not occur at a predictable time.

That drift allowed for an infrequent scheduling conflict between Banner and the IDM system where the IDM could check the Banner table at exactly the wrong time (more details on this below).  This silently occurred on 12/29/2017 and then again on 5/14/2018. We don’t have records further back in time than December but it seems plausible that the scheduling conflict happened a couple times a year and was not noticed.

On 4/19/2018 a rule was put in place on the IDM system called the “returning person rule” to try and address an issue where returning faculty, students or staff who had previously left the college would return and not know their password.  The rule would detect a returning person, reactivate their account with a random password and email their alternate email address with their account information.

At 11:00 pm each night Banner deletes the person_attribute table and recreates it “fresh”.  This whole process takes about 30 seconds but has several steps. The break down of the time elapsing for each step on the specific night of the incident (5/14/2018) was as follows:

delete from table took: Elapsed: 00:00:00.60
adding employees back: Elapsed: 00:00:04.14
adding students back: Elapsed: 00:00:24.34
adding admits back: Elapsed: 00:00:00.62

On 5/14/2018 the IDM system checked the banner table at exactly 11:00 and caught the table before employees, students or admits were added back in.  It caught the table empty.

To understand the impact that had we need to explain in more depth how changes are processed by the IDM.  The IDM system keeps a local copy of the person_attribute table for change tracking purposes. During each check in with Banner the IDM compares what it has in the local copy vs. the Banner table to determine what has changed.  For instance, if a person’s last name has changed, the IDM needs to determine this and trigger the rule to change that person’s name in all IDM connected systems (like Shibboleth, OneCard, etc.). The IDM looks for special codes to determine if a person has left the college and their account needs to be deactivated.  

Another component of the IDM system is what is termed the “vault.” This is effectively a flat LDAP tree that holds every account ever created by the system.  This allows, for instance, a student to graduate, then years later return to Smith College as an employee and have the same account name. When a new record appears in the Banner person_attribute table, it either triggers the creation of a new account or, if the person exists in the vault, the returning person rule is executed.

When the Banner person_attribute table was empty during the IDM check-in, it simply zeroed out its local change tracking table.  This process of zeroing out the local table took about 4 minutes and nothing else happened at that point.

Fifteen minutes later at 11:19 pm on 5/14/2018, the IDM checked the Banner person_attribute table again and found it fully populated with all 6691 accounts.   All these accounts existed in the vault so the returning person rule was executed against every single account, therefore every account had a random password assigned.

The IDM system feeds many other systems via various integration paths.  For instance the Smith LMS system, moodle, has a database connector whereas google uses an API interface.  The process of executing changes on all 6691 accounts bogged down the IDM so the password reset impact was spread over hours.  Accounts were affected at varied times for various services so user experience varied. One member of ITS was logged out of Google Mail at 11:30pm that night but was able to log directly back in because her Shibboleth password had not been changed until later in the night.

The returning person rule sends email to people’s alternate email address with their new password. The IDM also notifies several ITS staff for tracking purposes and this process executed first. That meant ITS staff received thousands of “Account Created” emails in a very short period of time which kicked in Google’s spam filters and started blocking subsequent emails coming from the IDM system. As far as ITS knows no one received an email to their alternate email address because of this spam block, however, it is possible a few got through before Google blocked these emails.

In order to prevent this event from happening again, we have disabled and do not plan on re-enabling the returning person rule.  The functionality of this rule can be handled through the ITS “Forgot my password” service or with help from the ITSC helpdesk. Critically, we also changed the system such that the IDM will not check the Banner person_attribute table while it is being rebuilt.

Restoration of service:  

We were able to restore account passwords from January 22nd.  We discovered during the day of May 15th that the process we used to backup these servers didn’t allow for us to restore the database where passwords were stored.  Once this was discovered we were able to find an older backup from a system update that had been done manually on the 22nd of January.

The method to restore the passwords required care.  We could not simply put the whole database back and role all user info back to January 22nd.  Instead we had to restore the full database to another server and write code to pull just the password field for each user out and push that into the production IDM system.  It was in this step that the current password policy had to be enforced and passwords that didn’t match our policy were rejected from being imported. In addition, the restoration process was programmed rapidly in order to get people back to work, and the output of the run didn’t capture specifically who’s password didn’t meet the policy.  (If it had, we likely would have created a communication program to alert individuals of this).

To resolve the backup problem, we have created a nightly job that cleanly copies the vault database using the same method that created the backup on January 22nd to supplement our standard backup procedures.




Seelye Hall B8 | Northampton, MA 01063 | 413.585.4487
Questions or comments? Send us an email

Copyright © 2015 Smith College Information Technology Services  |  Last updated June 19, 2018

DirectoryCalendarCampus MapVirtual TourContact UsSite A-Z