Summary

Why Needed

The Student Success Dashboard (SSD) is an application that connects students with their school information and also with their service information from service providers. Therefore it is critical to have an accurate and updated list of students from schools along with important details for the students and related schools information. This information is not part of SSD and is maintained in what’s called Student Information Systems (SIS) at schools.

What It Is

To get the needed student and school information for SSD and also allow periodic updates of that data, data has to be extracted, translated and loaded from SIS systems into SSD. This is accomplished via the integration layer of SSD. The integration layer is designed to allow schools to set up their own migration processes with their SIS systems and write directly to database tables that act as queue for new data.

How It Works

The integration layer is a set up as standalone tables in SQL Azure that act as queues that can be processed independently. These queue allow anything with proper access to insert student, school, teacher and class information from external systems like a SIS. This will typically be done via an ETL like tool or application.

Schools need to design and build their own ETL processes to extract school, student, teacher and class information from their student information systems (SIS) and insert into the Queue tables in SSD (see Figure 1 - Extract, Translate and Load School Data into SSD below).

FIGURE 1 - EXTRACT, TRANSLATE AND LOAD SCHOOL DATA INTO SSD

After data is loaded from schools SIS system, it needs to be “merged” with existing SSD data. Identity and relationships also need to be maintained. This requires more complex processing and has to be done in certain order to ensure dependencies exist when getting merged. For example, students have a school dependency and therefore schools have to be loaded first for foreign keys to be correctly set up with their SSD identifiers.

To accomplish this, stored procedures have been designed and built to properly handle taking data loaded into queues and inserting and updating them properly into SSD data tables. After the ETL process is completed loading the queues, a stored called “ProcessQueues” needs to be executed to insert and update the new data into SSD properly. This stored procedure calls several other stored procedures in order to insert new instances or update existing instances of corresponding school information.

FIGURE 2 - SSD INTEGRATION PROCESS

The merge process is a bulk processing mechanism that inserts data much faster than regular inserts and updates. It is designed to evaluate all data in one statement and determine each record separately. For each record, if new the record will be inserted, if exists based on the entity key than will be updated. The stored procedures also determine specific internal SSD identifiers and convert SIS keys to SSD identifiers.

Integration Layer Design

The integration layer is a basic information system that accepts data as standalone instances (objects) and processes each individually into SSD tables. This allows SSD to be developed independently of the data being uploaded and also enforces a standard interface for SSD data being uploaded. There are several key aspects to the Integration layer design; Queues and ETL, Stored Procedures and SSD Loading.

Queues and ETL

Getting updated school and student information into SSD is accomplished via some ETL process being built by school and loading the Queue tables in SSD (see Figure 3 - SSD Queue Tables for Integration Below). Each table is a standalone table with a “key” that is the unique identify as it is used and maintained in the source system. SSD maintains its own identity for these business objects. Therefore, these keys are critical to ensure uniqueness and identity in SSD on incremental uploads and maintain relational dependencies.

FIGURE 3 - SSD QUEUE TABLES FOR INTEGRATION

Stored Procedures and Business Logic

The SSD tables being updated are object oriented designed tables that are normalized, have their own identifier and also have constraints and indexes to properly support the SSD application (see Figure 4 below).

FIGURE 4 - SSD TABLES BEING UPDATED FOR INTEGRATION

To get loaded data into SSD properly requires certain rules, lookups and relationships to be managed. The SSD integration layer is designed to allow data to be loaded at any time and also handle large amounts of data. The loaded data is intended to represent standalone business objects and then be inserted and updated into the normalized SSD database with its own constraints and rules. To simplify the integration process, all loaded data is expected to be translated properly to needed data types prior to loading.

There are set of stored procedures that handle each queue business entity separately that either inserts new instances of the “key” identifier or updates existing instances when the “key” matches. For entities such as student that have a school relationship for the school the student is in, a special lookup has to be done to determine what the school SSD identifier is for the list school key for that student. Table 1 below is a list of the stored procedures used in the integration process.

ProcessQueues – Main stored procedure that call all integration stored procedures in correct order:

MergeSchoolToSSD – Stored procedure to merge loaded school data into SSD properly.
MergeTeacherToSSD – Stored procedure to merge loaded teacher data into SSD properly.
MergeStudentToSSD – Stored procedure to merge loaded student data into SSD properly.
MergeClassToSSD – Stored procedure to merge loaded class data into SSD properly.
MergeStudentClassesToSSD – Stored procedure to merge loaded data for classes for specific student into SSD properly.

Bulk Inserting and Merging

To handle large amounts of data for larger schools, the integration layers uses a bulk inserting mechanism as opposed to processing individually records. This is accomplished via the SQL merge statement. This dramatically improves speed and reduces database processing that helps limit the impact to the SSD application. The cost of doing it this way makes it a very ridged and fragile process. One error will cause all others records to fail and require schools to determine what broke and fix it and try again.

Integration Development Environment

The integration layer is a separate project in the SSD downloaded solution. It is a Visual Studio 2012 database project and has been designed and implemented as a standalone solution with SSD. The integration project itself is named “Strive.Integration.Database”. It is broken up into several folders to better organize the different parts of the integration layer and development (See Figure 5 - VS Project).

FIGURE 5 - VS PROJECT

dbo – Optional tables for testing (see Setting up Integration Testing)

Queue – Tables for the queues

Schemas – Schemas for separating SSD and Queues

Scripts – SQL scripts for installing and testing.

SSD – Copy of actual SSD tables. These are only a reference and used to replicate what SSD tables are for developing stored procedures.

StoredProcedures – The stored procedures used in integration process

TestData – Real test data used to test and verify the integration layer.

Installation and Setup

Prerequisites

The integration layer for SSD has been built and tested on Azure SQL and SQL Server 2012. Both of these database platforms support the SQL Merge statement, which is key aspect to the integration architecture for bulk uploads.

Prior to installing and setting up the integration layer, SSD needs to be installed along with all the SSD tables and database. After that is complete, you can then install the integration layer as outlined below.

In order to install and setup the integration layer, you will also need to be able to access the database and able to run queries against it. This can be done via SQL Server Management Studio (SSMS) or Azure portal. You will need privileges to add new entities to the SSD database.

Installing the Integration Layer

After installing SSD successfully, the following steps are needed to install and setup the integration layer:

Set up query window via SSMS or your Azure Portal for SSD database

Open SQL Script file “Install.sql” in the integration project in Scripts folder.

Copy whole script and paste into query window.

Execute query and ensure runs successfully.

Verify 6 tables for Queue Schema are added (see Figure 3 - SSD Queue Tables for Integration)

Verify 5 Stored Procedures are also created (see Table 1 - List of Integration Stored Procedures)

You Integration layer should now be installed and ready to go.

Extract, Translate and Load the Integration Layer

Each school is required to design, build, implement and maintain their own ETL solution for migrating SIS data to SSD. All needed SSD data needs to be extracted from the SIS source system. This can be in whatever format and process the school deems necessary. This data than needs to be separated into the 5 business object queues and translated to data types necessary for each queue. Lastly, the extracted and translated data needs to be loaded into the queues directly and then the “ProcessQueues” stored procedure needs to be called.

Development and Debugging

To be able to extend, customize of fix anything with integration layer, you will need to set up your own development environment and tools.

Setting up Development Environment

The integration layer for SSD has been built with Visual Studio 2012 and is a database project within the SSD solution. To make any changes to the integration layer and to test those changes will require Visual Studio 2012 to be installed along with SQL Server Manager Studio (SSMS) 2012. Also, a SSD database will need to be successfully installed and setup on Azure SQL or SQL Server 2008 R2 or later.

The integration layer is a project is called ”Strive.Integration.Database” in the overall SSD solution. Most of the projects contains the tables that are used in the integration process. There are also testing entities to help debug and test any changes make.

In the Scripts folder is the Install script for installation, along with an install script for testing objects (InstallTestObjects.sql) and also separate file for useful SQL scripts used in testing.

The stored procedures described above are all contained in the Stored Procedures folder. They are designed to be deployed via the schema compare method described below or directly via alter procedure script. Each stored procedure has commented out Alter Procedure statement that can be copied and used for updates.

Test data has also been included for each business objects. The test data is stored in an excel file in TestData Folder. There is validated data for 10,000 students, 100 schools, 750 teachers, 1000 classes and 50,000 student classes. Five classes per each student.

Deploying Changes

The recommended way to deploy changes to an existing SSD database is to use the “Schema Compare” method on the database project. To use this method, right click on the project and select “Schema Compare”. A new tab will pop up and ask for target database. You can select which “source” database you are going to be updating. After the database has been connected, you then need to “Compare” and select the entities you want updated and then “Update” to deploy them.

Setting up Integration Testing

During development of SSD Integration layer, several entities where created to help with testing and resetting. This includes simple test tables that mimic the SSD Queues and can be populated with the test data. The idea is that when the integration process runs (ProcessQueues Stored Procedure), all the test data is moved from the queues to the SSD tables. To rerun the process would require resetting everything which includes removing the data in SSD tables and also repopulating the queues. To simplify this process, the testing tables were created to store the test data and several SQL scripts were added to remove SSD data and reload the queue tables.

To install the testing tables, there is a special install script called “InstallTestObjects” in the Scripts folder. Just run that script in a query window to install the table for testing shown in Figure 6 to the right.

Adding Test Data

The test data is all located in an MS Excel file called “SSD-TestData-All.xlsx” in TestData folder. Each business entity queue has it’s own tab in the MS Excel file. To use this data will require the data to be uploaded to the new test tables created above.

Testing and Resetting

Several scripts have been included to make testing easier. In the “UsefulScripts.sql” file in Scripts folder, there are several scripts that can be used for testing. The first set of scripts create a report of the number of records in SSD, Queues and test tables for each business entity.

The second set of scripts copies data from test tables to the queues. This allows a 1 time upload of test data and makes it much easier to reload queues and test. That last set of scripts clears our SSD data. This is optional depending if want to test inserting or updating.

Extending the Integration Layer

Existing entities include school, teacher, student, classes and student classes. These are currently managed by corresponding key for each object. Add fields for these entities and keeping these keys for identity is the simplest way to extend the integration layer and add additional information to SSD.

To add fields to existing entities, the following needs to happen:

Add fields to existing SSD tables in database (see Figure 4 - SSD Tables being Updated for Integration)

Add same fields to integration queue tables ( Figure 3 - SSD Queue Tables for Integration)

Update SSD Schema in Visual Studio Integration Project

Right click on Project (see figure)

Select “Schema Compare”

In SchemaCompare tab, for source add database updated in steps 1 and 2

Select current project as target.

Select “Compare” button.

Only select tables updated in steps 1 and 2

Select “update”. Integration Project should now have changes

Update each stored procedure to add new fields for each entity. (See Table 1 - List of Integration Stored Procedures)

Return to the wiki home.