What makes our online exam platform so resilient & why it matters
Program Directors who run large assessment events want one thing above all else: for everything to go smoothly. No hitches.
They might have visions of hundreds of frozen screens with hundreds of anxious faces reflected in them, glitches that blast students’ answers into oblivion, or the building’s flimsy internet connection dropping again and again. Most of all, they wonder how they’ll recover from these problems.
The concerns are genuine. Large-scale online exams use complex software and hardware with many points of failure. To create a stable, reliable exam platform, engineers and architects must achieve something crucial: resilience.
A resilient online exam platform seldom freezes or crashes. It doesn’t lose students’ test data or have gaping security holes for hackers to exploit. It works perfectly during internet dropouts. In essence: it allows smooth exams for students.
At Janison, the resilience of our test platform allows us to run some of the world’s largest exam events: NAPLAN, the Check-In assessment for the NSW Department of Education, ICAS, and more.
In this article, we explore the features and processes that contribute to our online exam platform’s resilience, why they’re important, and how they’ve helped us create one of the strongest exam platforms in the world.
1. Offline testing
Some people have wobbly internet connections with constant dropouts. It’s annoying for buying clothes or streaming movies, but for something significant like a high-stakes test, it’s devastating. A person’s internet connection shouldn’t prevent them from gaining qualifications.
To be accessible and equitable, a test platform must be resilient enough to work offline. Students should be able to complete their tests, undisturbed, through every dropout. Or even without a connection at all.
At Janison, we achieve this through our test application: Replay. Before the exam starts, we ask institutions (or students, depending on who’s providing the computers) to pre-install the app, which requires a brief internet connection to download. Once installed, students can complete the tests offline in their classrooms, test centres, at their homes – anywhere they like. The app saves their answers securely, and when they re-connect to the internet, it uploads them to Janison’s servers.
To be accessible and equitable, a test platform must be resilient enough to work offline. Students should be able to complete their tests, undisturbed, through every dropout.
This powerful feature ensures that remote schools, institutions and companies – including those shrouded in red sand and cursed with awful internet – can run high-stakes tests for their students and help them build brighter futures.
2. Managing server loads
When a student accesses a test delivered by Janison, the data comes from various servers in their region. To put things simply, if any of these servers fail and they can no longer contribute to running the test application, an automatic monitoring system catches the fault and swaps them out for working servers. This happens so quickly that students don’t notice.
The process is called “self-healing” and it’s crucial when you’re running large exams with inevitable problems. When they pop up, the system needs to handle them quickly and gracefully so students can continue their tests without being interrupted. We use Microsoft Azure and its accompanying services for this reason – it’s one of the world’s most reliable and resilient hosts of online exam platforms like ours.
A different feature is needed to keep the servers within capacity: a load balancer. It constantly monitors the traffic loads of each server, and when students arrive for their tests, it assigns them to a server that can comfortably handle them. It balances traffic between all servers to prevent them from being overloaded.
When thousands of students try to access their tests at once, or when traffic spikes for another reason, the load balancer prevents freezes or crashes. It ensures that each student’s request is assigned to the optimal servers at that time and bounced between servers when necessary.
A load balancer constantly monitors the traffic loads of each server, and when students arrive for their tests, it assigns them to a server that can comfortably handle them.
But what happens if the servers are nearly full and the load balancer has little to work with? Autoscaling kicks in – a feature that spins up more servers reserved for this moment. With extra servers in play, and more waiting in the wings, the servers running the tests work smoothly throughout the exam.
3. Availability zones
Sometimes entire banks of servers can fail, often due to power outages. If this happens, Azure has another feature to allow students to continue their tests: availability zones.
Say a student is completing their test on a laptop in western Sydney, and the test is running on servers in a Microsoft data centre a few kilometres away. If there’s a power outage in the region, the test would typically fail. But Microsoft prevents this by having separate data centres within a region – “availability zones” with plenty of distance between each. So when the power fails in western Sydney, the system swaps over to another zone – say in eastern Sydney – that still has electricity. The zone has the test application installed and replicated on its servers, so the switch is seamless.
This is another form of server swapping but on a broader scale to protect against entire data centres going offline – something possible when you’re running thousands of tests across a large area. It’s another vital feature that adds to the resiliency of our system.
Want to run smoother large-scale tests?
They require a number of crucial elements, all of which lead to more efficient exam events.
4. Exam monitoring
When thousands of students sit the same test at once, it’s possible to fully automate the exam with technology. But for massive exam events like these, we have our human engineers monitor the health of the system, tracking data like server response times and CPU usage to ensure everything is running smoothly and problems are pre-squashed.
At Janison, the bigger the exam event, the more intense the monitoring. Some exams are so big and important we assign entire teams of engineers to monitor the system’s health and anticipate and catch problems. We replicate this approach for any clients that run huge exam events.
For smaller exams or tests with lower stakes, our engineers still monitor the system’s health but rely more on alerts. For example, if the system has an unusually large traffic spike during an exam, which doesn’t correspond with the number of students sitting tests, an engineer is notified to investigate (if they hadn’t already noticed). This approach allows them to crush problems quickly before they affect students’ tests.
Some exams are so big and important we assign entire teams of engineers to monitor the system’s health and anticipate and catch problems.
5. Constant data uploads and version history
Imagine a student’s horror when scrolling back through a test and seeing blank boxes where their answers once were. Or incorrect answers they’d already fixed.
To prevent this from happening, our online exam platform saves copies of students’ answers as they type and regularly uploads them to the server where they’re safely stored. It also saves snapshots of a student’s test answers and actions every 30 seconds, so if they claim a question is missing, the test administrator can check what happened. They can even see if the student manually highlighted and removed their answer. This kind of precise investigation keeps testing fair, and the system resilient against data issues.
6. Ironclad security
Institutions and companies that run online tests have lots of security threats: data breaches, phishing, and Denial-of-Service (DDoS) attacks to name a few.
To protect our clients at Janison, we rely on the enterprise-grade security of Microsoft Azure, as well as these security protocols (and many others):
- Code and infrastructure scanning – regular checks to identify and fix weaknesses.
- Penetration testing – simulated attacks on the system to identify vulnerabilities, completed by an external security firm.
- Data encryption – encrypting data on students’ devices, in our databases, and while being sent between systems.
- Security monitoring – regular monitoring of our system for suspicious activity.
In addition, we’ve completed the stringent work needed to acquire and maintain ISO-27001 accreditation: the gold standard for data security. This helps us keep our systems airtight.
The bigger the exam event, the bigger the possibility of problems. It goes with the territory. That’s why a resilient online exam platform is so important – it needs to squash problems before they pop up or immediately bounce back from them.
We know how important smooth exams are to our clients, which is why we’ve invested so much into making our platform resilient. It’s how we’re able to provide frictionless exams for their students, year after year, including the millions who have sat NAPLAN.
About the author
Janison
Unlocking the potential in every learner
You might also like
Want to learn more about our tailored solutions?
Chat to one of our assessment or learning consultants today.
or call us on 1300 857 687 (Australia) or +61 2 6652 9850 (International)