Exclusive Content:

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

“Revealing Weak Infosec Practices that Open the Door for Cyber Criminals in Your Organization” • The Register

Warning: Stolen ChatGPT Credentials a Hot Commodity on the...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Ensuring Student Safety: How PowerSchool Leverages AI-Powered Content Filtering with Amazon SageMaker

Enhancing Student Safety with Custom AI Content Filtering: A PowerSchool Initiative

This collaborative post outlines the development and implementation of a tailored content filtering solution designed for PowerSchool’s AI assistant, PowerBuddy, utilizing advanced capabilities from Amazon SageMaker.

Empowering Education with Safe AI: Building a Custom Content Filtering System at PowerSchool

This post is co-written with Gayathri Rengarajan and Harshit Kumar Nyati from PowerSchool.

PowerSchool is at the forefront of education technology, providing cloud-based solutions to over 60 million students across 90 countries and serving more than 18,000 clients, including over 90 of the largest U.S. school districts. With the launch of our AI assistant, PowerBuddy™, we faced a crucial challenge: developing advanced content filtering to differentiate between legitimate educational discussions and harmful content in student interactions.

In this post, we’ll share our journey in creating a custom content filtering solution using Amazon SageMaker AI, achieving improved accuracy while keeping false-positive rates low. We’ll detail our technical approach in fine-tuning Llama 3.1 8B, how we architected our deployment, and the results from our internal validations.

PowerSchool’s PowerBuddy

PowerBuddy is not just an AI assistant; it’s a personalized educational companion that enhances learning experiences. It integrates with various tools within the PowerSchool ecosystem—like Schoology Learning, Naviance CCLR, and Performance Matters—ensuring students and their support networks remain connected throughout their educational journey.

Our suite includes diverse AI solutions, such as:

  • PowerBuddy for Learning: A virtual tutor that aids in personalized study.
  • PowerBuddy for College and Career: Provides insights to assist in career exploration.
  • PowerBuddy for Community: Streamlines access to relevant district and school information.

With enhanced accessibility features like speech-to-text and text-to-speech, PowerBuddy is dedicated to supporting all students.

Content Filtering for PowerBuddy

As an organization serving millions of students, many of whom are minors, safety is our top priority. Alarmingly, national studies show that nearly 20% of students aged 12-17 experience bullying, and 16% of high school students seriously consider self-harm. The widespread adoption of PowerBuddy necessitated rigorous guardrails tailored for educational contexts.

Existing content filtering solutions were insufficient; they lacked the domain-specific awareness essential for sensitive educational discussions. For instance, it’s vital that discussions about events like World War II aren’t incorrectly flagged as inappropriate while ensuring timely alerts for bullying or self-harm.

To meet these demands, we required a sophisticated system capable of discerning academic inquiries from harmful content—correctly identifying and blocking signs of bullying, hate speech, violence, and other unsuitable materials.

Choosing Amazon SageMaker AI

After evaluating multiple cloud providers, we selected Amazon SageMaker as our platform. Critical criteria for this decision included:

  • Platform Stability: High reliability for our mission-critical service supporting millions of daily users.
  • Autoscaling Capabilities: Handling significant traffic spikes without performance loss.
  • Control of Model Weights: Continuous refinement of our safety guardrails was essential.
  • Incremental Training: The ability to improve our content-filtering model continually.
  • Cost-Effectiveness: Affordable solutions were necessary to maintain accessibility for schools.
  • Granular Control and Transparency: Visibility was crucial to student safety decisions.
  • Mature Managed Service: We wanted to focus on educational applications rather than infrastructure management.

Solution Overview

Our content filtering architecture consists of several components:

  1. Data Preparation Pipeline:

    • Curated datasets specific to educational contexts.
    • Secure storage in Amazon S3, anonymized, and encrypted.
  2. Model Training Infrastructure:

    • Fine-tuning of Llama 3.1 8B on SageMaker.
  3. Inference Architecture:

    • Managed endpoints with auto-scaling.
    • Integration via Amazon API Gateway for real-time filtering, monitored by Amazon CloudWatch for quality assessment.
  4. Continuous Improvement Loop:

    • Feedback mechanisms for false positives/negatives.
    • Scheduled retraining to incorporate new data.

Development Process

After several approaches, we opted to fine-tune Llama 3.1 8B using Amazon SageMaker JumpStart. This streamlined our development process, allowing us to focus on curating high-quality training data rather than infrastructure.

The model was fine-tuned using the Low Rank Adaptation (LoRA) technique, maintaining control over the training process. Deployment utilized NVIDIA A10G GPUs, balancing performance and cost effectively.

Technical Implementation

A snippet of our fine-tuning code highlights how we tailored the model:

estimator = JumpStartEstimator(
    model_id=model_id,
    environment={"accept_eula": "true"},
    disable_output_compression=True,
    hyperparameters={
        "instruction_tuned": "True",
        "epoch": "5",
        "max_input_length": "1024",
        "chat_dataset": "False"
    },
    sagemaker_session=session,
    base_job_name = "CF-M-0219251"
)

estimator.fit({"training": train_data_location})

After training, the model was deployed on SageMaker endpoints for evaluation, testing performance metrics such as recall and F1 scores to ensure reliability.

Validation of Solution

We conducted extensive testing, achieving about 93% accuracy in identifying harmful content with a false-positive rate below 3.75%. Even under peak loads, our solutions maintained response times and reliability validated through comprehensive load testing.

Schools reported a significant reduction in incidents of AI-enabled bullying thanks to our targeted content filtering capabilities.

Fine-Tuned Model Metrics

Our fine-tuned model performed markedly better than generic filtering solutions, with higher accuracy and better precision and recall, highlighting its efficacy in educational environments.

Future Plans

As PowerBuddy evolves, we plan to adapt our content filter for other products, integrating specialized adapters using SageMaker’s multi-adapter inference feature. This will facilitate cost-effective solutions tailored for specific educational challenges.

We aim to create an inclusive, secure AI learning environment that empowers students while safeguarding their well-being.

Conclusion

Implementing our specialized content filtering system has been transformative for PowerSchool. By developing robust safety mechanisms, we’ve addressed key concerns regarding AI in classrooms, ensuring that students have access to AI learning tools without exposure to harmful content.

Shivani Stumpf, our Chief Product Officer, notes, “We’re currently tracking around 500 school districts using PowerBuddy, reaching over 4.2 million students. Our content filtering technology enables students to benefit from AI-powered learning while ensuring a secure environment.”

Moving forward, we strive to maintain the balance between innovation and safety in education, fostering trust in AI systems that benefit educators and learners alike.

For those seeking domain-specific safety guardrails, consider how the fine-tuning capabilities of SageMaker AI can address your unique needs. Together, we can redefine the educational landscape, making AI not just a tool, but a trusted partner in learning.

About the Authors

Gayathri Rengarajan is the Associate Director of Data Science at PowerSchool, leading the PowerBuddy initiative.

Harshit Kumar Nyati is a Lead Software Engineer at PowerSchool, specializing in building generative AI applications.

Anjali Vijayakumar and Dmitry Soldatkin are AWS experts working to transform learning experiences.

Karan Jain is a Senior Machine Learning Specialist at AWS focused on customer solutions in generative AI.


We invite you to join us on this journey towards enhancing the learning experience safely and effectively!

Latest

OpenAI: Integrate Third-Party Apps Like Spotify and Canva Within ChatGPT

OpenAI Unveils Ambitious Plans to Transform ChatGPT into a...

Generative Tensions: An AI Discussion

Exploring the Intersection of AI and Society: A Conversation...

Don't miss

Haiper steps out of stealth mode, secures $13.8 million seed funding for video-generative AI

Haiper Emerges from Stealth Mode with $13.8 Million Seed...

VOXI UK Launches First AI Chatbot to Support Customers

VOXI Launches AI Chatbot to Revolutionize Customer Services in...

Investing in digital infrastructure key to realizing generative AI’s potential for driving economic growth | articles

Challenges Hindering the Widescale Deployment of Generative AI: Legal,...

Microsoft launches new AI tool to assist finance teams with generative tasks

Microsoft Launches AI Copilot for Finance Teams in Microsoft...

Leverage Amazon SageMaker HyperPod and Anyscale for Next-Gen Distributed Computing Solutions

Optimizing Large-Scale AI Deployments with Amazon SageMaker HyperPod and Anyscale Overview of Challenges in AI Infrastructure Introducing Amazon SageMaker HyperPod for ML Workloads The Integration of Anyscale...

Vxceed Creates the Ideal Sales Pitch for Scalable Sales Teams with...

Revolutionizing Revenue Retention: AI-Powered Solutions for Consumer Packaged Goods in Emerging Markets Collaborating for Change in CPG Loyalty Programs The Challenge: Addressing Revenue Retention in Emerging...

Streamline the Creation of Amazon QuickSight Data Stories with Agentic AI...

Streamlining Decision-Making with Automated Amazon QuickSight Data Stories Overview of Challenges in Data Story Creation Introduction to Amazon Nova Act Automating QuickSight Data Stories: A Step-by-Step Guide Best...