Skip to content

Latest commit

 

History

History
65 lines (43 loc) · 3.77 KB

README.md

File metadata and controls

65 lines (43 loc) · 3.77 KB

MLTHSC Javascript Web App (v2)

Now works online!

This repository contains all related files for the rework of the JavaScript-based web app of Multilabel Tagalog Hate Speech Classifier (MLTHSC). Also includes the Jupyter Notebooks written for retraining the classifier model in Python.

(v1): GitHub: syke9p3/mlthsc-thesis

The project started as a college thesis proposal - a hate speech classifier that can classify Tagalog text based on different categories like age, gender, physical, etc.

Our time was mostly spent on writing the manuscript, gathering text data, implementing the software architecture of our model, training, testing, etc. Time was running short for the upcoming defense at that time so we needed to build something fast - a simple user interface that would demonstrate the functionality to the panelists. The v1 web app wasn't very polished as a result hence the need for a rework. Also, the greatest challenge we had was making the classifier functional when being deployed online. Since the original model was large (about 500+ mb), hosting sites are not able to accomodate it because they have storage size limitations for uploading files. That is why the model had to be retrained and quantized so its size can be reduced as to not be too heavy to load and perform inference faster albeit sacrificing classification accuracy. Regardless, the original model is still available to try which demo I'll be deploying in Huggingface Spaces soon.

What changed?

  • now works online (even on mobile)
  • deployed on GitHub Pages
  • hosted the classifier model on Hugging Face
  • model was quantized because the loading the original larger model in the application would take too long to actually perform the classifications. This was the challenge we had from the start when trying to deploy the model online.
  • refactored Javascript code to make adding features faster
  • changed input limit from between 3 to 280 words to between 15 to 280 characters
  • added "last classified text" section
  • enhanced appearance of saved post cards
  • changed the appearance of the buttons
  • outlines for accessibility
  • FAQ section
    • overview of the tool
    • definitions for each labels
    • how the classifier works
  • source code link

Still working on

  • features section in FAQ
  • filtering
  • pagination
  • dark mode (restructuring CSS)
  • implementing accessibility (ARIA) standards
  • (might try to rewrite this again in React)