Hyppää sisältöön
    • Suomeksi
    • På svenska
    • In English
  • Suomeksi
  • In English
  • Kirjaudu
Näytä aineisto 
  •   Etusivu
  • LUTPub
  • Diplomityöt ja Pro gradu -tutkielmat
  • Näytä aineisto
  •   Etusivu
  • LUTPub
  • Diplomityöt ja Pro gradu -tutkielmat
  • Näytä aineisto
JavaScript is disabled for your browser. Some features of this site may not work without it.

Evaluating automated approaches for detecting privacy regulation non-compliance

Desai, Devarsh (2025)

Katso/Avaa
Mastersthesis_Desai_Devarsh.pdf (1.750Mb)
Lataukset: 


Diplomityö

Desai, Devarsh
2025

School of Engineering Science, Tietotekniikka

Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe2025062674147

Tiivistelmä

With the introduction of privacy policy regulations like GDPR and CCPA, the policy documents of the organisations have because lengthy and legally complex. It raises the transparency challenges for the users, and even the compliance auditors. Thus, the need for an automated policy compliance solution rises. This thesis tests the ability of the natural language processing (NLP) models in classifying the policy segments and flagging the segments with high risks. This project uses two different datasets OPP-115, and MAPP corpus with 179 privacy policies combined, and trains and test two different NLP models TF-IDF + Regression Testing, and BERT on those datasets.

The core objective of the models was to analyse the policies and classify each segment of the policy into either of these three categories which are first-party, third-party, and both. The process of evaluation also includes cross-validation, and risk flagging. The results of the tests show that BERT provides consistently better results than TF-IDF for all the provided tasks. Although BERT requires significantly more computing time and resources than TF-IDF.

The results also show that although the context independent model like BERT has the potential to support the legal and compliance workflows, tasks like cross-dataset generalisation still remain as a big challenge for this project, and the selected models.
Kokoelmat
  • Diplomityöt ja Pro gradu -tutkielmat [14178]
LUT-yliopisto
PL 20
53851 Lappeenranta
Ota yhteyttä | Tietosuoja | Saavutettavuusseloste
 

 

Tämä kokoelma

JulkaisuajatTekijätNimekkeetKoulutusohjelmaAvainsanatSyöttöajatYhteisöt ja kokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy
LUT-yliopisto
PL 20
53851 Lappeenranta
Ota yhteyttä | Tietosuoja | Saavutettavuusseloste