Design and implementation of an automated web scraping system : enhancing accuracy and efficiency for Nettileasing Finland Oy
Khan, Arsalan (2023)
Diplomityö
Khan, Arsalan
2023
School of Engineering Science, Tietotekniikka
Kaikki oikeudet pidätetään.
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi-fe20231212153353
https://urn.fi/URN:NBN:fi-fe20231212153353
Tiivistelmä
The objective of this thesis to address specific research questions through the development of an automated web scraping tool for Nettileasing Finland Oy, a startup based in Lahti. This study explores the implications of web scraping in a business context, evaluating its efficacy and potential in streamlining data acquisition processes. It attempts to enhance the efficiency and accuracy of data gathering from e-commerce platforms. The study focuses on three websites: bikemarine.fi, venekauppa.com, and tietokonekauppa.fi. It uses tools like Python, Scrapy, and Playwright for scraping product details. These tools are capable of handling both static and dynamic web content.
The study covers the design, testing, and implementation of the system. It highlights the system’s ability to adapt to different web structures. It also demonstrates efficient data extraction. The study evaluates the system’s performance against manual methods showing significant improvements in data accuracy and speed.
The thesis also considers how the system affects business operations. It focuses on improved insights and possible revenue growth for Nettileasing. The study concludes by emphasizing the importance of automated web scraping in e-commerce. It recommends using AI and machine learning for more in-depth data analysis in the furture.
The study covers the design, testing, and implementation of the system. It highlights the system’s ability to adapt to different web structures. It also demonstrates efficient data extraction. The study evaluates the system’s performance against manual methods showing significant improvements in data accuracy and speed.
The thesis also considers how the system affects business operations. It focuses on improved insights and possible revenue growth for Nettileasing. The study concludes by emphasizing the importance of automated web scraping in e-commerce. It recommends using AI and machine learning for more in-depth data analysis in the furture.
