Professional Web Scraping with Java

Professional Web Scraping with Java

English | MP4 | AVC 1920×1080 | AAC 44KHz 2ch | 1h 07m | 773 MB

Learn how to scrape data from any static or dynamic / AJAX web page using Java in a short and concise way.

In this short and concise course you will learn everything to get started with web scraping using Java.

You will learn the concepts behind web scraping that you can apply to practically any web page (static AND dynamic / AJAX).

We start with an overview of what web scraping is and what you can do with it.

Then we explain the difference in scraping static pages vs dynamic / AJAX pages. You learn how to classify a website in one of the two categories and then apply the right concept in order to scrape the data you want.

Now you will learn how to export the scraped data either as CSV or JSON. These are some popular formats that can be used for further processing.

Unfortunately many websites try to block scrapers or sometimes you just do not want to be detected. In the section going undercover you will learn how to stay undetected and avoid getting blocked.

At the end of the course you can download the full source code of all the lectures and we discuss an outlook to some advanced topics (private proxies, cloud deployment, multi threading …). Those advanced topics are covered in a follow up course I am going to teach.

What Will I Learn?

  • Have a solid understanding of web scraping with Java
  • Beeing able to scrape practically any web page (static AND dynamic / AJAX) though you learn the concepts behind web scraping
  • Download, parse and extract data from websites with Jsoup
  • Call web APIs in Java with Unirest
  • Export your data as CSV or JSON
  • Build web scrapers that stay undetected and do not get blocked or banned
Table of Contents

01 Promo
02 Introduction
03 What is a static web page
04 Concept how to scrape static web pages
05 Jsoup – the jQuery for Java
06 Example – Scraping Google
07 What is a dynamic web page
08 Unirest
09 Concept how to scrape dynamic web pages
10 Example – Scraping peoplescrapers
11 Export as CSV
12 Export as JSON
13 How to stay undetected
14 Conclusion