Skip to content

Anzo52/JCrawl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

JCrawl

JCrawl - Java Websites Crawler

JCrawl is a basic web crawler implemented in Java, designed to scrape web pages starting from a given URL and extracting links from those pages. Web crawling is the process of navigating and extracting information from web pages, often used by search engines and web scrapers

Table of Contents

Features

  • Web crawling from a starting URL.
  • Specify the number of links to scrape using a breakpoint.
  • Extract links from web pages.

Prerequisites

  • Java Development Kit (JDK) installed on your system.

Usage

  1. Clone or download this repository to your local machine.
  2. Compile the JCrawl.java file using javac: javac JCrawl.java

Run the porgram:

  1. java JCrawl