Pakistan News Fetcher

CLI tool to scrape and organize Pakistani news articles into CSV format.

Client: Free Lance

Date: 2024

Project Details

Pakistan News Fetcher is a lightweight command-line utility designed to crawl and fetch the latest headlines and article content from various Pakistani news websites.

Functionality & Flow:
  • Utilizes Axios to make HTTP requests to news websites
  • Uses Cheerio to parse and extract relevant content from HTML pages (titles, timestamps, links, summaries)
  • Fetched data is cleaned and stored in structured CSV format for easy analysis or archiving
  • Supports multiple major Pakistani news sources (e.g., Dunya, Geo, Express, ARY News & Bol News)
Key Features:
  • Automated scraping and extraction of daily news updates
  • Command-line interface with options to choose sources and output location
  • Built for scripting, automation, or personal research use
  • Organized CSV output with columns like headline, source, link, date, and summary
Technical Stack:
  • Node.js and Express for CLI structure and tool logic
  • Axios for HTTP fetching
  • Cheerio for HTML scraping and parsing
  • JavaScript for scripting and file handling
Outcome:
  • Streamlined tool for collecting structured news data
  • Useful for researchers, data journalists, or hobby projects
  • Designed for easy customization and source extension

Technologies Used

Node.js
Express
JavaScript
Axios
Cheerio