Skip to content

luminati-io/Instagram-dataset-samples

Repository files navigation

instagram-dataset-samples

A sample dataset of 2180 Instagram coding influencers

instagram dataset header

A github dataset sample of over 2000 leading Instagram Github coding influencers. Dataset was extracted using the Bright Data Collector.

Data points included in this free dataset:

  • followers count
  • profile type
  • account type
  • engagement score
  • categories
  • location
  • external/bio links
  • hashtags used
  • brand affiliation
  • bio
  • highlights
  • posts

This is a sample subset which is derived from the "All Instagram account, business & nonbusiness (public data)" dataset which includes 614,000,000 Instagram profiles.

In this example, the large dataset was filtered down into a smaller subset using smart filter queries available on the Bright Data control panel.

Queries used for filtering this subset:

  •   $or: [{"post_hashtags":"github"},{"bio_hashtags":"github"}]
    
  •   followers: {"$gt":100}
    

Additional filter query values include: Posts count, cuntry, verified account, multiple hashtag combinations and more.

Available dataset file formats: JSON, NDJSON, JSON Lines, CSV, or Parquet..

Dataset delivery type options: API download, Amazon S3, Google cloud, Microsoft Azure, SFTP.

Data enrichment available as an addition to the data points extracted: Avg. post engagement rate, brand affiliation and more.

Get the full Instagram dataset on Bright Data's page.

Additional Instagram datasets available:

  • 614,000,000 "Instagram profiles"
  • 21,100,000 "Instagram posts"
  • 88,800 "Instagram reels"

Free access to web scraping tools and datasets for academic researchers and NGOs

The Bright Initiative offers access to Bright Data's Web Scraper APIs to leading academic faculties and researchers, NGOs and NPOs promoting various environmental and social causes. You can submit an application here.