[EN] Collect metadata on the reports of the Armenian NGOs #8

ansakoy · 2023-06-05T13:31:40Z

Goal

The goal is to create a dataset containing the metadata on the reports of the Armenian NGOs.

Tasks

The metadata are published at an ASPX website (https://www.petekamutner.am/Reports_vh.aspx?rptid=1), which may require some browser emulating libraries. Otherwise, the task is to collect all the data from the given paginated table and store them in a machine readable format, such as JSON, or XML, or CSV in a flat structure. For example:

{
	"reg_num": STRING,
	"year": NUMBER,
	"taxpayer_id": STRING,
	"org_name": STRING,
	"org_type": STRING,
	"report_date": STRING,
	"report_url": STRING,  // the url to download the given report
	"date_update": STRING,
}

Please, bear in mind that all the digital IDs such as the taxpayer ID (ՀՎՀՀ) should always be stored as strings (characters) to preserve their precise value, including leading zeros, if there are any.

Context

The State Revenue Committee of the Republic of Armenia publishes the reports of NGOs, which may contain invaluable information on how the non-government and non-profit sector operates. Problem is, the reports themselves are published as PDF files that are hard to automatically process. However, these files are accompanied by rather helpful metadata, including organizations' names and IDs, and reports' years that allow to quickly search for specific reports, as well as to check out whether an NGO has such reports at all.

Requirements

A public GitHub repository should be created to store and publish the code and the data under one of the free and open licenses, such as Creative Commons or MIT.

Wishes

It would be best if your code is reusable, that is can be launch again by anyone who might want to update the dataset at a later point. For the same reason, we encourage you to comment your code, supplement it with at least a very brief README description, and specify the requirements and dependencies necessary to use the code.

Resources

https://www.petekamutner.am/Reports_vh.aspx?rptid=1

Prepared by

The Open Data Armenia team prepared this task

ivbeg added extraction Task that require data extraction (scraping) skills topic-government Tasks dedicated to government openness topic-finances Task related to public finances, banking, currencies and e.t.c labels Jun 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EN] Collect metadata on the reports of the Armenian NGOs #8

[EN] Collect metadata on the reports of the Armenian NGOs #8

ansakoy commented Jun 5, 2023 •

edited

Loading

[EN] Collect metadata on the reports of the Armenian NGOs #8

[EN] Collect metadata on the reports of the Armenian NGOs #8

Comments

ansakoy commented Jun 5, 2023 • edited Loading

Goal

Tasks

Context

Requirements

Wishes

Resources

Prepared by

ansakoy commented Jun 5, 2023 •

edited

Loading