Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HELP NEEDED] Cleaning database from some domains #525

Open
davidande opened this issue May 17, 2024 · 3 comments
Open

[HELP NEEDED] Cleaning database from some domains #525

davidande opened this issue May 17, 2024 · 3 comments

Comments

@davidande
Copy link

Hello,
I use Parsedmarc for a long time and today I log almost 200 different domain names.
for some reason, I do not need to log some domains anymore and I would like to clean the database from these domains.
Does anyome could help me for this?
is there any elastic command that could clean the base from all entries generated by one specific domain?
Thanks for Your help

David

@Szasza
Copy link
Contributor

Szasza commented May 30, 2024

Hi @davidande ,

Which data storage solution do you use?

@davidande
Copy link
Author

Hi @Szasza
My database is stored in Elasticsearch 8.13.0

@Szasza
Copy link
Contributor

Szasza commented Sep 22, 2024

@davidande you will need to write a script which lists all the indices in your ES installation, and call the Delete by query API with the following:

POST /<index_name>/_delete_by_query
{
  "query": {
    "match": {
      "header_from": "<domain_here>"
    }
  }
}

Please note the following:

  • The reason why you have to iterate through the indices is that parsedmarc stores the reports in separate indices based on the date of the reports.
  • The aggregate reports' indices will have dmarc_aggregate in their names, while forensic reports will have dmarc_forensic.
  • The above query checks the header_from field which is only present in aggregate reports, but not in forensic reports.
  • You may want to check for the envelope_from field as well (aggregate), also maybe envelope_to (aggregate), either separately, or as part of a composite query. Not sure what your deletion criteria "all entries generated by one specific domain" means.
  • If you need to delete forensic reports too, then you also need to run a deletion on the forensic indexes, matching the domain field, also maybe the dkim_domain one. Again, depends on your use case.

I hope the above helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants