Skip to content

Commit 2af81fc

Browse files
authored
New domain tracker (#892)
* feat: new domain tracker transformer
1 parent 0345d75 commit 2af81fc

13 files changed

+458
-13
lines changed

README.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
<p align="center">
22
<img src="https://goreportcard.com/badge/github.com/dmachard/go-dns-collector" alt="Go Report"/>
33
<img src="https://img.shields.io/badge/go%20version-min%201.21-green" alt="Go version"/>
4-
<img src="https://img.shields.io/badge/go%20tests-513-green" alt="Go tests"/>
4+
<img src="https://img.shields.io/badge/go%20tests-516-green" alt="Go tests"/>
55
<img src="https://img.shields.io/badge/go%20bench-21-green" alt="Go bench"/>
6-
<img src="https://img.shields.io/badge/go%20lines-32126-green" alt="Go lines"/>
6+
<img src="https://img.shields.io/badge/go%20lines-32515-green" alt="Go lines"/>
77
</p>
88

99
<p align="center">
@@ -76,6 +76,7 @@
7676

7777
- **[Transformers](./docs/transformers.md)**
7878

79+
- Detect [Newly Observed Domains](docs/transformers/transform_newdomaintracker.md)
7980
- [Rewrite](docs/transformers/transform_rewrite.md) DNS messages or custom [Relabeling](docs/transformers/transform_relabeling.md) for JSON output
8081
- Add additionnal [Tags](docs/transformers/transform_atags.md) in DNS messages
8182
- Traffic [Filtering](docs/transformers/transform_trafficfiltering.md) and [Reducer](docs/transformers/transform_trafficreducer.md)

docs/_examples/use-case-31.yml

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
global:
2+
trace:
3+
verbose: true
4+
5+
pipelines:
6+
- name: tap
7+
dnstap:
8+
listen-ip: 0.0.0.0
9+
listen-port: 6000
10+
transforms:
11+
normalize:
12+
qname-lowercase: true
13+
qname-replace-nonprintable: true
14+
routing-policy:
15+
forward: [ detect_new_domain ]
16+
dropped: [ ]
17+
18+
- name: detect_new_domain
19+
dnsmessage:
20+
matching:
21+
include:
22+
dnstap.operation: "CLIENT_QUERY"
23+
transforms:
24+
new-domain-tracker:
25+
ttl: 3600
26+
cache-size: 1000
27+
routing-policy:
28+
forward: [ console ]
29+
dropped: [ ]
30+
31+
- name: console
32+
stdout:
33+
mode: text

docs/collectors/collector_dnsmessage.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,6 @@ Finally a complete full example:
7171
atags:
7272
tags: [ "TXT:apple", "TXT:google" ]
7373
routing-policy:
74-
dropped: [ outputfile ]
74+
forward: [ outputfile ]
7575
default: [ console ]
7676
```

docs/examples.md

+1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ You will find below some examples of configurations to manage your DNS logs.
77
- [x] [Advanced example with DNSmessage collector](./_examples/use-case-24.yml)
88
- [x] [How can I log only slow responses and errors?"](./_examples/use-case-25.yml)
99
- [x] [Filter DNStap messages where the response ip address is 0.0.0.0](./_examples/use-case-26.yml)
10+
- [x] [Detect Newly Observed Domains](./_examples/use-case-31.yml)
1011

1112
- **Capture DNS traffic from incoming DNSTap streams**
1213
- [x] [Read from UNIX DNSTap socket and forward it to TLS stream](./_examples/use-case-5.yml)

docs/transformers.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,5 @@ Transformers processing is currently in this order :
2727
| [Traffic Prediction](transformers/transform_trafficprediction.md) | Features to train machine learning models |
2828
| [Additionnal Tags](transformers/transform_atags.md) | Add additionnal tags |
2929
| [JSON relabeling](transformers/transform_relabeling.md) | JSON relabeling to rename or remove keys |
30-
| [DNS message rewrite](transformers/transform_rewrite.md) | Rewrite value for DNS messages structure |
30+
| [DNS message rewrite](transformers/transform_rewrite.md) | Rewrite value for DNS messages structure |
31+
| [Newly Observed Domains](transformers/transform_newdomaintracker.md) | Detect Newly Observed Domains |
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Transformer: New Domain Tracker Transformer
2+
3+
The **New Domain Tracker** transformer identifies domains that are newly observed within a configurable time window. It is particularly useful for detecting potentially malicious or suspicious domains in DNS traffic, such as those used for phishing, malware, or botnets.
4+
5+
## Features
6+
7+
- **Configurable Time Window**: Define how long a domain is considered new.
8+
- **LRU-based Memory Management**: Ensures efficient memory usage with a finite cache size.
9+
- **Persistence**: Optionally save the domain cache to disk for continuity after restarts.
10+
- **Whitelist Support**: Exclude specific domains or patterns from detection.
11+
12+
## How It Works
13+
14+
1. When a DNS query is processed, the transformer checks if the queried domain exists in its cache.
15+
2. If the domain is not in the cache or has not been seen within the specified TTL, it is marked as newly observed.
16+
3. The domain is added to the cache with a timestamp of when it was last seen.
17+
4. Whitelisted domains are ignored and never marked as new.
18+
19+
## Configuration:
20+
21+
* `ttl` (integer)
22+
> time window in seconds (e.g., 1 hour)
23+
24+
* `cache-size` (integer)
25+
> Maximum number of domains to track
26+
27+
* `white-domains-file` (string)
28+
> path file to domain white list, domains list can be a partial domain name with regexp expression
29+
30+
31+
```yaml
32+
transforms:
33+
new-domain-tracker:
34+
ttl: 3600
35+
cache-size: 100000
36+
white-domains-file: ""
37+
persistence-file: ""
38+
```
39+
40+
## Cache
41+
42+
The New Domain Tracker uses an **LRU Cache** to manage memory consumption efficiently. You can configure the maximum number of domains stored in the cache using the max_size parameter. Once the cache reaches its maximum size, the least recently used entries will be removed to make room for new ones.
43+
The LRU Cache ensures finite memory usage but may cause some domains to be forgotten if the cache size is too small.
44+
45+
46+
## Whitelist
47+
48+
Example of configuration to load a whitelist of domains to ignore.
49+
50+
```yaml
51+
transforms:
52+
new-domain-tracker:
53+
white-domains-file: /tmp/whitelist_domain.txt
54+
```
55+
56+
Example of content for the file `/tmp/whitelist_domain.txt`
57+
58+
```
59+
(mail|wwww).google.com
60+
github.com
61+
```
62+
63+
## Persistence
64+
65+
To ensure continuity across application restarts, you can enable the persistence feature by specifying a file path (persistence).
66+
The transformer will save the domain cache to this file and reload it on startup.
67+
68+
```yaml
69+
transforms:
70+
new-domain-tracker:
71+
persistence-file: /tmp/nod-state.json
72+
```

go.mod

+1-1
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ require (
2525
github.com/google/uuid v1.6.0
2626
github.com/grafana/dskit v0.0.0-20240905221822-931a021fb06b
2727
github.com/grafana/loki/v3 v3.2.1
28+
github.com/hashicorp/golang-lru v0.6.0
2829
github.com/hashicorp/golang-lru/v2 v2.0.7
2930
github.com/hpcloud/tail v1.0.0
3031
github.com/influxdata/influxdb-client-go v1.4.0
@@ -92,7 +93,6 @@ require (
9293
github.com/hashicorp/go-rootcerts v1.0.2 // indirect
9394
github.com/hashicorp/go-sockaddr v1.0.6 // indirect
9495
github.com/hashicorp/go-uuid v1.0.3 // indirect
95-
github.com/hashicorp/golang-lru v0.6.0 // indirect
9696
github.com/hashicorp/memberlist v0.5.0 // indirect
9797
github.com/hashicorp/serf v0.10.1 // indirect
9898
github.com/huandu/xstrings v1.3.3 // indirect

pkgconfig/transformers.go

+7
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,13 @@ type ConfigTransformers struct {
9595
Enable bool `yaml:"enable" default:"false"`
9696
Identifiers map[string]interface{} `yaml:"identifiers,flow"`
9797
} `yaml:"rewrite"`
98+
NewDomainTracker struct {
99+
Enable bool `yaml:"enable" default:"false"`
100+
TTL int `yaml:"ttl" default:"3600"`
101+
CacheSize int `yaml:"cache-size" default:"100000"`
102+
WhiteDomainsFile string `yaml:"white-domains-file" default:""`
103+
PersistenceFile string `yaml:"persistence-file" default:""`
104+
} `yaml:"new-domain-tracker"`
98105
}
99106

100107
func (c *ConfigTransformers) SetDefault() {
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
.*\.google\.com
2+
github\.com

transformers/newdomaintracker.go

+204
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,204 @@
1+
package transformers
2+
3+
import (
4+
"bufio"
5+
"encoding/json"
6+
"errors"
7+
"fmt"
8+
"os"
9+
"regexp"
10+
"strings"
11+
"time"
12+
13+
"github.com/dmachard/go-dnscollector/dnsutils"
14+
"github.com/dmachard/go-dnscollector/pkgconfig"
15+
"github.com/dmachard/go-logger"
16+
"github.com/hashicorp/golang-lru/v2/expirable"
17+
)
18+
19+
type NewDomainTracker struct {
20+
ttl time.Duration // Time window to consider a domain as "new"
21+
cache *expirable.LRU[string, struct{}] // Expirable LRU Cache
22+
whitelist map[string]*regexp.Regexp // Whitelisted domains
23+
persistencePath string
24+
logInfo func(msg string, v ...interface{})
25+
logError func(msg string, v ...interface{})
26+
}
27+
28+
func NewNewDomainTracker(ttl time.Duration, maxSize int, whitelist map[string]*regexp.Regexp, persistencePath string, logInfo, logError func(msg string, v ...interface{})) (*NewDomainTracker, error) {
29+
30+
if ttl <= 0 {
31+
return nil, fmt.Errorf("invalid TTL value: %v", ttl)
32+
}
33+
34+
cache := expirable.NewLRU[string, struct{}](maxSize, nil, ttl)
35+
36+
tracker := &NewDomainTracker{
37+
ttl: ttl,
38+
cache: cache,
39+
whitelist: whitelist,
40+
persistencePath: persistencePath,
41+
logInfo: logInfo,
42+
logError: logError,
43+
}
44+
// Load cache state from disk if persistence is enabled
45+
if persistencePath != "" {
46+
if err := tracker.loadCacheFromDisk(); err != nil {
47+
return nil, fmt.Errorf("failed to load cache state: %w", err)
48+
}
49+
}
50+
51+
return tracker, nil
52+
}
53+
54+
func (ndt *NewDomainTracker) isWhitelisted(domain string) bool {
55+
for _, d := range ndt.whitelist {
56+
if d.MatchString(domain) {
57+
return true
58+
}
59+
}
60+
return false
61+
}
62+
63+
func (ndt *NewDomainTracker) IsNewDomain(domain string) bool {
64+
// Check if the domain is whitelisted
65+
if ndt.isWhitelisted(domain) {
66+
return false
67+
}
68+
69+
// Check if the domain exists in the cache
70+
if _, exists := ndt.cache.Get(domain); exists {
71+
// Domain was recently seen, not new
72+
return false
73+
}
74+
75+
// Otherwise, mark the domain as new
76+
ndt.cache.Add(domain, struct{}{})
77+
return true
78+
}
79+
80+
func (ndt *NewDomainTracker) SaveCacheToDisk() error {
81+
keys := ndt.cache.Keys()
82+
data, err := json.Marshal(keys)
83+
if err != nil {
84+
return err
85+
}
86+
87+
return os.WriteFile(ndt.persistencePath, data, 0644)
88+
}
89+
90+
// loadCacheFromDisk loads the cache state from a file
91+
func (ndt *NewDomainTracker) loadCacheFromDisk() error {
92+
if ndt.persistencePath == "" {
93+
return errors.New("persistence filepath not set")
94+
}
95+
96+
data, err := os.ReadFile(ndt.persistencePath)
97+
if err != nil {
98+
if os.IsNotExist(err) {
99+
return nil // File does not exist, no previous state to load
100+
}
101+
return err
102+
}
103+
104+
var keys []string
105+
if err := json.Unmarshal(data, &keys); err != nil {
106+
return err
107+
}
108+
109+
for _, key := range keys {
110+
ndt.cache.Add(key, struct{}{})
111+
}
112+
113+
return nil
114+
}
115+
116+
// NewDomainTransform is the Transformer for DNS messages
117+
type NewDomainTrackerTransform struct {
118+
GenericTransformer
119+
domainTracker *NewDomainTracker
120+
listDomainsRegex map[string]*regexp.Regexp
121+
}
122+
123+
// NewNewDomainTransform creates a new instance of the transformer
124+
func NewNewDomainTrackerTransform(config *pkgconfig.ConfigTransformers, logger *logger.Logger, name string, instance int, nextWorkers []chan dnsutils.DNSMessage) *NewDomainTrackerTransform {
125+
t := &NewDomainTrackerTransform{GenericTransformer: NewTransformer(config, logger, "new-domain-tracker", name, instance, nextWorkers)}
126+
t.listDomainsRegex = make(map[string]*regexp.Regexp)
127+
return t
128+
}
129+
130+
// ReloadConfig reloads the configuration
131+
func (t *NewDomainTrackerTransform) ReloadConfig(config *pkgconfig.ConfigTransformers) {
132+
t.GenericTransformer.ReloadConfig(config)
133+
ttl := time.Duration(config.NewDomainTracker.TTL) * time.Second
134+
t.domainTracker.ttl = ttl
135+
t.LogInfo("new-domain-transformer configuration reloaded")
136+
}
137+
138+
func (t *NewDomainTrackerTransform) GetTransforms() ([]Subtransform, error) {
139+
subtransforms := []Subtransform{}
140+
if t.config.NewDomainTracker.Enable {
141+
// init whitelist
142+
if err := t.LoadWhiteDomainsList(); err != nil {
143+
return nil, err
144+
}
145+
146+
// Initialize the domain tracker
147+
ttl := time.Duration(t.config.NewDomainTracker.TTL) * time.Second
148+
maxSize := t.config.NewDomainTracker.CacheSize
149+
tracker, err := NewNewDomainTracker(ttl, maxSize, t.listDomainsRegex, t.config.NewDomainTracker.PersistenceFile, t.LogInfo, t.LogError)
150+
if err != nil {
151+
return nil, err
152+
}
153+
t.domainTracker = tracker
154+
155+
subtransforms = append(subtransforms, Subtransform{name: "new-domain-tracker:detect", processFunc: t.trackNewDomain})
156+
}
157+
return subtransforms, nil
158+
}
159+
160+
func (t *NewDomainTrackerTransform) LoadWhiteDomainsList() error {
161+
// before to start, reset all maps
162+
for key := range t.listDomainsRegex {
163+
delete(t.listDomainsRegex, key)
164+
}
165+
166+
if len(t.config.NewDomainTracker.WhiteDomainsFile) > 0 {
167+
file, err := os.Open(t.config.NewDomainTracker.WhiteDomainsFile)
168+
if err != nil {
169+
return fmt.Errorf("unable to open regex list file: %w", err)
170+
} else {
171+
172+
scanner := bufio.NewScanner(file)
173+
for scanner.Scan() {
174+
domain := strings.ToLower(scanner.Text())
175+
t.listDomainsRegex[domain] = regexp.MustCompile(domain)
176+
}
177+
t.LogInfo("loaded with %d domains in the whitelist", len(t.listDomainsRegex))
178+
}
179+
}
180+
return nil
181+
}
182+
183+
// Process processes DNS messages and detects newly observed domains
184+
func (t *NewDomainTrackerTransform) trackNewDomain(dm *dnsutils.DNSMessage) (int, error) {
185+
// Log a warning if the cache is full (before adding the new domain)
186+
if t.domainTracker.cache.Len() == t.config.NewDomainTracker.CacheSize {
187+
return ReturnError, fmt.Errorf("LRU cache is full. Consider increasing cache-size to avoid frequent evictions")
188+
}
189+
190+
// Check if the domain is newly observed
191+
if t.domainTracker.IsNewDomain(dm.DNS.Qname) {
192+
return ReturnKeep, nil
193+
}
194+
return ReturnDrop, nil
195+
}
196+
197+
func (t *NewDomainTrackerTransform) Reset() {
198+
if len(t.domainTracker.persistencePath) != 0 {
199+
if err := t.domainTracker.SaveCacheToDisk(); err != nil {
200+
t.LogError("failed to save cache state: %v", err)
201+
}
202+
t.LogInfo("cache content saved on disk with success")
203+
}
204+
}

0 commit comments

Comments
 (0)