-
Notifications
You must be signed in to change notification settings - Fork 8
NetDB Home
NetDB has been tested on Debian, Ubuntu and CentOS, but it shouldn't matter since it's Perl. After porting from Sourceforge all the testing and script was performed on CentOS; therefore, CentOS will be the recommended OS for now.
I do not have an installer, but the install process is not complex for anyone who has ever installed a LAMP application before. I have all the commands you need below so you can mostly copy and paste. For most people, the install will take about 2 hours on a fresh server. Most of that time will be spent building your device list file and installing CPAN modules. Just follow the steps and you'll get there.
The app is installed in /opt/netdb and is controlled from cron. It relies on a few CPAN modules, so you will need to get CPAN going. I provide a full module list below that you need to install along with some automated cpan commands.
The database is runs on MySQL, and is well indexed with FK constraints. It's optimized for inserts but queries are also fast. The only potentially slow query is hostname searches since they are freetext searches. You may see degraded hostname queries above 500,000 ARP entries for hostnames.
New in version 1.10, there is a new plugin architecture for multi-vendor support. NetDB was originally written to support Cisco IOS devices but now supports other vendors. See /opt/netdb/netdbscraper/skeletonscraper.pl for information on how to add support for an unsupported device through SSH.
Plugin architecture for third-party additions (please share any new scrapers)
-
Cisco 12.2 - 15.0 IOS routers (2600/2800/2900) for ARP data
-
Cisco IOS 7600/6500 SXF and above plus VSS support
-
Cisco 4500 Family (4006/4506/4948/4900M/4500-X)
-
Cisco 3650/3750/3850/3560/2970/2960/2950 (anything using 12.2(35)+ will work for sure, most 12.2(25)SEE+ codes works, but really old versions have bugs in them)
-
Cisco Nexus NX-OS 7000/5000 Switches and Nexus 2000 FEXes
-
Cisco 2924/3500XL support for those of you with really old devices
-
Cisco ASA/FWSM ARP table support (SSH Access Only, no telnet support)
-
Cisco CatOS Support for 6500 and 2948-L3 (limited support)
-
JunOS Ethernet switches and routers including firewalls (SRX/EX/MX)
-
Foundry/Brocade Switch support
-
Cisco WLC Wireless Controllers
-
Aruba Wireless Controllers
-
Dell Powerconnect Switches
-
HP Procurve Switches
SSH WARNING: If you are using SSH with 12.2(25)SEE code on 3750s/3560s or below, there is an SSH memory leak that will eventually crash your switches. Please make sure you are using 12.2(35) or above on those devices if you plan to use SSH with them. They won't run out of memory for a few days, but then they will start dropping right and left and pulling the plug is the only way to fix them. Please take this in to account on older software.
SSH Memory Leak Update: There are SSH leaks throughout older IOS software. They don't show up in normal use, but NetDB opens many sessions a day to your devices. Make sure your software is no more than two years old if you want to use SSH. Otherwise, consider restricting telnet access to NetDB only. It is no less secure than SNMP v1/2 if properly configured.
-
NetDB was built with large networks in mind to take advantage of well designed networks where all equipment is known and managed already. There is no discovery of network devices via CDP, it expects you to have a list of switches already registered with DNS names. If you don't, that's ok, you'll just have to do some more work to get up and running.
-
Extensive reverse DNS infrastructure (updates will be extremely slow if reverse DNS queries are timing out). DDNS environments with reverse DNS work best with NetDB, but it's also possible to disable DNS lookups on IPs completely. At a minimum, set your DNS server to be authoritative for rfc1918 addresses and return nothing.
-
NetDB will likely be the largest source of DNS queries on your network, but this should not be a big deal for most people. I err on the side of accuracy rather than performance. If you don't care about absolute DNS accuracy, a local caching DNS server might be useful.
-
Robust TACACS/Radius infrastructure, be prepared for the increased logging on your TACACS server. Depending on the number of devices on your network and update interval, the increased logins can easily overrun your disk space with logging.
-
Up to date IOS code if you plan to use SSH, early IOS images have memory leaks in them, see below.
Cisco NX-OS devices now default to more appropriate ARP and MAC timers, so there's no need to adjust them. For Catalyst devices it is recommended for more accurate data.
Consider changing your ARP timers down to one hour from four hours on your routers, and syncing your mac aging timers to be the same as your NetDB update interval. This is not required, but helps to improve the accuracy of timestamps, while making sure you don't miss anything happening on the network. We use 15min MAC aging timers (default is 5 minutes) and 1 hour ARP timers (defaults to 4 hours) with no issues.
When a device goes offline, it's ARP entry can stick around for up to four hours, appearing to NetDB to still be online during that time (though the switchport entry may show it as offline sooner due to short mac aging). Reducing this to one hour instead of four is a good compromise between increasing ARP traffic on your network, and having accurate data for troubleshooting and auditing. Do not set the ARP timer too low, or you can increase the amount of broadcast ARP traffic on your subnets substantially.
Sometimes, new devices that don't do DHCP are plugged in, talk once and then go dormant. If you don't catch the first packet in the mac-table when NetDB runs, you may never see it again until it's rebooted. Increasing the default 5min mac age timers to match NetDB update cycle resolves this issue, This makes sure NetDB catches 100% of devices that plug in to the network, even if they were only plugged in for a few minutes.
Cisco 6500 example:
mac-address-table aging-time 900 int vlan70 arp timeout 3600
Caution: If you are running active/standby or active/active hsrp at the distribution level, consider syncing the MAC and ARP timers to be the same on each pair of switches to avoid flooding issues in general. This has nothing to do with NetDB but is something to be aware of if you are going to change your existing timers. If someone else has already customized the timers on your network, proceed with caution, especially at the distribution layer. In almost all other cases, you will be perfectly fine setting the suggested 1 hour ARP and 15min mac timers.
-
CentOS is recommended
-
If you have less than 150 cisco devices or less than 4000 nodes, any old server should work fine.
-
If you have a large or very large network, server performance becomes important, see next section.
-
When considering server performance, decisions should be made on how often you need to update the database and how important that information is to you. We update every 15 minutes, but the entire update process only takes 2.5min. We gather data on 25k nodes across 500+ devices, and import that data in to the database on a quadcore Dell 2950. The import is single threaded, and if you really want multi-process imports, let me know.
-
Unless you have over 10,000 nodes on your network, you can run NetDB on almost anything and skip this section.
-
There are two actions that take time to run, scraping the devices and importing the data. If you are having performance issues, first figure out which action is taking so long by running netdbctl -debug 2 manually.
-
The scraper is multi-process enabled and scales well if you tweak the scraper_procs and proc_delay settings in netdb.conf.
-
On a large install, you will want to have a well-configured DNS server. DNS timeouts are the single largest cause of slow import performance, see below.
-
For thousands of routers and switches and hundreds of thousands of nodes, you will want at least a quad-core server with 2gigs of ram and fast disk drives.
-
The database was designed to be highly transactional and disk write speed is the major bottleneck to import performance.
-
When importing into the database, DNS and MySQL performance will likely be your biggest hurdles. We can import 25k mac table entries in 20sec and 25k ARP entries in 50sec. The faster your database server the better if you are not happy with the speed. Importing the data on the same server as the database seems to improve performance given how network latency can slow the high number of transactions required.
-
To increase import performance, it is possible to run updates against MySQL in parallel, just email me if you need this level of performance.
-
Consider using the included my.cnf file. You want to make sure all tables/indexes exist in memory at all times for maximum database performance.
-
The only type of searches that are non-indexed (possibly slow) are hostname and vendor code searches since they require wildcards '%...%'. Once the number of ARP entries (more likely) or MAC entries in the database grows to over a million records (run netdbctl -st), hostname searches can take over one second or more to complete. Consider deleting older ARP data to improve performance on large databases.
-
You can now set regex=1 in netdb.conf and control hostname free-text searches with a wildcard character (partial_hostname) to improve query times with large dynamic ARP tables.
-
For those of you with millions of records, hopefully, future versions of MySQL will optimize full-text search for InnoDB. The application was designed to be ACID compliance, so MyISAM is not an option. If you must improve full-text performance and cannot delete old data, look at the SphinxSE plugin for MySQL.
In general, the app should be very fast. If searches take a long time something is wrong, and it's probably your database configuration. All searches should be on indexed data and the database was carefully designed for speed. Hostname searches are somewhat slower because they are free form, but they should not be extreme. One really large networks, you probably want to purge old ARP data on a regular basis. If an update takes over 10 minutes unless you have 100,000 devices in your ARP table, something is wrong.
NetDB tends to generate a lot of DNS queries. If it encounters an IP address with no reverse entry already in the database, it will try to do a lookup on it every time an update is run. If these lookups are timing out, import performance will slow to a crawl.
Talk to your DNS administrator and let them know to expect a lot of traffic. Modern DNS servers are more than capable of handling the load. Worst case scenario, take the size of your ARP table and multiply it times 4 for the number of reverse DNS requests per hour.
NetDB was built with Dynamic DNS in mind and works best when you generate reverse records for your hosts. Without DNS entries, there will be no hostnames associated with IPs in the database.
If your DNS server doesn't support reverse DNS or is a windows DNS server that takes forever to timeout, it will make ARP inserts super slow. You can disable DNS altogether through the config file, but a robust reverse DNS infrastructure adds tremendously to the value of NetDB. At the very least, you can get your administrator to create empty reverse zones for private address space to eliminate the delay.
Once a day, I recommend scheduling a full DNS refresh through cron (around 12 pm or so) to catch any hostname changes. This will do a loop on every entry currently in the ARP table and update it. If your network is constantly in flux, you might want to schedule updates more often, but just keep in mind the amount of DNS traffic that is being generated each time you run this.
New: See the skeletonscraper.pl in netdbscraper/ for a sample script for implementing NetDB on third-party devices.
You do not have to use the netdbscraper to get data into NetDB and can optionally use your own data sources. See the updatenetdb.pl script for information on the file formats you need to use to import your own data. You can choose to import just the arp table data and add the mac table data later.
I recommend you also import the interface status data if you are going to use your own mac table data, but it is optional. If you choose not to use the int status data, set the no_switchstatus variable in netdb.conf to 1 to enable switch reports.
If you have registration data per mac address, this can be imported into NetDB. You will need to get the registration data into a CSV file and have updatenetdb.pl import it with the -n command through netdbctl. See the header of the updatenetdb.pl file to find the format. You also need to set use_nac=1 in netdb-cgi.pl to enable NAC user reports and the display of registration data.
If you are using Bradford Network Access Control, see below.
NetDB has been tested on these platforms and likely works with many other Cisco devices:
-
Cisco 12.2 - 15.0 IOS routers for ARP data -- configure them with netdbnomac to avoid errors about missing mac tables
-
Cisco IOS 7600/6500 SXF and above plus VSS support (No CatOS Support)
-
Cisco 4500 Family (4006/4506/4948)
-
Cisco 3750/3560/2970/2960/2950 (anything using 12.2(35)+ will work for sure, some 12.2(25)SEE+ codes work, but really old versions have bugs in them)
-
Cisco Nexus NX-OS 5000/7000 Switches, 2000 FEXs and 1000v
-
Cisco 2924/3500XL support for those of you with really old devices
-
Cisco ASA/FWSM ARP table support (SSH Access Only, no telnet support)
SSH WARNING: If you are using SSH with 12.2(25)SEE code on 3750s/3560s or below, there is an SSH memory leak that will eventually crash your switches. Please make sure you are using 12.2(35) or above on those devices if you plan to use SSH with them. They won't run out of memory for a few days, but then they will start dropping right and left and pulling the plug is the only way to fix them. Please take this in to account on older software.
SSH Memory Leak Update: There are SSH leaks throughout older IOS software. They don't show up in normal use, but NetDB opens many sessions a day to your devices. Make sure your software is no more than two years old if you want to use SSH. Otherwise, consider restricting telnet access to NetDB only. It is no less secure than SNMP v1/2 if properly configured.
netdbctl.pl - Primary control script to update the database. Controls updates to the database, handles locking and coordination of configuration options of netdbscraper.pl and updatenetdb.pl, usually launched from cron. Reads configuration from /etc/netdb.conf, but options can be overwritten from the CLI options. See netdbctl output.
netdbscraper.pl - Gathers data from routers and switches and saves them to csv files in the data/ directory
updatenetdb.pl - Reads from files in the data/ directory and passes it to NetDB.pm to insert into the database. Also contains options to delete old data from the database.
netdb.pl - Command Line Interface to retrieve data from the database. See netdb output.
netdb.cgi.pl - Web interface, depends on /etc/netdb-cgi.conf and extra/depends/ directory, see install notes below.
NetDB.pm - Primary module with all get, insert and delete methods, used by all other scripts
NetDBHelper.pm - Contains helper routines including ssh/telnet login methods and various scraper functions.
See the netdb-uml.vsd visio file for an extensive overview of the database
Primary tables:
These are good to know about if you want to modify the database, but if you only want to query the database for data, see the section about the views below.
ip - IP is the primary key, holds other ip specific data mac - mac is the primary key, holds all data related specifically to a mac and timestamps ipmac - arp table, includes vlan and hostname data and timestamps switchports - holds switch,port,mac data and timestamps switchstatus - holds switch status data, purged when not seen for 7 days nacreg - holds optional nacreg data transactions - holds user transactions against database for security purposes
Database Views:
All views use one item as the primary key, such as the mac address, ip, switchport etc. Then they extrapolate data from other tables through joins. This makes it easy to get all the data you need with one query and keeps some of the complexity in the database. Since these are only used for reports, it doesn't affect update performance, which is highly transactional.
superarp - Contains all data for a specific IP address or ip,mac combo supermac - Contains all data for a specific mac address superswitch - Shows all data for a switch and/or port and/or mac address combination switchstatus - Not a view, but contains useful data for custom queries