tmcrawl

A Tendermint p2p crawling utility and API.

The tmcrawl utility will capture geolocation information and node metadata such as network name, node version, RPC information, and node ID for each crawled node. The utility will first start with a set of seeds and attempt to crawl as many nodes as possible from those seeds. When there are no nodes left to crawl, tmcrawl will pick a random set of nodes from the known list of nodes to reseed the crawl every crawl_interval seconds from the last attempted crawl finish.

Nodes are persisted in a key/value embedded database, by default BadgerDB. Saved nodes will also be periodically rechecked every recheck_interval. If any node cannot be reached, it'll be removed from the known set of nodes.

Things to Note

tmcrawl is a Tendermint p2p network crawler, it does not operate as a seed node or any other type of node.
Tendermint nodes can have peers across multiple networks, so there is no guarantee that all the nodes found by the tool run in the same network. For that, pay attention to the network attribute of the DefaultNodeInfo tendermint type.
tmcrawl uses the Tendermint RPC port (26657) to get ResultStatus, from where it extracts network and peer information. Since having the Tendermint RPC port open is optional, there is no guarantee that tmcrawl will crawl all the nodes in a cosmos network. The number of nodes crawled for a network depends on how many nodes have the RPC port open and which peers they connect to.
Independent development of tmcrawl can be found (here)[https://github.com/theSamPadilla/tmcrawl]. Original repo where can be found (here)[https://github.com/fissionlabsio/tmcrawl].

Install

tmcrawl takes a simple configuration. It needs to only know about a an ipstack API access key and an initial set of seed nodes. See config.toml for reference.

To install the binary:

$ make install

Note: Requires Go 1.13+

Usage

tmcrawl runs as a daemon process and exposes a RESTful JSON API.

To start the binary:

$ tmcrawl </path/to/config.toml> [flags]

The RESTful JSON API is served over listen_addr as provided in the configuration. See --help for further documentation.

API

All API documentation is hosted via Swagger UI under path /swagger/.

Future Improvements

Crawl integration tests
Front-end visualization

Contributing

Contributions are welcome! Please open an Issues or Pull Request for any changes.

License

CCC0 1.0 Universal

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
analysis		analysis
build		build
cmd		cmd
config		config
crawl		crawl
db		db
scripts		scripts
server		server
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.toml		config.toml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tmcrawl

Things to Note

Install

Usage

API

Future Improvements

Contributing

License

About

Releases

Packages

Languages

License

theSamPadilla/tmcrawl

Folders and files

Latest commit

History

Repository files navigation

tmcrawl

Things to Note

Install

Usage

API

Future Improvements

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages