MSTICPy 2.0.0 pre-release 2
Pre-releaseNew Features
There are several new features in V 2.0.0 of MSTICPy. The major
items include:
- Folium map update - plot a map using multiple layers, custom
icons, colors and tooltips from a single function call. - Time Series - calculate and display a Time Series anomalies
plot from a single function call. - Threat Intelligence lookups - individual providers run asynchronously
(simultaneously) making it many times faster to perform lookups
across providers. Lookup progress is also displayed with a progress
bar
Pre-release documentation for v2.0.0 is on ReadtheDocs
Note: API documentation should be up-to-date but user-guides for new features
are still TBD.
Folium map update
The Folium module in MSTICPy has always been a bit complex to use
since it normally required that you convert IP addresses to MSTICPy
IpAddress entities before adding them to the map. You can now
plot maps with a single function call from a DataFrame containing
IP addresses or location coordinates. You can group the data
into folium layers, specify columns to populate popups and tooltips
and to customize the icons and coloring.
plot_map
A new plot_map
function (in the msticpy.vis.foliummap module) that
lets you plot mapping points directly from a DataFrame. You can
specify either an ip_column
or coordinates columns (lat_column
and
long_column
). In the former case, the geo location of the IP address
is looked up using the MaxMind GeoLiteLookup data.
You can also control the icons used for each marker with the
icon_column
parameters. If you happen to have a column in your
data that contains names of FontAwesome or GlyphIcons icons
you can use that column directly.
More typically you would combine the icon_column
with the
icon_map
parameter. You can specify either a dictionary or a
function. For a dictionary, the value of the row in icon_column
is used as a key - the value is a dictionary of icon parameters
passed to the Folium.Icon class. For a function, the icon_column
value is passed to the function as a single parameter and the return value
should be a dictionary of valid parameters for the Icon
class.
You can read the documentation for this function in the
docs
plot_map pandas accessor
Plot maps from the comfort of your own DataFrame!
Using the msticpy mp_plot
accessor you can plot maps directly
from a DataFrame containing IP or location information.
The folium_map
function has the same syntax as plot_map
except that you omit the data
parameter.
df.mp_plot.folium_map(ip_column="ip", layer_column="CountryName")
Layering, Tooltips and Clustering support
In plot_map
and .mp_plot.folium_map
you can specify
a layer_column
parameter. This will group the data
by the values in that column and create an
individually selectable/displayable layer in Folium. For performance
and sanity reasons this should be a column with a relatively
small number of discrete values.
Clustering of markers in the same layer is also implemented by
default - this will collapse multiple closely located markers
into a cluster that you can expand by clicking or zooming.
You can also populate tooltips and popups with values
from one or more column names.
Classic interface
The original FoliumMap class is still there for more manual
control. This has also been
enhanced to support direct plotting from IP, coordiates or GeoHash
in addition to the existing IpAddress and GeoLocation entities.
It also supports layering and clustering.
Threat Intelligence Providers - Async support
When you have configured more than one TI provider, MSTICPy will
execute requests to each of them asynchronously. This will bring big
performance benefits when querying IoCs from multiple providers.
Note: requests to individual providers are still executed synchronously
since we want to avoid swamping provider services with multiple
simultaneous requests.
We've also implemented progress bar tracking for TILookups, giving a visual
indication of progress when querying multiple IoCs.
Combining the progress tracking with asynchronous operation means
that not only is performing lookups for lots of observables faster
but ou will also less likely to be left guessing whether or not your kernel
has hung.
TI Providers are now also loaded on demand - i.e. only when you have
a configuration entry in your msticpyconfig.yaml for that provider.
This prevents loading of code (and possibly import errors) due to providers
which you are not intending to use.
Finally, we've added functions to enable and disable providers
after loading TILookup:
from msticpy.context import TILookup
ti_lookup = TILookup()
iocs = ['162.244.80.235', '185.141.63.120', '82.118.21.1', '85.93.88.165']
ti_lookup.lookup_iocs(iocs, providers=["OTX", "RiskIQ"])
Time Series pandas accessor
Although the Time Series functionality was relatively simple to
use, it previously required several disconnected steps to compute
the time series, plot the data, extract the anomaly periods. Each of
these needed a separate function import. Now you can do all of these
from a DataFrame via pandas accessors.
(currently there is a separate accessor df.mp_timeseries
but we are
still working on consolidating our pandas accessors so this may change
before the final release.)
Because you typically still need these separate outputs, the accessor
has multiple methods:
df.mp_timeseries.analyze
- takes a time-summarized DataFrame
and returns the results of a time-series decompositiondf.mp_timeseries.plot
- takes a decomposed time-series and
plots the anomaliesdf.mp_timeseries.anomaly_periods
- extracts anomaly periods
as a list of time rangesdf.mp_timeseries.anomaly_periods
- extracts anomaly periods
as a list of KQL query clausesdf.mp_timeseries.apply_threshold
- applies a new anomaly
threshold score and returns the results.
Analyze data to produce time series.
df = qry_prov.get_networkbytes_per_hour(...)
ts_data = df.mp_timeseries.analyze()
Analyze and plot time series anomalies
df = qry_prov.get_networkbytes_per_hour(...)
ts_data = df.mp_timeseries.analyze().mp_timeseries.plot()
Analyze and retrieve anomaly time ranges
df = qry_prov.get_networkbytes_per_hour(...)
ts_data = df.mp_timeseries.analyze().mp_timeseries.anomaly_periods()
In next pre-release
Plot networks (graphs) directly from a DataFrame
One frequently-requested feature is the ability to easily plot
networks from data. For example you may want to view the interactions
between account names and IP addresses. This feature use
Networkx to build the graph and
Bokeh to plot the graph.
Note: The graph has the usual Bokeh interactivity - zoomin, panning, selecting,
hover-over tooltips. It does not allow you to move individual
nodes and interactively recalculate the layout. For the
latter, you can use this functionality to build a networkx graph
and plot using something like GraphViz or PyViz.
The network plot will give you two functions:
df.mp.to_graph
to convert a DataFrame to a networkx graphdf.mp_plot.network
create and plot the graph in a single step.
(There is also a separate function msticpy.vis.network_plot.plot_nx_graph
that will just do the NX -> plot operation)
You can specify the columns to use as source and target. An edge
is created between source and target when the two occur on
in the same row (or more than one row). You can also
specify columns to use as node and edge attributes.
To Do items
We intend to add the following before release:
- allow you to specify the networkx layout algorithm to use
(currently it uses the defaultspring_layout
) - assign edge
weight
attribute based on number of rows contributing
to an edge
MS Sentinel Workspaces API
Lets you query and resolve details for Sentinel workspaces.
This is integrated into the MpConfigEdit and MpConfigFile utilities
to let you lookup workspace details when you are editing your
settings:
- paste in a URL from the Sentinel Azure portal to populate workspace settings
- or resolve full details from partial workspace such as the workspace ID.
Other important fixes
The API details for most of the MSTICPy functions were not being
generated - this should now be fixed.
What's Changed (GitHub PR Summary)
- Added pd accessor for time series functions. by @ianhelle in #381
- Added new Sentinel Search Features - merge from main by @ianhelle in #380
- Ianhelle/ti async lookup 2022 04 27 by @ianhelle in #383
- Ianhelle/folium accessor 2022 04 30 by @ianhelle in #384
- Updated tweet action to include more details by @petebryan in #406
- Add Device Code fallback option for when interactive auth isn't available. by @petebryan in #401
- Adding OData Delegated Auth Support into 2.0 by @petebryan in #410
- Removed plaintext token cache from MSAL auth and replaced it with fall back to in memory caching by @petebryan in #414
- Ianhelle/kql nbinit fixes merge2.0 2022 05 18 by @ianhelle in #412
- Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #421
- Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #422
- Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #423
Full Changelog: v2.0.0.rc1...v2.0.0.rc2