Releases: microsoft/msticpy
Bokeh, ipywidgets version restrictions
The main driver for this release is to restrict versions of bokeh, ipywidgets and pandas.
- Version 3.0.0 of bokeh plots has some breaking changes that prevent it working with MSTICPy
- Version 8.0.0 of ipywidgets has changes that prevent some of the MSTICPy compound widgets displaying correctly.
We also decided to start restricting versions of some of our other dependencies to the current major version - to prevent unexpected breaking changes stopping MSTICPy from working. We have included pandas in this list and will expand it to cover more packages in future. We will combine this with an automated build job that has no version restrictions so that we're aware of version changes that we need to address. The intent here is to have MSTICPy have as broad a version range as possible for its dependencies while still avoiding failures due to breaking changes.
Another small but important change is an update to the Process Tree viewer to allow process GUIDs as process IDs (rather than just hex or decimal format integers). Thanks to @nbareil for this change!
What's Changed
- process_tree: Accept GUID format for ProcessID and ParentProcessID by @nbareil in #542
- Bump sphinx from 5.1.1 to 5.3.0 by @dependabot in #540
- Bump readthedocs-sphinx-ext from 2.1.9 to 2.2.0 by @dependabot in #545
- Update AzureBlobStorage.rst by @garybushey in #539
- Adding upper version restrictions to bokeh, pandas and ipywidgets deps by @ianhelle in #552
New Contributors
- @garybushey made their first contribution in #539
Full Changelog: v2.1.4...v2.1.5
Fixes for MS Sentinel API and configuration
Some minor fixes and improvements:
- MicrosoftSentinel class now defaults to "Default" workspace or workspace name supplied as
workspace
parameter
when connecting.
sentinel = MicrosoftSentinel()
sentinel.connect() # connect to "Default" workspace
sentinel.connect(workspace="MyWorkspace") # connect to named workspace
- Sentinel
create_*
APIs now return ID of new item (incident, bookmark, analytic, watchlist) - init_notebook - now accepts
config
parameter to use custommsticpyconfig.yaml
for notebook session (overrides enviromnent variable and other defaults
import msticpy as mp
mp.init_notebook(config="~/configs/all_ti_provs.yaml") # use a custom msticpy config file.
- Sentinel configuration editor no longer throws an exception if named control not found
- Sentinel TI provider will not attempt lookups if
ThreatIntelligenceIndicator
table not found in the Sentinel data provider schema - Support for Kusto/Azure Data explorer settings in Settings editor
- Added checked_kwargs decorator to utility/types.py
What's Changed
- Ianhelle/training hotfixes 2022 10 13 by @ianhelle in #543
- Updated ReadMe with Blackhat Arsenal Tag by @petebryan in #521
Full Changelog: v2.1.3...v2.1.4
Process Tree Viewer updates
Highlights
This is a minor release with some fixes and additions that enable broader functionality.
The biggest-impacting changes apply to the
Process Tree visualization.
These changes allow it to work with broader types of Windows or Linux process data:
- Removed the following columns that were previously required: host_name, logon_id, user_name, cmd_line.
- Added auto-coloring by level if no legend is supplied.
- Fixed process sorting so that tree and peer groups in the tree are sorted by level, then timestamp.
- Added ability to supply schema as dictionary to the process tree APIs.
The changes are described in more detail below.
We've also added support for a new MS Sentinel API to retrieve queries stored in a Sentinel workspace
and fixed some issues in IP WhoIs lookups.
Process Tree changes
Reduced required column set
This allows you to use the process tree visualization and utilities with a minimal set of data fields:
- process_id
- parent_id
- process_name
- time_stamp
cust_schema = {
"process_name": "ImageFileName",
"process_id": "PID",
"parent_id": "PPID",
"time_stamp": "CreateTime",
}
df.mp_plot_process(schema=cust_schema)
Auto-coloring of tree plot
If you do not supply a legend_col
parameter, the process objects will be
automatically colored by level in the hierarchy. This makes a basic tree more colorful and easier to navigate.
Processes are correctly sorted by process time
Previously, the code that builds the process tree left individual processes in an unintuitive order.
For a given level (e.g. parents) all of the processes will be displayed in time created order.
For example:
A \
- A.1
- A.2
B \
- B.1
- B.2
A will always have a timestamp less than or equal to B. All children of A (A.1, A.2...) and B will be shown in
time created order. However, across different levels and peer groups, there is no guarantee of time-ordering. In the above example, even though timestamp A is less than timestamp B, B.1 and B.2 could have timestamps earlier than either A.1 or A.2.
path | ImageFileName | CreateTime | |
---|---|---|---|
proc_key | |||
registry|88|2021-04-01 05:04:54.000000 | 116/0 | Registry | 2021-04-01 05:04:54+00:00 |
system|4|2021-04-01 05:04:58.000000 | 117/1 | System | 2021-04-01 05:04:58+00:00 |
smss.exe|404|2021-04-01 05:04:58.000000 | 117/1/2 | smss.exe | 2021-04-01 05:04:58+00:00 |
csrss.exe|640|2021-04-01 05:05:00.000000 | 118/3 | csrss.exe | 2021-04-01 05:05:00+00:00 |
winlogon.exe|700|2021-04-01 05:05:00.000000 | 118/4 | winlogon.exe | 2021-04-01 05:05:00+00:00 |
dwm.exe|1028|2021-04-01 05:05:02.000000 | 118/4/17 | dwm.exe | 2021-04-01 05:05:02+00:00 |
logonui.exe|512|2021-04-01 05:05:02.000000 | 118/4/21 | LogonUI.exe | 2021-04-01 05:05:02+00:00 |
fontdrvhost.ex|960|2021-04-01 05:05:01.000000 | 118/4/7 | fontdrvhost.ex | 2021-04-01 05:05:01+00:00 |
wininit.exe|632|2021-04-01 05:05:00.000000 | 119/5 | wininit.exe | 2021-04-01 05:05:00+00:00 |
lsass.exe|776|2021-04-01 05:05:01.000000 | 119/5/10 | lsass.exe | 2021-04-01 05:05:01+00:00 |
mp_plot.process_tree and mp.build_process_tree support schema as dictionary
Previously these accessors and the underlying functions plot_process_tree
and
build_process_tree
would only accept msticpy.transform.process_tree_schema.ProcSchema
instances. These will now accept dictionaries with at least the minimum required
attributes as keys.
What's Changed
- Sentinel - Return all saved queries by @petebryan in #519
- Bump readthedocs-sphinx-ext from 2.1.8 to 2.1.9 by @dependabot in #507
- Bump respx from 0.19.2 to 0.20.0 by @dependabot in #512
- Allow process tree to work with more data sources. by @ianhelle in #513
- Fixed error in cell using non-existing column name by @ianhelle in #527
- Ianhelle/proc tree fixes 2022 09 16 by @ianhelle in #530
- Fixed issue with whois lookups on only local IPs by @petebryan in #506
Full Changelog: v2.1.2...v2.1.3
Hotfix - Azure authentication failure
A last-minute change before release of 2.1.0 introduced a critical bug in azure_auth_core.py.
This caused all azure authentication to fail. It would also cause init_notebook()
to fail if the user had any Key Vault secrets referenced in their msticpyconfig.yaml.
Thanks to @FlorianBracq for spotting this independently (and before us) and submitting a PR with the fix.
The PR below is essentially the same fix as Florian's with a subtle change to allow an EnvironmentCredential of None to appear in the list of creds sent to ChainedTokenCredential. This is to cover an edge case where EnvironmentCredential is requested but the required environment variables are not set.
What's Changed
- [fix] bug in call to ChainTokenCredential breaks all authentication by @ianhelle in #505
- Rolling back change on _build_chained_creds by @FlorianBracq in #504
Full Changelog: v2.1.1...v2.1.2
Hotfix - missing beautifulsoup4 from requirements/dependencies
We inadvertently took a hard dependency on beautifulsoup4 but didn't have it in our dependencies.
Unfortunately, since bs4 is in our test dependencies this passed all the tests so didn't spot it until later.
What's Changed
Full Changelog: v2.1.0...v2.1.1
IpWhois, Malware Bazaar, Azure Auth, Azure Synapse
Highlights
Replaced dependency on IPWhois with local code #479
The ipwhois package seems to be abandoned and was causing conflicts with dnspython. We've
created equivalent functionality in msticpy removing build warnings and (minutely) speeding
up install time. We've also added a MSTICPy pandas accessor df.mp.whois()
so that you can
do bulk queries from a dataframe.
Malware Bazaar TI Provider #459
Many thanks to @fr0gger for this.
Check out the notebook MBLookup to
see how you can use this new provider.
Documentation on how to build a Data Provider #465
This was previously a blog post but we've added it to the official docs - Writing and Contributing a Data Provider
Updates to Azure authentication to support more authentication types #484
We've switched from using DefaultAzureCredential to supporting the native credential types.
This lets us support additional credential types such as Client Secret and Certification authentication.
You can also create your own custom AzureCredential and pass this to az_connect.
Updates to SQL2Kql converter #488
This was really prompted by @tonybaloney in helping us get a build working on Python 3.8-3.11. This
module had a dependency on a now-deprecated moz_sql_parser. We've updated to use mo_sql_parsing - many thanks to @klahnakoski for work on keeping this alive and well.
Our module also contains some fixes and enhancements from the original.
Builds and tests now running on Python 3.8, 3.9, 3.10 #476
We were previously only building on Python 3.8. Huge thanks to @tonybaloney for working on this and bringing us
into the modern era. We still have some issues with Python 3.11dev - although this is due to SciPy breaking with
the last 3.11 version we tried. As soon as this is sorted we will add 3.11 back.
Added support for msticpy notebooks in Azure Synapse pipelines #493
This is mostly work done to support MS Sentinel running unattended notebooks in Synapse pipelines.
We've extended the mp.init_notebook() function so that it can correctly configure msticpy (looking for
msticpyconfig.yaml on a mounted blob storage container and persisting cached data there),
use the linked Key Vault to store secrets and supply service principal credentials to msticpy.
Important fixes
- Allow for missing columns in Folium map data frame #489
- Updated M365D/MDE driver to pass query request with JSON encoding #498 Defender have always supported this
but we were sending a JSON string, which they recently stopped supporting. This should be working again. - You can now see data query help before connecting to the data provider. It's also possible
to dry run the query to see the full query with parameters replaced without needing to connect. #482
What's Changed
- Azure ML init fix by @FlorianBracq in #471
- Sumologic-DataConnector.ipynb: fix s/url=/connection_str=/ by @juju4 in #436
- Fix typo in parameter name by @FlorianBracq in #470
- Update jinja2 requirement from <3.1.0 to <3.2.0 by @dependabot in #450
- Update docutils requirement from <0.18.0 to <0.20.0 by @dependabot in #464
- Ianhelle/writing data provs doc 2022 03 14 by @ianhelle in #465
- IP Whois and Azure Auth Fixes by @petebryan in #479
- Bump sphinx from 5.0.2 to 5.1.1 by @dependabot in #478
- Update UploadData.rst with fix for import by @pensivepaddle in #483
- [update] DNS resolver return dataframe with one IP per row by @ianhelle in #485
- Adding Malware Bazaar module as TI provider by @fr0gger in #459
- Allow query help when qry provider not connected by @ianhelle in #482
- Adding all supported auth types to config UI mpconfig_defaults.yaml by @ianhelle in #484
- Fixing build issues with requirements-dev and doc by @ianhelle in #490
- [fix] Updated SQL to KQL converter to use mo_sql_parsing by @ianhelle in #488
- [fix] Allow for missing columns in Folium GeoIP data. by @ianhelle in #489
- Add support for Python 3.10, 3.11 and test in CI by @tonybaloney in #476
- Revert "Add support for Python 3.10, 3.11 and test in CI" by @petebryan in #494
- Add support for Python 3.10, 3.11 and test in CI by @petebryan in #495
- Added azure_synapse_tools to support notebooks in Synapse by @ianhelle in #493
- Changing the MDE/M365D request content to json encoding. by @ianhelle in #498
- Fixes and updates to support notebooklet updates by @petebryan in #497
- Fix breaking issues in Auth and Browshot by @petebryan in #499
New Contributors
- @juju4 made their first contribution in #436
- @tonybaloney made their first contribution in #476
Full Changelog: v2.0.0...v2.1.0
MSTICPy Version 2.0
MSTICPy Release 2.0
A notebook containing some of the features of MSTICPy 2.0
is available at What's new in MSTICPy 2.0
If you are new to MSTICPy or just want to catch up and get a quick
overview check out our new MSTICPy Quickstart Guide.
Contents
- Dropping Python 3.6 support
- Package re-organization and module search
- Simplifying imports in MSTICPy
- Folium map update - single function, layers, custom icons
- Threat Intelligence providers - async support
- Time Series simplified - analysis and plotting
- DataFrame to graph/network visualization
- Pivots - easy initialization/dynamic data pivots
- Consolidating Pandas accessors
- MS Sentinel workspace configuration
- MS Defender queries available in MS Sentinel QueryProvider
- Microsoft Sentinel QueryProvider
- New queries
- Documentation Additions and Improvements
- Miscellaneous improvements
- Previous feature changes since MSTICPy 1.0
Dropping Python 3.6 support
As of this release we only officially support Python 3.8 and above.
We will try to support Python 3.6 if the fixes required are small
and contained but make no guarantees of it working completely on
Python prior to 3.8.
Package re-organization and module search
One of our main goals for V2.0.0 was to re-organize MSTICPy to be more logical and easier to
use and maintain. Several years of organic growth had seen modules created in places that
seemed like a good idea at the time but did not age well.
The discussion about the V2 structure can be found here #320.
Due to the re-organization, many features are no longer in places
where they used to be imported from!
We have tried to maintain compatibility with old locations by adding "glue" modules.
These allow import of many modules from their previous locations but will issue a
Deprecation warning if loaded from the old location.
The warning will contain the new location of the module -
so you should update your code to point to this new location.
This table gives a quick overview of the V2.0 structure
folder | description |
---|---|
analysis | data analysis functions - timeseries, anomalies, clustering |
auth | authentication and secrets management |
common | common used utilities and definitions (e.g. exceptions) |
config | configuration and settings UI |
context | enrichment modules geoip, ip_utils, domaintools, tiproviders, vtlookup |
data | data acquisition/queries/storage/uploaders |
datamodel | entities, soc objects |
init | package loading and initialization - nbinit, pivot modules |
nbwidgets | nb widgets modules |
transform | simple data processing - decoding, reformatting, schema change, process tree |
vis | visualization modules including browsers |
Notable things that have moved:
- most things from the
sectools
folder have migrated to context, transform or analysis - most things from the
nbtools
folder have migrated to:msticpy.init
- (not to be confused with__init__
) - package initializationmsticpy.vis
- visualization modules
- pivot functionality has moved to
msticpy.init
Module Search
If you are having trouble finding a module, we have added a simple search function:
import msticpy as mp
mp.search("riskiq")
Matches will be returned in a table with links to the module documentation
Modules matching 'riskiq'
Module | Help |
---|---|
msticpy.context.tiproviders.riskiq | msticpy.context.tiproviders.riskiq |
Simplifying imports in MSTICPy
The root module in MSTICPy now has several modules and
classes that can be directly accessed from it (rather than
having to import them individually).
We've also decided to adopt a new "house style" of importing
msticpy
as the alias mp
. Slavishly copying the idea from
some of admired packages that we use (pandas -> pd
,
numpy -> np
, networkx -> nx
) we thought it would save
a bit of typing. You are free to adopt or ignore this style -
it obviously has no impact on the functionality.
import msticpy as mp
mp.init_notebook()
qry_prov = mp.QueryProvider("MDE")
ti = mp.TILookup()
Many commonly-used classes and functions are exposed as
attributes of msticpy (or mp
).
Also a number of commonly-used classes are imported by default
by init_notebook
, notably all of the entity classes.
This makes it easier to use pivot functions without any initialization
or import steps.
import msticpy as mp
mp.init_notebook()
# IpAddress can be used without having to import it.
IpAddress.whois("123.45.6.78")
init_notebook
improvements
- You no longer need to supply the
namespace=globals()
parameter when
calling from a notebook.init_notebook
will automatically obtain the
notebook global namespace and populate imports into it. - The default verbosity of
init_notebook
is now 0, which produces
minimal output - useverbosity=1
orverbosity=2
to get more
detailed reporting. - The Pivot subsystem is automatically initialized in
init_notebook
. - All MSTICPy entities are imported automatically.
- All MSTICPy magics are initialized here.
- Most MSTICPy pandas accessors are initialized here (some, which
require optional packages, such as the timeseries accessors are
not initialized by default). init_notebook
supports aconfig
parameter - you can use this to
provide a custom path to amsticpyconfig.yaml
overriding the usual
defaults.- searching for a
config.json
file is only enabled if you are running
MSTICPy in Azure Machine Learning.
Folium map update - single function, layers, custom icons
The Folium module in MSTICPy has always been a bit complex to use
since it normally required that you convert IP addresses to MSTICPy
IpAddress entities before adding them to the map.
You can now
plot maps with a single function call from a DataFrame containing
IP addresses or location coordinates. You can group the data
into folium layers, specify columns to populate popups and tooltips
and to customize the icons and coloring.
plot_map
A new plot_map
function (in the msticpy.vis.foliummap module) that
lets you plot mapping points directly from a DataFrame. You can
specify either an ip_column
or coordinates columns (lat_column
and
long_column
). In the former case, the geo location of the IP address
is looked up using the MaxMind GeoLiteLookup data.
You can also control the icons used for each marker with the
icon_column
parameters. If you happen to have a column in your
data that contains names of FontAwesome or GlyphIcons icons
you can use that column directly.
More typically, you would combine the icon_column
with the
icon_map
parameter. You can specify either a dictionary or a
function. For a dictionary, the value of the row in icon_column
is used as a key - the value is a dictionary of icon parameters
passed to the Folium.Icon class. For a function, the icon_column
value is passed to the function as a single parameter and the return value
should be a dictionary of valid parameters for the Icon
class.
You can read the documentation for this function in the
docs
plot_map pandas accessor
Plot maps from the comfort of your own DataFrame!
Using the msticpy mp_plot
accessor you can plot maps directly
from a DataFrame containing IP or location information.
The folium_map
function has the same syntax as plot_map
except that you omit the data
parameter.
df.mp_plot.folium_map(ip_column="ip", layer_column="CountryName")
Layering, Tooltips and Clustering support
In plot_map
and .mp_plot.folium_map
you can specify
a layer_column
parameter. This will group the data
by the values in that column and create an
individually selectable/displayable layer in Folium. For performance
and sanity reasons this should be a column with a relatively
small number of discrete values.
Clustering of markers in the same layer is also implemented by
default - this will collapse multiple closely located markers
into a cluster that you can expand by clicking or zooming.
You can also populate tooltips and popups with values
from one or more column names.
"Classic" interface
The original FoliumMap class is still there for more manual
control. This has also been
enhanced to support direct plotting from IP, coordinates or GeoHash
in addition to the existin...
MSTICPy 2.0 - Pre-release 3
New Features
A notebook containing some of the features of MSTICPy 2.0
is available at What's new in MSTICPy 2.0
Dropping Python 3.6 Support
As of this release we only officially support Python 3.8 and above.
We will try to support Python 3.6 if the fixes required are small
and contained but make no guarantees of it working completely on
Python prior to 3.8
DataFrame to Graph/Network
You can convert a pandas DataFrame into a NetworkX graph or
plot directly as a graph using Bokeh interactive plotting.
You pass the functions the column names for the source and target nodes to build a basic graph. You can also name other columns to be node or edge attributes. When displayed these attributes are visible as popup details courtesy of Bokeh’s Hover tool.
proc_df.head(100).mp_plot.network(
source_col="SubjectUserName",
target_col="Process",
source_attrs=["SubjectDomainName", "SubjectLogonId"],
target_attrs=["NewProcessName", "ParentProcessName", "CommandLine"],
edge_attrs=["TimeGenerated"],
)
Pivots without initialization/dynamic data query import
The pivot functionality has been overhauled - it is now initialized
automatically in init_notebook
.
Previously queries from
data providers were added at initialization - meaning that you had
to create your query providers before starting pivot or re-initialize
pivot. Data providers now dynamically add relevant queries as pivot
functions when you authenticate. Also for some providers, such
as Azure Sentinel, that support multiple instances, pivot now
supports separate instance naming so that each Workspace has a
separate instance of a given pivot query.
The naming of the Threat Intelligence pivot functions has been
simplified considerably.
VirusTotal and RiskIQ relationships should now be available as
pivot functions (you need the VT 3 and PassiveTotal packages installed
respectively to enable this functionality).
Simplify imports in msticpy
The root module in msticpy now has several modules and
classes that can be directly accessed from it (rather than
having to import them)
import msticpy as mp
mp.init_notebook()
qry_prov = mp.QueryProvider("MDE")
ti = mp.TILookup()
Also a number of commonly-used classes are imported by default
by init_notebook
, notable all of the entity classes.
This makes it easier to use pivot functions without any initialization
or import steps.
- entities
import msticpy as mp
mp.init_notebook()
IpAddress.whois("123.45.6.78")
Consolidation of Pandas accessors
Pandas accessors are extensions to DataFrames allowing you to
call custom functionality as a DataFrame method.
Almost all of the core MSTICPy functions previously available in
various accessors (plus a few new ones) are accessible in:
- df.mp - analysis and transform functions
- df.mp_plot - visualization functions
df.mp.ioc_extract(...)
df.mp.to_graph(...)
df.mp.mask(...)
df.mp_plot.timeline(...)
df.mp_plot.timeline_values(...)
df.mp_plot.process_tree(...)
df.mp_plot.network(...)
df.mp_plot.folium_map(...)
MS Defender Queries available to MS Sentinel Query Provider
Since Sentinel now has the ability to import Microsoft data, we've
made the Defender queries usable from the MS Sentinel provider.
Many of these queries are now available as Pivot functions.
ContiLeaks notebook added to MSTICPy Repo
We are privileged to host Thomas's awesome ContiLeaks notebook.
Thanks @fr0gger
New Queries added
Several new Sentinel and MS Defender queries have been added.
See the new built-in query list
Documentation Additions and Updates
The documentation for V2.0 is now live and available at https://msticpy.readthedocs.io
(Previous versions are still online and can be accessed through
the ReadTheDocs interface).
- New MSTICPy Quickstart Guide
- Updated Installing guide
- Updated Threat Intel Lookup documentation
- Updated Time Series analysis documentation
- New Plot Network Graph from DataFrame
- Updated Plotting Folium maps
- Updated Pivot functions
- Updated Jupyter and Sentinel
The API documentation has been split into separate modules to
make it easier to navigate. The API docs also now support "InterSphinx".
This means that MSTICPy references to objects in other packages (e.g. Python
standard library, pandas, Bokeh) have active links that will take you
to the native documentation for that item.
Also, the sample notebooks for most of these features have been updated
along the same lines. See MSTICPy Sample notebooks
Miscellaneous Improvements
- The MS Sentinel provider now support a timeout parameter allowing you
lengthen and shorten the default. - MSTICPy network requests use a custom User Agent header so that you
can identify or track requests from MSTICPy/Notebooks.
Plus a lot more that I can't recall at the moment.
What's Changed - The gory detail of the PRs
- Sync changes to main into v2 branch by @ianhelle in #330
- Ianhelle/msticpy v2.0.0 merge updates 2022 03 14 by @ianhelle in #338
- Ianhelle/implement isort 2022 02 15 by @ianhelle in #327
- Ianhelle/implement isort branch post-fixes 2022 03 21 by @ianhelle in #346
- Ianhelle/pivot dataprov selfload 2022 03 15 by @ianhelle in #343
- Ianhelle/main mergeback 2022 04 05 by @ianhelle in #355
- Merging changes from main for geoip.py, config editor and kusto_driver by @ianhelle in #359
- Pebryan/2022 4 14 auth merge by @petebryan in #368
- Fixed minor issues by @petebryan in #372
- Ianhelle/v2 reorg directories 2 2022 04 12 by @ianhelle in #377
- Ianhelle/mpconfigedit fix from main 2022 05 22 by @ianhelle in #396
- Added pd accessor for time series functions. by @ianhelle in #381
- Added new Sentinel Search Features - merge from main by @ianhelle in #380
- Ianhelle/ti async lookup 2022 04 27 by @ianhelle in #383
- Ianhelle/folium accessor 2022 04 30 by @ianhelle in #384
- Updated tweet action to include more details by @petebryan in #406
- Add Device Code fallback option for when interactive auth isn't avaliable. by @petebryan in #401
- Adding OData Delegated Auth Support into 2.0 by @petebryan in #410
- Removed plaintext token cache from MSAL auth and replaced it with fall back to in memory caching by @petebryan in #414
- Ianhelle/kql nbinit fixes merge2.0 2022 05 18 by @ianhelle in #412
- Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #421
- Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #422
- Ianhelle/geoip init fix 2022 05 27 by @ianhelle in #423
- Ianhelle/read the docs fixes 2022 05 29 by @ianhelle in #424
- Ianhelle/sentinel workspace lookup 2022 05 19 by @ianhelle in #419
- Fix for list_hunting_queries function by @pensivepaddle in #417
- Update calls to credential.modern.get_token by @FlorianBracq in #429
- Adding ContiLeaks Analysis by @fr0gger in #428
- Networkx graphs from dataframe by @ianhelle in #427
- Ianhelle/msticpy init imports and Quickstart doc by @ianhelle in #435
- Ianhelle/main updates to msticpy v2.0.0 2022 06 14 by @ianhelle in #444
- [fix] Revert to Py 3.7 build with typing-extensions by @ianhelle in #448
- [fix] if AuthKey or ApiID is None by @ianhelle in #449
- Ianhelle/query pivot naming 2022 06 06 by @ianhelle in #437
- Ianhelle/folium update docs 2022 05 29 by @ianhelle in #438
- Ianhelle/timeline updates 2022 06 14 by @ianhelle in #441
*...
MSTICPy 2.0.0 pre-release 2
New Features
There are several new features in V 2.0.0 of MSTICPy. The major
items include:
- Folium map update - plot a map using multiple layers, custom
icons, colors and tooltips from a single function call. - Time Series - calculate and display a Time Series anomalies
plot from a single function call. - Threat Intelligence lookups - individual providers run asynchronously
(simultaneously) making it many times faster to perform lookups
across providers. Lookup progress is also displayed with a progress
bar
Pre-release documentation for v2.0.0 is on ReadtheDocs
Note: API documentation should be up-to-date but user-guides for new features
are still TBD.
Folium map update
The Folium module in MSTICPy has always been a bit complex to use
since it normally required that you convert IP addresses to MSTICPy
IpAddress entities before adding them to the map. You can now
plot maps with a single function call from a DataFrame containing
IP addresses or location coordinates. You can group the data
into folium layers, specify columns to populate popups and tooltips
and to customize the icons and coloring.
plot_map
A new plot_map
function (in the msticpy.vis.foliummap module) that
lets you plot mapping points directly from a DataFrame. You can
specify either an ip_column
or coordinates columns (lat_column
and
long_column
). In the former case, the geo location of the IP address
is looked up using the MaxMind GeoLiteLookup data.
You can also control the icons used for each marker with the
icon_column
parameters. If you happen to have a column in your
data that contains names of FontAwesome or GlyphIcons icons
you can use that column directly.
More typically you would combine the icon_column
with the
icon_map
parameter. You can specify either a dictionary or a
function. For a dictionary, the value of the row in icon_column
is used as a key - the value is a dictionary of icon parameters
passed to the Folium.Icon class. For a function, the icon_column
value is passed to the function as a single parameter and the return value
should be a dictionary of valid parameters for the Icon
class.
You can read the documentation for this function in the
docs
plot_map pandas accessor
Plot maps from the comfort of your own DataFrame!
Using the msticpy mp_plot
accessor you can plot maps directly
from a DataFrame containing IP or location information.
The folium_map
function has the same syntax as plot_map
except that you omit the data
parameter.
df.mp_plot.folium_map(ip_column="ip", layer_column="CountryName")
Layering, Tooltips and Clustering support
In plot_map
and .mp_plot.folium_map
you can specify
a layer_column
parameter. This will group the data
by the values in that column and create an
individually selectable/displayable layer in Folium. For performance
and sanity reasons this should be a column with a relatively
small number of discrete values.
Clustering of markers in the same layer is also implemented by
default - this will collapse multiple closely located markers
into a cluster that you can expand by clicking or zooming.
You can also populate tooltips and popups with values
from one or more column names.
Classic interface
The original FoliumMap class is still there for more manual
control. This has also been
enhanced to support direct plotting from IP, coordiates or GeoHash
in addition to the existing IpAddress and GeoLocation entities.
It also supports layering and clustering.
Threat Intelligence Providers - Async support
When you have configured more than one TI provider, MSTICPy will
execute requests to each of them asynchronously. This will bring big
performance benefits when querying IoCs from multiple providers.
Note: requests to individual providers are still executed synchronously
since we want to avoid swamping provider services with multiple
simultaneous requests.
We've also implemented progress bar tracking for TILookups, giving a visual
indication of progress when querying multiple IoCs.
Combining the progress tracking with asynchronous operation means
that not only is performing lookups for lots of observables faster
but ou will also less likely to be left guessing whether or not your kernel
has hung.
TI Providers are now also loaded on demand - i.e. only when you have
a configuration entry in your msticpyconfig.yaml for that provider.
This prevents loading of code (and possibly import errors) due to providers
which you are not intending to use.
Finally, we've added functions to enable and disable providers
after loading TILookup:
from msticpy.context import TILookup
ti_lookup = TILookup()
iocs = ['162.244.80.235', '185.141.63.120', '82.118.21.1', '85.93.88.165']
ti_lookup.lookup_iocs(iocs, providers=["OTX", "RiskIQ"])
Time Series pandas accessor
Although the Time Series functionality was relatively simple to
use, it previously required several disconnected steps to compute
the time series, plot the data, extract the anomaly periods. Each of
these needed a separate function import. Now you can do all of these
from a DataFrame via pandas accessors.
(currently there is a separate accessor df.mp_timeseries
but we are
still working on consolidating our pandas accessors so this may change
before the final release.)
Because you typically still need these separate outputs, the accessor
has multiple methods:
df.mp_timeseries.analyze
- takes a time-summarized DataFrame
and returns the results of a time-series decompositiondf.mp_timeseries.plot
- takes a decomposed time-series and
plots the anomaliesdf.mp_timeseries.anomaly_periods
- extracts anomaly periods
as a list of time rangesdf.mp_timeseries.anomaly_periods
- extracts anomaly periods
as a list of KQL query clausesdf.mp_timeseries.apply_threshold
- applies a new anomaly
threshold score and returns the results.
Analyze data to produce time series.
df = qry_prov.get_networkbytes_per_hour(...)
ts_data = df.mp_timeseries.analyze()
Analyze and plot time series anomalies
df = qry_prov.get_networkbytes_per_hour(...)
ts_data = df.mp_timeseries.analyze().mp_timeseries.plot()
Analyze and retrieve anomaly time ranges
df = qry_prov.get_networkbytes_per_hour(...)
ts_data = df.mp_timeseries.analyze().mp_timeseries.anomaly_periods()
In next pre-release
Plot networks (graphs) directly from a DataFrame
One frequently-requested feature is the ability to easily plot
networks from data. For example you may want to view the interactions
between account names and IP addresses. This feature use
Networkx to build the graph and
Bokeh to plot the graph.
Note: The graph has the usual Bokeh interactivity - zoomin, panning, selecting,
hover-over tooltips. It does not allow you to move individual
nodes and interactively recalculate the layout. For the
latter, you can use this functionality to build a networkx graph
and plot using something like GraphViz or PyViz.
The network plot will give you two functions:
df.mp.to_graph
to convert a DataFrame to a networkx graphdf.mp_plot.network
create and plot the graph in a single step.
(There is also a separate function msticpy.vis.network_plot.plot_nx_graph
that will just do the NX -> plot operation)
You can specify the columns to use as source and target. An edge
is created between source and target when the two occur on
in the same row (or more than one row). You can also
specify columns to use as node and edge attributes.
To Do items
We intend to add the following before release:
- allow you to specify the networkx layout algorithm to use
(currently it uses the defaultspring_layout
) - assign edge
weight
attribute based on number of rows contributing
to an edge
MS Sentinel Workspaces API
Lets you query and resolve details for Sentinel workspaces.
This is integrated into the MpConfigEdit and MpConfigFile utilities
to let you lookup workspace details when you are editing your
settings:
- paste in a URL from the Sentinel Azure portal to populate workspace settings
- or resolve full details from partial workspace such as the workspace ID.
Other important fixes
The API details for most of the MSTICPy functions were not being
generated - this should now be fixed.
What's Changed (GitHub PR Summary)
- Added pd accessor for time series functions. by @ianhelle in https://githu...
Fixes for Linux auth, kql and nbinit initialization
Minor release fixing a few usability issues.
What's Changed
- Adding full Delegated Auth support to all OData Drivers by @petebryan in #409
This allows MDE and Graph users to use User-delegated authentication rather than app ID/secrets - Fixes for usability bugs in kql_driver, nbinit, user_config - added typing-extensions requirement by @ianhelle in #411
- Kql driver will revert to Kqlmagic-based device authentication if Azure Authentication fails
- Kql driver suppresses "missing PyGObject message" - a dependency that isn't required in this scenario
- init_notebook produced spurious error message about Virus Total libraries not being available even if they were not used.
- User config throws error if user has partial auto-load configuration in msticpy
- Replace MSAL auth plaintext file cache with memory cache by @petebryan in #413
- removed ability to use plaintext token cache because of security concerns
- Update API version for list_alert_rules by @FlorianBracq in #399
- Updating Dockerfile source to mcr anaconda by @ianhelle in #397
Docker source switched to trusted anaconda source for supply chain security - Updated Tweet bot to include more context in the tweets by @petebryan in #403
- Updated tweet action to include more detail in the tweets by @petebryan in #405
- Adding Microsoft SECURITY.MD by @microsoft-github-policy-service in #407
- Bump readthedocs-sphinx-ext from 2.1.5 to 2.1.6 by @dependabot in #400
Full Changelog: v1.8.1...v1.8.2