Skip to content

User Session Management, MaxMind Geolit fix, Extract nested dicts from Pandas

Latest
Compare
Choose a tag to compare
@ianhelle ianhelle released this 21 Oct 19:03
· 3 commits to main since this release
deab7a5

User Session Configuration

Do you always have one or more data providers or other components that you need to load for every notebook you create?
I do, and got a bit fed up with typing the same lines of code over and over again.

User session configuration lets you specify which providers are loaded, whether or not to connect and which parameters
to supply at load and connect time. You put all of this into a straightforward YAML file and load it using the following:

import msticpy as mp   # you likely will already be doing this
mp.init_notebook()     # and this

mp.load_user_session("my_config.yaml")   # if you have a "mp_user_session.yaml" in the current directory
                                         # you can skip the parameter

This example shows the structure of the YAML:

QueryProviders:
  qry_prov_sent:
    DataEnvironment: MSSentinel
    InitArgs:
      debug: True
    Connect: True
    ConnectArgs:
      workspace: MySoc
      auth_methods: ['cli', 'device_code']
  qry_prov_md:
    DataEnvironment: M365D
Components:
   mssentinel:
      Module: msticpy.context.azure
      Class: MicrosoftSentinel
      InitArgs:
      Connect: True
      ConnectArgs:
          workspace: MySoc
          auth_methods: ['cli', 'device_code']

The providers/components created (e.g. qry_prov_sent in this example)
are published back to your notebook Python namespace, so you'll see
these available as variables ready to use.

This configuration file is equivalent to the following code:

qry_prov_sent = mp.QueryProvider("MSSentinel")
qry_prov_sent.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])
qry_prov_md = mp.QueryProvider("M365D")

from msticpy.context.azure import MicrosoftSentinel
mssentinel = MicrosoftSentinel()
mssentinel.connect(workspace="MySoc", auth_methods=['cli', 'device_code'])

Not a huge saving, on the face of it, but if you create a lot of notebooks or want to use
msticpy in an automation scenario, it can be very helpful.
Include a verbose=True parameter to load_user_session to see more detailed logging of what is going on.
See the full documentation here

Maxmind GeoIPLite fix

Sometime recently (not too sure when) Maxmind changed their download procedure to use
a different URL and authentication mechanism. This was causing auto-update to fail. To use
the new mechanism you need to get your Maxmind User Account ID (login and look at your
account properties) and add that to your msticpyconfig.yaml as shown below.

OtherProviders:
  GeoIPLite:
    Args:
      AccountID: "1234567"
      AuthKey:
        EnvironmentVar: "MAXMIND_AUTH"
      DBFolder: "~/.msticpy"
    Provider: "GeoLiteLookup"

Extract nested dictionaries from pandas column to multiple rows/columns

@pioneerHitesh has added this as a new method in the mp_pivot pandas extension:

data_df.mp_pivot.dict_to_dataframe(col="my_nested_column")

It returns a dataframe with the column recursively expanded:

  • lists become new rows
  • dictionaries become new columns

So a column with the following structure:

NCol
0 {'A': ['A1', 'A2', 'A3'], 'B': {'B1': 'B1-1', 'B2': 'B2-1'}}
1 {'A': ['A3', 'A4', 'A5'], 'B': {'B3': 'B3-1', 'B4': 'B4-1'}}
my_df = src_df.mp_pivot.dict_to_dataframe(col="NCol")
my_df

Would be unpacked to:

A.0 A.1 A.2 B.B1 B.B2 B.B3 B.B4
0 A1 A2 A3 B1-1 B2-1 nan nan
1 A3 A4 A5 nan nan B3-1 B4-1

What's Changed

  • Authentication module unit test by @ianhelle in #800
  • Use sessions config and GeoIP download failure by @ianhelle in #801
  • Added Inbuilt function to extract nested JSON by @pioneerHitesh in #798
  • Add max retry parameter to the execution prevent HTTP 429 by @vx3r in #802

New Contributors

Full Changelog: v2.13.1...v2.14.0