Skip to content

Latest commit

 

History

History
executable file
·
160 lines (115 loc) · 6.6 KB

README.md

File metadata and controls

executable file
·
160 lines (115 loc) · 6.6 KB

SAS Scripting Wrapper for Analytics Transfer (SWAT)

Overview

The SAS SWAT package is a Python interface to the SAS Cloud Analytic Services (CAS) engine (the centerpiece of the SAS Viya framework). With this package, you can load and analyze data sets of any size on your desktop or in the cloud. Since CAS can be used on a local desktop or in a hosted cloud environment, you can analyze extremely large data sets using as much processing power as you need, while still retaining the ease-of-use of Python on the client side.

Using SWAT, you can execute workflows of CAS analytic actions, then pull down the summarized data to further process on the client side in Python, or to merge with data from other sources using familiar Pandas data structures. In fact, the SWAT package mimics much of the API of the Pandas package so that using CAS should feel familiar to current Pandas users.

With the best-of-breed SAS analytics in the cloud and the use of Python and its large collection of open source packages, the SWAT package gives you access to the best of both worlds.

To view updates to this project see the Change Log.

Prerequisites

To access the CAS binary protocol (recommended), you need the following:

  • 64-bit Python 3.7 to 3.12 on Windows or Linux (see shared library notes below)

The binary protocol requires pre-compiled components found in the pip installer only. These pieces are not available as source code and are under a separate license (see documentation on SAS TK). The binary protocol offers better performance than REST, especially when transferring larger amounts of data. It also offers more advanced data loading from the client and data formatting features.

To access the CAS REST interface only, you can use the pure Python code which runs in Python 3.7 to 3.12 on all platforms. While not as fast as the binary protocol, the pure Python interface is more portable.

Linux Library Dependencies

Some Linux distributions may not install all of the needed shared libraries by default. Most notably, the shared library libnuma.so.1 is required to make binary protocol connections to CAS. If you do not have this library on your machine you can install the numactl package for your distribution to make it available to SWAT.

Python Dependencies

The SWAT package uses many features of the Pandas Python package and other dependencies of Pandas. If you do not already have version 0.16.0 or greater of Pandas installed, pip will install or update it for you when you install SWAT.

If you are using pip version 23.1 or later to install from a tar.gz file, the python wheel package is required. If you do not have this package installed, you can install it using pip.

Installation

SWAT can be installed using pip:

pip install swat

You can also install from the files on the SWAT project releases page. Simply locate the file for your platform and install it using pip as follows:

pip install https://github.com/sassoftware/python-swat/releases/download/vX.X.X/python-swat-X.X.X-platform.tar.gz

Where X.X.X is the release you want to install, and platform is the platform you are installing on. You can also use the source code distribution if you only want to use the CAS REST interface. It does not contain support for the binary protocol.

Getting Started

For the full documentation go to sassoftware.github.io/python-swat. A simple example is shown below.

Once you have SWAT installed and you have a CAS server to connect to, you can import swat and create a connection::

>>> import swat
>>> conn = swat.CAS(host, port, username, password)

Note the default port for the Python SWAT connection is 5570.

If you are using python-swat version 1.8.0 or later to connect to a SAS Viya 3.5 CAS server using Kerberos, prior to connecting you must set the Service Principal Name (SPN) using the CASSPN environment variable. For SAS Viya 3.5, the SPN string must start with 'sascas@', followed by the hostname.

export CASSPN=sascas@host

If you get an error message about the TCP/IP negClientSSL support routine, you likely have an issue with your SSL certificate configuration. See the Encryption documentation for more information.

If that is successful, you should be able to run an action on the CAS server::

>>> out = conn.serverstatus()
NOTE: Grid node action status report: 1 nodes, 6 total actions executed.
>>> print(out)
[About]

 {'CAS': 'Cloud Analytic Services',
  'Copyright': 'Copyright © 2014-2016 SAS Institute Inc. All Rights Reserved.',
  'System': {'Hostname': 'cas01',
   'Model Number': 'x86_64',
   'OS Family': 'LIN X64',
   'OS Name': 'Linux',
   'OS Release': '2.6.32-504.12.2.el6.x86_64',
   'OS Version': '#1 SMP Sun Feb 1 12:14:02 EST 2015'},
  'Version': '3.01',
  'VersionLong': 'V.03.01M0D08232016',
  'license': {'expires': '20Oct2016:00:00:00',
   'gracePeriod': 62,
   'site': 'SAS Institute Inc.',
   'siteNum': 1,
   'warningPeriod': 31}}

[server]

 Server Status

    nodes  actions
 0      1        6

[nodestatus]

 Node Status

     name        role  uptime  running  stalled
 0  cas01  controller   4.836        0        0

+ Elapsed: 0.0168s, user: 0.016s, sys: 0.001s, mem: 0.287mb

>>> conn.close()

Contributing

The Contributor Agreement details on how contributions can be made to the project. The Contributing includes instructions and rules as it relates to making contributions on the project.

Licensing

The LICENSE.md states how this package is released and licensed.

Additional Resources