Skip to content

funcional-health-analytics/aws-glue-docker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Supported tags and respective Dockerfile links

Simple Tags

Python Shell

Spark

You can use Python extension modules and libraries with your AWS Glue ETL scripts as long as they are written in pure Python. C libraries such as pandas are not supported at the present time, nor are extensions written in other languages.
-- AWS

AWS Glue Docker

Software License

AWS Glue Development enviroment based on svajiraya/aws-glue-libs fix.

Getting started

# install docker and configure aliases
curl -sSL https://raw.githubusercontent.com/webysther/aws-glue-docker/master/start.sh | sh

# to use pandas
glue

# or pyspark
glue-spark

# here you are inside docker

# Glue PySpark (REPL)
pyspark

# Glue PySpark
# /app is you current folder
glue-spark sparksubmit /app/spark_script.py

# Test
glue pytest

# aliases inside docker (backwards compatibility)
gluesparksubmit == sparksubmit
gluepyspark == pyspark
gluepytest == pytest

License

MIT License. Please see License File for more information.

Releases

No releases published

Packages

No packages published

Languages

  • Dockerfile 81.3%
  • Shell 18.7%