=========
This is a sample attachment server implementation for Specify. This implementation is targeted at Ubuntu distributions but will work with minor modifications on other Linux systems. It is not expected to work without extensive adaptation on Windows systems.
The Specify Collections Consortium is funded by its member institutions. The Consortium website is:
http://www.specifysoftware.org
Web Asset Server Copyright © 2021 Specify Collections Consortium.
Specify comes with ABSOLUTELY NO WARRANTY. This is free software licensed under the GNU General Public License 2 (GPL2).
Specify Collections Consortium
Biodiversity Institute
University of Kansas
1345 Jayhawk Blvd.
Lawrence, KS 66045 USA
- Web Asset Server
- New Features
- Default Rate Restrictions
- Deployment
- Detailed Docker Installation Instructions
- Running server.py locally via Python without Nginx
- HTTPS
- Specify Settings *Specify6 Settings
- Compatibility with older versions of Python
Full Docker Integration - A full docker network ecosystem image database, server and nginx containers defined in .yml file. - Improved dockerfiles for nginx and image-server for easy setup.
Integrated Metadata Management
- REST support for reading and writing image metadata. Includes a submodule MetadataTools for editing EXIF metadata.
Image Asset Organization
- Internal MySQL database tracks all imports and allows querying to map a URL back to an original filename.
- Tiered directory structure based on the first four characters of the MD5 hash to prevent large directory listings.
For example,236586c8a808832c794a525642f5cc42.jpg
is stored in23/65/236586c8a808832c794a525642f5cc42.jpg
. - Prevents duplicate filename imports in a given collection or namespace.
Redacted Image Support
- Supports redacted images with controlled access. Only authenticated users can view redacted images.
Performance and Security Enhancements
- Docker integration with Nginx for optimized performance and enhanced security.
- Multithreading support with associated Nginx configuration.
- Nginx rate restriction and CORS policy options for external IPs and Domains.
- Optional bot and scanner blocker (e.g., Ultimate Bot Blocker).
Continuous Integration and Testing
- Jenkins CI support for automated testing and deployment.
- Comprehensive test suite to ensure reliability across all features.
- For external users and IP addresses, the current limit is set to 10 requests per minute, with a burst limit of 2.
- No rate limits are imposed on internal users on networks with addresses in the following ranges:
- 24-bit block: 10.0.0.0
- 20-bit block: 172.16.0.0
- 16-bit block: 192.168.0.0
Copy docker-compose.template.yml
and settings.template.py
to their
respective filenames without 'template' and adjust settings accordingly. Note that you can run the
system without using docker; simply launch the database with the start_images_development_db.sh
script
and then run the server with "python3 server.py". This is recommended for initial setup and testing.
Note that for testing, you'll need to use a non-privileged port such as 8080.
Once testing is complete, stop the docker container running the database, and type docker-compose up -d
to start the full server.
It is important that the working directory is set to the path containing server.py
so that bottle.py can find the template files. See “TEMPLATE NOT FOUND” IN MOD_WSGI/MOD_PYTHON.
The following instructions from Cloning Web Asset Server Source Repository to Final docker-compose and setup are for a Docker installation. Instructions for running the server directly without Docker are provided at the bottom.
If running directly via Python, only the step Configuring settings.py
needs to be completed after running git clone [email protected]:calacademy-research/cas-web-asset-server.git
, then skip to Running server.py locally via python without nginx:.
Clone this repository.
git clone [email protected]:calacademy-research/cas-web-asset-server.git
Docker in the preferred installation method for Web Asset Server.
An example docker-compose.template.yml
is provided:
services:
nginx:
build:
context: ./
dockerfile: Dockerfile.nginx
container_name: bottle-nginx
ports:
- '80:80'
- '443:443'
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
depends_on:
- image-server
logging:
driver: "json-file"
options:
max-size: "1m"
max-file: "5"
restart: always
image-server:
build:
context: ./
dockerfile: Dockerfile.images
container_name: image-server
environment:
- EXTERNAL_IP=institution.org
- PYTHONUNBUFFERED=1
- LANG=C.UTF-8
volumes:
- ./:/code
command: ['uwsgi', '--plugin', '/usr/lib/uwsgi/plugins/python312_plugin.so','--ini', 'uwsgi.ini']
restart: always
mysql-images:
restart: unless-stopped
image: mysql:8
container_name: mysql-images
command: "--default-authentication-plugin=mysql_native_password"
volumes:
- ./data:/var/lib/mysql:delegated
environment:
MYSQL_ROOT_PASSWORD: 'redacted'
MYSQL_DATABASE: 'images'
TZ: America/Los_Angeles
ports:
- "3306:3306"
expose:
- 3306
For production use, it is recommended to also add a nginx web server between
the web asset server and the outside world. An example nginx can be found at
nginx.conf
as well as an example docker-compose.yml for a nginx, image-server and image database at docker-compose.yml.
For development/evaluation, the web asset server can be exposed directly. To do so,
add the following lines to your docker-compose.yml
right after the asset-server:
line,
and comment out the nginx service:
ports:
- "80:80"
- "443:443"
Example settings.py at settings.py
- Copy settings.template.py to settings.py,
- Confirm your server HOST , PORT & SERVER_PROTOCOL match your server settings in docker-compose.yml.
- Confirm your SQL_HOST, SQL_PORT, SQL_PASSWORD and SQL_DATABASE match your image db settings.
- Confirm your BASE_DIR, THUMB_DIR, ORIG_DIR , match your attachments folder setup.
- Confirm other settings are to your preferences.
A nginx is recommended for controlling traffic to your server , a sample template is nginx.template.conf
Copy nginx.template.conf
to its respective filename and remove 'template'.
Edit the below nginx map variables to your preferences:
--$http_user_agent $is_bot
: contains a custom list bot name keywords to block.
--$remote_addr $is_blocked_ip
: contains a custom list of ip address ranges to block.
--$limited
: contains whitelisted ip ranges.
--$http_referer $cors_header
: contains whitelisted http_referrers for CORS policy.
(Optional) For running nginx locally on http:
- comment out the top server block which redirects port 80 with a 301 code.
- comment out the lines
ssl_certificate /etc/ssl/certs/name_of_ssl_cert.pem;
andssl_certificate_key /etc/ssl/private/name_of_ssl_cert.key;
. - replace the line
listen 443 ssl;
withlisten 80;
, replacing port 80 with any preferred port.
Botblocker is a comprehensive add-on that can be used to dynamically block harmful bots from accessing your server. Github
- Copy the templates in
botblocker-settings
forblacklist-user-agents
,whitelist-domains.conf
andwhitelist-ips.conf
. - Commented out examples are provided in each template. Setup desired whitelist exceptions for allowed ip ranges, user agents and/or domain names.
- uncomment or add the lines
include /etc/nginx/conf.d/globalblacklist.conf;
andinclude /etc/nginx/conf.d/botblocker-nginx-settings.conf;
inside your nginx http brackets. - uncomment or add the lines
include /etc/nginx/bots.d/blockbots.conf;
andinclude /etc/nginx/bots.d/ddos.conf;
inside your nginx server brackets. - add email to Dockerfile.nginx line
RUN echo "00 22 * * * /usr/local/sbin/update-ngxblocker -e [email protected]" >> /etc/crontab
for botblocker updates
For ubuntu here is a commented version of the install instructions, which are featured in Dockerfile.nginx
:
# Download and install the Nginx Ultimate Bad Bot Blocker 'install-ngxblocker' script
RUN wget https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/install-ngxblocker \
-O /usr/local/sbin/install-ngxblocker \
&& chmod +x /usr/local/sbin/install-ngxblocker
# Run install-ngxblocker and update botlist, telling it to place files in conf.d and bots.d
RUN /usr/local/sbin/install-ngxblocker -x -c /etc/nginx/conf.d -b /etc/nginx/bots.d
# Download and install the setup/update scripts
RUN wget https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/setup-ngxblocker \
-O /usr/local/sbin/setup-ngxblocker \
&& wget https://raw.githubusercontent.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker/master/update-ngxblocker \
-O /usr/local/sbin/update-ngxblocker \
&& chmod +x /usr/local/sbin/setup-ngxblocker \
&& chmod +x /usr/local/sbin/update-ngxblocker
# run any updates to botblocker
RUN /usr/local/sbin/update-ngxblocker -c /etc/nginx/conf.d -b /etc/nginx/bots.d
# adding custom whitelists to botblocker
COPY botblocker-settings/whitelist-domains.conf botblocker-settings/whitelist-ips.conf botblocker-settings/blacklist-user-agents.conf /etc/nginx/bots.d/
# (Optional) Add a crontab job to periodically update the blocker
RUN echo "00 22 * * * /usr/local/sbin/update-ngxblocker -e [email protected]" >> /etc/crontab
-- The image-server container runs on an ubuntu 24 docker image.
-- all setup and dependencies are installed automatically in the Dockerfiles. Dockerfile.nginx
here & Dockerfile.server
here.
-- If not using botblocker comment out all lines from the lines 19-42 containing the botblocker install steps
- Run docker compose.
docker compose up -d
- (Optional) deploy any sql image db backup by copying your backup .sql file to the mounted folder data/ via
cp image_db_backup.sql data/
and then logging into mysql and runningsource /var/lib/mysql/image_db_backup.sql
-- Copy and setup settings.py. See [Setting up settings.py](#Setting up settings.py) The dependencies are:
- Python 3.12 is known to work.
- Docker used to deploy the image database.
- ImageMagick for thumbnailing.
- ExifTool for editing exif metadata via the command line.
- Metadatatools submodule at https://github.com/calacademy-research/metadata_tools.git for python methods to edit exif metadata.
To install dependencies the following commands work on Ubuntu:
# install ubuntu dependencies
apt-get update && apt-get install -y \
ca-certificates tzdata wget curl python3-pip python3-setuptools \
build-essential libffi-dev imagemagick libimage-exiftool-perl \
gcc-aarch64-linux-gnu uwsgi uwsgi-plugin-python3 python3.12-venv\
&& apt-get clean && rm -rf /var/lib/apt/lists/*
# activate the Metadatatools submodule
git submodule update --init --recursive
# create venv
python3 -m venv venv
source venv/bin/activate
# install requirements
sudo pip install -r requirements.txt
sudo pip install -r metadata_tools/requirements.txt
## setup the image db docker container (check that the port and password are to your liking)
./start_images_dev.sh
## run the server
python3 server.py
## lastly to check that your setup is working
pytest tests/
--The command python server_ipup.py
automatically updates the host, .yml and settings ip addresses between uses when running server.py locally via python.
The easiest way to add HTTPS support, which is necessary to use the asset server with a Specify 7 server that is using HTTPS, is to place the asset server behind a reverse proxy such as Nginx. This also makes it possible to forego authbind and run the asset server on an unprivileged port. The proxy must be configured to rewrite the web_asset_store.xml
resource to adjust the links therein. An example configuration can be found in this gist.
You will generally want to add the asset server settings to the global Specify preferences so that all of the Specify clients obtain the same configuration.
The easiest way to do this is to open the database in Specify and navigate to the About option in the help menu.
In the resulting dialog double-click on the division name under System Information on the right hand side. This will open a properties editor for the global preferences.
You will need to set four properties to configure access to the asset server:
USE_GLOBAL_PREFS
-true
attachment.key
–##
- Replace
##
with the key from the following location:- obtain from asset server
settings.py
file if you have a local installation of 7 - obtain from
docker-compose.yml
file if you use a Docker deployment
- obtain from asset server
- Replace
attachment.url
http://[YOUR_SERVER]/web_asset_store.xml
attachment.use_path
false
If these properties do not already exist, they can be added using the Add Property button.
If you are using the Docker deployment method, you need to make sure that the attachment.key
and attachment.url
match the configuration in Specify 6.
For both the specify7
and specify7-worker
sections, you need to make sure that:
attachment.key
=ASSET_SERVER_KEY
attachment.url
=ASSET_SERVER_URL
specify7:
restart: unless-stopped
image: specifyconsortium/specify7-service:v7
init: true
volumes:
- "specify6:/opt/Specify:ro"
- "static-files:/volumes/static-files"
environment:
- DATABASE_HOST=mariadb
- DATABASE_PORT=3306
- DATABASE_NAME=specify
- MASTER_NAME=master
- MASTER_PASSWORD=master
- SECRET_KEY=change this to some unique random string
- ASSET_SERVER_URL=http://host.docker.internal/web_asset_store.xml
- ASSET_SERVER_KEY=your asset server access key
- REPORT_RUNNER_HOST=report-runner
- REPORT_RUNNER_PORT=8080
- CELERY_BROKER_URL=redis://redis/0
- CELERY_RESULT_BACKEND=redis://redis/1
- LOG_LEVEL=WARNING
- SP7_DEBUG=false
If you are using a local installation, in the settings.py
file, you need to make sure that:
attachment.key
=WEB_ATTACHMENT_KEY
attachment.url
=WEB_ATTACHMENT_URL
# The Specify web attachment server URL.
WEB_ATTACHMENT_URL = None
# The Specify web attachment server key.
WEB_ATTACHMENT_KEY = None
- convert to universal URLS (n2t.net) and database same (images.universal_urls). Our id=42754. http://n2t. net/e/n2t_apidoc.html
- Support invisible watermarks and add API for same
- support updating all cached resized images, not just thumbnails.