Maintenance

Configuration updates using ansible

The Data Library New Installation section explains how to create an ansible playbook and use it for the initial installation of the Data Library software. We also recommend that you use continue using ansible to manage configuration changes and software updates over time.

You must activate the correct python virtual environment in order to use the correct ansible playbook. To always use this version of python, you may want to add this line to the end of your ~/.bash_profile file:

source /opt/datalib_venv3.12/bin/activate

To make a configuration change,

  • Make sure your local copy of the dlconfig repository is up to date:

    cd dlconfig
    git pull --ff-only
    

    and read the output to be sure that the command succeeded. If there are conflicts, resolve them and pull again before proceeding.

  • Make your changes in playbook.yaml.

  • Run the playbook in check mode to verify that the changes that ansible is about to make are the ones you intended:

    ./run_ansible --check
    
  • After verifying the diff, run the playbook without --check to apply the change.

    ./run_ansible
    
  • If there have been changes to classic maprooms, run with --build to pull the new changes. (Rebuilding classic maprooms is skipped by default, because it is time-consuming.)

    ./run_ansible --build
    
  • Review, commit, and push your changes to your git host.

    git diff
    git commit -a -m "Description of the changes you made"
    git push
    

To update to a new version of the Data Library software, first check https://github.com/iridl/iridl-ansible for any backwards-compatibility warnings or manual migration steps. Then use ansible-galaxy to update the ansible_collections directory of your dlconfig repository:

git rm -rf ansible_collections
ansible-galaxy collection install -p . \
  git+https://github.com/iridl/iridl-ansible.git
git add ansible_collections
git commit -m "Update iridl ansible collection"

Adding user accounts

As described in User groups, users with accounts on the Data Library server can be divided in two groups: administrators and authors.

Administrator accounts should be created “by hand”, i.e. outside ansible’s control. Remember to add administrators to the wheel group so they will have sudo privileges.

Author accounts should be created by adding the user to the datag_users list in playbook.yaml and running ansible (see Configuration updates using ansible). Ansible will create the user account, add the new user to the datag group, and create a personal data catalog directory (see Important file and directory paths) for the user. Ansible does not set the user’s password, so you should do that by hand after running the playbook. Remember to commit and push your playbook changes.

Debugging tips

  • To see what services are running under docker, use

    sudo docker ps
    
  • To start and stop services, for example

    cd /usr/local/datalib
    sudo docker compose start squid
    sudo docker compose restart squid
    sudo docker compose up -d maproom
    
  • Most services output logs to stdout, which is captured by the docker daemon and routed to journald. You can read the logs by using journalctl, e.g.

    sudo journalctl CONTAINER_NAME=datalib_maproom_1 --since='1 hour ago'
    
  • squid produces two separate logs: the error log and the access log. The latter contains a line for each request served. The error log is piped to journalctl, while the access log is written to a docker volume. To read the access log, use docker exec to run a command in the squid container, e.g.

    sudo docker exec -it datalib_squid_1 tail -n 100 /var/log/squid/access.log
    
  • If an embedded image in a maproom is broken/empty, copy the image URL to a new browser window and remove the .gif at the end. Sometimes this will get you an informative error message instead of just an empty response.

  • If there’s an error message about a specific file or database table, look into that as described below. If no specific file or table is mentioned, identify the datasets that are used in the query, and read the catalog entries (dlentries) for those datasets to identify the files and/or tables that they use.

  • To check on a database table, exec into the postgres container and use psql. If the table doesn’t exist, either run the sql script that creates it, or add it to the sql scripts if it’s missing. If the table exists, check that the readonlyaccess role has select permission for it.