Installation¶
This section provides instructions for installing the Data Library software on a server.
Installation of the Data Library software is automated using ansible, a configuration management tool.
Prepare the server¶
Before running the installation script, perform the following steps to prepare the server:
Install CentOS 9 Stream from https://www.centos.org/centos-stream/#download. Select the appropriate architecture to download the ISO version.
For more information on this choice of operating system, see Infrastructure.
To create a bootable CD of the Installation ISO, follow these instructions.
Installation Notes
Install as Server with GUI to make it easier to maintain.
When configuring your disk partitions
Use LVM when creating the filesystems. This is the default.
Don’t modify /boot or /boot/efi
Increase root partition to 100GB
Create a /data partition large enough to hold all of your data now and in the future.
Leave 10% of the disk unallocated. This will allow you to create snapshots or increase the size of partitions later if necessary.
Create a user account for the system administrator who will be performing the installation. Make that user a member of the
wheel
group so that they will be able to perform commands as root usingsudo
.Install git and python3:
sudo yum install -y git python3 python3-pip
Install ansible with pip. The
--user
option causes it to be installed in ~ /.local, to avoid interfering with python packages provided by CentOS.# Note: do not use sudo here. python3 -m pip install --user --upgrade pip python3 -m pip install --user ansible==4.5.0
Disable SELinux and reboot:
sudo sed -i s/SELINUX=enforcing/SELINUX=disabled/ /etc/selinux/config sudo shutdown -r now
Note
If you feel SELinux is important to the security of your server, please start a conversation with us at help@iri.columbia.edu.
Create a configuration repository¶
Log back into the server after rebooting. Create a git repository to track your Data Library configuration. At IRI we call ours
dlconfig
.mkdir dlconfig cd dlconfig git init
Inside the new git repository, install the IRIDL ansible collection and dependencies:
ansible-galaxy collection install \ -p . \ git+https://github.com/iridl/iridl-ansible.git \ community.docker
The previous command should have downloaded the collection to a subdirectory called
ansible_collections
. Add and commit that directory to your git repository, to ensure that you will use the same version of the collection every time you run the playbook.git add ansible_collections git commit -m "add iridl ansible collection"
Copy template configuration files from the collection to the top level of the repository:
cp ansible_collections/iridl/iridl/example/* .
Move
secrets.yaml
out of the git repository. For security reasons, unencrypted secrets should not be committed to version control.mv secrets.yaml ..
Modify
playbook.yaml
and../secrets.yaml
to customize them to the specifics of your site. The files that you copied contain example configuration values that should be replaced with real email addresses, usernames, etc. The files include comments that explain the purpose of each configuration option. If you are not ready to set up your real Data Library server but merely want to practice the installation process, e.g. in a virtual machine, you can use the example files without modification.Commit your customizations and push them to your git server for safe keeping; back up
secrets.yaml
by other means, such as copying it to another machine.
Never edit the contents of the ansible_collections
directory. All
customization should be made in the configuration
files that you copied from the template. In the future when it comes time to
upgrade to a newer version of the DL
software, you will run the ansible-galaxy
command again and commit the new
version to your configuration repository.
Don’t upgrade without checking the release notes first, because in some cases an
upgrade may require manual migration
steps. (At this writing, there are no upgrade release notes because this is the
playbook’s initial release.)
Run the ansible playbook¶
Now we are ready to run the playbook, which will download, configure, and install the Data Library software using the parameters you defined in the configuration files.
From the root directory of the configuration repository, run the following command:
ansible-playbook \
--ask-become-pass \
-i inventory.cfg \
-e @../secrets.yaml \
-e run_update_script=yes \
playbook.yaml
Each step of the installation will be printed to the terminal. At a site with a fast connection to the internet, the playbook generally finishes within ten minutes, but if bandwidth is limited it may take a few hours, as the installation process involves downloading several GB of software packages and container images.
You should now be able to visit your Data Library server in a browser, but the maprooms are not yet functional because the data that underlies them has yet to be installed.
Install datasets¶
Among other things, the ansible playbook has created structures (directories, groups, a database, and permissions) to support the installation of datasets. You can now install your data as described in Installing data. A member of the IRI staff will typically be involved in this process, as it may involve copying large amounts of data from an IRI server to yours.
Use your new Data Library¶
You should now be able to visit your Data Library server in a browser. For next steps, see the Maintenance page of the current guide, and the User Guide.