Testing ======= We use `pytest`_ for the test module. The dependency has already been included in the `requirements.txt` file and should be installed automatically with aweSOM. Functionality tests ------------------- Run tests for all modules in the root directory of the repository: .. code-block:: bash python -m pytest You can also run specific test modules by specifying the path to the test file: .. code-block:: bash python -m pytest tests/[module]_test.py Or run a specific test function within a module: .. code-block:: bash python -m pytest tests/[module]_test.py::test_[function] If there is no GPU, or if the GPU is not CUDA-compatible, the `sce_test.py` module will fail partially. This is expected behavior, and SCE computation should still fall back to the CPU. Performance tests ----------------- aweSOM includes many additional features compared to the original implementation of `POPSOM `_ and `ensemble learning `_. Therefore, it is not possible to directly compare the performance of aweSOM with these legacy packages. However, we can still make a rough comparison by mimicking the original implementation of ensemble learning and POPSOM in aweSOM, and then benchmarking their performance. First install the dependencies for these legacy packages: .. code-block:: bash pip install -r tests/performance/requirements.txt Then run the performance tests inside the ``aweSOM/tests/performance/`` directory. Benchmarking aweSOM against POPSOM ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash python popsom_bench.py --N 10000 --F 4 where ``N`` is the number of points and ``F`` is the number of features. The script will train a POPSOM map and an aweSOM map given the same mock dataset, and compare the training time of the two algorithms. Additionally, high-level controls include: ``--nodes`` to specify the number of nodes in the lattice, which might be useful for isolated scaling tests; ``--procedure [training, mapping, both]`` to specify which part of the algorithm to benchmark; and ``--popsom`` or ``--awesom`` to specify one of the two algorithms to benchmark separately. If you are running a long-duration test that requires dedicated node(s), you can refer to ``examples/slurm_scripts/submit_popsom_bench.cpu`` for an example SLURM script to run this benchmark. For a personal computer, we recommend using a smaller number of points (:math:`N \sim 10^4`) and features (:math:`F < 5`) for the test to complete in a reasonable amount of time. More extensive tests can be run on a high-performance computing cluster. For example, one modern compute node with 40+ cores can perform this benchmark up to :math:`10^6` points and :math:`\sim 20` features. POPSOM generally cannot handle more than :math:`10^6` points, since training time can exceeds 2 hours at these parameters and/or an out-of-memory error will be raised (even with 1 TB of memory per node). The expected performance of aweSOM is a speedup of :math:`\sim 8-20 \times` compared to POPSOM, depending on the number of points and features. Benchmarking aweSOM against ensemble learning ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash python sce_bench.py --N 100000 --R 20 where ``N`` is the number of points and ``R`` is the number of independent realizations. The script will generate mock cluster IDs for the dataset and save them as ``npy`` files. Then it will perform the SCE analysis on the dataset using both the aweSOM and legacy implementations, and compare the training time of the two algorithms. Additionally, high-level controls include: ``--C`` to specify the number of clusters per realization; ``--legacy`` or ``--awesom`` to specify one of the two algorithms to benchmark separately. NOTE: If the test did not complete successfully, there will be a directory named ``som_out`` in the current working directory. This should be cleaned up manually. If you are running a long-duration test that requires dedicated node(s), you can refer to ``examples/slurm_scripts/submit_sce_bench.cpu`` and ``examples/slurm_scripts/submit_sce_bench.gpu`` for example SLURM scripts to run this benchmark. In general, the Numpy version of aweSOM is around :math:`2 \times` faster than the legacy implementation. However, the GPU version of aweSOM is slower than the legacy implementation due to the overhead for small datasets (:math:`N < 5\times10^4`). The GPU version of aweSOM is only faster for large datasets (:math:`N > 10^5`), and is exponentially faster as you scale up beyond :math:`N \sim 10^6`. We tested the performance of the SCE implementation on a single NVIDIA V-100 GPU with 32 GB of memory. At :math:`N = 10^6` and :math:`R = 10`, aweSOM is faster than the legacy implementation by a factor of :math:`\sim 15`. At :math:`N = 10^7` and :math:`R = 10`, aweSOM is faster by a factor of :math:`\sim 60`. In high-resolution simulations where :math:`L^3 \gtrsim 500, N = 10^8`, aweSOM is the only feasible option for performing the SCE analysis. .. _pytest: https://docs.pytest.org/en/stable/