by Peter Winkler & Norman Koch
What is Omniopt?
- A tool for hyperparameter optimization when you work with neural networks and big data.
- Omniopt is applicable to a broad class of problems (both classical simulations and neural networks).
- Omniopt is robust. It checks and installs all dependencies automatically and fixes many problems in the background
without the user even noticing that they have occurred.
- While Omniopt optimizes, no further intervention is required. You can follow the current stdout live in the console.
- Omniopt's overhead is minimal and virtually imperceptible
For what can you use it?
- Classical simulation methods as well as neural networks have a large number of hyperparameters that
significantly determine the accuracy, efficiency and transferability of the method.
- In classical simulations, the hyperparameters are usually determined by adaption to measured values.
- In neural networks the hyperparameters determine the network architecture: number and type of layers,
number of neurons, activation functions, measures against overfitting etc.
- The most common methods to determine hyperparameters are intuitive testing, grid search or random search.
How does it work?
- The hyperparameters are determined with a parallelizable stochastic minimization algorithm (TPE) using
the GPUs of the HPC system Taurus.
- The user has to provide his application which can be either a neural network or a classical simulation
as a black box and a target data set representing the optimal result.
- In an .ini file, the hyper parameters to be optimized (e.g., number of epochs, number of hidden layers, batch sizes, ...)
are defined and their parameter limits (minimum, maximum) are specified. 👉 Generate the config file!
- The number of hyperparameters is in principle arbitrary. In practice, up to ten parameters are currently recommended
(further tests required).
- The Bayesian stochastic optimization algorithm TPE calculates per optimization step the objective functions for
a set of parameter distributions in the parameter space.
- The user must provide a version of his program that reads the values of the hyperparameters to be
optimized and outputs the target function.
- For neural networks either the loss or another (more descriptive) quantity (e.g. F1 measure) can be used as
- The parallelization and distribution of the calculations on the Taurus GPUs is done automatically.
How can you use it? More questions?
Bergstra, J., Yamins, D., Cox, D. D. (2013) Making a Science of Model
Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision
Architectures. TProc. of the 30th International Conference on Machine
Learning (ICML 2013), June 2013, pp. I-115 to I-23.