Command Line Syntax¶
This library comes with a powerful command line syntax that makes it easy to change complex configuration options in a precise fashion.
Lists and dictionaries¶
Lists and dictionaries are written at the command line in python form:
Example list:
--splits_to_eval [train, val, test]
Example nested dictionary
--mining_funcs {tuple_miner: {MultiSimilarityMiner: {epsilon: 0.1}}}
Merge¶
Consider the following optimizer configuration.
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.000001
At the command line, we can change lr to 0.01, and add alpha = 0.95 to the RMSprop parameters:
--optimizers {trunk_optimizer: {RMSprop: {lr: 0.01, alpha: 0.95}}}
So in effect, the config file now looks like this:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.01
alpha: 0.95
In other words, we specify a dictionary at the command line, using python dictionary syntax. This dictionary is then merged into the one specified in the config file. Thus, adding keys is very straightforward:
--optimizers {embedder_optimizer: {Adam: {lr: 0.01}}}
Now the config file includes a specification for embedder_optimizer:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.000001
embedder_optimizer:
Adam:
lr: 0.01
But what happens if we try to set the trunk_optimizer to Adam?
--optimizers {trunk_optimizer: {Adam: {lr: 0.01}}}
Now there's a problem with the config file, because two optimizer types are specified for a single optimizer:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.000001
Adam:
lr: 0.01
How can we get around this? By using the Override syntax.
Override¶
Overriding simple options requires no special syntax. For example, the following will change save_interval from its default value of 2 to 5:
--save_interval 5
However, for complex options (i.e. nested dictionaries) the ~OVERRIDE~ flag is required to avoid merges. Let's consider the same optimizer config file from above:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.000001
To instead use Adam with lr = 0.01:
--optimizers~OVERRIDE~ {trunk_optimizer: {Adam: {lr: 0.01}}}
Now the config file looks like this:
optimizers:
trunk_optimizer:
Adam:
lr: 0.01
The ~OVERRIDE~ flag can be used at any level of the dictionary, which comes in handy for more complex config options. Consider this config file:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.000001
embedder_optimizer:
Adam:
lr: 0.01
We can make trunk_optimizer use Adam, but leave embedder_optimizer unchanged, by applying the ~OVERRIDE~ flag to trunk_optimizer:
--optimizers {trunk_optimizer~OVERRIDE~: {Adam: {lr: 0.01}}}
Apply¶
Sometimes the merging and override capabilities don't offer enough flexibility. Consider this config file:
trainer:
MetricLossOnly:
dataloader_num_workers: 2
batch_size: 32
If we want to change the batch size using merging:
--trainer: {MetricLossOnly: {batch_size: 256}}
There are two problems with this:
- It's verbose. We only wanted to change
batch_size, but we had to write out the name of thetrainer. - It requires knowledge of the
trainerthat is being used.
So instead, we can use the ~APPLY~ flag:
--trainer~APPLY~2: {batch_size: 256}
This syntax means that {batch_size: 256} will be applied to (i.e. merged into) all dictionaries at a depth of 2. So the trainer config now looks like:
trainer:
MetricLossOnly:
dataloader_num_workers: 2
batch_size: 256
Here's another example with optimizers. The starting configuration looks like:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.000001
embedder_optimizer:
Adam:
lr: 0.01
We can set both learning rates to 0.005:
--optimizers~APPLY~3 {lr: 0.005}
The new config file looks like:
optimizers:
trunk_optimizer:
RMSprop:
lr: 0.005
embedder_optimizer:
Adam:
lr: 0.005
Swap¶
Consider the trainer config file again, but with more of its parameters listed:
trainer:
MetricLossOnly:
iterations_per_epoch: 100
dataloader_num_workers: 2
batch_size: 32
freeze_trunk_batchnorm: True
label_hierarchy_level: 0
loss_weights: null
set_min_label_to_zero: True
Let's say you write your own custom trainer, and it has the same set of initialization parameters. One way to use your custom trainer is to use the ~OVERRIDE~ flag:
--trainer~OVERRIDE~ {YourCustomTrainer: {iterations_per_epoch: 100, \
dataloader_num_workers: 32, \
batch_size: 32, \
freeze_trunk_batchnorm: True, \
label_hierarchy_level: 0, \
loss_weights: null, \
set_min_label_to_zero: True}}
Again, this is very verbose, considering we only wanted to change the trainer type. So instead, we can use the ~SWAP~ flag:
--trainer~SWAP~1 {YourCustomTrainer: {}}
This goes to a dictionary depth of 1, and swaps the only key, MetricLossOnly, with YourCustomTrainer, while leaving everything else unchanged. Now the config file looks like:
trainer:
YourCustomTrainer:
iterations_per_epoch: 100
dataloader_num_workers: 2
batch_size: 32
freeze_trunk_batchnorm: True
label_hierarchy_level: 0
loss_weights: null
set_min_label_to_zero: True
What if there are multiple keys at the specified depth? For example, consider this configuration for data transforms:
transforms:
train:
Resize:
size: 256
RandomResizedCrop:
scale: 0.16 1
ratio: 0.75 1.33
size: 227
RandomHorizontalFlip:
p: 0.5
If we want to swap RandomHorizontalFlip out for RandomVerticalFlip, we need to explicitly indicate the mapping, because there are 2 other keys that could be swapped out (Resize and RandomResizedCrop):
--transforms~SWAP~2 {RandomHorizontalFlip: RandomVerticalFlip}
The new config file contains RandomVerticalFlip in place of RandomHorizontalFlip:
transforms:
train:
Resize:
size: 256
RandomResizedCrop:
scale: 0.16 1
ratio: 0.75 1.33
size: 227
RandomVerticalFlip:
p: 0.5
Delete¶
Consider this models config file:
models:
trunk:
bninception:
pretrained: imagenet
embedder:
MLP:
layer_sizes:
- 128
Let's replace embedder with Identity(), which is a essentially an empty PyTorch module:
--models {embedder~OVERRIDE~ {Identity: {}}}
But because embedder has no optimizable parameters, we need to get rid of the embedder_optimizer that is specified in the default config file. We can do this easily with the ~DELETE~ flag:
--optimizers {embedder_optimizer~DELETE~: {}}