This guide explains how to set up the environment and run experiments for learning with interval-based targets.
-
Create a virtual environment (recommended)
python -m venv venv source venv/bin/activate -
Install Dependencies
Install the required packages using the
requirements.txtfile:pip install -r requirements.txt
Run main.py from your terminal with your chosen arguments. The script automatically performs a grid search for any argument that is given multiple values.
Example: This command tests two learning rates on the abalone dataset.
python main.py \
--data_names "abalone" \
--method "projection" \
--lr 0.01 0.001 \
--num_epochs 100 \
--name "abalone_test_run"The results will be saved to abalone_test_run.csv.
The --method argument controls the core training strategy.
projection: The model is only penalized if its prediction falls outside the target interval. If the prediction is inside, the loss is zero.minmax: The loss is calculated against the worst-case label within the interval. It can be shown that this worst-case label will always be one of the interval's two endpoints.
These methods use a two-stage process:
-
Generate Pseudo-Labels: First, multiple models are trained using the
projectionloss. Their predictions on the training set become the "pseudo-labels." This approach incorporates the smoothness of the model class into the process of defining the target region. -
Train Final Model: A final model is then trained on these pseudo-labels using one of the following aggregation strategies:
minmax_pl_mean: Minimizes the average loss calculated against all pseudo-labels.minmax_pl_max: Minimizes the maximum loss calculated against any of the pseudo-labels.
To use a Lipschitz-constrained MLP (LipMLP) instead of a standard MLP, add the --use_lip_mlp flag, specifying the method it should apply to. This flag is currently supported for projection and all PL methods (but not minmax).
Example: The following command runs the projection loss with a LipMLP.
python main.py \
--data_names "abalone" \
--method "projection" \
--lr 0.01 0.001 \
--num_epochs 100 \
--use_lip_mlp "projection" \
--name "abalone_lipmlp_run"For a dataset named my_data, place the file at dataset/my_data/my_data.arff. The script will find it automatically, assuming the target label is the first column. For example, you could download a wine_quality dataset.
This is the best way to handle CSV files. To add a new dataset, add a new entry to the DATASET_CONFIGS dictionary. For guidance, you can refer to the existing configuration for the abalone dataset. Download the abalone dataset and place the abalone folder inside your main dataset folder.
Example Entry:
# In main.py inside the DATASET_CONFIGS dictionary
"new_dataset": {
"path": "dataset/new_dataset/data.csv",
"loader_func": pd.read_csv,
"target_col": "name_of_target_column",
"drop_cols": ["id_column_to_drop"],
}You can now use --data_names "new_dataset" in your commands.