We investigate regression problems for which one is given the additional possibility of controlling the conditional variance of the output given the input, by varying the computational time dedicated to supervise each example. For a given upper bound on the total computational time, we optimize the trade-off between the number of examples and their precision, by formulating and solving a suitable optimization problem, based on a large-sample approximation of the output of the ordinary least squares algorithm. Considering a specific functional form for that precision, we prove that there are cases in which “many but bad” examples provide a smaller generalization error than “few but good” ones, but also that the converse can occur, depending on the “returns to scale” of the precision with respect to the computational time assigned to supervise each example. Hence, the results of this study highlight that increasing the size of the dataset is not always beneficial, if one has the possibility to collect a smaller number of more reliable examples.
On the trade-off between number of examples and precision of supervision in regression / Gnecco, Giorgio Stefano; Nutarelli, Federico. - 1:(2019). [10.1007/978-3-030-16841-4_1]
On the trade-off between number of examples and precision of supervision in regression
Gnecco Giorgio;Nutarelli Federico
2019
Abstract
We investigate regression problems for which one is given the additional possibility of controlling the conditional variance of the output given the input, by varying the computational time dedicated to supervise each example. For a given upper bound on the total computational time, we optimize the trade-off between the number of examples and their precision, by formulating and solving a suitable optimization problem, based on a large-sample approximation of the output of the ordinary least squares algorithm. Considering a specific functional form for that precision, we prove that there are cases in which “many but bad” examples provide a smaller generalization error than “few but good” ones, but also that the converse can occur, depending on the “returns to scale” of the precision with respect to the computational time assigned to supervise each example. Hence, the results of this study highlight that increasing the size of the dataset is not always beneficial, if one has the possibility to collect a smaller number of more reliable examples.| File | Dimensione | Formato | |
|---|---|---|---|
|
GneccoINNSBDDL2019.pdf
non disponibili
Descrizione: On the Trade-Off Between Number of Examples and Precision of Supervision in Regression
Tipologia:
Versione Editoriale (PDF)
Licenza:
Copyright dell'editore
Dimensione
1.68 MB
Formato
Adobe PDF
|
1.68 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

