Questions tagged [bayesian-optimization]
Bayesian optimization is a family of global optimization methods which use information about previously-computed values of the function to make inference about which function values are plausibly optima. Its applications include computer experiments and hyper-parameter optimization in some machine learning models.
196 questions
1
vote
0
answers
61
views
best approaches for multiple root finding when functions are not differentiable
I have a problem similar to one I posted about recently but sufficiently different to warrant its own discussion I think.
I have k functions, each of the same k-dimensional vector x, and I want to ...
1
vote
0
answers
74
views
Suggestions constrained optimization with noisy observations
For $N$ correlated Ornstein-Uhlenbeck processes, I want to find $N$ absorption boundaries, $\mathbf{A}\in\mathbb{R}^{N}$, such that expected value of the summed $N$ processes is maximized, while the ...
1
vote
1
answer
111
views
What Constrained Optimization method to use when my objective isn't strictly differentiable
I'm trying to find the vector of parameters x which gets me the optimal reward, subject to a couple of constraints like $f(x)=k$ and $g(x) \geq C $.
I have lower and upper bounds for each component of ...
0
votes
0
answers
50
views
Questions about calculating uncertainty and correlation matrix of model parameters from optimization
I am running a nonlinear earth system model to optimize 42 parameters p with 7 different kinds of observations $O_j$ where ...
1
vote
0
answers
42
views
Using limited labeling effort to estimate the proportion of positives
I have an (uncalibrated) binary image classifier. I want to use this classifier to estimate the proportion of positives $p_i$ in a dataset $D_i$. I have multiple datasets, each of which is drawn from ...
3
votes
1
answer
110
views
Standardizing data in Bayesian optimization
I am implementing a very basic Bayesian optimization algorithm in Matlab. It is generally recommended to standardize both the inputs (sampling points) and the outputs (black-box objective function ...
1
vote
0
answers
60
views
Best way to tackling SVM fine-tuning
I'm encountering a multiclass classification problem where I'm trying to predict 4 categories using SVM. I'm trying to fine-tuning its hyperparameter using Bayesian Optimization to speed up the ...
2
votes
1
answer
116
views
Posterior estimation using VAE
Using normalizing flows, we can model model's posteriors $p(\theta|D)$, by feeding Gaussian noise $z$ to the NF (parametrized with $\phi$), using the output of the NF $\theta$ as model parameters, and ...
1
vote
1
answer
123
views
How to determine the optimal exploitation-exploration trade off for a fixed number of objective function evaluations
In Bayesian optimization, we guess the next sampling point by finding $x = \textrm{argmax}_x \alpha(x)$, where $\alpha(x)$ is the acquisition function. For simplicity, let us consider the upper ...
0
votes
0
answers
68
views
Learning a probability distribution from samples drawn from unknown function
I am wanting to learn some probability distribution $p$ from data (using e.g., Kernel Density Estimation, a Normalizing Flow, whatever your favourite machine learning model is).
If I had a dataset $D =...
1
vote
0
answers
64
views
Understanding Bayesian Optimal Experiment Design
I read this tutorial on Bayesian Experimentation Design (https://pyro.ai/examples/working_memory.html) and I'm trying to wrap my head around it.
Suppose you have data (X,y).
You're thinking about ...
0
votes
1
answer
89
views
Maximum likelihood estimation and bayesian inference of variance given multiple datasets
I'm currently working on a problem were I have multiple normal distributed data sets $X_1, \dotsc,X_n$ with each data set having it's own mean $\bar x_i $ but all have the same variance $\sigma$. The ...
1
vote
1
answer
147
views
What exactly are we training across different iterations in the Gaussian Process Regression example in GPyTorch?
I am following this tutorial to implement a GP Regression using gPyTorch.
Based on my understanding of GP Regression, given the training data we can compute the posterior mean and covariance using the ...
2
votes
0
answers
245
views
Bayesian optimization for parametric curve fitting?
I am relatively new to Gaussian Processes and Bayesian Optimization. My question is very simple:
Suppose I am trying to learn a function from a parametric family of curves which best describes the ...
5
votes
4
answers
326
views
How to choose a point that has both optimal value and low variance
I have a Gaussian Process Regression model that models the cost of a certain process. Once trained, I want to find the point $x$ corresponding to which the regression predicts the lowest cost.
Simply ...
3
votes
1
answer
301
views
Bayesian optimization for solving least squares
Bayesian optimization with Gaussian processes (GPs) is an effective minimization methodology when the evaluation of the function to minimize, say $f(a)$, is computationally expensive.
Loosely speaking,...
3
votes
0
answers
592
views
Bayesian Optimization: number of iterations as function of search space dimensionality?
I am performing Bayesian Optimization to select a hyperparameter configuration for my supervised learning model. I understand that with each additional hyperparameter that I choose to optimize, the ...
1
vote
0
answers
69
views
Rounding Approximation for Blackbox Integer Optimization
I am working on a black-box optimization that involves surrogate modeling. Some of my decision variables are integers, but I doubt a MIP approach would work for my case.
My advisor told me that it is ...
1
vote
0
answers
67
views
Expected improvement for bayesian linear regression with unknown noise variance
My question is basically if the expected improvement for a bayesian linear regression with unknown noise variance, i.e. we place a prior on the noise variance -> predictive distribution may not be ...
12
votes
2
answers
2k
views
Why go through the trouble of expectation maximization and not use gradient descent?
In expectation maximization first a lower bound of the likelihood is found and then a 2 step iterative algorithm kicks in where first we try to find the weights (the probability that a data point ...
0
votes
0
answers
107
views
Tuning Random Forest results in max_features parameter taking a value of 1. Why?
I did a bayesian optimization tuning for parameters of random forest. With 200 iterations, it seems like 70% of the times, very low values (read 1 or 2) of max_features seems to produce better (...
4
votes
1
answer
234
views
Terms and assumptions in trans-dimensional MCMC (RJ-MCMC) for Green 1995 paper
I want to use Trans-dimensional MCMC in my research and for fundamental understanding, I am trying to learn from Green (1995) paper, which is foundation of RJ-MCMC.
In part of 3.3 'switching between ...
1
vote
0
answers
49
views
With mult-dimensional input vectors, what are the dimensions of the covariance matrix elements? (Gaussian Process)
I am trying to create a Bayesian Optimisation code with a Gaussian Process. My input data, $\vec{X}_i$ is 8-dimensional, where each dimension corresponds to a feature of my data,
$\vec{X}_i = [\...
0
votes
0
answers
60
views
Hyperparameter tuning and initialisation doubts in multivariate gaussian process model
I'm trying to train a multivariate Gaussian Process model using the code here https://github.com/Magica-Chen/gptp_multi_output. However I noticed how problematic is to initialise the length scales of ...
6
votes
2
answers
2k
views
Estimating a "most likely" distribution from min, max, mean, median, standard deviation
I'm a physics undergrad who started becoming curious about this question after exam season. After any exam, we're typically given the following parameters: Min, max, mean, median, std. deviation, ...
2
votes
1
answer
1k
views
Difference between Bayesian optimization and multi-armed bandit optimization
What are the differences between Bayesian optimization and multi-armed bandit optimization? Are the problems equivalent when multi-armed bandit's action space is infinite?
0
votes
0
answers
97
views
Best function value for Expected Improvement computation for Gaussian Process Regression
I want to implement Gaussian Process regression in the context of active learning, in which interpolation is performed with the best interpolating points, selected at each step iteratively. At every ...
1
vote
0
answers
102
views
Best way to fit a Gaussian Process surrogate model to an RL Reward function
Is there a way to get an estimate of good scaling parameters (namely mean and variance) for a Gaussian Process kernel serving as a surrogate model to a Reinforcement Learning reward function for ...
2
votes
0
answers
40
views
Constrained optimization between two bayesian variables
I have 2 separate Bayesain networks and I was hoping to maximize Value within the constraint of the Cost. What are is a good way ...
5
votes
1
answer
914
views
Deriving Log Marginal Likelihood for Gaussian Process
I am trying to evaluate the following integral marginalized across all possible functions.
$$\mathbb{P}(y|X,\theta) = \int \mathbb{P}(y|f)\ \mathbb{P}(f|X,\theta) \ df$$
In G.P. we assume prior to be ...
3
votes
1
answer
311
views
Bayesian optimization on a polynomial regression as the surrogate model
My understanding of Bayesian optimization is that it is generally used in conjunction with Gaussian process (GP) as the surrogate model. Because GP inherently produces an uncertainty of estimate, the ...
1
vote
0
answers
112
views
Monte Carlo Dropout as surrogate model for Bayesian Optimization
I am interested in using Monte Carlo Dropout as a surrogate model for Bayesian optimization. I noticed that the paper states:
The use of dropout (and its variants) in NNs can be interpreted as a ...
3
votes
1
answer
528
views
Role of standard deviation in Bayesian optimization using GP
I am new to GP and BO and I have been playing with the two in a simple 1D context which happens to be practically relevant to what I am working on. Essentially, I am trying to find a peak (modeled as ...
1
vote
1
answer
85
views
Bayesian terminology for experiments
I am learning about bayesian experimental design and confused about the "Bayesian" terminology. Multi-armed bandits are normally in the syllabus of bayesian experiments. But there are ...
4
votes
0
answers
406
views
Choice of model surrogate for bayesian optimization
I am running BayesSearchCV to optimize the hyperparameters of my machine learning model. This particular procedure allows the user to choose the surrogate model. The options are Gaussian Process, ...
0
votes
1
answer
162
views
Bayesian optimization with constraints
I want to perform Bayesian optimization for a certain physical task but with additional requirements. We have access to a set of variables and want to maximize (multiple) signal outputs from an ...
1
vote
0
answers
118
views
Why was Bayes better on exponential model rather than log-linear model?
I wanted to train a Bayesian version of this model which we can consider to be this log-linear form.
$$\ln \text{PSI} = \alpha \text{Time} + \beta.$$
Here are the priors I guessed for $\alpha$ and $\...
2
votes
0
answers
90
views
How to apply Bayesian Optimization to a function depending only on categorical variables?
I would like to apply Bayesian Optimization (BO) to a black-box function depending only on multiple categorical variables. In my application, each categorical variable has 3 possible categories. I ...
0
votes
0
answers
242
views
Expected Improvement (EI) in Bayesian Optimization failed to converge
I am not sure if "converge" is a proper description of my problem. I'll state it in detail below.
I'm working on a CFD problem which means each sample is expensive. The framework I choose to ...
1
vote
1
answer
164
views
Minimum sampling for maximising the prediction accuracy [closed]
Suppose that I'm training a machine learning model to predict people's age by a picture of their faces. Lets say that I have a dataset of people from 1 year olds to 100 year olds. But I want to choose ...
1
vote
0
answers
70
views
How to solve this type of multi-task Bayesian optimization problem?
Let us consider a collection of local Bayesian optimization tasks, each employs a Gaussian Process model to find the local optimum (i.e. global optimum of that task). The goal is to design a ...
4
votes
1
answer
304
views
What is global concavity of the (log-)likelihood worth in Bayesian estimation?
In maximum likelihood estimation there is a big emphasis on finding the global maximum, which is why likelihood functions that are provably globally (log-)concave are desirable (despite often being ...
2
votes
1
answer
84
views
Is it correct to replace the oldest data when using Gaussian Process in blackbox hyper-parameter optimization?
I try to apply GPR in a blackbox HPO question. My input will have 6 dimensions like X=[x1,...x6]. The implementation is quite straightforward with sklearn with a ...
3
votes
2
answers
2k
views
How to run Bayesian optimization experiments in parallel?
Suppose I have the following hyperparameters for tuning:
learning_rate: [0.00001, 0.1]
epochs: [200,300,400,....,1000]
batch_size: [16,32,64,128]
If I want to run experiments using 4 parallel jobs, ...
2
votes
0
answers
102
views
How much reduction of hyperparameter experiments can I get using Bayesian optimization vs Grid search?
I have 5 hyperparameters for tuning and the number of combinations of all possible values is 9,360. This means if I want to find the optimal parameter setting using Grid Search, I need to do 9,360 ...
0
votes
0
answers
516
views
Why is it desirable to standardize the inputs of a Gaussian Process Regression?
I read this question (Should we standardize the data while doing Gaussian process regression?) and wondered, why do we need to normalize the inputs of a Gaussian Process?
In my case, I want to use ...
2
votes
1
answer
51
views
Clarification on Bayesian ensembling
I was reading a paper https://arxiv.org/pdf/2007.06823.pdf and at the end of page 3 the author presents the technique called "ensembling" for the estimation of the expected outputs and the ...
4
votes
1
answer
782
views
Is it possible to pick more than 1 sample point in each iteration of bayesian optimisation?
I want to use Bayesian optimisation for my project and I plan to build a closed-loop system, such that there is a model, robot to conduct experiments, measurement of experimental data which updates ...
4
votes
1
answer
263
views
How does removal of symmetry (e.g. via constraints) in a Bayesian optimization search space affect search efficiency?
There are many examples of search space symmetry in real-world optimization problems in the physical sciences. To motivate this, here are some that come to mind:
When optimizing a formulation such as ...
2
votes
1
answer
796
views
Package for hyperparameter optimization with categorical values
Context
I'm trying to solve a black-box optimization problem, and I can "reformulate" parts of the problem is different ways that may lead to lower or higher costs, and which can interact ...