Yaroslav Bulatov – Medium

Yaroslav Bulatov

Yaroslav Bulatov

Introduction to residual correction

Better formated version (LaTex) of this post on Ghost

7 min readOct 14, 2023

--

Introduction to residual correction

--

Yaroslav Bulatov

‌Gradient descent under harmonic eigenvalue decay

‌Better formatted version on ghost (due to Latex support)‌

4 min readFeb 16, 2023

--

‌Gradient descent under harmonic eigenvalue decay

--

Yaroslav Bulatov

Critical batch-size and effective dimension in Ordinary Least Squares

Note: a better formatted version (due to lack of LaTeX support on medium) is here

5 min readJan 30, 2023

--

Critical batch-size and effective dimension in Ordinary Least Squares

--

Yaroslav Bulatov

optimal learning rate for Gradient Descent on a high-dimensional quadratic

A better formatted version of this article is on Ghost, which has proper Latex support…

2 min readDec 27, 2021

--

optimal learning rate for Gradient Descent on a high-dimensional quadratic

--

Yaroslav Bulatov

How many matmuls are needed to compute Hessian-vector products?

Suppose you have a simple composition of d dense functions. Computing Jacobian needs d matrix multiplications. What about computing Hessian…

2 min readDec 15, 2021

--

How many matmuls are needed to compute Hessian-vector products?

--

Yaroslav Bulatov

How to do matrix derivatives

Suppose you have the following scalar function of matrix variable W.

3 min readJul 9, 2021

--

1

How to do matrix derivatives

--

1

Yaroslav Bulatov

Using “Evolved Notation” to derive the Hessian of cross-entropy loss

I was recently reminded of a lesson learned at Stephen Boyd’s Convex Optimization class at Stanford a few years ago, back when Google was…

3 min readAug 30, 2019

--

--

Yaroslav Bulatov

Large-scale AI and sharing of models

Background

3 min readJul 21, 2019

--

1

Large-scale AI and sharing of models

--

1

Yaroslav Bulatov

ICLR Optimization papers III

Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions

4 min readJun 25, 2019

--

1

ICLR Optimization papers III

--

1

Yaroslav Bulatov

ICLR Optimization papers II

continued from part 1

5 min readJun 11, 2019

--

1

ICLR Optimization papers II

--

1

Yaroslav Bulatov

Yaroslav Bulatov

asdf

Following

Help
Status
About
Careers
Blog
Privacy
Terms
Text to speech
Teams