======== Prologue ======== :Author: Guanqun Yang Who am I ======== My name is Guanqun Yang, and I am a second-year master student in Electrical and Computer Engineering at UCLA. My research interests include statistical machine and its applications. You could learn more about me `here `__. Motivation ========== Past experiences tell me how important it is to cherish the learning experiences I have. Even though learning here might refer to something other than school education like what I have learned about purely out of curiosity, it is still preferable to organize this information, believing they could serve a certain purpose one day. As for the formal courses I take, some sort of organization is **always** made by physical or digit notes (many thanks `BoostNote `__, the best note-taking software I have used). However, such organization still looks messy to me and I started looking for a way to **clearly** organize what I have learned for a period (say, a quarter). It seems that there are many things I could use and I tried many of them, but they are all somehow disappointing - WordPress/Wix/Weebly: They are the go-to choices for personal blogging and there are some amazing examples, like `Terrence Tao’s blog `__ generated by WordPress. However, these services serve pretty diverse purposes and look less professional and some advanced features are non-free, I finally gave up after some efforts. - A Website from scratch: Since professional technical blogging is my ultimate goal, I thought of this because of its high-customizability. However, it turns out it is much more expensive than the previous choice and involves much more time spending. - GitBook: This is a service to host software’s documentation and one of my friends uses this as his technical blog (see `CharlesNotes `__). Even though he claims that this is the optimal way for personal blogging, I find it still disappointing since some limitation GitBook poses on the users (again, advanced features is non-free). Despite all of these disappointments, I kept looking for alternatives and I finally found `this site `__, where the author hosted his/her notes for the statistical physics course he/she took. I also realized that I could actually host my notes like software’s documentation. Staying on the right track by combining these two ideas, I got everything configuration/deployment in less than one day and therefore the site you see here. Plan ==== The following notes will be compiled and hosted on this site - Linear algebra - Convex optimization - Statistical machine learning - Deep learning - My current research Notation ======== From the previous experiences, I have spent a lot of time resolving notation differences resulting from learning with multiple sources. In the posts I am going to write, I will try to make the notations consistent. Here are some guidelines I will abide by: - Matrices, probability, expectation will all use square brackets. Note a more formal version of probability/expectation signs, i.e. :math:`\mathbb{P}` and :math:`\mathbb{E}`, will not be used for efficiency. - Example (Markov inequality) .. math:: P[X \geq a] \leq \frac{E[X]}{a}\ (a>0) - Feature vectors will appear as row vectors in the feature matrix. In classification problems, the dataset (both features and labels) could be arranged as a matrix. Note sometimes it is confusing to use :math:`m` and :math:`n` as subscripts simultaneously, so when referring number of examples in the dataset, either :math:`m` or :math:`n` will be used. At the same time, :math:`p` will be used to represent the number of features in feature vector. - Example (dataset rearranged as a matrix) .. math:: \begin{bmatrix} \mathbf{x}_1^T& y_1\\ \mathbf{x}_2^T& y_2\\ \vdots & \vdots\\ \mathbf{x}_m^T& y_m \end{bmatrix} .. table:: Notation used throughout this set of notes ============================================================================================================================ ============================ Notation Meaning ============================================================================================================================ ============================ :math:`\mathcal{X}` domain set :math:`\mathcal{Y}` label set :math:`\mathcal{D}: \mathcal{X}\times \mathcal{Y}` underlying distribution :math:`D=\{(\mathbf{x}_1,y_1), (\mathbf{x}_2, y_2),\cdots, (\mathbf{x}_m,y_m) \}` dataset :math:`\mathbf{x}_i:=\left[x_{i1};x_{i2};\cdots;x_{ip}\right]:=\begin{bmatrix}x_{i1}\\x_{i2}\\\vdots\\x_{ip} \end{bmatrix}` feature vector :math:`\mathbf{X}:=\begin{bmatrix}\mathbf{x}_1^T\\\mathbf{x}_2^T\\\vdots\\\mathbf{x}_m^T \end{bmatrix}` feature matrix :math:`\mathcal{H}` hypothesis class :math:`h` hypothesis :math:`X` random variable :math:`\mathbb{R}^{m\times p}` :math:`m\times p` real space ============================================================================================================================ ============================