Jekyll2020-08-09T13:05:53+00:00https://ryanzhang.info/feed.xmlRen’s Cabinet of CuriositiesLearning is never cumulative, it is a movement of knowing which has no beginning and no end. – Bruce LeeRen ZhangLearning to rank2019-10-31T15:43:43+00:002019-10-31T15:43:43+00:00https://ryanzhang.info/post/2019/10/31/learning-to-rank<h1 id="notes-on-learning-to-rank">Notes on Learning To Rank</h1> <h2 id="task">Task</h2> <p>We want to learn a function $$f(q, D)$$ which takes in a query $$q$$ and a list of documents $$D=\{d_1, d_2, ..., d_n\}$$, and produces scores using which we can rank/order the list of documents.</p> <h2 id="types">Types</h2> <p>There are multiple ways we can formulate the problem:</p> <ol> <li>Pointwise</li> <li>Pairwise</li> <li>Listwise</li> </ol> <h3 id="pointwise">Pointwise</h3> <p>In this approach we learn $$f(q,d)$$, which scores the match-ness between the query and document independently. When scoring a data point, the function does not take other document in the list into consideration.</p> <p>To train a model in this approach, the data would be in the long format where each row contains a $$(q,d)$$ pair and we need labels for every row. Either the label is binary(classification) or relevance scores(regression).</p> <h3 id="pairwise">Pairwise</h3> <p>In this approach we learn $$Pr(rank(d_i,q)\succ rank(d_j,q))$$, that is to learn to determine relevant preference between two documents given a query.</p> <p>It can be treated as binary classification problem, the data would be in the format where each row contains a triplet of $$(q,d_i,d_j)$$ and we need a binary label for each row. We can hand crafting features that captures the difference between $$d_i,d_j$$ with respect to $$q$$ and feed that difference to a binary classifier.</p> <p>Or more often we learn it in a pointwise fashion, by learning the intermediate rank function. Let $$rank(d_i,q)=s_i$$ , the pairwise classification problem becomes classification on the difference between rank scores. That is:</p> $Pr(rank(d_i,q) &gt; rank(d_j,q))=\frac{1}{1+exp(-(s_i-s_j))}$ <p>The loss would be the negative log of this likelihood, which is: $$L_{ij}=log(1+exp(s_j-s_i))$$ and we can train the rank function to minimize this loss.</p> <p>If we work out the graident with respect to the parameter in the rank function, it is:</p> \begin{aligned} \frac{\partial L_{ij}}{\partial \theta}&amp;=\frac{\partial L_{ij}}{\partial s_i}\frac{\partial s_i}{\partial \theta} + \frac{\partial L_{ij}}{\partial s_j}\frac{\partial s_j}{\partial \theta} \\ &amp;=-\frac{1}{1 + exp(s_i-s_j)}(\frac{\partial s_i}{\theta} - \frac{\partial s_j}{\theta}) \\ &amp;=\lambda_{ij}(\frac{\partial s_i}{\theta} - \frac{\partial s_j}{\theta}) \end{aligned} <p>As a result, a single gradient descent step with this gradient is doing a gradient ascent for $$s_i$$ and gradient descent for $$s_j$$ together with a weight of $$\lambda_{ij}$$. That is, for a given pair of documents, we make the score of a more relevant document higher, and make the score of a less relevant document lower, and how much we perform the update is determined by the score difference.</p> <h3 id="listwise">Listwise</h3> <p>To be continued.</p>Ren ZhangNotes on Learning To RankJust when you think you know shake up enough2019-10-25T18:48:00+00:002019-10-25T18:48:00+00:00https://ryanzhang.info/post/2019/10/25/just-when-you-think-you-know-shake-up-enough<p><img src="/assets/images/just_when_you_think_you_know_shake_up_enough.png" alt="image" /></p> <p>Yester day I experienced my biggest shakeup ever since I started participating on Kaggle. I dropped from silver zone to nowhere. The competition is an image segmentation competition, and based on past experience, CV competitions has relatively low shakeups(compare to tabular/transaction/time series predictions). I have seen CV competitions with low shakeups even when there is obvious train/test distribution difference. So this comes a bit surprise, but honestly, it is not all that bad.</p> <p>I have chosen a risky strategy by training models in full training set instead of using cross validations, and relying solely on public leaderboard for feedbacks, in order to fit 5 times more models from different architectures. This is actually all fine, my ensemble of 7 models with no threshold tuning based on public board still could retain my silver medal position. It was the thresholding that sunk my ranks. A lesson well learnt.</p>Ren ZhangIterate over an iterable multiple times2019-08-22T19:50:00+00:002019-08-22T19:50:00+00:00https://ryanzhang.info/post/2019/08/22/iterate-over-an-iterable-multiple-times<p>I was working on a piece of code today and I need to iterate over a iterable multiple times to do some computes. The body of code is the same for all passes. One obvious thing I can do is to do double loops:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">results</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_repeats</span><span class="p">):</span> <span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">get_iterable</span><span class="p">():</span> <span class="n">results</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">process</span><span class="p">(</span><span class="n">item</span><span class="p">))</span> </code></pre></div></div> <p>The reason that I need to loop over it multiple times rather than loop once and duplicate <code class="language-plaintext highlighter-rouge">results</code> is that <code class="language-plaintext highlighter-rouge">f</code> has an internal state, it gives a different output based on the number of times it has seen the input. The above code did work, but not as nice as I’d like. I utilized <code class="language-plaintext highlighter-rouge">chain</code> and <code class="language-plaintext highlighter-rouge">repeat</code> from <code class="language-plaintext highlighter-rouge">itertools</code> to clean up it a bit:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">chain</span><span class="p">,</span> <span class="n">repeat</span> <span class="n">results</span> <span class="o">=</span> <span class="p">[]</span> <span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">chain</span><span class="p">(</span><span class="o">*</span><span class="n">repeat</span><span class="p">(</span><span class="n">get_iterable</span><span class="p">(),</span> <span class="n">n_repeats</span><span class="p">)):</span> <span class="n">results</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">f</span><span class="p">.</span><span class="n">process</span><span class="p">(</span><span class="n">item</span><span class="p">))</span> </code></pre></div></div> <p>It does the exactly same thing, but the code reads more like straight English.</p>Ren ZhangI was working on a piece of code today and I need to iterate over a iterable multiple times to do some computes. The body of code is the same for all passes. One obvious thing I can do is to do double loops:Clean python code to get a reverse mapping2019-08-21T15:40:00+00:002019-08-21T15:40:00+00:00https://ryanzhang.info/post/2019/08/21/clean-code-to-get-a-reverse-mapping-in-python<p>When working with machine learning problems, often I use python dictionary to map categorical values to its integer encoded values. Something like:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">string</span> <span class="n">feature_encoder</span> <span class="o">=</span> <span class="p">{</span><span class="n">v</span><span class="p">:</span><span class="n">i</span> <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">string</span><span class="p">.</span><span class="n">ascii_lowercase</span><span class="p">)}</span> </code></pre></div></div> <p>To get back the original value, I need to have a reverse mapping, I used to create it with:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">feature_decoder</span> <span class="o">=</span> <span class="p">{</span><span class="n">v</span><span class="p">:</span><span class="n">k</span> <span class="k">for</span> <span class="n">k</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">feature_encoder</span><span class="p">.</span><span class="n">items</span><span class="p">()}</span> </code></pre></div></div> <p>This is fine, but I dislike it for that I have to spend some mental effort to read it to know what I was doing the next time I read my code. And today I found a nicer way to get the reverse mapping.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">feature_decoder</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="nb">reversed</span><span class="p">,</span> <span class="n">feature_encoder</span><span class="p">.</span><span class="n">items</span><span class="p">()))</span> </code></pre></div></div> <p>It is doing exactly the same thing, but I can just read it as an English sentence to know what is going on.</p>Ren ZhangWhen working with machine learning problems, often I use python dictionary to map categorical values to its integer encoded values. Something like: import string feature_encoder = {v:i for i, v in enumerate(string.ascii_lowercase)} To get back the original value, I need to have a reverse mapping, I used to create it with: feature_decoder = {v:k for k, v in feature_encoder.items()} This is fine, but I dislike it for that I have to spend some mental effort to read it to know what I was doing the next time I read my code. And today I found a nicer way to get the reverse mapping. feature_decoder = dict(map(reversed, feature_encoder.items())) It is doing exactly the same thing, but I can just read it as an English sentence to know what is going on.Hello Jekyll on Github Pages2019-05-22T02:39:43+00:002019-05-22T02:39:43+00:00https://ryanzhang.info/post/2019/05/22/welcome-to-jekyll<p>My current website hosting costs me $7.99 a month plus ~$20 for domain name per year. I have been paying this amount for a little more than 3 years, and finally decided to cut the hosting fee by migrating over to github pages.</p>Ren ZhangMy current website hosting costs me $7.99 a month plus ~$20 for domain name per year. I have been paying this amount for a little more than 3 years, and finally decided to cut the hosting fee by migrating over to github pages.