Use KaTeX for client-side math rendering instead of MathJax
This commit is contained in:
parent
fe6d8d5839
commit
633507e193
26 changed files with 241 additions and 177 deletions
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Archives</title>
|
<title>Dimitri Lozeve - Archives</title>
|
||||||
<link rel="stylesheet" href="./css/default.css" />
|
<link rel="stylesheet" href="./css/default.css" />
|
||||||
<link rel="stylesheet" href="./css/syntax.css" />
|
<link rel="stylesheet" href="./css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -135,26 +135,13 @@ then <span class="math inline">\(\varphi(n)\)</span> is true for every natural n
|
||||||
<p>First, we prove that every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
<p>First, we prove that every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><span class="math inline">\(0+0 = 0+0\)</span>.</li>
|
<li><span class="math inline">\(0+0 = 0+0\)</span>.</li>
|
||||||
<li><p>For every natural number <span class="math inline">\(a\)</span> such that <span class="math inline">\(0+a = a+0\)</span>, we have:</p>
|
<li><p>For every natural number <span class="math inline">\(a\)</span> such that <span class="math inline">\(0+a = a+0\)</span>, we have:</p></li>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
0 + s(a) &= s(0+a)\\
|
|
||||||
&= s(a+0)\\
|
|
||||||
&= s(a)\\
|
|
||||||
&= s(a) + 0.
|
|
||||||
\end{align}
|
|
||||||
\]</span></li>
|
|
||||||
</ul>
|
</ul>
|
||||||
<p>By Axiom 5, every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
<p>By Axiom 5, every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
||||||
<p>We can now prove the main proposition:</p>
|
<p>We can now prove the main proposition:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><span class="math inline">\(\forall a,\quad a+0=0+a\)</span>.</li>
|
<li><span class="math inline">\(\forall a,\quad a+0=0+a\)</span>.</li>
|
||||||
<li><p>For all <span class="math inline">\(a\)</span> and <span class="math inline">\(b\)</span> such that <span class="math inline">\(a+b=b+a\)</span>,</p>
|
<li><p>For all <span class="math inline">\(a\)</span> and <span class="math inline">\(b\)</span> such that <span class="math inline">\(a+b=b+a\)</span>,</p></li>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
a + s(b) &= s(a+b)\\
|
|
||||||
&= s(b+a)\\
|
|
||||||
&= s(b) + a.
|
|
||||||
\end{align}
|
|
||||||
\]</span></li>
|
|
||||||
</ul>
|
</ul>
|
||||||
<p>We used the opposite of the second rule for <span class="math inline">\(+\)</span>, namely <span class="math inline">\(\forall a,
|
<p>We used the opposite of the second rule for <span class="math inline">\(+\)</span>, namely <span class="math inline">\(\forall a,
|
||||||
\forall b,\quad s(a) + b = s(a+b)\)</span>. This can easily be proved by another induction.</p>
|
\forall b,\quad s(a) + b = s(a+b)\)</span>. This can easily be proved by another induction.</p>
|
||||||
|
@ -230,31 +217,15 @@ then <span class="math inline">\(\varphi(n)\)</span> is true for every natural n
|
||||||
\mathcal{S}\)</span> to a set <span class="math inline">\(\mathcal{A}(s)\)</span> of possible <em>actions</em> for this state. In this post, we will often simplify by using <span class="math inline">\(\mathcal{A}\)</span> as a set, assuming that all actions are possible for each state,</li>
|
\mathcal{S}\)</span> to a set <span class="math inline">\(\mathcal{A}(s)\)</span> of possible <em>actions</em> for this state. In this post, we will often simplify by using <span class="math inline">\(\mathcal{A}\)</span> as a set, assuming that all actions are possible for each state,</li>
|
||||||
<li><span class="math inline">\(\mathcal{R} \subset \mathbb{R}\)</span> is a set of <em>rewards</em>,</li>
|
<li><span class="math inline">\(\mathcal{R} \subset \mathbb{R}\)</span> is a set of <em>rewards</em>,</li>
|
||||||
<li><p>and <span class="math inline">\(p\)</span> is a function representing the <em>dynamics</em> of the MDP:</p>
|
<li><p>and <span class="math inline">\(p\)</span> is a function representing the <em>dynamics</em> of the MDP:</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
p &: \mathcal{S} \times \mathcal{R} \times \mathcal{S} \times \mathcal{A} \mapsto [0,1] \\
|
|
||||||
p(s', r \;|\; s, a) &:= \mathbb{P}(S_t=s', R_t=r \;|\; S_{t-1}=s, A_{t-1}=a),
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
<p>such that <span class="math display">\[ \forall s \in \mathcal{S}, \forall a \in \mathcal{A},\quad \sum_{s', r} p(s', r \;|\; s, a) = 1. \]</span></p></li>
|
<p>such that <span class="math display">\[ \forall s \in \mathcal{S}, \forall a \in \mathcal{A},\quad \sum_{s', r} p(s', r \;|\; s, a) = 1. \]</span></p></li>
|
||||||
</ul>
|
</ul>
|
||||||
</div>
|
</div>
|
||||||
<p>The function <span class="math inline">\(p\)</span> represents the probability of transitioning to the state <span class="math inline">\(s'\)</span> and getting a reward <span class="math inline">\(r\)</span> when the agent is at state <span class="math inline">\(s\)</span> and chooses action <span class="math inline">\(a\)</span>.</p>
|
<p>The function <span class="math inline">\(p\)</span> represents the probability of transitioning to the state <span class="math inline">\(s'\)</span> and getting a reward <span class="math inline">\(r\)</span> when the agent is at state <span class="math inline">\(s\)</span> and chooses action <span class="math inline">\(a\)</span>.</p>
|
||||||
<p>We will also use occasionally the <em>state-transition probabilities</em>:</p>
|
<p>We will also use occasionally the <em>state-transition probabilities</em>:</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
p &: \mathcal{S} \times \mathcal{S} \times \mathcal{A} \mapsto [0,1] \\
|
|
||||||
p(s' \;|\; s, a) &:= \mathbb{P}(S_t=s' \;|\; S_{t-1}=s, A_{t-1}=a) \\
|
|
||||||
&= \sum_r p(s', r \;|\; s, a).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
<h2 id="rewarding-the-agent">Rewarding the agent</h2>
|
<h2 id="rewarding-the-agent">Rewarding the agent</h2>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>expected reward</em> of a state-action pair is the function</p>
|
<p>The <em>expected reward</em> of a state-action pair is the function</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
r &: \mathcal{S} \times \mathcal{A} \mapsto \mathbb{R} \\
|
|
||||||
r(s,a) &:= \mathbb{E}[R_t \;|\; S_{t-1}=s, A_{t-1}=a] \\
|
|
||||||
&= \sum_r r \sum_{s'} p(s', r \;|\; s, a).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>discounted return</em> is the sum of all future rewards, with a multiplicative factor to give more weights to more immediate rewards: <span class="math display">\[ G_t := \sum_{k=t+1}^T \gamma^{k-t-1} R_k, \]</span> where <span class="math inline">\(T\)</span> can be infinite or <span class="math inline">\(\gamma\)</span> can be 1, but not both.</p>
|
<p>The <em>discounted return</em> is the sum of all future rewards, with a multiplicative factor to give more weights to more immediate rewards: <span class="math display">\[ G_t := \sum_{k=t+1}^T \gamma^{k-t-1} R_k, \]</span> where <span class="math inline">\(T\)</span> can be infinite or <span class="math inline">\(\gamma\)</span> can be 1, but not both.</p>
|
||||||
|
@ -264,33 +235,14 @@ r(s,a) &:= \mathbb{E}[R_t \;|\; S_{t-1}=s, A_{t-1}=a] \\
|
||||||
<p>A <em>policy</em> is a way for the agent to choose the next action to perform.</p>
|
<p>A <em>policy</em> is a way for the agent to choose the next action to perform.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>A <em>policy</em> is a function <span class="math inline">\(\pi\)</span> defined as</p>
|
<p>A <em>policy</em> is a function <span class="math inline">\(\pi\)</span> defined as</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
\pi &: \mathcal{A} \times \mathcal{S} \mapsto [0,1] \\
|
|
||||||
\pi(a \;|\; s) &:= \mathbb{P}(A_t=a \;|\; S_t=s).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<p>In order to compare policies, we need to associate values to them.</p>
|
<p>In order to compare policies, we need to associate values to them.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>state-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
<p>The <em>state-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
v_{\pi} &: \mathcal{S} \mapsto \mathbb{R} \\
|
|
||||||
v_{\pi}(s) &:= \text{expected return when starting in $s$ and following $\pi$} \\
|
|
||||||
v_{\pi}(s) &:= \mathbb{E}_{\pi}\left[ G_t \;|\; S_t=s\right] \\
|
|
||||||
v_{\pi}(s) &= \mathbb{E}_{\pi}\left[ \sum_{k=0}^{\infty} \gamma^k R_{t+k+1} \;|\; S_t=s\right]
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<p>We can also compute the value starting from a state <span class="math inline">\(s\)</span> by also taking into account the action taken <span class="math inline">\(a\)</span>.</p>
|
<p>We can also compute the value starting from a state <span class="math inline">\(s\)</span> by also taking into account the action taken <span class="math inline">\(a\)</span>.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>action-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
<p>The <em>action-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
q_{\pi} &: \mathcal{S} \times \mathcal{A} \mapsto \mathbb{R} \\
|
|
||||||
q_{\pi}(s,a) &:= \text{expected return when starting from $s$, taking action $a$, and following $\pi$} \\
|
|
||||||
q_{\pi}(s,a) &:= \mathbb{E}_{\pi}\left[ G_t \;|\; S_t=s, A_t=a \right] \\
|
|
||||||
q_{\pi}(s,a) &= \mathbb{E}_{\pi}\left[ \sum_{k=0}^{\infty} \gamma^k R_{t+k+1} \;|\; S_t=s, A_t=a\right]
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<h2 id="the-quest-for-the-optimal-policy">The quest for the optimal policy</h2>
|
<h2 id="the-quest-for-the-optimal-policy">The quest for the optimal policy</h2>
|
||||||
<h1 id="references">References</h1>
|
<h1 id="references">References</h1>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Contact</title>
|
<title>Dimitri Lozeve - Contact</title>
|
||||||
<link rel="stylesheet" href="./css/default.css" />
|
<link rel="stylesheet" href="./css/default.css" />
|
||||||
<link rel="stylesheet" href="./css/syntax.css" />
|
<link rel="stylesheet" href="./css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Curriculum Vitæ</title>
|
<title>Dimitri Lozeve - Curriculum Vitæ</title>
|
||||||
<link rel="stylesheet" href="./css/default.css" />
|
<link rel="stylesheet" href="./css/default.css" />
|
||||||
<link rel="stylesheet" href="./css/syntax.css" />
|
<link rel="stylesheet" href="./css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Home</title>
|
<title>Dimitri Lozeve - Home</title>
|
||||||
<link rel="stylesheet" href="./css/default.css" />
|
<link rel="stylesheet" href="./css/default.css" />
|
||||||
<link rel="stylesheet" href="./css/syntax.css" />
|
<link rel="stylesheet" href="./css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Random matrices from the Ginibre ensemble</title>
|
<title>Dimitri Lozeve - Random matrices from the Ginibre ensemble</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Ising model simulation in APL</title>
|
<title>Dimitri Lozeve - Ising model simulation in APL</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Ising model simulation</title>
|
<title>Dimitri Lozeve - Ising model simulation</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Generating and representing L-systems</title>
|
<title>Dimitri Lozeve - Generating and representing L-systems</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Peano Axioms</title>
|
<title>Dimitri Lozeve - Peano Axioms</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
@ -81,26 +90,13 @@ then <span class="math inline">\(\varphi(n)\)</span> is true for every natural n
|
||||||
<p>First, we prove that every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
<p>First, we prove that every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><span class="math inline">\(0+0 = 0+0\)</span>.</li>
|
<li><span class="math inline">\(0+0 = 0+0\)</span>.</li>
|
||||||
<li><p>For every natural number <span class="math inline">\(a\)</span> such that <span class="math inline">\(0+a = a+0\)</span>, we have:</p>
|
<li><p>For every natural number <span class="math inline">\(a\)</span> such that <span class="math inline">\(0+a = a+0\)</span>, we have:</p></li>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
0 + s(a) &= s(0+a)\\
|
|
||||||
&= s(a+0)\\
|
|
||||||
&= s(a)\\
|
|
||||||
&= s(a) + 0.
|
|
||||||
\end{align}
|
|
||||||
\]</span></li>
|
|
||||||
</ul>
|
</ul>
|
||||||
<p>By Axiom 5, every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
<p>By Axiom 5, every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
||||||
<p>We can now prove the main proposition:</p>
|
<p>We can now prove the main proposition:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><span class="math inline">\(\forall a,\quad a+0=0+a\)</span>.</li>
|
<li><span class="math inline">\(\forall a,\quad a+0=0+a\)</span>.</li>
|
||||||
<li><p>For all <span class="math inline">\(a\)</span> and <span class="math inline">\(b\)</span> such that <span class="math inline">\(a+b=b+a\)</span>,</p>
|
<li><p>For all <span class="math inline">\(a\)</span> and <span class="math inline">\(b\)</span> such that <span class="math inline">\(a+b=b+a\)</span>,</p></li>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
a + s(b) &= s(a+b)\\
|
|
||||||
&= s(b+a)\\
|
|
||||||
&= s(b) + a.
|
|
||||||
\end{align}
|
|
||||||
\]</span></li>
|
|
||||||
</ul>
|
</ul>
|
||||||
<p>We used the opposite of the second rule for <span class="math inline">\(+\)</span>, namely <span class="math inline">\(\forall a,
|
<p>We used the opposite of the second rule for <span class="math inline">\(+\)</span>, namely <span class="math inline">\(\forall a,
|
||||||
\forall b,\quad s(a) + b = s(a+b)\)</span>. This can easily be proved by another induction.</p>
|
\forall b,\quad s(a) + b = s(a+b)\)</span>. This can easily be proved by another induction.</p>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Quick Notes on Reinforcement Learning</title>
|
<title>Dimitri Lozeve - Quick Notes on Reinforcement Learning</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
@ -51,31 +60,15 @@
|
||||||
\mathcal{S}\)</span> to a set <span class="math inline">\(\mathcal{A}(s)\)</span> of possible <em>actions</em> for this state. In this post, we will often simplify by using <span class="math inline">\(\mathcal{A}\)</span> as a set, assuming that all actions are possible for each state,</li>
|
\mathcal{S}\)</span> to a set <span class="math inline">\(\mathcal{A}(s)\)</span> of possible <em>actions</em> for this state. In this post, we will often simplify by using <span class="math inline">\(\mathcal{A}\)</span> as a set, assuming that all actions are possible for each state,</li>
|
||||||
<li><span class="math inline">\(\mathcal{R} \subset \mathbb{R}\)</span> is a set of <em>rewards</em>,</li>
|
<li><span class="math inline">\(\mathcal{R} \subset \mathbb{R}\)</span> is a set of <em>rewards</em>,</li>
|
||||||
<li><p>and <span class="math inline">\(p\)</span> is a function representing the <em>dynamics</em> of the MDP:</p>
|
<li><p>and <span class="math inline">\(p\)</span> is a function representing the <em>dynamics</em> of the MDP:</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
p &: \mathcal{S} \times \mathcal{R} \times \mathcal{S} \times \mathcal{A} \mapsto [0,1] \\
|
|
||||||
p(s', r \;|\; s, a) &:= \mathbb{P}(S_t=s', R_t=r \;|\; S_{t-1}=s, A_{t-1}=a),
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
<p>such that <span class="math display">\[ \forall s \in \mathcal{S}, \forall a \in \mathcal{A},\quad \sum_{s', r} p(s', r \;|\; s, a) = 1. \]</span></p></li>
|
<p>such that <span class="math display">\[ \forall s \in \mathcal{S}, \forall a \in \mathcal{A},\quad \sum_{s', r} p(s', r \;|\; s, a) = 1. \]</span></p></li>
|
||||||
</ul>
|
</ul>
|
||||||
</div>
|
</div>
|
||||||
<p>The function <span class="math inline">\(p\)</span> represents the probability of transitioning to the state <span class="math inline">\(s'\)</span> and getting a reward <span class="math inline">\(r\)</span> when the agent is at state <span class="math inline">\(s\)</span> and chooses action <span class="math inline">\(a\)</span>.</p>
|
<p>The function <span class="math inline">\(p\)</span> represents the probability of transitioning to the state <span class="math inline">\(s'\)</span> and getting a reward <span class="math inline">\(r\)</span> when the agent is at state <span class="math inline">\(s\)</span> and chooses action <span class="math inline">\(a\)</span>.</p>
|
||||||
<p>We will also use occasionally the <em>state-transition probabilities</em>:</p>
|
<p>We will also use occasionally the <em>state-transition probabilities</em>:</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
p &: \mathcal{S} \times \mathcal{S} \times \mathcal{A} \mapsto [0,1] \\
|
|
||||||
p(s' \;|\; s, a) &:= \mathbb{P}(S_t=s' \;|\; S_{t-1}=s, A_{t-1}=a) \\
|
|
||||||
&= \sum_r p(s', r \;|\; s, a).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
<h2 id="rewarding-the-agent">Rewarding the agent</h2>
|
<h2 id="rewarding-the-agent">Rewarding the agent</h2>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>expected reward</em> of a state-action pair is the function</p>
|
<p>The <em>expected reward</em> of a state-action pair is the function</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
r &: \mathcal{S} \times \mathcal{A} \mapsto \mathbb{R} \\
|
|
||||||
r(s,a) &:= \mathbb{E}[R_t \;|\; S_{t-1}=s, A_{t-1}=a] \\
|
|
||||||
&= \sum_r r \sum_{s'} p(s', r \;|\; s, a).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>discounted return</em> is the sum of all future rewards, with a multiplicative factor to give more weights to more immediate rewards: <span class="math display">\[ G_t := \sum_{k=t+1}^T \gamma^{k-t-1} R_k, \]</span> where <span class="math inline">\(T\)</span> can be infinite or <span class="math inline">\(\gamma\)</span> can be 1, but not both.</p>
|
<p>The <em>discounted return</em> is the sum of all future rewards, with a multiplicative factor to give more weights to more immediate rewards: <span class="math display">\[ G_t := \sum_{k=t+1}^T \gamma^{k-t-1} R_k, \]</span> where <span class="math inline">\(T\)</span> can be infinite or <span class="math inline">\(\gamma\)</span> can be 1, but not both.</p>
|
||||||
|
@ -85,33 +78,14 @@ r(s,a) &:= \mathbb{E}[R_t \;|\; S_{t-1}=s, A_{t-1}=a] \\
|
||||||
<p>A <em>policy</em> is a way for the agent to choose the next action to perform.</p>
|
<p>A <em>policy</em> is a way for the agent to choose the next action to perform.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>A <em>policy</em> is a function <span class="math inline">\(\pi\)</span> defined as</p>
|
<p>A <em>policy</em> is a function <span class="math inline">\(\pi\)</span> defined as</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
\pi &: \mathcal{A} \times \mathcal{S} \mapsto [0,1] \\
|
|
||||||
\pi(a \;|\; s) &:= \mathbb{P}(A_t=a \;|\; S_t=s).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<p>In order to compare policies, we need to associate values to them.</p>
|
<p>In order to compare policies, we need to associate values to them.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>state-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
<p>The <em>state-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
v_{\pi} &: \mathcal{S} \mapsto \mathbb{R} \\
|
|
||||||
v_{\pi}(s) &:= \text{expected return when starting in $s$ and following $\pi$} \\
|
|
||||||
v_{\pi}(s) &:= \mathbb{E}_{\pi}\left[ G_t \;|\; S_t=s\right] \\
|
|
||||||
v_{\pi}(s) &= \mathbb{E}_{\pi}\left[ \sum_{k=0}^{\infty} \gamma^k R_{t+k+1} \;|\; S_t=s\right]
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<p>We can also compute the value starting from a state <span class="math inline">\(s\)</span> by also taking into account the action taken <span class="math inline">\(a\)</span>.</p>
|
<p>We can also compute the value starting from a state <span class="math inline">\(s\)</span> by also taking into account the action taken <span class="math inline">\(a\)</span>.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>action-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
<p>The <em>action-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
q_{\pi} &: \mathcal{S} \times \mathcal{A} \mapsto \mathbb{R} \\
|
|
||||||
q_{\pi}(s,a) &:= \text{expected return when starting from $s$, taking action $a$, and following $\pi$} \\
|
|
||||||
q_{\pi}(s,a) &:= \mathbb{E}_{\pi}\left[ G_t \;|\; S_t=s, A_t=a \right] \\
|
|
||||||
q_{\pi}(s,a) &= \mathbb{E}_{\pi}\left[ \sum_{k=0}^{\infty} \gamma^k R_{t+k+1} \;|\; S_t=s, A_t=a\right]
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<h2 id="the-quest-for-the-optimal-policy">The quest for the optimal policy</h2>
|
<h2 id="the-quest-for-the-optimal-policy">The quest for the optimal policy</h2>
|
||||||
<h1 id="references">References</h1>
|
<h1 id="references">References</h1>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Mindsay: Towards Self-Learning Chatbots</title>
|
<title>Dimitri Lozeve - Mindsay: Towards Self-Learning Chatbots</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Projects</title>
|
<title>Dimitri Lozeve - Projects</title>
|
||||||
<link rel="stylesheet" href="./css/default.css" />
|
<link rel="stylesheet" href="./css/default.css" />
|
||||||
<link rel="stylesheet" href="./css/syntax.css" />
|
<link rel="stylesheet" href="./css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - ADS-B data visualization</title>
|
<title>Dimitri Lozeve - ADS-B data visualization</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Civilisation</title>
|
<title>Dimitri Lozeve - Civilisation</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Community Detection</title>
|
<title>Dimitri Lozeve - Community Detection</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Ising model simulation</title>
|
<title>Dimitri Lozeve - Ising model simulation</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - L-systems</title>
|
<title>Dimitri Lozeve - L-systems</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Orbit</title>
|
<title>Dimitri Lozeve - Orbit</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Satrap</title>
|
<title>Dimitri Lozeve - Satrap</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Topological Data Analysis of time-dependent networks</title>
|
<title>Dimitri Lozeve - Topological Data Analysis of time-dependent networks</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - WWII bombings visualization</title>
|
<title>Dimitri Lozeve - WWII bombings visualization</title>
|
||||||
<link rel="stylesheet" href="../css/default.css" />
|
<link rel="stylesheet" href="../css/default.css" />
|
||||||
<link rel="stylesheet" href="../css/syntax.css" />
|
<link rel="stylesheet" href="../css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
|
@ -131,26 +131,13 @@ then <span class="math inline">\(\varphi(n)\)</span> is true for every natural n
|
||||||
<p>First, we prove that every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
<p>First, we prove that every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><span class="math inline">\(0+0 = 0+0\)</span>.</li>
|
<li><span class="math inline">\(0+0 = 0+0\)</span>.</li>
|
||||||
<li><p>For every natural number <span class="math inline">\(a\)</span> such that <span class="math inline">\(0+a = a+0\)</span>, we have:</p>
|
<li><p>For every natural number <span class="math inline">\(a\)</span> such that <span class="math inline">\(0+a = a+0\)</span>, we have:</p></li>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
0 + s(a) &= s(0+a)\\
|
|
||||||
&= s(a+0)\\
|
|
||||||
&= s(a)\\
|
|
||||||
&= s(a) + 0.
|
|
||||||
\end{align}
|
|
||||||
\]</span></li>
|
|
||||||
</ul>
|
</ul>
|
||||||
<p>By Axiom 5, every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
<p>By Axiom 5, every natural number commutes with <span class="math inline">\(0\)</span>.</p>
|
||||||
<p>We can now prove the main proposition:</p>
|
<p>We can now prove the main proposition:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><span class="math inline">\(\forall a,\quad a+0=0+a\)</span>.</li>
|
<li><span class="math inline">\(\forall a,\quad a+0=0+a\)</span>.</li>
|
||||||
<li><p>For all <span class="math inline">\(a\)</span> and <span class="math inline">\(b\)</span> such that <span class="math inline">\(a+b=b+a\)</span>,</p>
|
<li><p>For all <span class="math inline">\(a\)</span> and <span class="math inline">\(b\)</span> such that <span class="math inline">\(a+b=b+a\)</span>,</p></li>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
a + s(b) &= s(a+b)\\
|
|
||||||
&= s(b+a)\\
|
|
||||||
&= s(b) + a.
|
|
||||||
\end{align}
|
|
||||||
\]</span></li>
|
|
||||||
</ul>
|
</ul>
|
||||||
<p>We used the opposite of the second rule for <span class="math inline">\(+\)</span>, namely <span class="math inline">\(\forall a,
|
<p>We used the opposite of the second rule for <span class="math inline">\(+\)</span>, namely <span class="math inline">\(\forall a,
|
||||||
\forall b,\quad s(a) + b = s(a+b)\)</span>. This can easily be proved by another induction.</p>
|
\forall b,\quad s(a) + b = s(a+b)\)</span>. This can easily be proved by another induction.</p>
|
||||||
|
@ -226,31 +213,15 @@ then <span class="math inline">\(\varphi(n)\)</span> is true for every natural n
|
||||||
\mathcal{S}\)</span> to a set <span class="math inline">\(\mathcal{A}(s)\)</span> of possible <em>actions</em> for this state. In this post, we will often simplify by using <span class="math inline">\(\mathcal{A}\)</span> as a set, assuming that all actions are possible for each state,</li>
|
\mathcal{S}\)</span> to a set <span class="math inline">\(\mathcal{A}(s)\)</span> of possible <em>actions</em> for this state. In this post, we will often simplify by using <span class="math inline">\(\mathcal{A}\)</span> as a set, assuming that all actions are possible for each state,</li>
|
||||||
<li><span class="math inline">\(\mathcal{R} \subset \mathbb{R}\)</span> is a set of <em>rewards</em>,</li>
|
<li><span class="math inline">\(\mathcal{R} \subset \mathbb{R}\)</span> is a set of <em>rewards</em>,</li>
|
||||||
<li><p>and <span class="math inline">\(p\)</span> is a function representing the <em>dynamics</em> of the MDP:</p>
|
<li><p>and <span class="math inline">\(p\)</span> is a function representing the <em>dynamics</em> of the MDP:</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
p &: \mathcal{S} \times \mathcal{R} \times \mathcal{S} \times \mathcal{A} \mapsto [0,1] \\
|
|
||||||
p(s', r \;|\; s, a) &:= \mathbb{P}(S_t=s', R_t=r \;|\; S_{t-1}=s, A_{t-1}=a),
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
<p>such that <span class="math display">\[ \forall s \in \mathcal{S}, \forall a \in \mathcal{A},\quad \sum_{s', r} p(s', r \;|\; s, a) = 1. \]</span></p></li>
|
<p>such that <span class="math display">\[ \forall s \in \mathcal{S}, \forall a \in \mathcal{A},\quad \sum_{s', r} p(s', r \;|\; s, a) = 1. \]</span></p></li>
|
||||||
</ul>
|
</ul>
|
||||||
</div>
|
</div>
|
||||||
<p>The function <span class="math inline">\(p\)</span> represents the probability of transitioning to the state <span class="math inline">\(s'\)</span> and getting a reward <span class="math inline">\(r\)</span> when the agent is at state <span class="math inline">\(s\)</span> and chooses action <span class="math inline">\(a\)</span>.</p>
|
<p>The function <span class="math inline">\(p\)</span> represents the probability of transitioning to the state <span class="math inline">\(s'\)</span> and getting a reward <span class="math inline">\(r\)</span> when the agent is at state <span class="math inline">\(s\)</span> and chooses action <span class="math inline">\(a\)</span>.</p>
|
||||||
<p>We will also use occasionally the <em>state-transition probabilities</em>:</p>
|
<p>We will also use occasionally the <em>state-transition probabilities</em>:</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
p &: \mathcal{S} \times \mathcal{S} \times \mathcal{A} \mapsto [0,1] \\
|
|
||||||
p(s' \;|\; s, a) &:= \mathbb{P}(S_t=s' \;|\; S_{t-1}=s, A_{t-1}=a) \\
|
|
||||||
&= \sum_r p(s', r \;|\; s, a).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
<h2 id="rewarding-the-agent">Rewarding the agent</h2>
|
<h2 id="rewarding-the-agent">Rewarding the agent</h2>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>expected reward</em> of a state-action pair is the function</p>
|
<p>The <em>expected reward</em> of a state-action pair is the function</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
r &: \mathcal{S} \times \mathcal{A} \mapsto \mathbb{R} \\
|
|
||||||
r(s,a) &:= \mathbb{E}[R_t \;|\; S_{t-1}=s, A_{t-1}=a] \\
|
|
||||||
&= \sum_r r \sum_{s'} p(s', r \;|\; s, a).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>discounted return</em> is the sum of all future rewards, with a multiplicative factor to give more weights to more immediate rewards: <span class="math display">\[ G_t := \sum_{k=t+1}^T \gamma^{k-t-1} R_k, \]</span> where <span class="math inline">\(T\)</span> can be infinite or <span class="math inline">\(\gamma\)</span> can be 1, but not both.</p>
|
<p>The <em>discounted return</em> is the sum of all future rewards, with a multiplicative factor to give more weights to more immediate rewards: <span class="math display">\[ G_t := \sum_{k=t+1}^T \gamma^{k-t-1} R_k, \]</span> where <span class="math inline">\(T\)</span> can be infinite or <span class="math inline">\(\gamma\)</span> can be 1, but not both.</p>
|
||||||
|
@ -260,33 +231,14 @@ r(s,a) &:= \mathbb{E}[R_t \;|\; S_{t-1}=s, A_{t-1}=a] \\
|
||||||
<p>A <em>policy</em> is a way for the agent to choose the next action to perform.</p>
|
<p>A <em>policy</em> is a way for the agent to choose the next action to perform.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>A <em>policy</em> is a function <span class="math inline">\(\pi\)</span> defined as</p>
|
<p>A <em>policy</em> is a function <span class="math inline">\(\pi\)</span> defined as</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
\pi &: \mathcal{A} \times \mathcal{S} \mapsto [0,1] \\
|
|
||||||
\pi(a \;|\; s) &:= \mathbb{P}(A_t=a \;|\; S_t=s).
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<p>In order to compare policies, we need to associate values to them.</p>
|
<p>In order to compare policies, we need to associate values to them.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>state-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
<p>The <em>state-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
v_{\pi} &: \mathcal{S} \mapsto \mathbb{R} \\
|
|
||||||
v_{\pi}(s) &:= \text{expected return when starting in $s$ and following $\pi$} \\
|
|
||||||
v_{\pi}(s) &:= \mathbb{E}_{\pi}\left[ G_t \;|\; S_t=s\right] \\
|
|
||||||
v_{\pi}(s) &= \mathbb{E}_{\pi}\left[ \sum_{k=0}^{\infty} \gamma^k R_{t+k+1} \;|\; S_t=s\right]
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<p>We can also compute the value starting from a state <span class="math inline">\(s\)</span> by also taking into account the action taken <span class="math inline">\(a\)</span>.</p>
|
<p>We can also compute the value starting from a state <span class="math inline">\(s\)</span> by also taking into account the action taken <span class="math inline">\(a\)</span>.</p>
|
||||||
<div class="definition">
|
<div class="definition">
|
||||||
<p>The <em>action-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
<p>The <em>action-value function</em> of a policy <span class="math inline">\(\pi\)</span> is</p>
|
||||||
<span class="math display">\[\begin{align}
|
|
||||||
q_{\pi} &: \mathcal{S} \times \mathcal{A} \mapsto \mathbb{R} \\
|
|
||||||
q_{\pi}(s,a) &:= \text{expected return when starting from $s$, taking action $a$, and following $\pi$} \\
|
|
||||||
q_{\pi}(s,a) &:= \mathbb{E}_{\pi}\left[ G_t \;|\; S_t=s, A_t=a \right] \\
|
|
||||||
q_{\pi}(s,a) &= \mathbb{E}_{\pi}\left[ \sum_{k=0}^{\infty} \gamma^k R_{t+k+1} \;|\; S_t=s, A_t=a\right]
|
|
||||||
\end{align}
|
|
||||||
\]</span>
|
|
||||||
</div>
|
</div>
|
||||||
<h2 id="the-quest-for-the-optimal-policy">The quest for the optimal policy</h2>
|
<h2 id="the-quest-for-the-optimal-policy">The quest for the optimal policy</h2>
|
||||||
<h1 id="references">References</h1>
|
<h1 id="references">References</h1>
|
||||||
|
|
|
@ -7,7 +7,16 @@
|
||||||
<title>Dimitri Lozeve - Skills in Statistics, Data Science and Machine Learning</title>
|
<title>Dimitri Lozeve - Skills in Statistics, Data Science and Machine Learning</title>
|
||||||
<link rel="stylesheet" href="./css/default.css" />
|
<link rel="stylesheet" href="./css/default.css" />
|
||||||
<link rel="stylesheet" href="./css/syntax.css" />
|
<link rel="stylesheet" href="./css/syntax.css" />
|
||||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML" async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous" onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
2
site.hs
2
site.hs
|
@ -123,7 +123,7 @@ customPandocCompiler =
|
||||||
newExtensions = defaultExtensions `mappend` customExtensions
|
newExtensions = defaultExtensions `mappend` customExtensions
|
||||||
writerOptions = defaultHakyllWriterOptions
|
writerOptions = defaultHakyllWriterOptions
|
||||||
{ writerExtensions = newExtensions
|
{ writerExtensions = newExtensions
|
||||||
, writerHTMLMathMethod = MathJax ""
|
, writerHTMLMathMethod = KaTeX ""
|
||||||
}
|
}
|
||||||
readerOptions = defaultHakyllReaderOptions
|
readerOptions = defaultHakyllReaderOptions
|
||||||
in do
|
in do
|
||||||
|
|
|
@ -7,7 +7,17 @@
|
||||||
<title>Dimitri Lozeve - $title$</title>
|
<title>Dimitri Lozeve - $title$</title>
|
||||||
<link rel="stylesheet" href="/css/default.css" />
|
<link rel="stylesheet" href="/css/default.css" />
|
||||||
<link rel="stylesheet" href="/css/syntax.css" />
|
<link rel="stylesheet" href="/css/syntax.css" />
|
||||||
<script src='https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=TeX-MML-AM_CHTML' async></script>
|
|
||||||
|
<!-- KaTeX CSS styles -->
|
||||||
|
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.css" integrity="sha384-BdGj8xC2eZkQaxoQ8nSLefg4AV4/AwB3Fj+8SUSo7pnKP6Eoy18liIKTPn9oBYNG" crossorigin="anonymous">
|
||||||
|
|
||||||
|
<!-- The loading of KaTeX is deferred to speed up page rendering -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/katex.min.js" integrity="sha384-JiKN5O8x9Hhs/UE5cT5AAJqieYlOZbGT3CHws/y97o3ty4R7/O5poG9F3JoiOYw1" crossorigin="anonymous"></script>
|
||||||
|
|
||||||
|
<!-- To automatically render math in text elements, include the auto-render extension: -->
|
||||||
|
<script defer src="https://cdn.jsdelivr.net/npm/katex@0.11.0/dist/contrib/auto-render.min.js" integrity="sha384-kWPLUVMOks5AQFrykwIup5lo0m3iMkkHrD0uJ4H5cjeGihAutqP0yW0J6dpFiVkI" crossorigin="anonymous"
|
||||||
|
onload="renderMathInElement(document.body);"></script>
|
||||||
|
|
||||||
</head>
|
</head>
|
||||||
<body>
|
<body>
|
||||||
<header>
|
<header>
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue