Han's Blog

Thoughts on the normative and descriptive perspectives of human decision-making behavior.

  • Home
  • Categories
  • Archives
  • Tags
  • Scribbles

A Discussion on GAN-based IRL Methods

Posted on 2018-08-31 | In Study Notes

Introduction

Inverse reinforcement learning (IRL) is the dual of reinforcement learning (RL) and aims at recovering the reward function that the agent optimizes with. Compared with supervised learning (behavior cloning, BC), IRL imposes more restrictive regularity conditions on the set of policies and is more appropriate in cases with inadequate expert demonstrations and for robust estimations. For example, consider a routing problem on a grid network: the optimal actions taken by the agent are geographically correlated. In this case, compared with BC, IRL has a higher chance of recovering the internal reward structure and leads to more consistent actions across space.

In this post, I will briefly introduce several recent works using the generative adversarial network (GAN) for IRL and discuss their connections.

Read more »

关于共享出行

Posted on 2017-12-27 | In Comment

交通即人与物在时空上的流动。从本质来看,交通系统的设计是为这种流动提供可靠的载体。在交通系统设计不能得到改进的情况下,在提高出行能力(即满足更多的出行需求)与缓解城市拥堵之间必然存在着取舍。这种在出行能力与拥堵之间的取舍平衡,在下面的讨论中中将统一称为系统供需平衡。我这一年在交通方面的研究,主要就是关注于各类共享出行服务(shared mobility)对这种系统供需平衡的影响。

Read more »

Reflection on Simple Queues

Posted on 2016-12-17 | In Reflection

This semester, when I was working on homework of 1.200, there was a frequently asked question: under what situation would the queue system have a steady state? It is claimed that system with finite capacity always have a steady state, but no proof is given in the class. In this post, I will discuss the dynamical system approach for this problem, and suggest several extensions.

Read more »

Case Study: Latent Classes in Quebec Energy Consumption

Posted on 2016-11-06 | In Case Study

Last semester, in course 1.202 Demand Modeling, there is an interesting case study of Quebec energy consumption. Exploratory data analysis indicates that there are potential market segments, and subsequent computation shows that multinomial logit models estimated from separate datasets have a better total log likelihood than the one from the whole dataset. An interesting question then arises: can we discover latent classes efficiently on high dimensional data?

Read more »

Case Study: Delay in Mexico City

Posted on 2016-10-23 | In Case Study

This June, when traveling in Mexico, I encountered one of the most unexpected congestion in my life. The horrible performance of Google Maps was annoying and during the summer I kept thinking about how to model and solve this problem. In last several days, I realized the tools we needed were no more than high school arithmetics.

Read more »

Integer Programming Note V: Decomposition

Posted on 2016-10-04 | In Study Notes

Inner Approximation: $V$-representation, Dantzig-Wolfe decomposition, and Branch and Price

Assume we are given problem

$$\begin{aligned} z_I =\max & \ c^Tx \\ s.t. & \ A_1 x \leqslant b_1 \\ & \ A_2 x \leqslant b_2\\ & \ x \in R^{n_1} \times Z^{n_2} \end{aligned}$$

and we know that LP relaxation for $Q=\{A_2 x \leqslant b_2, x \in R^{n_1} \times Z^{n_2}\}$ is tight but that for the whole problem is weak. One can then think of dual approach utilizing the tighter local constraint structure:

Read more »

Integer Programming Note IV: Cuts

Posted on 2016-10-04 | In Study Notes

Cuts are important tools to strengthen your formulation and improve computation efficiency. In this post, I will introduce two kind of cuts: geometric cuts, which consider the geometric property of the polyhedron and have general application, and implicit cuts, which are more problem specific but potentially can lead to very strong formulation.

Read more »

Integer Programming Note III: Encoding & Embedding

Posted on 2016-10-04 | In Study Notes

Unbalanced Branching and Encoding

Consider constraint $\sum_i x_i= 1$ with $x_i\in \{0,1\}$, which arises naturally in problem with selection (e.g. union of polytope $x\in \cup_i P_i$). Recall that, in standard MIP solver, $x_i$ is first relaxed into continuous variable, and then forced to be integer with B&B. For the constraint above, when we branch on a fractional $x_i$, we have two choices, $x_i=1$ or $x_i=0$, but the result of each branch is different. With $x_i=1$, we immediate know $x_j=0, \forall j\neq i$, even though they are still assumed continuous. But with $x_i=0$, the optimal solution might still contain another fractional $x_j$ and further branching is needed. In extreme case, we need to branch $n-1$ times to locate the optimal integer solution.

Read more »

Integer Programming Note II: Formulation Strength & Size

Posted on 2016-10-04 | In Study Notes

Assume we need to solve a mixed integer linear program (MILP), and the feasible region is $S\subseteq R^{n_1}\times Z^{n_2}$.  For some reasons we don’t have a simple formulation of $S$, and we want to use some polyhedron $P=\{(x,y)\in R^{n_1+n_2}\times R^{m_1+m_2}: Ax+By\leqslant b\}$ and its subset $P_I=P\cap R^{n_1}\times Z^{n_2}\times R^{m_1}\times Z^{m_2}$ to represent $S$. Notice that we can use auxiliary variables $y$; these representation are called extended (sometimes we MUST have some integer variables $y$ to construct a valid formulation; in this case, the formulation is not extended). One important question is: what is a good representation?

Read more »

Integer Programming Note I: Polyhedral Theory

Posted on 2016-10-04 | In Study Notes

In this post, I will briefly review some basic ideas in polyhedral theory, which are the building blocks for more advanced MIP techniques.

Read more »
<i class="fa fa-angle-left"></i>1…34

40 posts
5 categories
11 tags
Homepage Github E-Mail
© 2016 — 2024 Han Qiu
Powered by Hexo
|
Theme — NexT.Muse v5.1.4