09年感悟

上半年顺风顺水,有付出也有收获,良好的状态持续到暑假实习结束。下半年压力陡增,主要包括就业压力和毕设压力。一个是因为国内就业竞争的激烈,另一个是因为十分具有挑战性的毕设课题。

压力。有压力是好事,在恰当的时候它能转换为我的动力,督促我的行为,间接促进我的进步。但是对待压力也要心态平和,不要过于焦虑、消沉,否则会陷入非常被动的局面。看过一句话,压力大就是因为自信心不足。自信心是建立在刻苦的努力之上的。所以最好的状态还是Take it easy,找准最需要下功夫的地方,多勤奋一点,多努力一点,付出的比别人多自然就有丰厚的回报,而且随着一点一滴的积累,自信心自然也就来了。

定位。首先是找到并持续的激发自己的兴趣点,从事自己真正发自内心喜欢并且觉得有意义的事情,这往往是成功的第一步。不要被所谓的“热潮”牵着鼻子走,分散了自己的注意力不说,这些不是自己最感兴趣的事情往往也不太适合自己。最好的学习和工作状态是每天早晨睁开眼就很兴奋很期待今天要做的事情,不管是它能给他人带来很积极的影响,或是能提升自己的专业技能,都能给自己带来满足感、成就感。最怕的就是对事情只有三分钟热情,当几天过去热情不在,或是碰到困难后就放弃,这样往往最后就是竹篮打水一场空。其次就是对症下药,先找自己最薄弱的环节并积极去弥补它。不要左一下右一下,结果等到面对问题的时候那些最需要的技能还没准备好那就完了。这里引出了另一个话题,专注。

专注。提高学习效率最有效的办法就是专注。首先从生活习惯上改善,每天划出几大块完整的学习时间段有助于保持专注。把学习的精力集中在能提升自己核心竞争力的技能上,而不是“花拳绣腿”,看上去新鲜实际上没啥用处。其次,注意任务切换(Context Switch)是要花费时间和精力的。这体现在学习的过程中,例如你学着学着突然想上下网,结果这一上就是十几二十分钟,等你再切回学习这个进程,你得先花十几分钟找回刚刚学习的感觉,才能继续开始,这样宝贵的三十分钟就没了。解决办法就是尽量减少Context Switch的次数。

计划。短期来看,给事情安排好优先级,给每天的工作都制定好计划;长期来看,给职业生涯做好规划。这样能有效的减少学习时的无所事事的(idle)时间(参考暗时间一文),有效的提升productivity。以前我不相信我可以同时做好两三件大事,但是2010年我想挑战一下自己,从每天的时间安排做起,争取高效率快节奏的同时把两到三件事情给同时做好,因为我导师就是这么干的。我问过我导师你这么忙胆识还能把事情安排的井井有条(又是搞研究开会又是开公司)是不是因为你已经成功的把自己并行化了,他笑说他希望他有One million cores,这样他就能把所有的事情都处理的来。当然了,其实我发现一个重要原因是因为他有严格的时间管理方法,以小时为单位来详细安排自己每天的行程。

思考。如果每天只是忙忙碌碌但是不进行足够深入的思考并总结自己得到的经验教训,所能得到的进步就会不够多。每天起床吃饭学习然后睡觉,但是不思考,就会停滞在一个思维水平上,往往不能得到大突破。刘未鹏的博客有很多关于“思考”的好文,今年我的目标就是多思考,多总结,从而多进步。

宠辱不惊。心智的成熟体现在“不以物喜,不以己悲”,抗压能力,调整能力等等上。我现在这个阶段,开始从校园走向社会,各方面的压力会迎面而来。这个时候更需要自己有良好的心态,积极的调整自己,善于化解压力,善于自我激励,时刻把握住自己的目标,不迷失自己的方向。

眼光。眼光放长远些,明年的目标只是第一步,五年乃至十年的目标才是更值得关注的。当然,第一步的起点如果够高会很有帮助。

09年大事记:

09 Jan – 09 Mar

TDA297 Distributed Systems II, EDA281 Parallel Computer Organization and Design

09 Mar – 09 May

TIN092 Algorithm, EDA203 Unix Internal, Internship Applications & Interviews

09 Jun – 09 Jul

Summer Intern at Nema Labs

09 Aug

Summer vacation in China

09 Sep – 09 Oct

TDA381 Concurrent Programming (pending), DAT145 Advanced topic in NDS

09 Oct – 09 Dec

DAT105 Computer Architecture, Master Thesis

10年计划:

09 Jan – 09 Mar

TDA231 – Algorithms for machine learning and inference

09 Jan – 09 Sep

Master Thesis

09 Jun – 09 Aug

Summer Internship

09 Fall

To be continued.

Proposal for the “Search and sort” competition of Findwise

In this April I took part in a competition hold by Findwise and Mriday which is about search technology.

Search and Sort | Findwise

Current, Search and Sort | Findwise April 25th, 2009

We are constantly acquiring innovative ideas and solutions in the field of search technology. Therefore we have created the following contest to discover people who are interested in joining Findwise and build next generation’s search technology.

Project overview

The name of this contest is called Search and Sort. We can start by looking at an example which everybody is familiar with, Google. The Google search engine is the most used search engine on the Web. The search results generated from it includes webpages, PDF, Word documents, Excel spreadsheets, Flash, videos etc. For any query, up to the first 1000 results can be shown with a maximum of 100 displayed per page.

Despite all this power, it is still sometimes time consuming to find the exact piece of information you are looking for. This is because although the different results are ranked, they are not well organized. For the average user, wouldn’t it be neat if different types of search results are categorized and displayed in different
groups?

Submission

Your submission for this contest should contain two parts

Think of a search engine based on the concept of Search and Sort. Come up with a user interface design including two pages. One welcome page with the search box (and whatever else you think is suitable), and one page with the different types of results categorized, sorted, and presented in a userfriendly fashion. There is no strict requirements on exactly how the results will be categorized. It is entirely up to you to decide the types of categories. In fact, this will be a key deciding factor when your contest submission is being reviewed.

The second part of the contest is to discuss the framework behind your graphical user interface. What programming language and platform do you suggest for building the system? How would you extract information from different types of results and use that information to categorize them? Describe the plan to develop and implement it. The key for this part is to show a good understanding of the basics of a search engine, and a passion to innovate new ideas.

Searching has become the standard way for Internet users to find information. This contest gives you the chance to take Searching to the next level. If you are interested in search technology and would like to join the leading vendor independent company within this segment in Sweden, then send us your ideas. We will carefully review your submission and provide feedback. Your submission should be in PDF or DOC format.

The deadline for submission is 2009-04-30

Reward

The top 5 submissions will be invited to Findwise and receive a learning session about the company and the future of search technology. The most outstanding submission will receive a monetary reward of 10 000 SEK. Job offers will be presented to qualified individuals if requirements are met.

I spent a whole Sunday to write down a proposal for this topic. That’s the first time for me to write down something on “search technology” which is a very interesting and hot area nowdays. Even though this paper looks a lit bit naive now, I still like it since I enjoy the feeling of writing down something interesting very much.

You can find and download my proposal from the link below:

Search and sort.pdf

在瑞典打甲流疫苗

如果在哥德堡想打甲流H1N1疫苗的话可以参考这里

在页面左侧栏列出了哥德堡不同的地区的Vårdcentraler,离Chalmers最近的Vårdcentraler是属于Centrum地区。我去的是Vårdcentralen Gibraltargatan,就在图书馆旁边。打开Vårdcentralen Gibraltargatan的页面后有专门的接种时间等信息:

Våra vaccinationstider

För information om influensan och vaccineringen

一般他们都是8点开门,但是疫苗现在还是比较抢手,昨天11点到的时候已经没有了,所以今天我是早上8点到的,排到了24号,等了半个多小时轮到了我。等待时填了个人信息表,然后进去打针,打完后医生会给你写好一个小卡片让你留底。但是个人信息表是瑞典语的,以下是粗略翻译(仅供参考):

underlag för pandemivaccination (pandemrix)

för patienten

inför vaccinationen mot den pandemiska influensan ber vi dig svara på följande frågor

1. har du tidigare fatt nagon allvarlig reaktion (som yrsel, svimning, andnod eller utslag)

ja/nej/vet ej

2. ar du allergisk mot agg

3. medicinerar du med nagon blodfortunnande medicin, t.ex. waran eller fragmin? (galler ej trombyl)

4. har du nagon sjukdom eller medicin som påverkar ditt immunforsvar

5. har du nyligen (inom 1 manad) fatt nagot annat vaccin

6. tillhor du nagon medicinsk riskgrupp for den pandemiska flu

———————————

basis for pandemic vaccination (Pandemrix)
the patient

for vaccination against pandemic influenza, please answer the following questions
1. have you ever had any severe reaction (such as dizziness, fainting, shortness of breath or a rash)
yes/no/do not know
2nd Are you allergic to eggs
3rd medication you with any blood-thinning medicine, such as warfarin or Fragmin? (Not Trombyl)
4th you have an illness or medication that affects your immune system
5th have you recently (within 1 month) received any other vaccine
6th you belong to any medical risk groups for pandemic flu

An interesting algorithm problem: the longest plateau

Recently I met an interesting algorithm problem:

Problem:

Given an array, try to develop an efficient algorithm which can compute the length of the longest plateau. A plateau is a consecutive segment of an array with equal contents. For example, if x[] = {1, 2, 3, 4, 4, 4, 5, 5, 6}, then we have six plateaus which are 1, 2, 3, 4-4-4, 5-5 and 6. And obviously the length of the longest plateaus is 3.

Analysis:

Well, a straightforward idea is try to firstly compute all the length of different plateaus from left to right and then select the longest length. The pseudo-code is like this:

for each element in the array a[]
     if a[i] is equal to a[i-1]
          add 1 to the length for the current plateau
          check whether the current length is the longest one so far
     else
          reset length to 1 // plateau length is at least 1

Whether we need line 5&6 depends on whether we need to store the length of every plateau. If we just want to calculate the longest length then we can keep the code and use the “length” as a temp variable which is only used inside the loop. On the other hand, if we need to keep track of the length of all plateaus, we need to use an array of “length[]” to store the needed information.

/*
 * input: an array a[], the length of the array n
 * output: the length of the longest plateau
 */
int longestPlateau (int a[], int n)
{
	if (n == 0)
 		return 0;

	int length = 1;
	int longestLength = 1;

	for (int i = 1; i<n; i++)
 	{
		if (a[i] == a[i-1])
 		{
 			length++;
			longestLength = max(longestLength, length);
 		}
 		else
 			length = 1;
	}
	return longestLength;
}

Some more:

What if the given array is sorted (in the increasing order) already?

Actually if the array is sorted, the algorithm can be much simpler:

assume the longest length now is L, then we just need to compare a[i] and a[i-L], if they are equal then all the elements between them are also equal (since this is a sorted array!), and we can add 1 to the current longest length. The code looks like this:

/*
 * input: an sorted array a[] (increasing order), the length of the array n
 * output: the length of the longest plateau
 */
int longestPlateau (int a[], int n)
{
	if (n == 0)
 		return 0;

	int length = 1;
	for (int i = 1; i<n; i++)
 	{
		if (a[i] == a[i-length])
 			length++;
	}
	return length;
}

Launched my master thesis finally

Today I talked with my supervisor again and finally and officially launched my Master Thesis!

The next coming 12 months I will focus on developing a methodology on analyzing the performance bottleneck of large-scale multi-threaded software. I am really excited in the challenging problem about how to find the performance bottleneck when we are moving to many-core systems with hundreds of threads running concurrently. There will be a lot of fun and I believe I can learn a lot from it.

Another fun thing is, in the project plan written by my supervisor, the total budget is written as “81 000 KSEK” incorrectly, I hope Ericsson will not notice that mistake and if so, they have to pay a lot of  money to us since they have approved this project:)