Sunday, January 07, 2007

面试编程题的一些经验

发信人: starstorm (尘影), 信区: JobHunting
标 题: 面试编程题的一些经验
发信站: BBS 未名空间站 (Sat Jan 6 18:53:16 2007)


我几个月前开始找工作的时候,一直在这个版上潜水,得到了很多有价值的信息。我总
想有机会回报一下大家。我很早就想把我的一些心得写下来,给大家参考。但因为我不
习惯灌水,刚换了工作,搬了家又比较忙,一直没有动笔。现在比较闲了,于是决定把
它写下来。LP马上也要找工作了,也算是给她攒RP吧。

关于找CS的工作,网上的资料和经验极多。我相信大家对算法,数据结构,面向对象之
类的肯定很熟悉了。我在这里不想重复别人已经讲过的东西。我只是想强调一下解答编
程题的一些原则,这是我一些经验和教训的总结,希望对大家有所帮助。除了举例以外
,我不准备Post我碰到的题目,因为绝大部分的编程题在网上都可以找到。

本人当时有一年多的工作经验,找的是Develper的工作,算是Entry Level或稍微高一
点的级别。所以我碰到的问题,大概和本版上活动的大多数人差不多。我不知道更
Senior的问题会怎么样,但一个朋友告诉我,只要是申请Develper,而不是Manager,
编程的问题是必不可少的。

我两年前找过工作,我发现这两次找工作有一个极其明显的区别。两年前的面世,编程
题强调算法,他们往往会要你写一个非常复杂的算法,但并不要求程序很准确。而现在
的面世,算法的要求大为降低,但其实更有挑战性,因为-面世人要求准确无误。一个
Manager给我出了一道很简单的编程题,写一个函数,我最后用C写完不到50行(还包括{
})。简单吗?不。因为他说,这个程序必须能编译通过,运行无误。

如果你还觉得简单,那我只能佩服。我学了十几年计算机,写的code不算少,特别是工
作一年多来,每天都在写。但我觉得这实在是不容易。

首先,仔细地考虑问题。一道题哪怕做过100遍,也要再三考虑。Do not work too
fast, do not try to solve a lot of questions. That does not give you any BIG
CREDIT! 记住,安全第一,言多必失。If the interviewer expects you to solve 3
problems in 45 mins, and you solve them within 30 mins. She may have to ask
you two more. If you solve them correctly in 15 mins, it may increase your
credit a little bit. But if you make one single mistake, it will be used
against you GREATLY! So, think of the benefit and risk, and consider whether
it is worthy to take the risk. Remember, for a entry-level programmer, the
company just expects you to pass the bar. So do not take risk impressing
them. And for a senior programmer, I also believe it will not increase your
credit a lot just because you can solve some simple programming problems
quicklier than others.

我们碰到的题目,一般会有两类,一类是写一个函数,另一类是设计一个类。还有一类
是设计多个类,以我的经验(包括从bbs上看到的),没有太多的公司考这类问题。

If the task is to implement a function, you need to ask to clarify:
(1) What is the task?
(2) What is the input?
(3) What is the output? How to return normally? How to return an error?
(4) What is the data structure?

First, you may ask the interviewer to give you some input example, then
write some more input example (test case) yourself. 如果要写出没有错误的程序
,就必须象tester那样思考问题,那样给自己挑错。我在后面会介绍一些容易出错的地
方。Test-driven的程序设计方法现在非常流行,我们准备面世当然不用搞得太复杂。
但一些testing的感觉会非常有帮助。Think it this way. What is the purpose of
your program? It is nothing but to process the input data! So when you write
code, you MUST bear it in mind what kind of data this statement, this code
snippet will process? So why not just enumerate all the test case before you
write your function? In my experience, writing test case can help me write
the program easier by just considering how to deal with those cases, and it
makes the program need less modification later!

Test cases for single function are typically a discussion of all the
possible input parameters, good input, boundary and bad input. You need to
pay special attention to boundary and bad input, which mean a branch in your
code.

举例

Implement function
char *strtoken(char *s1, const char *s2)

Strtoken() considers the string s1 to consist of a sequence of zero or more
text tokens separated by single characters from the separator string s2.

First you need to ask the interviewer how to return the result. She may say
the first call, returns the first token, second call returns the second
token, blah... If all the tokens have been returned, the call return NULL.
Then you need to ask her if it OK to change the input string s1, and she may
say it is fine (Actually this is implicited because s1 does not have const
while s2 does, pay attention to such signals. And remember, when you write
function prototype yourself, add const in the parameters whenever possible).
This means you can use a NULL to replace separators in s1 and return a NULL
-ended token, and this returned pointers will point to some place in the
string s1. Then you may ask as this function only returns tokens, not the
pointer to a position in string s1 that is to be searched inthe next call,
how to distinguish each calls. She may ask you to think it yourself (
probably) or she may tell you only the first call has input s1, the
subsequent calls set s1 NULL. Then you should know you need to record the
information of the current position in the function, which means a local
static variable.

Test case:

s2: NULL, empty (*s2 ==\0), one or more chars
use D to represent separator, N non-separator
s1: NULL, empty, all N, all D, starting with N, starting with D, ending with
N, ending with D
A typical s1: NNNNDDDNNNDDDNNND....

After you lists all the test cases, ask the interviewer to review it and
whether she has more to say before you proceed to implement the function.

After you finish the function and the interviewer accept it. You may mention
risks, performance bottlenecks, possible optimizations, and alternative
algorithms.

In this function, you may say the bottleneck is the function to search
whether a char in s1 is in s2, so you may optimize the function find(char ch
, const char * separators). (Suppose you use such a function, and I would
recommend this) Several ways, such as sort the string beforehand and use
binary search, use advanced datastructure such as std:set, or hashtable, or
use an array of size 256 to represent whether an ASCII charactar is in
separators.

Also you may want to mention that it is not thread-safe because of the local
static variable. And propose some solutions such as use a class to wrap the
function.

举例

double atof(const char * str)

Test case:
str: NULL, empty
typical: [+-]?\d+{\.+\d*}? 我用regular expression 表示。这个题如果不写出所有
的test case,很难做对。

Some questions should be asked?
How to deal with non-digit, friendly (ignore) or non-friendly (error), how
to return an error?
How to deal with special case: str is NULL or empty
empty: 0 obvious
NULL? 0 or error?
How to deal with overflow/underflow

举例

walkclock(int hour, int min) return the angel between hour hand and minute
hand.

Questions:
What exactly the function is expected to do? Is the direction a factor (
clockwise or counter-clockwise?) From hour hand to minute handor or the
contrary?
Boundy condition. If the two hands overlap, how to do with it?
What is 值域 (domain), [0, 360) or [-180, 180). I think you must explicitly
write this otherwise it will be very confusing.
Boundary condition, 12:00 (or 0:00)

这三道题都是很简单的题目,但我在interview中都出过错。

另一类题是设计一个class完成某个任务。它的出现频率比第一类少。我只是想简单的
谈一下一些基本的原则。我不准备涉及多个类的design,它们的出现频率更低,而且我
自己也没有把握。

(1) Ask the requirements/responsibilities of the class, a class should have
small set of responsibilities
(2) Write interface (public method to be called by the client) first, do not
forget write constructor if something need to be initialized
(3) Make sure what exactly a class, and its interface do, write some example
(test cases)
(4) For every public method, design and implement following the discuss on
function above
(5) Pay attention to the signature of methods. Use const wheneven possible.
(6) The signature in implementation shoud be the same with the signature in
declaration
(7) Be prepared to deal with errors if anything related to memory allocation
or network conection or file I/O
(8) If constructor use new, or create other types of resource, need to write
destructor, need to override copy constructor and assignment constructor.
(9) If a class is possible to be used as a base class, make destructor
virtual
(10) Make every member as private as possible.

这些原则很简单,但记住它们可以避免很多最愚蠢的错误。我会在心里记住它们,然后
在写程序的时候一条条地检查。

有很多好书介绍C++的设计,所以我就不用再多说了。

下面是我对各种容易导致错误的地方的一个分类总结。当你在编程时看到它们,心里就
要想起它们可能会导致某种类型的错误。

NULL-ended char string (char *)
(1) NULL
(2) empty, pointing to \0
(3) The char array that can accomondate string is strlen()+1
(4) Some function, like atof(), input string has very complicated form.
(5) More than one string, consider the relationship between two strings, e.g
., strtoken()

Function
(1) Return pointer to local variable
(2) Return reference to local variable
(3) Static local variable (not thread safe, can use a class to replace it)
(4) Use const as much as possible
(5) Return type? Value, reference, const reference, pointer?
(6) How to return error?
(7) If the input parameter is an object, better use reference or pointer

Array (vector), pointer to array
Out of band
(1) Index is result of an operation
(2) Index must be unsigned!!! Otherwise need to gurantee it is > 0.
Especially when char is used as index. char is [-128, 127]
(3) For vector, use iterator as much as possible
(4) Must check bound before access. Otherwise, explain to interviewer why
the check is not necessary. e.g., size is 256, index is unsigned char [0,255
].

Link list
(1) what is data structure. Typically
struct Node
{
struct Node * mNext;
int value;
};
(2) Empty list
struct Node * head = NULL
(3) Only one Node in the list
(4) Pay special attention to operation on head or tail
(5) Whether the list has loop

Type declaration, their value ranges
In different architectures, this is not the same. In X86, int is 32 bit.

Expensive initialization
If a varialbe need a loop to intialize, try to delay it as late as possible

Bit operation
Use unsigned char (int, long)
Size of int, char, long

Tree
Is a general tree or binary search tree, or balanced binary search tree
Is it an empty tree

Arithmetic operator
Overflow/underflow, especially pay attention when cast another type to
interger/double. E.g., atof(), must mention the possibility of overflow/
underflow.

Pointer
当你在使用指针时,除非有充分的理由,必须检查它是否为空。例如,double atof(
const char * str)。绝对不要对interviewer说,"How about assuming the input
str is not NULL?" She may say "That is OK". But remember, probably she would
forget it when she criticizes your program in the end during the interview,
or when she discusses your case in the meeting of the hiring committee, she
may submit her note which is the exact copy of what you have written, or
even pictures she took on your code in the board. So DO NOT TRUST her. Just
write two simple lines:
if ( NULL == str)
return 0;
That is simple, and safe.

这是我在找工作中一些经验和教训的总结,希望对大家有所帮助。

最后感谢在这个版上活动的所有的朋友,尽管我在这里从没有说过话,问过问题,但我
的确从这里得到了很大的帮助。祝愿大家在新的一年里,都能找到理想的工作。

No comments: