一篇學術論文除了摘要給人的第一個感覺很重要外，第一章的導論部分可能也是決定你這篇論文會不會被接受的關鍵，很多人一樣的要把導論留到最後寫，我的建議是一開始就要寫，寫完就交給指導教授改，因為在導論中，問題的定義已經出來，對問題的約略走向也已經約略的講清楚。大部分寫科技論文的學生對方法的描述、實驗的進行、資料的分析等可能比較沒有問題，但要整理出一篇學術論文可能會有障礙，這部分要和指導教授多溝通幾次。

導論要怎麼寫？「起承轉合」這四個字倒可以幫你的論文導論建立一個架構：

**起：**問題描述，**從遠到近**交代你的論文要解問題的領域。

**承：**相關研究，具體的說明這個問題的領域有兩些相關的研究。你可以將相關研究分成兩或三大類（有需要的話，大類還可以，再分一級），然後每一大類約略的依年代介紹相關研究的方法和它們可能的缺點。

**轉：**研究的必要性，既然有了一些相關研究，那為什麼你還要做這個研究。

**合：**研究目的（Purpose),用很明確的文字定義你的研究目標，有時也可以再加上對自己方法價值的評斷。

**再轉：**對一篇完整的論文而言，導論是「起」，所以，一般的學位論文或期刊長文導論的最後一段還要再加上後面各章節的簡要描述。

下面我就用一篇2006年發表在IEEE trans. on Pattern Analysis and Machine Intelligence上面的一篇文章Discriminative Common Vectors for Face Recognition來跟大家做說明，因為這篇講「臉部辨識」的文章是長文，導論的部分有點長：

**起：問題描述**

RECENTLY, due to military, commercial, and law enforcement applications, there has been much interest in automatically recognizing faces in still and video images. This research spans several disciplines such as image processing, pattern recognition, computer vision, and neural networks. The data come from a wide variety of sources. One group of sources is the relatively controlled format images such as passports, credit cards, photo IDs, drivers’ licenses, and mug shots.

這篇文章的導論一開始講應用，上面的敘述比較一般化，後面的敘述將範圍縮小：A more challenging class of application imagery includes real-time detection and recognition of faces in surveillance video images, which present additional constraints in terms of speed and processing requirements [1].

下面開始定義技術問題：Face recognition can be defined as the identification of individuals from images of their faces by using a stored database of faces labeled with people’s identities.

臉部辨識為什麼很困難，它們受到哪些因素的影響：This task is complex and can be decomposed into the smaller steps of detection of faces in a cluttered background, localization of these faces followed by extraction of features from the face regions, and, finally, recognition and verification [2].

It is a difficult problem as there are numerous factors such as 3D pose, facial expression, hair style, make up, etc., which affect the appearance of an individual’s facial features. In addition to these varying factors, lighting, background, and scale changes make this task even more challenging. Additional problematic conditions include noise, occlusion, and many other possible factors.

**承：相關研究─**

Many methods have been proposed for face recognition within the last two decades [1], [3]. （提出什麼方法都用現在完成式來表達）

相關方法出來了：Among these methods, appearance-based approaches operate directly on images or appearances of face objects, and process the images as two-dimensional holistic patterns. In these approaches, a two-dimensional image of size w by h pixels is represented by a vector in a wh-dimensional space. Therefore, each facial image corresponds to a point in this space. This space is called the sample space or the image space, and its dimension typically is very high [4]. 注意：對這些方法的描述全部用現在式，這是因為這些方法現在進行也是這樣（事實）。

這一類方法會碰到的問題：However, since face images have similar structure, the image vectors are correlated, and any image in the sample space can be represented in a lower-dimensional subspace without losing a significant amount of information.

解決問題的方法1：The Eigenface method has been proposed for finding such a lowerdimensional subspace [5]. The key idea behind the Eigenface method, which uses Principal Component Analysis (PCA), is to find the best set of projection directions in the sample space that will maximize the total scatter across all images such that （數學式子，省略）is maximized. Here, ST is the total scatter matrix of the training set samples, and W is the matrix whose columns are the orthonormal projection vectors. The projection directions are also called the eigenfaces. Any face image in the sample space can be approximated by a linear combination of the significant eigenfaces. The sum of the eigenvalues that correspond to the eigenfaces not used in reconstruction gives the mean square error of reconstruction.

方法缺點1.1：This method is an unsupervised technique since it does not consider the classes within the training set data. In choosing a criterion that maximizes the total scatter, this approach tends to model unwanted within-class variations such as those resulting from the differences in lighting, facial expression, and other factors [6], [7].

方法缺點1.2：Additionally, since the criterion does not attempt to minimize the withinclass variation, the resulting classes may tend to have more overlap than other approaches. Thus, the projection vectors chosen for optimal reconstruction may obscure the existence of the separate classes.

方法2：這個方法與前面的方法有關，它是為了改善方法1的缺點來的；基本上，如果你有兩類的方法，你要交代這兩類方法不同的地方在那裡。

The Linear Discriminant Analysis (LDA) method is proposed in [6] and [7]. （這裡用現在式有點不太正確，我會建議用過去式，因為這兩篇論文的提出已經過去了。）

方法2約略的描述：This method overcomes the limitations of the Eigenface method by applying the Fisher’s Linear Discriminant criterion. This criterion tries to maximize the ratio（數學式子，省略）where SB is the between-class scatter matrix, and SW is the within-class scatter matrix.

方法2的特點/優點：Thus, by applying this method, we find the projection directions that on one hand maximize the Euclidean distance between the face images of different classes and on the other minimize the distance between the face images of the same class. This ratio is maximized when the column vectors of the projection matrix W are the eigenvectors of ....

方法2的問題：In face recognition tasks, this method cannot be applied directly since the dimension of the sample space is typically larger than the number of samples in the training set. As a consequence, SW is singular in this case. This problem is also known as the “small sample size problem” [8].

方法2的相關研究論文四篇：In the last decade numerous methods have been proposed to solve this problem, Tian et al. [9] used the Pseudoinverse method by replacing ... with its pseudoinverse. The Perturbation method is used in [2] and [10], where a small perturbation matrix is added to SW in order to make it nonsingular. Cheng et al. [11] proposed the Rank Decomposition method based on successive eigendecompositions of the total scatter matrix ST and the between-class scatter matrix SB.

相關研究的缺點：However, the above methods are typically computationally expensive since the scatter matrices are very large (e.g., images of size 256 by 256 yield scatter matrices of size 65,536 by 65,536). Swets and Weng [7] proposed a two stage PCA+LDA method, also known as the Fisherface method, in which PCA is first used for dimension reduction so as to make SW nonsingular before the application of LDA.

針對上面缺點的改進：However, in order to make SW nonsingular, some directions corresponding to the small eigenvalues of ST are thrown away in the PCA step. Thus, applying PCA for dimensionality reduction has the potential to remove dimensions that contain discriminative information [12], [13], [14], [15], [16]. Chen et al. [17] proposed the Null Space method based on the modified Fisher’s Linear Discriminant criterion（數學式子，省略）This method has been proposed to be used when the dimension of the sample space is larger than the rank of the within-class scatter matrix SW.

其他改進1：It has been shown that the original Fisher’s Linear Discriminant criterion can be replaced by the modified Fisher’s Linear Discriminant criterion in the course of solving the discriminant vectors of the optimal set in [18].

作法：In this method, all image samples are first projected onto the null space of SW, resulting in a new within-class scatter that is a zero matrix. Then, PCA is applied to the projected samples to obtain the optimal projection vectors.

其他改進2：Chen et al. also proved that by applying this method, the modified Fisher’s Linear Discriminant criterion attains its maximum.

**轉：研究的必要性**

However, they did not propose an efficient algorithm for applying this method in the original sample space. Instead, a pixel grouping method is applied to extract geometric features and reduce the dimension of the sample space. Then, they applied the Null Space method in this new reduced space.

一些觀察：In our experiments, we observed that the performance of the Null Space method depends on the dimension of the null space of SW in the sense that larger dimension provides better performance. Thus, any kind of preprocessing that reduces the original sample space should be avoided.

**回頭 再「承」：**

方法3.1：Another novel method, the PCA+Null Space method, was proposed by Huang et al. in [15] for dealing with the small sample size problem.

方法描述：In this method, at first, PCA is applied to remove the null space of ST , which contains the intersection of the null spaces of SB and SW. Then, the optimal projection vectors are found in the remaining lowerdimensional space by using the Null Space method. The difference between the Fisherface method and the PCA+Null Space method is that for the latter, the within-class scatter matrix in the reduced space is typically singular. This occurs because all eigenvectors corresponding to the nonzero eigenvalues of ST are used for dimension reduction.

方法3.2：Yang et al. applied a variation of this method in [16]. After dimension reduction, they split the new within-class scatter matrix, ... (where PPCA is the matrix whose columns are the orthonormal eigenvectors corresponding to the nonzero eigenvalues of ST ), into its null space .... Then, all the projection vectors that maximize the betweenclass scatter in the null space are chosen. If, according to some criterion, more projection vectors are needed, the remaining projection vectors are obtained from the range space.

方法3.1和3.2的缺點：Although, the PCA+Null Space method and the variation proposed by Yang et al., use the original sample space, applying PCA and using all eigenvectors corresponding to the nonzero eigenvalues make these methods** impractical** for face recognition applications when the training set size is large. This is due to the fact that the computational expense of training becomes very large.

方法4：Last, the Direct-LDA method is proposed in [12].

方法描述：This method uses the simultaneous diagonalization method [8]. First, the null space of SB is removed and, then, the projection vectors that minimize the within-class scatter in the transformed space are selected from the range space of SB.

方法的缺點：However, removing the null space of SB by dimensionality reduction will also remove part of the null space of SW and may result in the loss of important discriminative information [13], [15], [16]. Furthermore, SB is whitened as a part of this method. This whitening process can be shown to be redundant and, therefore, should be skipped.（講方法的缺點就是為了「轉」到自己的研究。）

**合：研究目的**

In this paper, a new method is proposed which addresses the limitations of other methods that use the null space of SW to find the optimal projection vectors.

本研究的限制：Thus, the proposed method can be only used when the dimension of the sample space is larger than the rank of SW.

**再轉：**

The remainder of the paper is organized as follows: （這一句話是科技八股，照著寫就對了；也請注意標點符號。）

In Section 2, the Discriminative Common Vector approach is introduced. In Section 3, we describe the data sets and experimental results. Finally, we formulate our conclusions in Section 4.

本想再剖析另外一篇論文的導論，但篇幅有點太長了，就此打住。要寫論文的人就根據上面起承轉合的四步寫法，找兩到三篇期刊論文來剖析一下，分析一下每一篇論文寫法的「同、異」，然後將之記錄下來，以後自己的寫作就會順利的多。

（後註：這一篇我整理了兩個多小時喔。）

## 留言列表