题目大意:对于给定的多个字符串,分别输出他们的最长回文字符串长度,一组一行。
Description
Andy the smart computer science student was attending an algorithms class when the professor asked the students a simple question, “Can you propose an efficient algorithm to find the length of the largest palindrome in a string?”
A string is said to be a palindrome if it reads the same both forwards and backwards, for example “madam” is a palindrome while “acm” is not.
The students recognized that this is a classical problem but couldn’t come up with a solution better than iterating over all substrings and checking whether they are palindrome or not, obviously this algorithm is not efficient at all, after a while Andy raised his hand and said “Okay, I’ve a better algorithm” and before he starts to explain his idea he stopped for a moment and then said “Well, I’ve an even better algorithm!”.
If you think you know Andy’s final solution then prove it! Given a string of at most 1000000 characters find and print the length of the largest palindrome inside this string.
Input
Your program will be tested on at most 30 test cases, each test case is given as a string of at most 1000000 lowercase characters on a line by itself. The input is terminated by a line that starts with the string “END” (quotes for clarity).
Output
For each test case in the input print the test case number and the length of the largest palindrome.
Sample Input
abcbabcbabcba abacacbaaaab END
Sample Output
Case 1: 13 Case 2: 6
解题思路
回文串,就是顺着和倒着来看都一样的字符串,是对称的,其对称中心可以是字符串中的字符,如aba,也可以是字符之间的空闲,如abba。为了减少麻烦,我们可以在所有空隙位置(包括首尾)加入特殊字符(原串不存在的字符)“#”,这样对称中心就一定是字符了(如果是空闲,那么对称中心是#)。
怎么计算以每个字符为中心的最长回文串的长度呢?枚举每个字符,暴力扫描,复杂度是O(n*n)的,很容易超时。Manacher算法的复杂度却是O(n)的,因为他利用了对称性原理,避免了很多重复计算。如字符串abcdcba,以s[4](字符d)为中心,回文长度是7,回文覆盖区域是1到7;本实例中,s[5]和s[3]的回文长度是一样的,因为以d为中心折叠一些,两边的字符完全一样。又如字符串ababbaba,以s[4]和s[5]的空隙为中心,回文长度是8,s[4]和s[5]、s[3]和s[6]、s[2]和s[7]、s[1]和s[8]的回文长度也一样,因为这个字符串是对称的。再如字符串cabababab,以s[5]为中心,回文长度是7,回文区域是s[2]到s[8],但s[3]和s[7]的回文长度却不一样了,分别是3和5,这是因为s[9]超出了前一个回文区域,没有对称性了。这是右边多的情况,左边也可以多,如字符串babababac,以s[5]为中心,覆盖区间[2, 8],s[3]的回文长度是5,而s[7]的回文长度是3,因为s[1]不在回文区域内。
Manacher算法是这样的,从左到右依次计算以每个字符为对称中心的最长回文长度,如果下一个字符包含在前一个回文串中,那么由对称性可知其在回文区域内的最长回文长度。如对称中心是i,回文向右扩展的长度是c[i](包括s[i]),如果j在[i, i+c[i])范围内,那么cspan style="color: #ff0000;">j = cstrong>i*2-j或者c[j] = i+c[i] – j,即要么是跟前面一样多,要么是超出回文区域被截取了。当然,这只是至少的情况,后面还需要继续扩展,当s[i+c[i == s[i-c[i时,回文区域就会扩大。由于回文区域右边界最大就扩展到n,区域内的可以根据对称性O(1)求出回文长度,因此整体复杂度是O(n)。
代码中,a记录原串,b记录加入“#”的字符串,c记录向右(向左)扩展的长度(含中心点),k记录回文扩展的最右边,j记录扩展最右边的那个回文中心。这样从左往右一次计算每个点为中心的回文长度:第一个点肯定大于k,从1开始扩展;如果在回文串内,根据对称性得到,如果能继续扩展就继续扩展。计算完i为中心,更新最右边的位置以及中心点。因为加入了#,回文串中有一半+1个是#,回文长度就是c[i]-1。
小实例:211121112111,模拟一遍,可了解流程。