[CF 1270F]Awesome substrings


Difficulty Ratings

Idea Creativity: 3/5
Techniques Used: 2/5
Implementation : 3/5
#beattheproblemsetter

Problem Link

Link by antontrygubO_o.

$O(N\sqrt{N})$ Solution

Not going to describe much because editorial talked about this. Basically, you seperate the awesome substrings into 2 cases, one is number of '1's less than $K$, and one is number of '1's more than or equal to $K$. Then you can do $NK$ on case 1 and $\frac{N^2}{K}$ on case 2.

$O(Nlog^2(N))$ Solution by gs12117

Consider you plot a path in a coordinate plane, where you start at origin, walk rightwards if you encounter a '1' or upwards otherwise. Then we are finding the number of pairs of points that have an integer slope. So assume we are enumerating $K$ from 0 to $N$, the slope and count the number of valid pairs which has slope equals to $K$. Lets see what happens when we fix $K$.

Imagine we building a graph, where each $x$ coordinate is a node, and an edge is built between $x_i$ and $x_j$ if there exists point on path $(x_i,y_i)$ and $(x_j,y_j)$ such that $\frac{y_j-y_i}{x_j-x_i} \ge K$. We can see that each connected component must be an interval. To see that, imagine 3 points $(x_i,y_i)$,$(x_k,y_k)$,$(x_j,y_j)$ where $\frac{y_j-y_i}{x_j-x_i} \ge K$ and $x_i \lt x_k \lt x_j$ holds. Then, at least one of $\frac{y_j-y_k}{x_j-x_k} \ge K$ and $\frac{y_k-y_i}{x_k-x_i} \ge K$ holds. Obviously, we only need to care about pairs that lie on the same component. Lets look at one of the connected components $[x_l,x_r]$.

Here comes the catch. Lets analysize the number of 0 that appeared in $[x_l,x_r]$. Lets define a good pair to be $(i,j)$ where $\frac{y_j-y_i}{x_j-x_i} \ge K$ holds. Lets say if we take a pair, we cover the whole segment of the path from $(x_i,y_i)$ to $(x_j,y_j)$. Now, lets try and take minimal pairs to cover the subpath in $[x_l,x_r]$. Firstly, as $[x_l,x_r]$ is connected, there exists a way to cover the whole path. Then, an important claim is: there are no parts of the subpath in the optimal solution that is covered by at least 3 pairs. Lets assume the claim is not true and imagine 3 pairs (illustrated as segments here) $[l_x,r_x],[l_y,r_y],[l_z,r_z]$ where $l_x,l_y,l_z < p$ and $r_x,r_y,r_z > p$ and that we have chosen all three. Call the $w$ such that $l_w$ is minimal (if tie pick any) to be unavailable. Similarly, call the $v$ where $r_v$ is maximal unavailable. So, one of $x,y,z$ must still be available. We remove that pair and then it will cover the whole segment. Therefore, the assumption that the set of pairs we take is minimal is not correct and contradiction occurs.

You might wonder, why are we doing all this stupid construction. Why is it related to the problem. Lets try to draw conclusions from the above construction. Let $c_0$ be the number of zeroes in the subpath $[x_l,x_r]$ and $c_1$ similarly. Let $d_0$ be the number of $0$ that is covered (if you have covered a $0$ twice then it is counted twice), and $d_1$ similarly (remember each part of the path represents one digit). We have $d_0 \ge d_1 \times K$, $c_0 \ge \frac{d_0}{2}$, $d_1 \ge c_1$ and $c_1 = x_r-x_l$, so $c_1\times(x_r-x_l)\times \frac{K}{2} \le c_0$. However, the total sum of $c_0+c_1$ is $N$. So sum of $x_r-x_l$ for all possible connected components in all possible $K$ is less than $\frac{N}{1}+\frac{N}{2}+...+\frac{N}{N}$ which is $O(Nlog(N))$.

So it remains to 1) Find all connected components $[x_l,x_r]$ where $x_l \neq x_r$ for eack $K$ and 2) Count the number of valid pairs in a connected component in $O((x_r-x_l) \times log(N))$. To do 1), we notice that if an edge $x_i - x_j$ exists during some $K$, it must exist for smaller values of $K$ too. Furthermore, it is easy to calculate, using convex hull, at which $K$ do $x_i$ stop having neighbours. If $x_i$ has a neighbour, it must belong to a connected component with size at least 2. Therefore, we are left with a total of $O(Nlog(N))$ checks to find the neighbours of some $x_i$. Also, obviously you only care about its leftmost neighbour and rightmost neighbour. To check if $x_i$ and $x_j$ are connected for a specific $K$ ( $i < j$ ), it actually means to compare whether $pf(j)-pf(i)$ is $\ge 0$, where $pf(x)$ is defined to be the prefix sum when $1$ has value $-K$ and $0$ has value $1$. Thus, you can find its useful neighbours in $O(log(N))$ using binary search, giving $O(Nlog^2(N))$ total.

To do 2), we can sort the $pf$ values and count the number of pairs of equal value, which is same as the editorial. So it also takes $O(Nlog^2(N))$.

Faster???

Yes, it is possible to make it $O(Nlog(N))$. For 1), instead of finding neighbours and build component, you realise that components are not important, and you can just compute answer for every continuous segment that has a neighbour. That means, even when the connected components are $[1,4]$ and $[5,8]$, you can check $[1,8]$ directly, and it hs no harm. So the binary search is redundant and we only need $O(N)$ for the convex hull. For 2), instead of sorting standardly, we can use radix sort. You can dump all the arrays you need to sort in to one bin, then you run radix sort, and do your tricks. So we removed the comparison sorting and made it $O(Nlog(N))$. Of course, its just for theoretical satisfaction as the $O(Nlog^2(N))$ works as like $O(Nlog(N))$ anyway.

See also

BtOI18 Alternating which (my solution) uses the same idea as the proof of the claim above. It is quite interesting.

Comments

Anton Trygub setting cool problems without himself noticing... Dude, I wrote an $O(N\sqrt{N})$ with bad constant and didn't succeed in squeezing through TL even with 1 hour of hard work. But now that a solution with faster time complexity exist, I cannot blame...

留言