Merge branch 'master' of https://github.com/algorithmica-org/algorithmica

sslotin · sslotin · commit 1305a253bdc0 · 2022-08-17T09:40:58.000+03:00
diff --git a/content/english/hpc/arithmetic/integer.md b/content/english/hpc/arithmetic/integer.md
@@ -93,7 +93,7 @@ This seems like an important architecture aspect, but in most cases, it doesn't
 - Little-endian has the advantage that you can cast a value to a smaller type (e.g., `long long` to `int`) by just loading fewer bytes, which in most cases means doing nothing — thanks to *register aliasing*, `eax` refers to the first 4 bytes of `rax`, so conversion is essentially free. It is also easier to read values in a variety of type sizes — while on big-endian architectures, loading an `int` from a `long long` array would require shifting the pointer by 2 bytes.
 - Big-endian has the advantage that higher bytes are loaded first, which in theory can make highest-to-lowest routines such as comparisons and printing faster. You can also perform certain checks such as finding out whether a number is negative by only loading its first byte.
 
-Big-endian is also more "natural" — this is how we write binary numbers on paper — but the advantage of having faster type conversions outweigh it. For this reason, little-endian is used by default on most hardware, although some CPUs are "bi-endian" and can be configured to switch modes on demand.
+Big-endian is also more "natural" — this is how we write binary numbers on paper — but the advantage of having faster type conversions outweights it. For this reason, little-endian is used by default on most hardware, although some CPUs are "bi-endian" and can be configured to switch modes on demand.
 
 ### 128-bit Integers
 
diff --git a/content/english/hpc/profiling/noise.md b/content/english/hpc/profiling/noise.md
@@ -1,6 +1,7 @@
 ---
 title: Getting Accurate Results
 weight: 10
+published: true
 ---
 
 It is not an uncommon for there to be two library algorithm implementations, each maintaining its own benchmarking code, and each claiming to be faster than the other. This confuses everyone involved, especially the users, who have to somehow choose between the two.
@@ -111,7 +112,7 @@ for (int i = 0; i < N; i++)
     checksum ^= lower_bound(q[i]);
 ```
 
-It is also sometimes convenient to combine the warm-up run with answer validation, it if is more complicated than just computing some sort of checksum.
+It is also sometimes convenient to combine the warm-up run with answer validation, if it is more complicated than just computing some sort of checksum.
 
 **Over-optimization.** Sometimes the benchmark is outright erroneous because the compiler just optimized the benchmarked code away. To prevent the compiler from cutting corners, you need to add checksums and either print them somewhere or add the `volatile` qualifier, which also prevents any sort of interleaving of loop iterations.
 
@@ -127,10 +128,10 @@ https://github.com/sosy-lab/benchexec
 
 The issues we've described produce *bias* in measurements: they consistently give advantage to one algorithm over the other. There are other types of possible problems with benchmarking that result in either unpredictable skews or just completely random noise, thus increasing *variance*.
 
-These type of issues are caused by side effects and some sort of external noise, mostly due to noisy neighbors and CPU frequency scaling:
+These types of issues are caused by side effects and some sort of external noise, mostly due to noisy neighbors and CPU frequency scaling:
 
 - If you benchmark a compute-bound algorithm, measure its performance in cycles using `perf stat`: this way it will be independent of clock frequency, fluctuations of which is usually the main source of noise.
-- Otherwise, set core frequency to the what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e.g., `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it.
+- Otherwise, set core frequency to what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e.g., `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it.
 - If applicable, turn hyper-threading off and attach jobs to specific cores. Make sure no other jobs are running on the system, turn off networking and try not to fiddle with the mouse.
 
 You can't remove noises and biases completely. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries.
diff --git a/content/russian/cs/decomposition/scanline.md b/content/russian/cs/decomposition/scanline.md
@@ -1,11 +1,12 @@
 ---
 title: Сканирующая прямая
 authors:
-- Сергей Слотин
+  - Сергей Слотин
 prerequisites:
-- /cs/range-queries
-- /cs/segment-tree
+  - /cs/range-queries
+  - /cs/segment-tree
 weight: 1
+published: true
 ---
 
 Метод сканирующей прямой (англ. *scanline*) заключается в сортировке точек на координатной прямой либо каких-то абстрактных «событий» по какому-то признаку и последующему проходу по ним.
@@ -22,7 +23,7 @@ weight: 1
 
 Это решение можно улучшить. Отсортируем интересные точки по возрастанию координаты и пройдем по ним слева направо, поддерживая количество отрезков `cnt`, которые покрывают данную точку. Если в данной точке начинается отрезок, то надо увеличить `cnt` на единицу, а если заканчивается, то уменьшить. После этого пробуем обновить ответ на задачу текущим значением `cnt`. 
 
-Как такое писать: нужно представить интересные точки в виде структур с полями «координата» и «тип» (начало / конец) и отсортировать со своим компаратором. Удобно начало отрезка обозначать +1, а конец -1, чтобы просто прибавлять к `cnt` это значение и на разбирать случае.
+Как такое писать: нужно представить интересные точки в виде структур с полями «координата» и «тип» (начало / конец) и отсортировать со своим компаратором. Удобно начало отрезка обозначать +1, а конец -1, чтобы просто прибавлять к `cnt` это значение и не разбивать на случаи.
 
 Единственный нюанс — если координаты двух точек совпали, чтобы получить правильный ответ, сначала надо рассмотреть все начала отрезков, а только потом концы (чтобы при обновлении ответа в этой координате учлись и правые, и левые граничные отрезки).
 
diff --git a/content/russian/cs/persistent/persistent-array.md b/content/russian/cs/persistent/persistent-array.md
@@ -2,8 +2,9 @@
 title: Структуры с откатами
 weight: 1
 authors:
-- Сергей Слотин
-date: 2021-09-12
+  - Сергей Слотин
+date: {}
+published: true
 ---
 
 Состояние любой структуры как-то лежит в памяти: в каких-то массивах, или в более общем случае, по каким-то определенным адресам в памяти. Для простоты, пусть у нас есть некоторый массив $a$ размера $n$, и нам нужно обрабатывать запросы присвоения и чтения, а также иногда откатывать изменения обратно.
@@ -20,7 +21,7 @@ int a[N];
 stack< pair<int, int> > s;
 
 void change(int k, int x) {
-    l.push({k, a[k]});
+    s.push({k, a[k]});
     a[k] = x;
 }
 
@@ -84,7 +85,7 @@ void rollback() {
 
 ```cpp
 int t = 0;
-vector<int> versions[N];
+vector< pair<int, int> > versions[N];
 
 void change(int k, int x) {
     versions[k].push_back({t++, x});
diff --git a/content/russian/cs/range-queries/sqrt-structures.md b/content/russian/cs/range-queries/sqrt-structures.md
@@ -1,10 +1,10 @@
 ---
 title: Корневые структуры
 authors:
-- Сергей Слотин
-- Иван Сафонов
+  - Сергей Слотин
+  - Иван Сафонов
 weight: 6
-date: 2021-09-13
+date: 2022-08-16
 ---
 
 Корневые оптимизации можно использовать много для чего, в частности в контексте структур данных.
@@ -23,16 +23,15 @@ date: 2021-09-13
 ```c++
 // c это и количество блоков, и также их размер; оно должно быть чуть больше корня
 const int maxn = 1e5, c = 330;
-int a[maxn], b[c];
-int add[c];
+int a[maxn], b[c], add[c];
 
 for (int i = 0; i < n; i++)
     b[i / c] += a[i];
 ```
 
-Заведем также массив `add` размера $\sqrt n$, который будем использовать для отложенной операции прибавления на блоке. Будем считать, что реальное значение $i$-го элемента равно `a[i] + add[i / c]`.
+Заведем также массив `add` размера $\sqrt n$, который будем использовать для отложенной операции прибавления на блоке: будем считать, что реальное значение $i$-го элемента равно `a[i] + add[i / c]`.
 
-Теперь мы можем отвечать на запросы первого типа за $O(\sqrt n)$ на запрос:
+Теперь мы можем отвечать на запросы первого типа за $O(\sqrt n)$ операций на запрос:
 
 1. Для всех блоков, лежащих целиком внутри запроса, просто возьмём уже посчитанные суммы и сложим.
 2. Для блоков, пересекающихся с запросом только частично (их максимум два — правый и левый), проитерируемся по нужным элементам и поштучно прибавим к ответу.
@@ -68,6 +67,7 @@ void upd(int l, int r, int x) {
             l += c;
         }
         else {
+            b[l / c] += x;
             a[l] += x;
             l++;
         }
@@ -111,8 +111,8 @@ vector< vector<int> > blocks;
 // возвращает индекс блока и индекс элемента внутри блока
 pair<int, int> find_block(int pos) {
     int idx = 0;
-    while (blocks[idx].size() >= pos)
-        pos -= blocks[idx--].size();
+    while (blocks[idx].size() <= pos)
+        pos -= blocks[idx++].size();
     return {idx, pos};
 }
 ```
diff --git a/content/russian/cs/sorting/selection.md b/content/russian/cs/sorting/selection.md
@@ -1,6 +1,7 @@
 ---
 title: Сортировка выбором
 weight: 2
+published: true
 ---
 
 Похожим методом является **сортировка выбором** (минимума или максимума).
@@ -10,7 +11,7 @@ weight: 2
 ```cpp
 void selection_sort(int *a, int n) {
     for (int k = 0; k < n - 1; k++)
-        for (j = k + 1; j < n; j++)
+        for (int j = k + 1; j < n; j++)
             if (a[k] > a[j])
                 swap(a[j], a[k]);
 }