In number theoretic algorithmic problems, there arises cases where you need to find all primes below a certain limit, say N. The improved trial division primality check algorithm can be used to find all primes below a certain limit that isn’t too large. But with larger limits, there is a need for a more efficient way. Sieve of Eratosthenes is a 2000 year old algorithm for doing just that, in an efficient manner.
- Create list of all numbers from 2 to N.
- Let p initially equal to 2. The first prime.
- Strike all multiples of p from the list less than or equal to N.
- Find the next number after p, that’s not yet crossed out on the list. This is the next prime.
- Repeat step 3 and 4 for the new p, unless p2 is greater than N.
- Numbers not crossed out on the list are primes below the limit N.
Why does it work:
The reason why Sieve of Eratosthenes works is because each time we start with a prime number ‘p’, and cross out all of its multiples because they can’t be prime as they have ‘p’ as their divisor. To see why the next number in each iteration that isn’t crossed out is a prime, realize that every composite number (a number that isn’t prime) has a prime factor less then itself. But since our next number that we chose as ‘p’ was left uncrossed after we crossed out all multiples of prime numbers less than this new ‘p’, thus it hadn’t any of those primes as its factor, and as a result is a prime itself. As an example, when we chose 7, we know that it wasn’t crossed out as multiple of any of the primes below 7, i.e. 2, 3, and 5, and thus doesn’t have any prime factor less than itself, and is thus a prime.
Implementation in C++:
A simple implementation of Sieve of Eratosthenes in C++ is as following:
using namespace std;
long long limit;
long long sqrtlimit = sqrt(limit);
vector<bool> sieve(limit+1, false);
for (int n=4;n<=limit;n=n+2)
1- We start with an array of size N+1. This is done for ease of understanding so that for each number ‘i’, we can look at array index ‘i’, sieve[i], and know whether it’s a prime or not. We ignore the first two elements in the array, as both 0 and 1 aren’t of interest to us in primality check.
2- We first start by striking out all multiples of two, which is essentially crossing out all even numbers. Since our array type is of bool, setting any index to true means it has been struck out.
3- The outer loop looks for next values of ‘p’ in our algorithm. If ‘n’ isn’t struck out, it’s a prime, and we move into the inner loop.
4- The inner loop strikes out all the multiples of our chosen prime in the outer loop. There are two important points to note here. We start striking out numbers starting with m=n2. This is because any multiple of ‘n’ smaller than n2 would have already been struck out as an nth multiple of numbers smaller than ‘n’. For example when n=5, we know that any multiple of 5 less than m=52=25 is already struck out as a 5th multiple of numbers 2, 3, and 4. (2*5=10, 3*5=15, 4*5=2*10=20). This is also essentially the reason why we are running the outer loop to the sqrt of limit, instead of the limit itself. Second point to note is that the inner loop progresses with a step of 2n. This is because every ‘m’ is going to be odd (as no prime other than 2 is even), and by progressing through a step of 2n, we are essentially ignoring all even multiples of m since they would have already been crossed out.
5- Finally we print the numbers that haven’t been struck out in the array as our left over prime numbers.