범위에서 임의의 정수 생성

Programming

범위에서 임의의 정수 생성

procodes 2020. 6. 8. 22:02

범위에서 임의의 정수 생성

주어진 범위 (경계 값 포함)에서 임의의 정수를 생성하는 함수가 필요합니다. 나는 불합리한 품질 / 무작위 요구 사항이 아니며, 네 가지 요구 사항이 있습니다.

나는 그것을 빨리해야합니다. 내 프로젝트는 수백만 (또는 때로는 수천만)의 난수를 생성해야하며 현재 생성기 기능은 병목 현상으로 입증되었습니다.
나는 합리적으로 균일해야합니다 (rand () 사용은 완벽하게 좋습니다).
최소-최대 범위는 <0, 1>에서 <-32727, 32727>까지 가능합니다.
시드 가능해야합니다.

현재 다음 C ++ 코드가 있습니다.

output = min + (rand() * (int)(max - min) / RAND_MAX)

문제는 실제로 균일하지 않다는 것입니다 .max는 rand () = RAND_MAX (Visual C ++의 경우 1/32727) 일 때만 반환됩니다. 이것은 마지막 값이 거의 반환되지 않는 <-1, 1>과 같은 작은 범위의 주요 문제입니다.

그래서 펜과 종이를 잡고 다음 공식 ((int) (n + 0.5) 정수 반올림 트릭을 기반으로 함)을 생각해 냈습니다.

여기에 이미지 설명을 입력하십시오

그러나 여전히 나에게 균일 한 분포를 제공하지는 않습니다. 10000 개의 샘플을 반복 실행하면 값 값 -1, 0에 대해 37:50:13의 비율이 표시됩니다.

더 나은 공식을 제안 해 주시겠습니까? (또는 전체 의사 난수 생성기 기능)

귀하의 것보다 빠르지 만 다소 우수하지만 여전히 균등 한 분산 솔루션은

output = min + (rand() % static_cast<int>(max - min + 1))

범위의 크기가 2의 거듭 제곱 인 경우를 제외 하고이 방법은 의 품질에 관계없이 치우친 비 균일 분포 수를 생성합니다rand() . 이 방법의 품질에 대한 종합적인 테스트는 다음을 참조 하십시오 .

가장 간단한 (그리고 최상의) C ++ (2011 표준 사용) 대답은 다음과 같습니다.

#include <random>

std::random_device rd;     // only used once to initialise (seed) engine
std::mt19937 rng(rd());    // random-number engine used (Mersenne-Twister in this case)
std::uniform_int_distribution<int> uni(min,max); // guaranteed unbiased

auto random_integer = uni(rng);

바퀴를 다시 발명 할 필요가 없습니다. 편견에 대해 걱정할 필요가 없습니다. 임의의 시드로 시간을 사용하는 것에 대해 걱정할 필요가 없습니다.

컴파일러가 C ++ 0x를 지원하고이를 사용하는 것이 옵션이라면, 새로운 표준 <random>헤더가 귀하의 요구를 충족시킬 것입니다. uniform_int_distribution최소 및 최대 범위 (필요한 경우 포함)를 허용하는 고품질 을 가지고 있으며 다양한 난수 생성기 중에서 선택하여 해당 분포에 연결할 수 있습니다.

다음은 int[-57, 365]에 균일하게 분포 된 백만 개의 난수를 생성하는 코드입니다 . <chrono>성능이 중요한 관심사라고 언급 할 때 새로운 표준 기능을 사용하여 시간을 측정했습니다.

#include <iostream>
#include <random>
#include <chrono>

int main()
{
    typedef std::chrono::high_resolution_clock Clock;
    typedef std::chrono::duration<double> sec;
    Clock::time_point t0 = Clock::now();
    const int N = 10000000;
    typedef std::minstd_rand G;
    G g;
    typedef std::uniform_int_distribution<> D;
    D d(-57, 365);
    int c = 0;
    for (int i = 0; i < N; ++i) 
        c += d(g);
    Clock::time_point t1 = Clock::now();
    std::cout << N/sec(t1-t0).count() << " random numbers per second.\n";
    return c;
}

나 (2.8GHz Intel Core i5)의 경우 다음과 같이 인쇄됩니다.

초당 2.10268e + 07의 난수.

생성자에 int를 전달하여 생성기를 시드 할 수 있습니다.

    G g(seed);

If you later find that int doesn't cover the range you need for your distribution, this can be remedied by changing the uniform_int_distribution like so (e.g. to long long):

    typedef std::uniform_int_distribution<long long> D;

If you later find that the minstd_rand isn't a high enough quality generator, that can also easily be swapped out. E.g.:

    typedef std::mt19937 G;  // Now using mersenne_twister_engine

Having separate control over the random number generator, and the random distribution can be quite liberating.

I've also computed (not shown) the first 4 "moments" of this distribution (using minstd_rand) and compared them to the theoretical values in an attempt to quantify the quality of the distribution:

min = -57
max = 365
mean = 154.131
x_mean = 154
var = 14931.9
x_var = 14910.7
skew = -0.00197375
x_skew = 0
kurtosis = -1.20129
x_kurtosis = -1.20001

(The x_ prefix refers to "expected")

Let's split the problem into two parts:

Generate a random number n in the range 0 through (max-min).
Add min to that number

The first part is obviously the hardest. Let's assume that the return value of rand() is perfectly uniform. Using modulo will add bias to the first (RAND_MAX + 1) % (max-min+1) numbers. So if we could magically change RAND_MAX to RAND_MAX - (RAND_MAX + 1) % (max-min+1), there would no longer be any bias.

It turns out that we can use this intuition if we are willing to allow pseudo-nondeterminism into the running time of our algorithm. Whenever rand() returns a number which is too large, we simply ask for another random number until we get one which is small enough.

The running time is now geometrically distributed, with expected value 1/p where p is the probability of getting a small enough number on the first try. Since RAND_MAX - (RAND_MAX + 1) % (max-min+1) is always less than (RAND_MAX + 1) / 2, we know that p > 1/2, so the expected number of iterations will always be less than two for any range. It should be possible to generate tens of millions of random numbers in less than a second on a standard CPU with this technique.

EDIT:

Although the above is technically correct, DSimon's answer is probably more useful in practice. You shouldn't implement this stuff yourself. I have seen a lot of implementations of rejection sampling and it is often very difficult to see if it's correct or not.

How about the Mersenne Twister? The boost implementation is rather easy to use and is well tested in many real-world applications. I've used it myself in several academic projects such as artificial intelligence and evolutionary algorithms.

Here's their example where they make a simple function to roll a six-sided die:

#include <boost/random/mersenne_twister.hpp>
#include <boost/random/uniform_int.hpp>
#include <boost/random/variate_generator.hpp>

boost::mt19937 gen;

int roll_die() {
    boost::uniform_int<> dist(1, 6);
    boost::variate_generator<boost::mt19937&, boost::uniform_int<> > die(gen, dist);
    return die();
}

Oh, and here's some more pimping of this generator just in case you aren't convinced you should use it over the vastly inferior rand():

The Mersenne Twister is a "random number" generator invented by Makoto Matsumoto and Takuji Nishimura; their website includes numerous implementations of the algorithm.

Essentially, the Mersenne Twister is a very large linear-feedback shift register. The algorithm operates on a 19,937 bit seed, stored in an 624-element array of 32-bit unsigned integers. The value 2^19937-1 is a Mersenne prime; the technique for manipulating the seed is based on an older "twisting" algorithm -- hence the name "Mersenne Twister".

An appealing aspect of the Mersenne Twister is its use of binary operations -- as opposed to time-consuming multiplication -- for generating numbers. The algorithm also has a very long period, and good granularity. It is both fast and effective for non-cryptographic applications.

int RandU(int nMin, int nMax)
{
    return nMin + (int)((double)rand() / (RAND_MAX+1) * (nMax-nMin+1));
}

This is a mapping of 32768 integers to (nMax-nMin+1) integers. The mapping will be quite good if (nMax-nMin+1) is small (as in your requirement). Note however that if (nMax-nMin+1) is large, the mapping won't work (For example - you can't map 32768 values to 30000 values with equal probability). If such ranges are needed - you should use a 32-bit or 64-bit random source, instead of the 15-bit rand(), or ignore rand() results which are out-of-range.

Here is an unbiased version that generates numbers in [low, high]:

int r;
do {
  r = rand();
} while (r < ((unsigned int)(RAND_MAX) + 1) % (high + 1 - low));
return r % (high + 1 - low) + low;

If your range is reasonably small, there is no reason to cache the right-hand side of the comparison in the do loop.

I recommend the Boost.Random library, it's super detailed and well-documented, lets you explicitly specify what distribution you want, and in non-cryptographic scenarios can actually outperform a typical C library rand implementation.

assume min and max are int values, [ and ] means include this value, ( and ) means not include this value, using above to get the right value using c++ rand()

reference: for ()[] define, visit:

https://en.wikipedia.org/wiki/Interval_(mathematics)

for rand and srand function or RAND_MAX define, visit:

http://en.cppreference.com/w/cpp/numeric/random/rand

[min, max]

int randNum = rand() % (max - min + 1) + min

(min, max]

int randNum = rand() % (max - min) + min + 1

[min, max)

int randNum = rand() % (max - min) + min

(min, max)

int randNum = rand() % (max - min - 1) + min + 1

In this thread rejection sampling was already discussed, but I wanted to suggest one optimization based on the fact that rand() % 2^something does not introduce any bias as already mentioned above.

The algorithm is really simple:

calculate the smallest power of 2 greater than the interval length
randomize one number in that "new" interval
return that number if it is less than the length of the original interval
- reject otherwise

Here's my sample code:

int randInInterval(int min, int max) {
    int intervalLen = max - min + 1;
    //now calculate the smallest power of 2 that is >= than `intervalLen`
    int ceilingPowerOf2 = pow(2, ceil(log2(intervalLen)));

    int randomNumber = rand() % ceilingPowerOf2; //this is "as uniform as rand()"

    if (randomNumber < intervalLen)
        return min + randomNumber;      //ok!
    return randInInterval(min, max);    //reject sample and try again
}

This works well especially for small intervals, because the power of 2 will be "nearer" to the real interval length, and so the number of misses will be smaller.

PS
Obviously avoiding the recursion would be more efficient (no need to calculate over and over the log ceiling..) but I thought it was more readable for this example.

이에 대한 공식은 매우 간단하므로이 표현을 사용해보십시오.

 int num = (int) rand() % (max - min) + min;  
 //Where rand() returns a random number between 0.0 and 1.0

내가 실수하지 않으면 다음 표현은 편견이 없어야합니다.

std::floor( ( max - min + 1.0 ) * rand() ) + min;

여기서 rand ()는 0.0을 포함하지 않고 0.0과 1.0 사이의 임의의 값을 1.0을 포함하지 않으며 max와 min은 min <max. 인 조건의 정수라고 가정합니다.

참고 URL : https://stackoverflow.com/questions/5008804/generating-random-integer-from-a-range

저작자표시 (새창열림)

'Programming' 카테고리의 다른 글

MySQL 테이블에서 중복을 삭제하는 방법은 무엇입니까? (0)	2020.06.08
Django ORM 쿼리 세트의 해당 SQL 쿼리를 보는 방법은 무엇입니까? (0)	2020.06.08
예 / 아니요 입력과 같은 APT 명령 행 인터페이스? (0)	2020.06.08
서클 C에서 com.android.tools.build:gradle:3.0.0-alpha1을 찾을 수 없습니다. (0)	2020.06.08
"camelCase"를 "Camel Case"로 변환하는 방법? (0)	2020.06.08

현재글범위에서 임의의 정수 생성

procodes

범위에서 임의의 정수 생성

범위에서 임의의 정수 생성

'Programming' 카테고리의 다른 글

'Programming'의 다른글

티스토리툴바

범위에서 임의의 정수 생성

범위에서 임의의 정수 생성

'Programming' 카테고리의 다른 글

'Programming'의 다른글

관련글

티스토리툴바