| |
- add_docstring(...)
- add_docstring(obj, docstring)
Add a docstring to a built-in obj if possible.
If the obj already has a docstring raise a RuntimeError
If this routine does not know how to add a docstring to the object
raise a TypeError
- add_newdoc_ufunc(...)
- add_ufunc_docstring(ufunc, new_docstring)
Replace the docstring for a ufunc with new_docstring.
This method will only work if the current docstring for
the ufunc is NULL. (At the C level, i.e. when ufunc->doc is NULL.)
Parameters
----------
ufunc : numpy.ufunc
A ufunc whose current doc is NULL.
new_docstring : string
The new docstring for the ufunc.
Notes
-----
This method allocates memory for new_docstring on
the heap. Technically this creates a mempory leak, since this
memory will not be reclaimed until the end of the program
even if the ufunc itself is removed. However this will only
be a problem if the user is repeatedly creating ufuncs with
no documentation, adding documentation via add_newdoc_ufunc,
and then throwing away the ufunc.
- alterdot(...)
- Change `dot`, `vdot`, and `inner` to use accelerated BLAS functions.
Typically, as a user of Numpy, you do not explicitly call this function. If
Numpy is built with an accelerated BLAS, this function is automatically
called when Numpy is imported.
When Numpy is built with an accelerated BLAS like ATLAS, these functions
are replaced to make use of the faster implementations. The faster
implementations only affect float32, float64, complex64, and complex128
arrays. Furthermore, the BLAS API only includes matrix-matrix,
matrix-vector, and vector-vector products. Products of arrays with larger
dimensionalities use the built in functions and are not accelerated.
See Also
--------
restoredot : `restoredot` undoes the effects of `alterdot`.
- arange(...)
- arange([start,] stop[, step,], dtype=None)
Return evenly spaced values within a given interval.
Values are generated within the half-open interval ``[start, stop)``
(in other words, the interval including `start` but excluding `stop`).
For integer arguments the function is equivalent to the Python built-in
`range <http://docs.python.org/lib/built-in-funcs.html>`_ function,
but returns an ndarray rather than a list.
When using a non-integer step, such as 0.1, the results will often not
be consistent. It is better to use ``linspace`` for these cases.
Parameters
----------
start : number, optional
Start of interval. The interval includes this value. The default
start value is 0.
stop : number
End of interval. The interval does not include this value, except
in some cases where `step` is not an integer and floating point
round-off affects the length of `out`.
step : number, optional
Spacing between values. For any output `out`, this is the distance
between two adjacent values, ``out[i+1] - out[i]``. The default
step size is 1. If `step` is specified, `start` must also be given.
dtype : dtype
The type of the output array. If `dtype` is not given, infer the data
type from the other input arguments.
Returns
-------
arange : ndarray
Array of evenly spaced values.
For floating point arguments, the length of the result is
``ceil((stop - start)/step)``. Because of floating point overflow,
this rule may result in the last element of `out` being greater
than `stop`.
See Also
--------
linspace : Evenly spaced numbers with careful handling of endpoints.
ogrid: Arrays of evenly spaced numbers in N-dimensions.
mgrid: Grid-shaped arrays of evenly spaced numbers in N-dimensions.
Examples
--------
>>> np.arange(3)
array([0, 1, 2])
>>> np.arange(3.0)
array([ 0., 1., 2.])
>>> np.arange(3,7)
array([3, 4, 5, 6])
>>> np.arange(3,7,2)
array([3, 5])
- array(...)
- array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)
Create an array.
Parameters
----------
object : array_like
An array, any object exposing the array interface, an
object whose __array__ method returns an array, or any
(nested) sequence.
dtype : data-type, optional
The desired data-type for the array. If not given, then
the type will be determined as the minimum type required
to hold the objects in the sequence. This argument can only
be used to 'upcast' the array. For downcasting, use the
.astype(t) method.
copy : bool, optional
If true (default), then the object is copied. Otherwise, a copy
will only be made if __array__ returns a copy, if obj is a
nested sequence, or if a copy is needed to satisfy any of the other
requirements (`dtype`, `order`, etc.).
order : {'C', 'F', 'A'}, optional
Specify the order of the array. If order is 'C' (default), then the
array will be in C-contiguous order (last-index varies the
fastest). If order is 'F', then the returned array
will be in Fortran-contiguous order (first-index varies the
fastest). If order is 'A', then the returned array may
be in any order (either C-, Fortran-contiguous, or even
discontiguous).
subok : bool, optional
If True, then sub-classes will be passed-through, otherwise
the returned array will be forced to be a base-class array (default).
ndmin : int, optional
Specifies the minimum number of dimensions that the resulting
array should have. Ones will be pre-pended to the shape as
needed to meet this requirement.
Returns
-------
out : ndarray
An array object satisfying the specified requirements.
See Also
--------
empty, empty_like, zeros, zeros_like, ones, ones_like, fill
Examples
--------
>>> np.array([1, 2, 3])
array([1, 2, 3])
Upcasting:
>>> np.array([1, 2, 3.0])
array([ 1., 2., 3.])
More than one dimension:
>>> np.array([[1, 2], [3, 4]])
array([[1, 2],
[3, 4]])
Minimum dimensions 2:
>>> np.array([1, 2, 3], ndmin=2)
array([[1, 2, 3]])
Type provided:
>>> np.array([1, 2, 3], dtype=complex)
array([ 1.+0.j, 2.+0.j, 3.+0.j])
Data-type consisting of more than one element:
>>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])
>>> x['a']
array([1, 3])
Creating an array from sub-classes:
>>> np.array(np.mat('1 2; 3 4'))
array([[1, 2],
[3, 4]])
>>> np.array(np.mat('1 2; 3 4'), subok=True)
matrix([[1, 2],
[3, 4]])
- beta(...)
- beta(a, b, size=None)
The Beta distribution over ``[0, 1]``.
The Beta distribution is a special case of the Dirichlet distribution,
and is related to the Gamma distribution. It has the probability
distribution function
.. math:: f(x; a,b) = \frac{1}{B(\alpha, \beta)} x^{\alpha - 1}
(1 - x)^{\beta - 1},
where the normalisation, B, is the beta function,
.. math:: B(\alpha, \beta) = \int_0^1 t^{\alpha - 1}
(1 - t)^{\beta - 1} dt.
It is often seen in Bayesian inference and order statistics.
Parameters
----------
a : float
Alpha, non-negative.
b : float
Beta, non-negative.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : ndarray
Array of the given shape, containing values drawn from a
Beta distribution.
- bincount(...)
- bincount(x, weights=None, minlength=None)
Count number of occurrences of each value in array of non-negative ints.
The number of bins (of size 1) is one larger than the largest value in
`x`. If `minlength` is specified, there will be at least this number
of bins in the output array (though it will be longer if necessary,
depending on the contents of `x`).
Each bin gives the number of occurrences of its index value in `x`.
If `weights` is specified the input array is weighted by it, i.e. if a
value ``n`` is found at position ``i``, ``out[n] += weight[i]`` instead
of ``out[n] += 1``.
Parameters
----------
x : array_like, 1 dimension, nonnegative ints
Input array.
weights : array_like, optional
Weights, array of the same shape as `x`.
minlength : int, optional
.. versionadded:: 1.6.0
A minimum number of bins for the output array.
Returns
-------
out : ndarray of ints
The result of binning the input array.
The length of `out` is equal to ``np.amax(x)+1``.
Raises
------
ValueError
If the input is not 1-dimensional, or contains elements with negative
values, or if `minlength` is non-positive.
TypeError
If the type of the input is float or complex.
See Also
--------
histogram, digitize, unique
Examples
--------
>>> np.bincount(np.arange(5))
array([1, 1, 1, 1, 1])
>>> np.bincount(np.array([0, 1, 1, 3, 2, 1, 7]))
array([1, 3, 1, 1, 0, 0, 0, 1])
>>> x = np.array([0, 1, 1, 3, 2, 1, 7, 23])
>>> np.bincount(x).size == np.amax(x)+1
True
The input array needs to be of integer dtype, otherwise a
TypeError is raised:
>>> np.bincount(np.arange(5, dtype=np.float))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: array cannot be safely cast to required type
A possible use of ``bincount`` is to perform sums over
variable-size chunks of an array, using the ``weights`` keyword.
>>> w = np.array([0.3, 0.5, 0.2, 0.7, 1., -0.6]) # weights
>>> x = np.array([0, 1, 1, 2, 2, 2])
>>> np.bincount(x, weights=w)
array([ 0.3, 0.7, 1.1])
- binomial(...)
- binomial(n, p, size=None)
Draw samples from a binomial distribution.
Samples are drawn from a Binomial distribution with specified
parameters, n trials and p probability of success where
n an integer >= 0 and p is in the interval [0,1]. (n may be
input as a float, but it is truncated to an integer in use)
Parameters
----------
n : float (but truncated to an integer)
parameter, >= 0.
p : float
parameter, >= 0 and <=1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : {ndarray, scalar}
where the values are all integers in [0, n].
See Also
--------
scipy.stats.distributions.binom : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Binomial distribution is
.. math:: P(N) = \binom{n}{N}p^N(1-p)^{n-N},
where :math:`n` is the number of trials, :math:`p` is the probability
of success, and :math:`N` is the number of successes.
When estimating the standard error of a proportion in a population by
using a random sample, the normal distribution works well unless the
product p*n <=5, where p = population proportion estimate, and n =
number of samples, in which case the binomial distribution is used
instead. For example, a sample of 15 people shows 4 who are left
handed, and 11 who are right handed. Then p = 4/15 = 27%. 0.27*15 = 4,
so the binomial distribution should be used in this case.
References
----------
.. [1] Dalgaard, Peter, "Introductory Statistics with R",
Springer-Verlag, 2002.
.. [2] Glantz, Stanton A. "Primer of Biostatistics.", McGraw-Hill,
Fifth Edition, 2002.
.. [3] Lentner, Marvin, "Elementary Applied Statistics", Bogden
and Quigley, 1972.
.. [4] Weisstein, Eric W. "Binomial Distribution." From MathWorld--A
Wolfram Web Resource.
http://mathworld.wolfram.com/BinomialDistribution.html
.. [5] Wikipedia, "Binomial-distribution",
http://en.wikipedia.org/wiki/Binomial_distribution
Examples
--------
Draw samples from the distribution:
>>> n, p = 10, .5 # number of trials, probability of each trial
>>> s = np.random.binomial(n, p, 1000)
# result of flipping a coin 10 times, tested 1000 times.
A real world example. A company drills 9 wild-cat oil exploration
wells, each with an estimated probability of success of 0.1. All nine
wells fail. What is the probability of that happening?
Let's do 20,000 trials of the model, and count the number that
generate zero positive results.
>>> sum(np.random.binomial(9,0.1,20000)==0)/20000.
answer = 0.38885, or 38%.
- busday_count(...)
- busday_count(begindates, enddates, weekmask='1111100', holidays=[], busdaycal=None, out=None)
Counts the number of valid days between `begindates` and
`enddates`, not including the day of `enddates`.
If ``enddates`` specifies a date value that is earlier than the
corresponding ``begindates`` date value, the count will be negative.
.. versionadded:: 1.7.0
Parameters
----------
begindates : array_like of datetime64[D]
The array of the first dates for counting.
enddates : array_like of datetime64[D]
The array of the end dates for counting, which are excluded
from the count themselves.
weekmask : str or array_like of bool, optional
A seven-element array indicating which of Monday through Sunday are
valid days. May be specified as a length-seven list or array, like
[1,1,1,1,1,0,0]; a length-seven string, like '1111100'; or a string
like "Mon Tue Wed Thu Fri", made up of 3-character abbreviations for
weekdays, optionally separated by white space. Valid abbreviations
are: Mon Tue Wed Thu Fri Sat Sun
holidays : array_like of datetime64[D], optional
An array of dates to consider as invalid dates. They may be
specified in any order, and NaT (not-a-time) dates are ignored.
This list is saved in a normalized form that is suited for
fast calculations of valid days.
busdaycal : busdaycalendar, optional
A `busdaycalendar` object which specifies the valid days. If this
parameter is provided, neither weekmask nor holidays may be
provided.
out : array of int, optional
If provided, this array is filled with the result.
Returns
-------
out : array of int
An array with a shape from broadcasting ``begindates`` and ``enddates``
together, containing the number of valid days between
the begin and end dates.
See Also
--------
busdaycalendar: An object that specifies a custom set of valid days.
is_busday : Returns a boolean array indicating valid days.
busday_offset : Applies an offset counted in valid days.
Examples
--------
>>> # Number of weekdays in January 2011
... np.busday_count('2011-01', '2011-02')
21
>>> # Number of weekdays in 2011
... np.busday_count('2011', '2012')
260
>>> # Number of Saturdays in 2011
... np.busday_count('2011', '2012', weekmask='Sat')
53
- busday_offset(...)
- busday_offset(dates, offsets, roll='raise', weekmask='1111100', holidays=None, busdaycal=None, out=None)
First adjusts the date to fall on a valid day according to
the ``roll`` rule, then applies offsets to the given dates
counted in valid days.
.. versionadded:: 1.7.0
Parameters
----------
dates : array_like of datetime64[D]
The array of dates to process.
offsets : array_like of int
The array of offsets, which is broadcast with ``dates``.
roll : {'raise', 'nat', 'forward', 'following', 'backward', 'preceding', 'modifiedfollowing', 'modifiedpreceding'}, optional
How to treat dates that do not fall on a valid day. The default
is 'raise'.
* 'raise' means to raise an exception for an invalid day.
* 'nat' means to return a NaT (not-a-time) for an invalid day.
* 'forward' and 'following' mean to take the first valid day
later in time.
* 'backward' and 'preceding' mean to take the first valid day
earlier in time.
* 'modifiedfollowing' means to take the first valid day
later in time unless it is across a Month boundary, in which
case to take the first valid day earlier in time.
* 'modifiedpreceding' means to take the first valid day
earlier in time unless it is across a Month boundary, in which
case to take the first valid day later in time.
weekmask : str or array_like of bool, optional
A seven-element array indicating which of Monday through Sunday are
valid days. May be specified as a length-seven list or array, like
[1,1,1,1,1,0,0]; a length-seven string, like '1111100'; or a string
like "Mon Tue Wed Thu Fri", made up of 3-character abbreviations for
weekdays, optionally separated by white space. Valid abbreviations
are: Mon Tue Wed Thu Fri Sat Sun
holidays : array_like of datetime64[D], optional
An array of dates to consider as invalid dates. They may be
specified in any order, and NaT (not-a-time) dates are ignored.
This list is saved in a normalized form that is suited for
fast calculations of valid days.
busdaycal : busdaycalendar, optional
A `busdaycalendar` object which specifies the valid days. If this
parameter is provided, neither weekmask nor holidays may be
provided.
out : array of datetime64[D], optional
If provided, this array is filled with the result.
Returns
-------
out : array of datetime64[D]
An array with a shape from broadcasting ``dates`` and ``offsets``
together, containing the dates with offsets applied.
See Also
--------
busdaycalendar: An object that specifies a custom set of valid days.
is_busday : Returns a boolean array indicating valid days.
busday_count : Counts how many valid days are in a half-open date range.
Examples
--------
>>> # First business day in October 2011 (not accounting for holidays)
... np.busday_offset('2011-10', 0, roll='forward')
numpy.datetime64('2011-10-03','D')
>>> # Last business day in February 2012 (not accounting for holidays)
... np.busday_offset('2012-03', -1, roll='forward')
numpy.datetime64('2012-02-29','D')
>>> # Third Wednesday in January 2011
... np.busday_offset('2011-01', 2, roll='forward', weekmask='Wed')
numpy.datetime64('2011-01-19','D')
>>> # 2012 Mother's Day in Canada and the U.S.
... np.busday_offset('2012-05', 1, roll='forward', weekmask='Sun')
numpy.datetime64('2012-05-13','D')
>>> # First business day on or after a date
... np.busday_offset('2011-03-20', 0, roll='forward')
numpy.datetime64('2011-03-21','D')
>>> np.busday_offset('2011-03-22', 0, roll='forward')
numpy.datetime64('2011-03-22','D')
>>> # First business day after a date
... np.busday_offset('2011-03-20', 1, roll='backward')
numpy.datetime64('2011-03-21','D')
>>> np.busday_offset('2011-03-22', 1, roll='backward')
numpy.datetime64('2011-03-23','D')
- can_cast(...)
- can_cast(from, totype, casting = 'safe')
Returns True if cast between data types can occur according to the
casting rule. If from is a scalar or array scalar, also returns
True if the scalar value can be cast without overflow or truncation
to an integer.
Parameters
----------
from : dtype, dtype specifier, scalar, or array
Data type, scalar, or array to cast from.
totype : dtype or dtype specifier
Data type to cast to.
casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional
Controls what kind of data casting may occur.
* 'no' means the data types should not be cast at all.
* 'equiv' means only byte-order changes are allowed.
* 'safe' means only casts which can preserve values are allowed.
* 'same_kind' means only safe casts or casts within a kind,
like float64 to float32, are allowed.
* 'unsafe' means any data conversions may be done.
Returns
-------
out : bool
True if cast can occur according to the casting rule.
Notes
-----
Starting in NumPy 1.9, can_cast function now returns False in 'safe'
casting mode for integer/float dtype and string dtype if the string dtype
length is not long enough to store the max integer/float value converted
to a string. Previously can_cast in 'safe' mode returned True for
integer/float dtype and a string dtype of any length.
See also
--------
dtype, result_type
Examples
--------
Basic examples
>>> np.can_cast(np.int32, np.int64)
True
>>> np.can_cast(np.float64, np.complex)
True
>>> np.can_cast(np.complex, np.float)
False
>>> np.can_cast('i8', 'f8')
True
>>> np.can_cast('i8', 'f4')
False
>>> np.can_cast('i4', 'S4')
False
Casting scalars
>>> np.can_cast(100, 'i1')
True
>>> np.can_cast(150, 'i1')
False
>>> np.can_cast(150, 'u1')
True
>>> np.can_cast(3.5e100, np.float32)
False
>>> np.can_cast(1000.0, np.float32)
True
Array scalar checks the value, array does not
>>> np.can_cast(np.array(1000.0), np.float32)
True
>>> np.can_cast(np.array([1000.0]), np.float32)
False
Using the casting rules
>>> np.can_cast('i8', 'i8', 'no')
True
>>> np.can_cast('<i8', '>i8', 'no')
False
>>> np.can_cast('<i8', '>i8', 'equiv')
True
>>> np.can_cast('<i4', '>i8', 'equiv')
False
>>> np.can_cast('<i4', '>i8', 'safe')
True
>>> np.can_cast('<i8', '>i4', 'safe')
False
>>> np.can_cast('<i8', '>i4', 'same_kind')
True
>>> np.can_cast('<i8', '>u4', 'same_kind')
False
>>> np.can_cast('<i8', '>u4', 'unsafe')
True
- chisquare(...)
- chisquare(df, size=None)
Draw samples from a chi-square distribution.
When `df` independent random variables, each with standard normal
distributions (mean 0, variance 1), are squared and summed, the
resulting distribution is chi-square (see Notes). This distribution
is often used in hypothesis testing.
Parameters
----------
df : int
Number of degrees of freedom.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
output : ndarray
Samples drawn from the distribution, packed in a `size`-shaped
array.
Raises
------
ValueError
When `df` <= 0 or when an inappropriate `size` (e.g. ``size=-1``)
is given.
Notes
-----
The variable obtained by summing the squares of `df` independent,
standard normally distributed random variables:
.. math:: Q = \sum_{i=0}^{\mathtt{df}} X^2_i
is chi-square distributed, denoted
.. math:: Q \sim \chi^2_k.
The probability density function of the chi-squared distribution is
.. math:: p(x) = \frac{(1/2)^{k/2}}{\Gamma(k/2)}
x^{k/2 - 1} e^{-x/2},
where :math:`\Gamma` is the gamma function,
.. math:: \Gamma(x) = \int_0^{-\infty} t^{x - 1} e^{-t} dt.
References
----------
`NIST/SEMATECH e-Handbook of Statistical Methods
<http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm>`_
Examples
--------
>>> np.random.chisquare(2,4)
array([ 1.89920014, 9.00867716, 3.13710533, 5.62318272])
- compare_chararrays(...)
- concatenate(...)
- concatenate((a1, a2, ...), axis=0)
Join a sequence of arrays together.
Parameters
----------
a1, a2, ... : sequence of array_like
The arrays must have the same shape, except in the dimension
corresponding to `axis` (the first, by default).
axis : int, optional
The axis along which the arrays will be joined. Default is 0.
Returns
-------
res : ndarray
The concatenated array.
See Also
--------
ma.concatenate : Concatenate function that preserves input masks.
array_split : Split an array into multiple sub-arrays of equal or
near-equal size.
split : Split array into a list of multiple sub-arrays of equal size.
hsplit : Split array into multiple sub-arrays horizontally (column wise)
vsplit : Split array into multiple sub-arrays vertically (row wise)
dsplit : Split array into multiple sub-arrays along the 3rd axis (depth).
hstack : Stack arrays in sequence horizontally (column wise)
vstack : Stack arrays in sequence vertically (row wise)
dstack : Stack arrays in sequence depth wise (along third dimension)
Notes
-----
When one or more of the arrays to be concatenated is a MaskedArray,
this function will return a MaskedArray object instead of an ndarray,
but the input masks are not preserved. In cases where a MaskedArray
is expected as input, use the ma.concatenate function from the masked
array module instead.
Examples
--------
>>> a = np.array([[1, 2], [3, 4]])
>>> b = np.array([[5, 6]])
>>> np.concatenate((a, b), axis=0)
array([[1, 2],
[3, 4],
[5, 6]])
>>> np.concatenate((a, b.T), axis=1)
array([[1, 2, 5],
[3, 4, 6]])
This function will not preserve masking of MaskedArray inputs.
>>> a = np.ma.arange(3)
>>> a[1] = np.ma.masked
>>> b = np.arange(2, 5)
>>> a
masked_array(data = [0 -- 2],
mask = [False True False],
fill_value = 999999)
>>> b
array([2, 3, 4])
>>> np.concatenate([a, b])
masked_array(data = [0 1 2 2 3 4],
mask = False,
fill_value = 999999)
>>> np.ma.concatenate([a, b])
masked_array(data = [0 -- 2 2 3 4],
mask = [False True False False False False],
fill_value = 999999)
- copyto(...)
- copyto(dst, src, casting='same_kind', where=None)
Copies values from one array to another, broadcasting as necessary.
Raises a TypeError if the `casting` rule is violated, and if
`where` is provided, it selects which elements to copy.
.. versionadded:: 1.7.0
Parameters
----------
dst : ndarray
The array into which values are copied.
src : array_like
The array from which values are copied.
casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional
Controls what kind of data casting may occur when copying.
* 'no' means the data types should not be cast at all.
* 'equiv' means only byte-order changes are allowed.
* 'safe' means only casts which can preserve values are allowed.
* 'same_kind' means only safe casts or casts within a kind,
like float64 to float32, are allowed.
* 'unsafe' means any data conversions may be done.
where : array_like of bool, optional
A boolean array which is broadcasted to match the dimensions
of `dst`, and selects elements to copy from `src` to `dst`
wherever it contains the value True.
- count_nonzero(...)
- count_nonzero(a)
Counts the number of non-zero values in the array ``a``.
Parameters
----------
a : array_like
The array for which to count non-zeros.
Returns
-------
count : int or array of int
Number of non-zero values in the array.
See Also
--------
nonzero : Return the coordinates of all the non-zero values.
Examples
--------
>>> np.count_nonzero(np.eye(4))
4
>>> np.count_nonzero([[0,1,7,0,0],[3,0,0,2,19]])
5
- datetime_as_string(...)
- datetime_data(...)
- digitize(...)
- digitize(x, bins, right=False)
Return the indices of the bins to which each value in input array belongs.
Each index ``i`` returned is such that ``bins[i-1] <= x < bins[i]`` if
`bins` is monotonically increasing, or ``bins[i-1] > x >= bins[i]`` if
`bins` is monotonically decreasing. If values in `x` are beyond the
bounds of `bins`, 0 or ``len(bins)`` is returned as appropriate. If right
is True, then the right bin is closed so that the index ``i`` is such
that ``bins[i-1] < x <= bins[i]`` or bins[i-1] >= x > bins[i]`` if `bins`
is monotonically increasing or decreasing, respectively.
Parameters
----------
x : array_like
Input array to be binned. It has to be 1-dimensional.
bins : array_like
Array of bins. It has to be 1-dimensional and monotonic.
right : bool, optional
Indicating whether the intervals include the right or the left bin
edge. Default behavior is (right==False) indicating that the interval
does not include the right edge. The left bin and is open in this
case. Ie., bins[i-1] <= x < bins[i] is the default behavior for
monotonically increasing bins.
Returns
-------
out : ndarray of ints
Output array of indices, of same shape as `x`.
Raises
------
ValueError
If the input is not 1-dimensional, or if `bins` is not monotonic.
TypeError
If the type of the input is complex.
See Also
--------
bincount, histogram, unique
Notes
-----
If values in `x` are such that they fall outside the bin range,
attempting to index `bins` with the indices that `digitize` returns
will result in an IndexError.
Examples
--------
>>> x = np.array([0.2, 6.4, 3.0, 1.6])
>>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])
>>> inds = np.digitize(x, bins)
>>> inds
array([1, 4, 3, 2])
>>> for n in range(x.size):
... print bins[inds[n]-1], "<=", x[n], "<", bins[inds[n]]
...
0.0 <= 0.2 < 1.0
4.0 <= 6.4 < 10.0
2.5 <= 3.0 < 4.0
1.0 <= 1.6 < 2.5
>>> x = np.array([1.2, 10.0, 12.4, 15.5, 20.])
>>> bins = np.array([0,5,10,15,20])
>>> np.digitize(x,bins,right=True)
array([1, 2, 3, 4, 4])
>>> np.digitize(x,bins,right=False)
array([1, 3, 3, 4, 5])
- dot(...)
- dot(a, b, out=None)
Dot product of two arrays.
For 2-D arrays it is equivalent to matrix multiplication, and for 1-D
arrays to inner product of vectors (without complex conjugation). For
N dimensions it is a sum product over the last axis of `a` and
the second-to-last of `b`::
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
Parameters
----------
a : array_like
First argument.
b : array_like
Second argument.
out : ndarray, optional
Output argument. This must have the exact kind that would be returned
if it was not used. In particular, it must have the right type, must be
C-contiguous, and its dtype must be the dtype that would be returned
for `dot(a,b)`. This is a performance feature. Therefore, if these
conditions are not met, an exception is raised, instead of attempting
to be flexible.
Returns
-------
output : ndarray
Returns the dot product of `a` and `b`. If `a` and `b` are both
scalars or both 1-D arrays then a scalar is returned; otherwise
an array is returned.
If `out` is given, then it is returned.
Raises
------
ValueError
If the last dimension of `a` is not the same size as
the second-to-last dimension of `b`.
See Also
--------
vdot : Complex-conjugating dot product.
tensordot : Sum products over arbitrary axes.
einsum : Einstein summation convention.
Examples
--------
>>> np.dot(3, 4)
12
Neither argument is complex-conjugated:
>>> np.dot([2j, 3j], [2j, 3j])
(-13+0j)
For 2-D arrays it's the matrix product:
>>> a = [[1, 0], [0, 1]]
>>> b = [[4, 1], [2, 2]]
>>> np.dot(a, b)
array([[4, 1],
[2, 2]])
>>> a = np.arange(34*56).reshape((3,4,5,6))
>>> b = np.arange(34*56)[::-1].reshape((5,4,6,3))
>>> np.dot(a, b)[2,3,2,1,2,2]
499128
>>> sum(a[2,3,2,:] * b[1,2,:,2])
499128
- einsum(...)
- einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe')
Evaluates the Einstein summation convention on the operands.
Using the Einstein summation convention, many common multi-dimensional
array operations can be represented in a simple fashion. This function
provides a way compute such summations. The best way to understand this
function is to try the examples below, which show how many common NumPy
functions can be implemented as calls to `einsum`.
Parameters
----------
subscripts : str
Specifies the subscripts for summation.
operands : list of array_like
These are the arrays for the operation.
out : ndarray, optional
If provided, the calculation is done into this array.
dtype : data-type, optional
If provided, forces the calculation to use the data type specified.
Note that you may have to also give a more liberal `casting`
parameter to allow the conversions.
order : {'C', 'F', 'A', 'K'}, optional
Controls the memory layout of the output. 'C' means it should
be C contiguous. 'F' means it should be Fortran contiguous,
'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise.
'K' means it should be as close to the layout as the inputs as
is possible, including arbitrarily permuted axes.
Default is 'K'.
casting : {'no', 'equiv', 'safe', 'same_kind', 'unsafe'}, optional
Controls what kind of data casting may occur. Setting this to
'unsafe' is not recommended, as it can adversely affect accumulations.
* 'no' means the data types should not be cast at all.
* 'equiv' means only byte-order changes are allowed.
* 'safe' means only casts which can preserve values are allowed.
* 'same_kind' means only safe casts or casts within a kind,
like float64 to float32, are allowed.
* 'unsafe' means any data conversions may be done.
Returns
-------
output : ndarray
The calculation based on the Einstein summation convention.
See Also
--------
dot, inner, outer, tensordot
Notes
-----
.. versionadded:: 1.6.0
The subscripts string is a comma-separated list of subscript labels,
where each label refers to a dimension of the corresponding operand.
Repeated subscripts labels in one operand take the diagonal. For example,
``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``.
Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)``
is equivalent to ``np.inner(a,b)``. If a label appears only once,
it is not summed, so ``np.einsum('i', a)`` produces a view of ``a``
with no changes.
The order of labels in the output is by default alphabetical. This
means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while
``np.einsum('ji', a)`` takes its transpose.
The output can be controlled by specifying output subscript labels
as well. This specifies the label order, and allows summing to
be disallowed or forced when desired. The call ``np.einsum('i->', a)``
is like ``np.sum(a, axis=-1)``, and ``np.einsum('ii->i', a)``
is like ``np.diag(a)``. The difference is that `einsum` does not
allow broadcasting by default.
To enable and control broadcasting, use an ellipsis. Default
NumPy-style broadcasting is done by adding an ellipsis
to the left of each term, like ``np.einsum('...ii->...i', a)``.
To take the trace along the first and last axes,
you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix
product with the left-most indices instead of rightmost, you can do
``np.einsum('ij...,jk...->ik...', a, b)``.
When there is only one operand, no axes are summed, and no output
parameter is provided, a view into the operand is returned instead
of a new array. Thus, taking the diagonal as ``np.einsum('ii->i', a)``
produces a view.
An alternative way to provide the subscripts and operands is as
``einsum(op0, sublist0, op1, sublist1, ..., [sublistout])``. The examples
below have corresponding `einsum` calls with the two parameter methods.
Examples
--------
>>> a = np.arange(25).reshape(5,5)
>>> b = np.arange(5)
>>> c = np.arange(6).reshape(2,3)
>>> np.einsum('ii', a)
60
>>> np.einsum(a, [0,0])
60
>>> np.trace(a)
60
>>> np.einsum('ii->i', a)
array([ 0, 6, 12, 18, 24])
>>> np.einsum(a, [0,0], [0])
array([ 0, 6, 12, 18, 24])
>>> np.diag(a)
array([ 0, 6, 12, 18, 24])
>>> np.einsum('ij,j', a, b)
array([ 30, 80, 130, 180, 230])
>>> np.einsum(a, [0,1], b, [1])
array([ 30, 80, 130, 180, 230])
>>> np.dot(a, b)
array([ 30, 80, 130, 180, 230])
>>> np.einsum('...j,j', a, b)
array([ 30, 80, 130, 180, 230])
>>> np.einsum('ji', c)
array([[0, 3],
[1, 4],
[2, 5]])
>>> np.einsum(c, [1,0])
array([[0, 3],
[1, 4],
[2, 5]])
>>> c.T
array([[0, 3],
[1, 4],
[2, 5]])
>>> np.einsum('..., ...', 3, c)
array([[ 0, 3, 6],
[ 9, 12, 15]])
>>> np.einsum(3, [Ellipsis], c, [Ellipsis])
array([[ 0, 3, 6],
[ 9, 12, 15]])
>>> np.multiply(3, c)
array([[ 0, 3, 6],
[ 9, 12, 15]])
>>> np.einsum('i,i', b, b)
30
>>> np.einsum(b, [0], b, [0])
30
>>> np.inner(b,b)
30
>>> np.einsum('i,j', np.arange(2)+1, b)
array([[0, 1, 2, 3, 4],
[0, 2, 4, 6, 8]])
>>> np.einsum(np.arange(2)+1, [0], b, [1])
array([[0, 1, 2, 3, 4],
[0, 2, 4, 6, 8]])
>>> np.outer(np.arange(2)+1, b)
array([[0, 1, 2, 3, 4],
[0, 2, 4, 6, 8]])
>>> np.einsum('i...->...', a)
array([50, 55, 60, 65, 70])
>>> np.einsum(a, [0,Ellipsis], [Ellipsis])
array([50, 55, 60, 65, 70])
>>> np.sum(a, axis=0)
array([50, 55, 60, 65, 70])
>>> a = np.arange(60.).reshape(3,4,5)
>>> b = np.arange(24.).reshape(4,3,2)
>>> np.einsum('ijk,jil->kl', a, b)
array([[ 4400., 4730.],
[ 4532., 4874.],
[ 4664., 5018.],
[ 4796., 5162.],
[ 4928., 5306.]])
>>> np.einsum(a, [0,1,2], b, [1,0,3], [2,3])
array([[ 4400., 4730.],
[ 4532., 4874.],
[ 4664., 5018.],
[ 4796., 5162.],
[ 4928., 5306.]])
>>> np.tensordot(a,b, axes=([1,0],[0,1]))
array([[ 4400., 4730.],
[ 4532., 4874.],
[ 4664., 5018.],
[ 4796., 5162.],
[ 4928., 5306.]])
>>> a = np.arange(6).reshape((3,2))
>>> b = np.arange(12).reshape((4,3))
>>> np.einsum('ki,jk->ij', a, b)
array([[10, 28, 46, 64],
[13, 40, 67, 94]])
>>> np.einsum('ki,...k->i...', a, b)
array([[10, 28, 46, 64],
[13, 40, 67, 94]])
>>> np.einsum('k...,jk', a, b)
array([[10, 28, 46, 64],
[13, 40, 67, 94]])
- empty(...)
- empty(shape, dtype=float, order='C')
Return a new array of given shape and type, without initializing entries.
Parameters
----------
shape : int or tuple of int
Shape of the empty array
dtype : data-type, optional
Desired output data-type.
order : {'C', 'F'}, optional
Whether to store multi-dimensional data in C (row-major) or
Fortran (column-major) order in memory.
Returns
-------
out : ndarray
Array of uninitialized (arbitrary) data with the given
shape, dtype, and order.
See Also
--------
empty_like, zeros, ones
Notes
-----
`empty`, unlike `zeros`, does not set the array values to zero,
and may therefore be marginally faster. On the other hand, it requires
the user to manually set all the values in the array, and should be
used with caution.
Examples
--------
>>> np.empty([2, 2])
array([[ -9.74499359e+001, 6.69583040e-309],
[ 2.13182611e-314, 3.06959433e-309]]) #random
>>> np.empty([2, 2], dtype=int)
array([[-1073741821, -1067949133],
[ 496041986, 19249760]]) #random
- empty_like(...)
- empty_like(a, dtype=None, order='K', subok=True)
Return a new array with the same shape and type as a given array.
Parameters
----------
a : array_like
The shape and data-type of `a` define these same attributes of the
returned array.
dtype : data-type, optional
.. versionadded:: 1.6.0
Overrides the data type of the result.
order : {'C', 'F', 'A', or 'K'}, optional
.. versionadded:: 1.6.0
Overrides the memory layout of the result. 'C' means C-order,
'F' means F-order, 'A' means 'F' if ``a`` is Fortran contiguous,
'C' otherwise. 'K' means match the layout of ``a`` as closely
as possible.
subok : bool, optional.
If True, then the newly created array will use the sub-class
type of 'a', otherwise it will be a base-class array. Defaults
to True.
Returns
-------
out : ndarray
Array of uninitialized (arbitrary) data with the same
shape and type as `a`.
See Also
--------
ones_like : Return an array of ones with shape and type of input.
zeros_like : Return an array of zeros with shape and type of input.
empty : Return a new uninitialized array.
ones : Return a new array setting values to one.
zeros : Return a new array setting values to zero.
Notes
-----
This function does not initialize the returned array; to do that use
`zeros_like` or `ones_like` instead. It may be marginally faster than
the functions that do set the array values.
Examples
--------
>>> a = ([1,2,3], [4,5,6]) # a is array-like
>>> np.empty_like(a)
array([[-1073741821, -1073741821, 3], #random
[ 0, 0, -1073741821]])
>>> a = np.array([[1., 2., 3.],[4.,5.,6.]])
>>> np.empty_like(a)
array([[ -2.00000715e+000, 1.48219694e-323, -2.00000572e+000],#random
[ 4.38791518e-305, -2.00000715e+000, 4.17269252e-309]])
- exponential(...)
- exponential(scale=1.0, size=None)
Exponential distribution.
Its probability density function is
.. math:: f(x; \frac{1}{\beta}) = \frac{1}{\beta} \exp(-\frac{x}{\beta}),
for ``x > 0`` and 0 elsewhere. :math:`\beta` is the scale parameter,
which is the inverse of the rate parameter :math:`\lambda = 1/\beta`.
The rate parameter is an alternative, widely used parameterization
of the exponential distribution [3]_.
The exponential distribution is a continuous analogue of the
geometric distribution. It describes many common situations, such as
the size of raindrops measured over many rainstorms [1]_, or the time
between page requests to Wikipedia [2]_.
Parameters
----------
scale : float
The scale parameter, :math:`\beta = 1/\lambda`.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
References
----------
.. [1] Peyton Z. Peebles Jr., "Probability, Random Variables and
Random Signal Principles", 4th ed, 2001, p. 57.
.. [2] "Poisson Process", Wikipedia,
http://en.wikipedia.org/wiki/Poisson_process
.. [3] "Exponential Distribution, Wikipedia,
http://en.wikipedia.org/wiki/Exponential_distribution
- f(...)
- f(dfnum, dfden, size=None)
Draw samples from a F distribution.
Samples are drawn from an F distribution with specified parameters,
`dfnum` (degrees of freedom in numerator) and `dfden` (degrees of freedom
in denominator), where both parameters should be greater than zero.
The random variate of the F distribution (also known as the
Fisher distribution) is a continuous probability distribution
that arises in ANOVA tests, and is the ratio of two chi-square
variates.
Parameters
----------
dfnum : float
Degrees of freedom in numerator. Should be greater than zero.
dfden : float
Degrees of freedom in denominator. Should be greater than zero.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : {ndarray, scalar}
Samples from the Fisher distribution.
See Also
--------
scipy.stats.distributions.f : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The F statistic is used to compare in-group variances to between-group
variances. Calculating the distribution depends on the sampling, and
so it is a function of the respective degrees of freedom in the
problem. The variable `dfnum` is the number of samples minus one, the
between-groups degrees of freedom, while `dfden` is the within-groups
degrees of freedom, the sum of the number of samples in each group
minus the number of groups.
References
----------
.. [1] Glantz, Stanton A. "Primer of Biostatistics.", McGraw-Hill,
Fifth Edition, 2002.
.. [2] Wikipedia, "F-distribution",
http://en.wikipedia.org/wiki/F-distribution
Examples
--------
An example from Glantz[1], pp 47-40.
Two groups, children of diabetics (25 people) and children from people
without diabetes (25 controls). Fasting blood glucose was measured,
case group had a mean value of 86.1, controls had a mean value of
82.2. Standard deviations were 2.09 and 2.49 respectively. Are these
data consistent with the null hypothesis that the parents diabetic
status does not affect their children's blood glucose levels?
Calculating the F statistic from the data gives a value of 36.01.
Draw samples from the distribution:
>>> dfnum = 1. # between group degrees of freedom
>>> dfden = 48. # within groups degrees of freedom
>>> s = np.random.f(dfnum, dfden, 1000)
The lower bound for the top 1% of the samples is :
>>> sort(s)[-10]
7.61988120985
So there is about a 1% chance that the F statistic will exceed 7.62,
the measured value is 36, so the null hypothesis is rejected at the 1%
level.
- fastCopyAndTranspose = _fastCopyAndTranspose(...)
- _fastCopyAndTranspose(a)
- frombuffer(...)
- frombuffer(buffer, dtype=float, count=-1, offset=0)
Interpret a buffer as a 1-dimensional array.
Parameters
----------
buffer : buffer_like
An object that exposes the buffer interface.
dtype : data-type, optional
Data-type of the returned array; default: float.
count : int, optional
Number of items to read. ``-1`` means all data in the buffer.
offset : int, optional
Start reading the buffer from this offset; default: 0.
Notes
-----
If the buffer has data that is not in machine byte-order, this should
be specified as part of the data-type, e.g.::
>>> dt = np.dtype(int)
>>> dt = dt.newbyteorder('>')
>>> np.frombuffer(buf, dtype=dt)
The data of the resulting array will not be byteswapped, but will be
interpreted correctly.
Examples
--------
>>> s = 'hello world'
>>> np.frombuffer(s, dtype='S1', count=5, offset=6)
array(['w', 'o', 'r', 'l', 'd'],
dtype='|S1')
- fromfile(...)
- fromfile(file, dtype=float, count=-1, sep='')
Construct an array from data in a text or binary file.
A highly efficient way of reading binary data with a known data-type,
as well as parsing simply formatted text files. Data written using the
`tofile` method can be read using this function.
Parameters
----------
file : file or str
Open file object or filename.
dtype : data-type
Data type of the returned array.
For binary files, it is used to determine the size and byte-order
of the items in the file.
count : int
Number of items to read. ``-1`` means all items (i.e., the complete
file).
sep : str
Separator between items if file is a text file.
Empty ("") separator means the file should be treated as binary.
Spaces (" ") in the separator match zero or more whitespace characters.
A separator consisting only of spaces must match at least one
whitespace.
See also
--------
load, save
ndarray.tofile
loadtxt : More flexible way of loading data from a text file.
Notes
-----
Do not rely on the combination of `tofile` and `fromfile` for
data storage, as the binary files generated are are not platform
independent. In particular, no byte-order or data-type information is
saved. Data can be stored in the platform independent ``.npy`` format
using `save` and `load` instead.
Examples
--------
Construct an ndarray:
>>> dt = np.dtype([('time', [('min', int), ('sec', int)]),
... ('temp', float)])
>>> x = np.zeros((1,), dtype=dt)
>>> x['time']['min'] = 10; x['temp'] = 98.25
>>> x
array([((10, 0), 98.25)],
dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])
Save the raw data to disk:
>>> import os
>>> fname = os.tmpnam()
>>> x.tofile(fname)
Read the raw data from disk:
>>> np.fromfile(fname, dtype=dt)
array([((10, 0), 98.25)],
dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])
The recommended way to store and load data:
>>> np.save(fname, x)
>>> np.load(fname + '.npy')
array([((10, 0), 98.25)],
dtype=[('time', [('min', '<i4'), ('sec', '<i4')]), ('temp', '<f8')])
- fromiter(...)
- fromiter(iterable, dtype, count=-1)
Create a new 1-dimensional array from an iterable object.
Parameters
----------
iterable : iterable object
An iterable object providing data for the array.
dtype : data-type
The data-type of the returned array.
count : int, optional
The number of items to read from iterable. The default is -1,
which means all data is read.
Returns
-------
out : ndarray
The output array.
Notes
-----
Specify `count` to improve performance. It allows ``fromiter`` to
pre-allocate the output array, instead of resizing it on demand.
Examples
--------
>>> iterable = (x*x for x in range(5))
>>> np.fromiter(iterable, np.float)
array([ 0., 1., 4., 9., 16.])
- frompyfunc(...)
- frompyfunc(func, nin, nout)
Takes an arbitrary Python function and returns a Numpy ufunc.
Can be used, for example, to add broadcasting to a built-in Python
function (see Examples section).
Parameters
----------
func : Python function object
An arbitrary Python function.
nin : int
The number of input arguments.
nout : int
The number of objects returned by `func`.
Returns
-------
out : ufunc
Returns a Numpy universal function (``ufunc``) object.
Notes
-----
The returned ufunc always returns PyObject arrays.
Examples
--------
Use frompyfunc to add broadcasting to the Python function ``oct``:
>>> oct_array = np.frompyfunc(oct, 1, 1)
>>> oct_array(np.array((10, 30, 100)))
array([012, 036, 0144], dtype=object)
>>> np.array((oct(10), oct(30), oct(100))) # for comparison
array(['012', '036', '0144'],
dtype='|S4')
- fromstring(...)
- fromstring(string, dtype=float, count=-1, sep='')
A new 1-D array initialized from raw binary or text data in a string.
Parameters
----------
string : str
A string containing the data.
dtype : data-type, optional
The data type of the array; default: float. For binary input data,
the data must be in exactly this format.
count : int, optional
Read this number of `dtype` elements from the data. If this is
negative (the default), the count will be determined from the
length of the data.
sep : str, optional
If not provided or, equivalently, the empty string, the data will
be interpreted as binary data; otherwise, as ASCII text with
decimal numbers. Also in this latter case, this argument is
interpreted as the string separating numbers in the data; extra
whitespace between elements is also ignored.
Returns
-------
arr : ndarray
The constructed array.
Raises
------
ValueError
If the string is not the correct size to satisfy the requested
`dtype` and `count`.
See Also
--------
frombuffer, fromfile, fromiter
Examples
--------
>>> np.fromstring('\x01\x02', dtype=np.uint8)
array([1, 2], dtype=uint8)
>>> np.fromstring('1 2', dtype=int, sep=' ')
array([1, 2])
>>> np.fromstring('1, 2', dtype=int, sep=',')
array([1, 2])
>>> np.fromstring('\x01\x02\x03\x04\x05', dtype=np.uint8, count=3)
array([1, 2, 3], dtype=uint8)
- gamma(...)
- gamma(shape, scale=1.0, size=None)
Draw samples from a Gamma distribution.
Samples are drawn from a Gamma distribution with specified parameters,
`shape` (sometimes designated "k") and `scale` (sometimes designated
"theta"), where both parameters are > 0.
Parameters
----------
shape : scalar > 0
The shape of the gamma distribution.
scale : scalar > 0, optional
The scale of the gamma distribution. Default is equal to 1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : ndarray, float
Returns one sample unless `size` parameter is specified.
See Also
--------
scipy.stats.distributions.gamma : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Gamma distribution is
.. math:: p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},
where :math:`k` is the shape and :math:`\theta` the scale,
and :math:`\Gamma` is the Gamma function.
The Gamma distribution is often used to model the times to failure of
electronic components, and arises naturally in processes for which the
waiting times between Poisson distributed events are relevant.
References
----------
.. [1] Weisstein, Eric W. "Gamma Distribution." From MathWorld--A
Wolfram Web Resource.
http://mathworld.wolfram.com/GammaDistribution.html
.. [2] Wikipedia, "Gamma-distribution",
http://en.wikipedia.org/wiki/Gamma-distribution
Examples
--------
Draw samples from the distribution:
>>> shape, scale = 2., 2. # mean and dispersion
>>> s = np.random.gamma(shape, scale, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> import scipy.special as sps
>>> count, bins, ignored = plt.hist(s, 50, normed=True)
>>> y = bins*(shape-1)(np.exp(-bins/scale) /
... (sps.gamma(shape)scale*shape))
>>> plt.plot(bins, y, linewidth=2, color='r')
>>> plt.show()
- geometric(...)
- geometric(p, size=None)
Draw samples from the geometric distribution.
Bernoulli trials are experiments with one of two outcomes:
success or failure (an example of such an experiment is flipping
a coin). The geometric distribution models the number of trials
that must be run in order to achieve success. It is therefore
supported on the positive integers, ``k = 1, 2, ...``.
The probability mass function of the geometric distribution is
.. math:: f(k) = (1 - p)^{k - 1} p
where `p` is the probability of success of an individual trial.
Parameters
----------
p : float
The probability of success of an individual trial.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : ndarray
Samples from the geometric distribution, shaped according to
`size`.
Examples
--------
Draw ten thousand values from the geometric distribution,
with the probability of an individual success equal to 0.35:
>>> z = np.random.geometric(p=0.35, size=10000)
How many trials succeeded after a single run?
>>> (z == 1).sum() / 10000.
0.34889999999999999 #random
- get_state(...)
- get_state()
Return a tuple representing the internal state of the generator.
For more details, see `set_state`.
Returns
-------
out : tuple(str, ndarray of 624 uints, int, int, float)
The returned tuple has the following items:
1. the string 'MT19937'.
2. a 1-D array of 624 unsigned integer keys.
3. an integer ``pos``.
4. an integer ``has_gauss``.
5. a float ``cached_gaussian``.
See Also
--------
set_state
Notes
-----
`set_state` and `get_state` are not needed to work with any of the
random distributions in NumPy. If the internal state is manually altered,
the user should know exactly what he/she is doing.
- getbuffer(...)
- getbuffer(obj [,offset[, size]])
Create a buffer object from the given object referencing a slice of
length size starting at offset.
Default is the entire buffer. A read-write buffer is attempted followed
by a read-only buffer.
Parameters
----------
obj : object
offset : int, optional
size : int, optional
Returns
-------
buffer_obj : buffer
Examples
--------
>>> buf = np.getbuffer(np.ones(5), 1, 3)
>>> len(buf)
3
>>> buf[0]
'\x00'
>>> buf
<read-write buffer for 0x8af1e70, size 3, offset 1 at 0x8ba4ec0>
- geterrobj(...)
- geterrobj()
Return the current object that defines floating-point error handling.
The error object contains all information that defines the error handling
behavior in Numpy. `geterrobj` is used internally by the other
functions that get and set error handling behavior (`geterr`, `seterr`,
`geterrcall`, `seterrcall`).
Returns
-------
errobj : list
The error object, a list containing three elements:
[internal numpy buffer size, error mask, error callback function].
The error mask is a single integer that holds the treatment information
on all four floating point errors. The information for each error type
is contained in three bits of the integer. If we print it in base 8, we
can see what treatment is set for "invalid", "under", "over", and
"divide" (in that order). The printed string can be interpreted with
* 0 : 'ignore'
* 1 : 'warn'
* 2 : 'raise'
* 3 : 'call'
* 4 : 'print'
* 5 : 'log'
See Also
--------
seterrobj, seterr, geterr, seterrcall, geterrcall
getbufsize, setbufsize
Notes
-----
For complete documentation of the types of floating-point exceptions and
treatment options, see `seterr`.
Examples
--------
>>> np.geterrobj() # first get the defaults
[10000, 0, None]
>>> def err_handler(type, flag):
... print "Floating point error (%s), with flag %s" % (type, flag)
...
>>> old_bufsize = np.setbufsize(20000)
>>> old_err = np.seterr(divide='raise')
>>> old_handler = np.seterrcall(err_handler)
>>> np.geterrobj()
[20000, 2, <function err_handler at 0x91dcaac>]
>>> old_err = np.seterr(all='ignore')
>>> np.base_repr(np.geterrobj()[1], 8)
'0'
>>> old_err = np.seterr(divide='warn', over='log', under='call',
invalid='print')
>>> np.base_repr(np.geterrobj()[1], 8)
'4351'
- gumbel(...)
- gumbel(loc=0.0, scale=1.0, size=None)
Gumbel distribution.
Draw samples from a Gumbel distribution with specified location and scale.
For more information on the Gumbel distribution, see Notes and References
below.
Parameters
----------
loc : float
The location of the mode of the distribution.
scale : float
The scale parameter of the distribution.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : ndarray
The samples
See Also
--------
scipy.stats.gumbel_l
scipy.stats.gumbel_r
scipy.stats.genextreme
probability density function, distribution, or cumulative density
function, etc. for each of the above
weibull
Notes
-----
The Gumbel (or Smallest Extreme Value (SEV) or the Smallest Extreme Value
Type I) distribution is one of a class of Generalized Extreme Value (GEV)
distributions used in modeling extreme value problems. The Gumbel is a
special case of the Extreme Value Type I distribution for maximums from
distributions with "exponential-like" tails.
The probability density for the Gumbel distribution is
.. math:: p(x) = \frac{e^{-(x - \mu)/ \beta}}{\beta} e^{ -e^{-(x - \mu)/
\beta}},
where :math:`\mu` is the mode, a location parameter, and :math:`\beta` is
the scale parameter.
The Gumbel (named for German mathematician Emil Julius Gumbel) was used
very early in the hydrology literature, for modeling the occurrence of
flood events. It is also used for modeling maximum wind speed and rainfall
rates. It is a "fat-tailed" distribution - the probability of an event in
the tail of the distribution is larger than if one used a Gaussian, hence
the surprisingly frequent occurrence of 100-year floods. Floods were
initially modeled as a Gaussian process, which underestimated the frequency
of extreme events.
It is one of a class of extreme value distributions, the Generalized
Extreme Value (GEV) distributions, which also includes the Weibull and
Frechet.
The function has a mean of :math:`\mu + 0.57721\beta` and a variance of
:math:`\frac{\pi^2}{6}\beta^2`.
References
----------
Gumbel, E. J., Statistics of Extremes, New York: Columbia University
Press, 1958.
Reiss, R.-D. and Thomas, M., *Statistical Analysis of Extreme Values from
Insurance, Finance, Hydrology and Other Fields*, Basel: Birkhauser Verlag,
2001.
Examples
--------
Draw samples from the distribution:
>>> mu, beta = 0, 0.1 # location and scale
>>> s = np.random.gumbel(mu, beta, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 30, normed=True)
>>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta)
... * np.exp( -np.exp( -(bins - mu) /beta) ),
... linewidth=2, color='r')
>>> plt.show()
Show how an extreme value distribution can arise from a Gaussian process
and compare to a Gaussian:
>>> means = []
>>> maxima = []
>>> for i in range(0,1000) :
... a = np.random.normal(mu, beta, 1000)
... means.append(a.mean())
... maxima.append(a.max())
>>> count, bins, ignored = plt.hist(maxima, 30, normed=True)
>>> beta = np.std(maxima)*np.pi/np.sqrt(6)
>>> mu = np.mean(maxima) - 0.57721*beta
>>> plt.plot(bins, (1/beta)*np.exp(-(bins - mu)/beta)
... * np.exp(-np.exp(-(bins - mu)/beta)),
... linewidth=2, color='r')
>>> plt.plot(bins, 1/(beta np.sqrt(2 np.pi))
... np.exp(-(bins - mu)2 / (2 beta2)),
... linewidth=2, color='g')
>>> plt.show()
- hypergeometric(...)
- hypergeometric(ngood, nbad, nsample, size=None)
Draw samples from a Hypergeometric distribution.
Samples are drawn from a Hypergeometric distribution with specified
parameters, ngood (ways to make a good selection), nbad (ways to make
a bad selection), and nsample = number of items sampled, which is less
than or equal to the sum ngood + nbad.
Parameters
----------
ngood : int or array_like
Number of ways to make a good selection. Must be nonnegative.
nbad : int or array_like
Number of ways to make a bad selection. Must be nonnegative.
nsample : int or array_like
Number of items sampled. Must be at least 1 and at most
``ngood + nbad``.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or scalar
The values are all integers in [0, n].
See Also
--------
scipy.stats.distributions.hypergeom : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Hypergeometric distribution is
.. math:: P(x) = \frac{\binom{m}{n}\binom{N-m}{n-x}}{\binom{N}{n}},
where :math:`0 \le x \le m` and :math:`n+m-N \le x \le n`
for P(x) the probability of x successes, n = ngood, m = nbad, and
N = number of samples.
Consider an urn with black and white marbles in it, ngood of them
black and nbad are white. If you draw nsample balls without
replacement, then the Hypergeometric distribution describes the
distribution of black balls in the drawn sample.
Note that this distribution is very similar to the Binomial
distribution, except that in this case, samples are drawn without
replacement, whereas in the Binomial case samples are drawn with
replacement (or the sample space is infinite). As the sample space
becomes large, this distribution approaches the Binomial.
References
----------
.. [1] Lentner, Marvin, "Elementary Applied Statistics", Bogden
and Quigley, 1972.
.. [2] Weisstein, Eric W. "Hypergeometric Distribution." From
MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/HypergeometricDistribution.html
.. [3] Wikipedia, "Hypergeometric-distribution",
http://en.wikipedia.org/wiki/Hypergeometric-distribution
Examples
--------
Draw samples from the distribution:
>>> ngood, nbad, nsamp = 100, 2, 10
# number of good, number of bad, and number of samples
>>> s = np.random.hypergeometric(ngood, nbad, nsamp, 1000)
>>> hist(s)
# note that it is very unlikely to grab both bad items
Suppose you have an urn with 15 white and 15 black marbles.
If you pull 15 marbles at random, how likely is it that
12 or more of them are one color?
>>> s = np.random.hypergeometric(15, 15, 15, 100000)
>>> sum(s>=12)/100000. + sum(s<=3)/100000.
# answer = 0.003 ... pretty unlikely!
- inner(...)
- inner(a, b)
Inner product of two arrays.
Ordinary inner product of vectors for 1-D arrays (without complex
conjugation), in higher dimensions a sum product over the last axes.
Parameters
----------
a, b : array_like
If `a` and `b` are nonscalar, their last dimensions of must match.
Returns
-------
out : ndarray
`out.shape = a.shape[:-1] + b.shape[:-1]`
Raises
------
ValueError
If the last dimension of `a` and `b` has different size.
See Also
--------
tensordot : Sum products over arbitrary axes.
dot : Generalised matrix product, using second last dimension of `b`.
einsum : Einstein summation convention.
Notes
-----
For vectors (1-D arrays) it computes the ordinary inner-product::
np.inner(a, b) = sum(a[:]*b[:])
More generally, if `ndim(a) = r > 0` and `ndim(b) = s > 0`::
np.inner(a, b) = np.tensordot(a, b, axes=(-1,-1))
or explicitly::
np.inner(a, b)[i0,...,ir-1,j0,...,js-1]
= sum(a[i0,...,ir-1,:]*b[j0,...,js-1,:])
In addition `a` or `b` may be scalars, in which case::
np.inner(a,b) = a*b
Examples
--------
Ordinary inner product for vectors:
>>> a = np.array([1,2,3])
>>> b = np.array([0,1,0])
>>> np.inner(a, b)
2
A multidimensional example:
>>> a = np.arange(24).reshape((2,3,4))
>>> b = np.arange(4)
>>> np.inner(a, b)
array([[ 14, 38, 62],
[ 86, 110, 134]])
An example where `b` is a scalar:
>>> np.inner(np.eye(2), 7)
array([[ 7., 0.],
[ 0., 7.]])
- int_asbuffer(...)
- is_busday(...)
- is_busday(dates, weekmask='1111100', holidays=None, busdaycal=None, out=None)
Calculates which of the given dates are valid days, and which are not.
.. versionadded:: 1.7.0
Parameters
----------
dates : array_like of datetime64[D]
The array of dates to process.
weekmask : str or array_like of bool, optional
A seven-element array indicating which of Monday through Sunday are
valid days. May be specified as a length-seven list or array, like
[1,1,1,1,1,0,0]; a length-seven string, like '1111100'; or a string
like "Mon Tue Wed Thu Fri", made up of 3-character abbreviations for
weekdays, optionally separated by white space. Valid abbreviations
are: Mon Tue Wed Thu Fri Sat Sun
holidays : array_like of datetime64[D], optional
An array of dates to consider as invalid dates. They may be
specified in any order, and NaT (not-a-time) dates are ignored.
This list is saved in a normalized form that is suited for
fast calculations of valid days.
busdaycal : busdaycalendar, optional
A `busdaycalendar` object which specifies the valid days. If this
parameter is provided, neither weekmask nor holidays may be
provided.
out : array of bool, optional
If provided, this array is filled with the result.
Returns
-------
out : array of bool
An array with the same shape as ``dates``, containing True for
each valid day, and False for each invalid day.
See Also
--------
busdaycalendar: An object that specifies a custom set of valid days.
busday_offset : Applies an offset counted in valid days.
busday_count : Counts how many valid days are in a half-open date range.
Examples
--------
>>> # The weekdays are Friday, Saturday, and Monday
... np.is_busday(['2011-07-01', '2011-07-02', '2011-07-18'],
... holidays=['2011-07-01', '2011-07-04', '2011-07-17'])
array([False, False, True], dtype='bool')
- laplace(...)
- laplace(loc=0.0, scale=1.0, size=None)
Draw samples from the Laplace or double exponential distribution with
specified location (or mean) and scale (decay).
The Laplace distribution is similar to the Gaussian/normal distribution,
but is sharper at the peak and has fatter tails. It represents the
difference between two independent, identically distributed exponential
random variables.
Parameters
----------
loc : float
The position, :math:`\mu`, of the distribution peak.
scale : float
:math:`\lambda`, the exponential decay.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Notes
-----
It has the probability density function
.. math:: f(x; \mu, \lambda) = \frac{1}{2\lambda}
\exp\left(-\frac{|x - \mu|}{\lambda}\right).
The first law of Laplace, from 1774, states that the frequency of an error
can be expressed as an exponential function of the absolute magnitude of
the error, which leads to the Laplace distribution. For many problems in
Economics and Health sciences, this distribution seems to model the data
better than the standard Gaussian distribution
References
----------
.. [1] Abramowitz, M. and Stegun, I. A. (Eds.). Handbook of Mathematical
Functions with Formulas, Graphs, and Mathematical Tables, 9th
printing. New York: Dover, 1972.
.. [2] The Laplace distribution and generalizations
By Samuel Kotz, Tomasz J. Kozubowski, Krzysztof Podgorski,
Birkhauser, 2001.
.. [3] Weisstein, Eric W. "Laplace Distribution."
From MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/LaplaceDistribution.html
.. [4] Wikipedia, "Laplace distribution",
http://en.wikipedia.org/wiki/Laplace_distribution
Examples
--------
Draw samples from the distribution
>>> loc, scale = 0., 1.
>>> s = np.random.laplace(loc, scale, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 30, normed=True)
>>> x = np.arange(-8., 8., .01)
>>> pdf = np.exp(-abs(x-loc)/scale)/(2.*scale)
>>> plt.plot(x, pdf)
Plot Gaussian for comparison:
>>> g = (1/(scale np.sqrt(2 * np.pi))
... np.exp(-(x - loc)2 / (2 * scale2)))
>>> plt.plot(x,g)
- lexsort(...)
- lexsort(keys, axis=-1)
Perform an indirect sort using a sequence of keys.
Given multiple sorting keys, which can be interpreted as columns in a
spreadsheet, lexsort returns an array of integer indices that describes
the sort order by multiple columns. The last key in the sequence is used
for the primary sort order, the second-to-last key for the secondary sort
order, and so on. The keys argument must be a sequence of objects that
can be converted to arrays of the same shape. If a 2D array is provided
for the keys argument, it's rows are interpreted as the sorting keys and
sorting is according to the last row, second last row etc.
Parameters
----------
keys : (k, N) array or tuple containing k (N,)-shaped sequences
The `k` different "columns" to be sorted. The last column (or row if
`keys` is a 2D array) is the primary sort key.
axis : int, optional
Axis to be indirectly sorted. By default, sort over the last axis.
Returns
-------
indices : (N,) ndarray of ints
Array of indices that sort the keys along the specified axis.
See Also
--------
argsort : Indirect sort.
ndarray.sort : In-place sort.
sort : Return a sorted copy of an array.
Examples
--------
Sort names: first by surname, then by name.
>>> surnames = ('Hertz', 'Galilei', 'Hertz')
>>> first_names = ('Heinrich', 'Galileo', 'Gustav')
>>> ind = np.lexsort((first_names, surnames))
>>> ind
array([1, 2, 0])
>>> [surnames[i] + ", " + first_names[i] for i in ind]
['Galilei, Galileo', 'Hertz, Gustav', 'Hertz, Heinrich']
Sort two columns of numbers:
>>> a = [1,5,1,4,3,4,4] # First column
>>> b = [9,4,0,4,0,2,1] # Second column
>>> ind = np.lexsort((b,a)) # Sort by a, then by b
>>> print ind
[2 0 4 6 5 3 1]
>>> [(a[i],b[i]) for i in ind]
[(1, 0), (1, 9), (3, 0), (4, 1), (4, 2), (4, 4), (5, 4)]
Note that sorting is first according to the elements of ``a``.
Secondary sorting is according to the elements of ``b``.
A normal ``argsort`` would have yielded:
>>> [(a[i],b[i]) for i in np.argsort(a)]
[(1, 9), (1, 0), (3, 0), (4, 4), (4, 2), (4, 1), (5, 4)]
Structured arrays are sorted lexically by ``argsort``:
>>> x = np.array([(1,9), (5,4), (1,0), (4,4), (3,0), (4,2), (4,1)],
... dtype=np.dtype([('x', int), ('y', int)]))
>>> np.argsort(x) # or np.argsort(x, order=('x', 'y'))
array([2, 0, 4, 6, 5, 3, 1])
- loads(...)
- loads(string) -- Load a pickle from the given string
- logistic(...)
- logistic(loc=0.0, scale=1.0, size=None)
Draw samples from a Logistic distribution.
Samples are drawn from a Logistic distribution with specified
parameters, loc (location or mean, also median), and scale (>0).
Parameters
----------
loc : float
scale : float > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : {ndarray, scalar}
where the values are all integers in [0, n].
See Also
--------
scipy.stats.distributions.logistic : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Logistic distribution is
.. math:: P(x) = P(x) = \frac{e^{-(x-\mu)/s}}{s(1+e^{-(x-\mu)/s})^2},
where :math:`\mu` = location and :math:`s` = scale.
The Logistic distribution is used in Extreme Value problems where it
can act as a mixture of Gumbel distributions, in Epidemiology, and by
the World Chess Federation (FIDE) where it is used in the Elo ranking
system, assuming the performance of each player is a logistically
distributed random variable.
References
----------
.. [1] Reiss, R.-D. and Thomas M. (2001), Statistical Analysis of Extreme
Values, from Insurance, Finance, Hydrology and Other Fields,
Birkhauser Verlag, Basel, pp 132-133.
.. [2] Weisstein, Eric W. "Logistic Distribution." From
MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/LogisticDistribution.html
.. [3] Wikipedia, "Logistic-distribution",
http://en.wikipedia.org/wiki/Logistic-distribution
Examples
--------
Draw samples from the distribution:
>>> loc, scale = 10, 1
>>> s = np.random.logistic(loc, scale, 10000)
>>> count, bins, ignored = plt.hist(s, bins=50)
# plot against distribution
>>> def logist(x, loc, scale):
... return exp((loc-x)/scale)/(scale(1+exp((loc-x)/scale))*2)
>>> plt.plot(bins, logist(bins, loc, scale)*count.max()/\
... logist(bins, loc, scale).max())
>>> plt.show()
- lognormal(...)
- lognormal(mean=0.0, sigma=1.0, size=None)
Return samples drawn from a log-normal distribution.
Draw samples from a log-normal distribution with specified mean,
standard deviation, and array shape. Note that the mean and standard
deviation are not the values for the distribution itself, but of the
underlying normal distribution it is derived from.
Parameters
----------
mean : float
Mean value of the underlying normal distribution
sigma : float, > 0.
Standard deviation of the underlying normal distribution
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or float
The desired samples. An array of the same shape as `size` if given,
if `size` is None a float is returned.
See Also
--------
scipy.stats.lognorm : probability density function, distribution,
cumulative density function, etc.
Notes
-----
A variable `x` has a log-normal distribution if `log(x)` is normally
distributed. The probability density function for the log-normal
distribution is:
.. math:: p(x) = \frac{1}{\sigma x \sqrt{2\pi}}
e^{(-\frac{(ln(x)-\mu)^2}{2\sigma^2})}
where :math:`\mu` is the mean and :math:`\sigma` is the standard
deviation of the normally distributed logarithm of the variable.
A log-normal distribution results if a random variable is the product
of a large number of independent, identically-distributed variables in
the same way that a normal distribution results if the variable is the
sum of a large number of independent, identically-distributed
variables.
References
----------
Limpert, E., Stahel, W. A., and Abbt, M., "Log-normal Distributions
across the Sciences: Keys and Clues," BioScience, Vol. 51, No. 5,
May, 2001. http://stat.ethz.ch/~stahel/lognormal/bioscience.pdf
Reiss, R.D. and Thomas, M., Statistical Analysis of Extreme Values,
Basel: Birkhauser Verlag, 2001, pp. 31-32.
Examples
--------
Draw samples from the distribution:
>>> mu, sigma = 3., 1. # mean and standard deviation
>>> s = np.random.lognormal(mu, sigma, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 100, normed=True, align='mid')
>>> x = np.linspace(min(bins), max(bins), 10000)
>>> pdf = (np.exp(-(np.log(x) - mu)2 / (2 * sigma2))
... / (x sigma * np.sqrt(2 np.pi)))
>>> plt.plot(x, pdf, linewidth=2, color='r')
>>> plt.axis('tight')
>>> plt.show()
Demonstrate that taking the products of random samples from a uniform
distribution can be fit well by a log-normal probability density function.
>>> # Generate a thousand samples: each is the product of 100 random
>>> # values, drawn from a normal distribution.
>>> b = []
>>> for i in range(1000):
... a = 10. + np.random.random(100)
... b.append(np.product(a))
>>> b = np.array(b) / np.min(b) # scale values to be positive
>>> count, bins, ignored = plt.hist(b, 100, normed=True, align='center')
>>> sigma = np.std(np.log(b))
>>> mu = np.mean(np.log(b))
>>> x = np.linspace(min(bins), max(bins), 10000)
>>> pdf = (np.exp(-(np.log(x) - mu)2 / (2 * sigma2))
... / (x sigma * np.sqrt(2 np.pi)))
>>> plt.plot(x, pdf, color='r', linewidth=2)
>>> plt.show()
- logseries(...)
- logseries(p, size=None)
Draw samples from a Logarithmic Series distribution.
Samples are drawn from a Log Series distribution with specified
parameter, p (probability, 0 < p < 1).
Parameters
----------
loc : float
scale : float > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : {ndarray, scalar}
where the values are all integers in [0, n].
See Also
--------
scipy.stats.distributions.logser : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Log Series distribution is
.. math:: P(k) = \frac{-p^k}{k \ln(1-p)},
where p = probability.
The Log Series distribution is frequently used to represent species
richness and occurrence, first proposed by Fisher, Corbet, and
Williams in 1943 [2]. It may also be used to model the numbers of
occupants seen in cars [3].
References
----------
.. [1] Buzas, Martin A.; Culver, Stephen J., Understanding regional
species diversity through the log series distribution of
occurrences: BIODIVERSITY RESEARCH Diversity & Distributions,
Volume 5, Number 5, September 1999 , pp. 187-195(9).
.. [2] Fisher, R.A,, A.S. Corbet, and C.B. Williams. 1943. The
relation between the number of species and the number of
individuals in a random sample of an animal population.
Journal of Animal Ecology, 12:42-58.
.. [3] D. J. Hand, F. Daly, D. Lunn, E. Ostrowski, A Handbook of Small
Data Sets, CRC Press, 1994.
.. [4] Wikipedia, "Logarithmic-distribution",
http://en.wikipedia.org/wiki/Logarithmic-distribution
Examples
--------
Draw samples from the distribution:
>>> a = .6
>>> s = np.random.logseries(a, 10000)
>>> count, bins, ignored = plt.hist(s)
# plot against distribution
>>> def logseries(k, p):
... return -p*k/(klog(1-p))
>>> plt.plot(bins, logseries(bins, a)*count.max()/
logseries(bins, a).max(), 'r')
>>> plt.show()
- may_share_memory(...)
- Determine if two arrays can share memory
The memory-bounds of a and b are computed. If they overlap then
this function returns True. Otherwise, it returns False.
A return of True does not necessarily mean that the two arrays
share any element. It just means that they might.
Parameters
----------
a, b : ndarray
Returns
-------
out : bool
Examples
--------
>>> np.may_share_memory(np.array([1,2]), np.array([5,8,9]))
False
- min_scalar_type(...)
- min_scalar_type(a)
For scalar ``a``, returns the data type with the smallest size
and smallest scalar kind which can hold its value. For non-scalar
array ``a``, returns the vector's dtype unmodified.
Floating point values are not demoted to integers,
and complex values are not demoted to floats.
Parameters
----------
a : scalar or array_like
The value whose minimal data type is to be found.
Returns
-------
out : dtype
The minimal data type.
Notes
-----
.. versionadded:: 1.6.0
See Also
--------
result_type, promote_types, dtype, can_cast
Examples
--------
>>> np.min_scalar_type(10)
dtype('uint8')
>>> np.min_scalar_type(-260)
dtype('int16')
>>> np.min_scalar_type(3.1)
dtype('float16')
>>> np.min_scalar_type(1e50)
dtype('float64')
>>> np.min_scalar_type(np.arange(4,dtype='f8'))
dtype('float64')
- multinomial(...)
- multinomial(n, pvals, size=None)
Draw samples from a multinomial distribution.
The multinomial distribution is a multivariate generalisation of the
binomial distribution. Take an experiment with one of ``p``
possible outcomes. An example of such an experiment is throwing a dice,
where the outcome can be 1 through 6. Each sample drawn from the
distribution represents `n` such experiments. Its values,
``X_i = [X_0, X_1, ..., X_p]``, represent the number of times the outcome
was ``i``.
Parameters
----------
n : int
Number of experiments.
pvals : sequence of floats, length p
Probabilities of each of the ``p`` different outcomes. These
should sum to 1 (however, the last element is always assumed to
account for the remaining probability, as long as
``sum(pvals[:-1]) <= 1)``.
size : tuple of ints
Given a `size` of ``(M, N, K)``, then ``MNK`` samples are drawn,
and the output shape becomes ``(M, N, K, p)``, since each sample
has shape ``(p,)``.
Examples
--------
Throw a dice 20 times:
>>> np.random.multinomial(20, [1/6.]*6, size=1)
array([[4, 1, 7, 5, 2, 1]])
It landed 4 times on 1, once on 2, etc.
Now, throw the dice 20 times, and 20 times again:
>>> np.random.multinomial(20, [1/6.]*6, size=2)
array([[3, 4, 3, 3, 4, 3],
[2, 4, 3, 4, 0, 7]])
For the first run, we threw 3 times 1, 4 times 2, etc. For the second,
we threw 2 times 1, 4 times 2, etc.
A loaded dice is more likely to land on number 6:
>>> np.random.multinomial(100, [1/7.]*5)
array([13, 16, 13, 16, 42])
- multivariate_normal(...)
- multivariate_normal(mean, cov[, size])
Draw random samples from a multivariate normal distribution.
The multivariate normal, multinormal or Gaussian distribution is a
generalization of the one-dimensional normal distribution to higher
dimensions. Such a distribution is specified by its mean and
covariance matrix. These parameters are analogous to the mean
(average or "center") and variance (standard deviation, or "width,"
squared) of the one-dimensional normal distribution.
Parameters
----------
mean : 1-D array_like, of length N
Mean of the N-dimensional distribution.
cov : 2-D array_like, of shape (N, N)
Covariance matrix of the distribution. Must be symmetric and
positive-semidefinite for "physically meaningful" results.
size : int or tuple of ints, optional
Given a shape of, for example, ``(m,n,k)``, ``mnk`` samples are
generated, and packed in an `m`-by-`n`-by-`k` arrangement. Because
each sample is `N`-dimensional, the output shape is ``(m,n,k,N)``.
If no shape is specified, a single (`N`-D) sample is returned.
Returns
-------
out : ndarray
The drawn samples, of shape size, if that was provided. If not,
the shape is ``(N,)``.
In other words, each entry ``out[i,j,...,:]`` is an N-dimensional
value drawn from the distribution.
Notes
-----
The mean is a coordinate in N-dimensional space, which represents the
location where samples are most likely to be generated. This is
analogous to the peak of the bell curve for the one-dimensional or
univariate normal distribution.
Covariance indicates the level to which two variables vary together.
From the multivariate normal distribution, we draw N-dimensional
samples, :math:`X = [x_1, x_2, ... x_N]`. The covariance matrix
element :math:`C_{ij}` is the covariance of :math:`x_i` and :math:`x_j`.
The element :math:`C_{ii}` is the variance of :math:`x_i` (i.e. its
"spread").
Instead of specifying the full covariance matrix, popular
approximations include:
- Spherical covariance (cov is a multiple of the identity matrix)
- Diagonal covariance (cov has non-negative elements, and only on
the diagonal)
This geometrical property can be seen in two dimensions by plotting
generated data-points:
>>> mean = [0,0]
>>> cov = [[1,0],[0,100]] # diagonal covariance, points lie on x or y-axis
>>> import matplotlib.pyplot as plt
>>> x,y = np.random.multivariate_normal(mean,cov,5000).T
>>> plt.plot(x,y,'x'); plt.axis('equal'); plt.show()
Note that the covariance matrix must be non-negative definite.
References
----------
Papoulis, A., Probability, Random Variables, and Stochastic Processes,
3rd ed., New York: McGraw-Hill, 1991.
Duda, R. O., Hart, P. E., and Stork, D. G., Pattern Classification,
2nd ed., New York: Wiley, 2001.
Examples
--------
>>> mean = (1,2)
>>> cov = [[1,0],[1,0]]
>>> x = np.random.multivariate_normal(mean,cov,(3,3))
>>> x.shape
(3, 3, 2)
The following is probably true, given that 0.6 is roughly twice the
standard deviation:
>>> print list( (x[0,0,:] - mean) < 0.6 )
[True, True]
- negative_binomial(...)
- negative_binomial(n, p, size=None)
Draw samples from a negative_binomial distribution.
Samples are drawn from a negative_Binomial distribution with specified
parameters, `n` trials and `p` probability of success where `n` is an
integer > 0 and `p` is in the interval [0, 1].
Parameters
----------
n : int
Parameter, > 0.
p : float
Parameter, >= 0 and <=1.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : int or ndarray of ints
Drawn samples.
Notes
-----
The probability density for the Negative Binomial distribution is
.. math:: P(N;n,p) = \binom{N+n-1}{n-1}p^{n}(1-p)^{N},
where :math:`n-1` is the number of successes, :math:`p` is the probability
of success, and :math:`N+n-1` is the number of trials.
The negative binomial distribution gives the probability of n-1 successes
and N failures in N+n-1 trials, and success on the (N+n)th trial.
If one throws a die repeatedly until the third time a "1" appears, then the
probability distribution of the number of non-"1"s that appear before the
third "1" is a negative binomial distribution.
References
----------
.. [1] Weisstein, Eric W. "Negative Binomial Distribution." From
MathWorld--A Wolfram Web Resource.
http://mathworld.wolfram.com/NegativeBinomialDistribution.html
.. [2] Wikipedia, "Negative binomial distribution",
http://en.wikipedia.org/wiki/Negative_binomial_distribution
Examples
--------
Draw samples from the distribution:
A real world example. A company drills wild-cat oil exploration wells, each
with an estimated probability of success of 0.1. What is the probability
of having one success for each successive well, that is what is the
probability of a single success after drilling 5 wells, after 6 wells,
etc.?
>>> s = np.random.negative_binomial(1, 0.1, 100000)
>>> for i in range(1, 11):
... probability = sum(s<i) / 100000.
... print i, "wells drilled, probability of one success =", probability
- nested_iters(...)
- newbuffer(...)
- newbuffer(size)
Return a new uninitialized buffer object.
Parameters
----------
size : int
Size in bytes of returned buffer object.
Returns
-------
newbuffer : buffer object
Returned, uninitialized buffer object of `size` bytes.
- noncentral_chisquare(...)
- noncentral_chisquare(df, nonc, size=None)
Draw samples from a noncentral chi-square distribution.
The noncentral :math:`\chi^2` distribution is a generalisation of
the :math:`\chi^2` distribution.
Parameters
----------
df : int
Degrees of freedom, should be >= 1.
nonc : float
Non-centrality, should be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Notes
-----
The probability density function for the noncentral Chi-square distribution
is
.. math:: P(x;df,nonc) = \sum^{\infty}_{i=0}
\frac{e^{-nonc/2}(nonc/2)^{i}}{i!}P_{Y_{df+2i}}(x),
where :math:`Y_{q}` is the Chi-square with q degrees of freedom.
In Delhi (2007), it is noted that the noncentral chi-square is useful in
bombing and coverage problems, the probability of killing the point target
given by the noncentral chi-squared distribution.
References
----------
.. [1] Delhi, M.S. Holla, "On a noncentral chi-square distribution in the
analysis of weapon systems effectiveness", Metrika, Volume 15,
Number 1 / December, 1970.
.. [2] Wikipedia, "Noncentral chi-square distribution"
http://en.wikipedia.org/wiki/Noncentral_chi-square_distribution
Examples
--------
Draw values from the distribution and plot the histogram
>>> import matplotlib.pyplot as plt
>>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000),
... bins=200, normed=True)
>>> plt.show()
Draw values from a noncentral chisquare with very small noncentrality,
and compare to a chisquare.
>>> plt.figure()
>>> values = plt.hist(np.random.noncentral_chisquare(3, .0000001, 100000),
... bins=np.arange(0., 25, .1), normed=True)
>>> values2 = plt.hist(np.random.chisquare(3, 100000),
... bins=np.arange(0., 25, .1), normed=True)
>>> plt.plot(values[1][0:-1], values[0]-values2[0], 'ob')
>>> plt.show()
Demonstrate how large values of non-centrality lead to a more symmetric
distribution.
>>> plt.figure()
>>> values = plt.hist(np.random.noncentral_chisquare(3, 20, 100000),
... bins=200, normed=True)
>>> plt.show()
- noncentral_f(...)
- noncentral_f(dfnum, dfden, nonc, size=None)
Draw samples from the noncentral F distribution.
Samples are drawn from an F distribution with specified parameters,
`dfnum` (degrees of freedom in numerator) and `dfden` (degrees of
freedom in denominator), where both parameters > 1.
`nonc` is the non-centrality parameter.
Parameters
----------
dfnum : int
Parameter, should be > 1.
dfden : int
Parameter, should be > 1.
nonc : float
Parameter, should be >= 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : scalar or ndarray
Drawn samples.
Notes
-----
When calculating the power of an experiment (power = probability of
rejecting the null hypothesis when a specific alternative is true) the
non-central F statistic becomes important. When the null hypothesis is
true, the F statistic follows a central F distribution. When the null
hypothesis is not true, then it follows a non-central F statistic.
References
----------
Weisstein, Eric W. "Noncentral F-Distribution." From MathWorld--A Wolfram
Web Resource. http://mathworld.wolfram.com/NoncentralF-Distribution.html
Wikipedia, "Noncentral F distribution",
http://en.wikipedia.org/wiki/Noncentral_F-distribution
Examples
--------
In a study, testing for a specific alternative to the null hypothesis
requires use of the Noncentral F distribution. We need to calculate the
area in the tail of the distribution that exceeds the value of the F
distribution for the null hypothesis. We'll plot the two probability
distributions for comparison.
>>> dfnum = 3 # between group deg of freedom
>>> dfden = 20 # within groups degrees of freedom
>>> nonc = 3.0
>>> nc_vals = np.random.noncentral_f(dfnum, dfden, nonc, 1000000)
>>> NF = np.histogram(nc_vals, bins=50, normed=True)
>>> c_vals = np.random.f(dfnum, dfden, 1000000)
>>> F = np.histogram(c_vals, bins=50, normed=True)
>>> plt.plot(F[1][1:], F[0])
>>> plt.plot(NF[1][1:], NF[0])
>>> plt.show()
- normal(...)
- normal(loc=0.0, scale=1.0, size=None)
Draw random samples from a normal (Gaussian) distribution.
The probability density function of the normal distribution, first
derived by De Moivre and 200 years later by both Gauss and Laplace
independently [2]_, is often called the bell curve because of
its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it
describes the commonly occurring distribution of samples influenced
by a large number of tiny, random disturbances, each with its own
unique distribution [2]_.
Parameters
----------
loc : float
Mean ("centre") of the distribution.
scale : float
Standard deviation (spread or "width") of the distribution.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
See Also
--------
scipy.stats.distributions.norm : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Gaussian distribution is
.. math:: p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }}
e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },
where :math:`\mu` is the mean and :math:`\sigma` the standard deviation.
The square of the standard deviation, :math:`\sigma^2`, is called the
variance.
The function has its peak at the mean, and its "spread" increases with
the standard deviation (the function reaches 0.607 times its maximum at
:math:`x + \sigma` and :math:`x - \sigma` [2]_). This implies that
`numpy.random.normal` is more likely to return samples lying close to the
mean, rather than those far away.
References
----------
.. [1] Wikipedia, "Normal distribution",
http://en.wikipedia.org/wiki/Normal_distribution
.. [2] P. R. Peebles Jr., "Central Limit Theorem" in "Probability, Random
Variables and Random Signal Principles", 4th ed., 2001,
pp. 51, 51, 125.
Examples
--------
Draw samples from the distribution:
>>> mu, sigma = 0, 0.1 # mean and standard deviation
>>> s = np.random.normal(mu, sigma, 1000)
Verify the mean and the variance:
>>> abs(mu - np.mean(s)) < 0.01
True
>>> abs(sigma - np.std(s, ddof=1)) < 0.01
True
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 30, normed=True)
>>> plt.plot(bins, 1/(sigma np.sqrt(2 * np.pi))
... np.exp( - (bins - mu)2 / (2 * sigma2) ),
... linewidth=2, color='r')
>>> plt.show()
- orbit_fits_plot(datafile, outplot, cols=None)
- Plot the semi-major axis of the HST orbit vs time.
- orbit_fits_report(output, test_date_spss, nohead)
- Generate the report of the orbit fits by querying the database.
- packbits(...)
- packbits(myarray, axis=None)
Packs the elements of a binary-valued array into bits in a uint8 array.
The result is padded to full bytes by inserting zero bits at the end.
Parameters
----------
myarray : array_like
An integer type array whose elements should be packed to bits.
axis : int, optional
The dimension over which bit-packing is done.
``None`` implies packing the flattened array.
Returns
-------
packed : ndarray
Array of type uint8 whose elements represent bits corresponding to the
logical (0 or nonzero) value of the input elements. The shape of
`packed` has the same number of dimensions as the input (unless `axis`
is None, in which case the output is 1-D).
See Also
--------
unpackbits: Unpacks elements of a uint8 array into a binary-valued output
array.
Examples
--------
>>> a = np.array([[[1,0,1],
... [0,1,0]],
... [[1,1,0],
... [0,0,1]]])
>>> b = np.packbits(a, axis=-1)
>>> b
array([[[160],[64]],[[192],[32]]], dtype=uint8)
Note that in binary 160 = 1010 0000, 64 = 0100 0000, 192 = 1100 0000,
and 32 = 0010 0000.
- pareto(...)
- pareto(a, size=None)
Draw samples from a Pareto II or Lomax distribution with specified shape.
The Lomax or Pareto II distribution is a shifted Pareto distribution. The
classical Pareto distribution can be obtained from the Lomax distribution
by adding 1 and multiplying by the scale parameter ``m`` (see Notes).
The smallest value of the Lomax distribution is zero while for the
classical Pareto distribution it is ``mu``, where the standard Pareto
distribution has location ``mu = 1``. Lomax can also be considered as a
simplified version of the Generalized Pareto distribution (available in
SciPy), with the scale set to one and the location set to zero.
The Pareto distribution must be greater than zero, and is unbounded above.
It is also known as the "80-20 rule". In this distribution, 80 percent of
the weights are in the lowest 20 percent of the range, while the other 20
percent fill the remaining 80 percent of the range.
Parameters
----------
shape : float, > 0.
Shape of the distribution.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
See Also
--------
scipy.stats.distributions.lomax.pdf : probability density function,
distribution or cumulative density function, etc.
scipy.stats.distributions.genpareto.pdf : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Pareto distribution is
.. math:: p(x) = \frac{am^a}{x^{a+1}}
where :math:`a` is the shape and :math:`m` the scale.
The Pareto distribution, named after the Italian economist Vilfredo Pareto,
is a power law probability distribution useful in many real world problems.
Outside the field of economics it is generally referred to as the Bradford
distribution. Pareto developed the distribution to describe the
distribution of wealth in an economy. It has also found use in insurance,
web page access statistics, oil field sizes, and many other problems,
including the download frequency for projects in Sourceforge [1]_. It is
one of the so-called "fat-tailed" distributions.
References
----------
.. [1] Francis Hunt and Paul Johnson, On the Pareto Distribution of
Sourceforge projects.
.. [2] Pareto, V. (1896). Course of Political Economy. Lausanne.
.. [3] Reiss, R.D., Thomas, M.(2001), Statistical Analysis of Extreme
Values, Birkhauser Verlag, Basel, pp 23-30.
.. [4] Wikipedia, "Pareto distribution",
http://en.wikipedia.org/wiki/Pareto_distribution
Examples
--------
Draw samples from the distribution:
>>> a, m = 3., 2. # shape and mode
>>> s = (np.random.pareto(a, 1000) + 1) * m
Display the histogram of the samples, along with the probability
density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, _ = plt.hist(s, 100, normed=True)
>>> fit = a*ma / bins(a+1)
>>> plt.plot(bins, max(count)*fit/max(fit), linewidth=2, color='r')
>>> plt.show()
- permutation(...)
- permutation(x)
Randomly permute a sequence, or return a permuted range.
If `x` is a multi-dimensional array, it is only shuffled along its
first index.
Parameters
----------
x : int or array_like
If `x` is an integer, randomly permute ``np.arange(x)``.
If `x` is an array, make a copy and shuffle the elements
randomly.
Returns
-------
out : ndarray
Permuted sequence or array range.
Examples
--------
>>> np.random.permutation(10)
array([1, 7, 4, 3, 0, 9, 2, 5, 8, 6])
>>> np.random.permutation([1, 4, 9, 12, 15])
array([15, 1, 9, 4, 12])
>>> arr = np.arange(9).reshape((3, 3))
>>> np.random.permutation(arr)
array([[6, 7, 8],
[0, 1, 2],
[3, 4, 5]])
- poisson(...)
- poisson(lam=1.0, size=None)
Draw samples from a Poisson distribution.
The Poisson distribution is the limit of the Binomial
distribution for large N.
Parameters
----------
lam : float or sequence of float
Expectation of interval, should be >= 0. A sequence of expectation
intervals must be broadcastable over the requested size.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Notes
-----
The Poisson distribution
.. math:: f(k; \lambda)=\frac{\lambda^k e^{-\lambda}}{k!}
For events with an expected separation :math:`\lambda` the Poisson
distribution :math:`f(k; \lambda)` describes the probability of
:math:`k` events occurring within the observed interval :math:`\lambda`.
Because the output is limited to the range of the C long type, a
ValueError is raised when `lam` is within 10 sigma of the maximum
representable value.
References
----------
.. [1] Weisstein, Eric W. "Poisson Distribution." From MathWorld--A Wolfram
Web Resource. http://mathworld.wolfram.com/PoissonDistribution.html
.. [2] Wikipedia, "Poisson distribution",
http://en.wikipedia.org/wiki/Poisson_distribution
Examples
--------
Draw samples from the distribution:
>>> import numpy as np
>>> s = np.random.poisson(5, 10000)
Display histogram of the sample:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 14, normed=True)
>>> plt.show()
Draw each 100 values for lambda 100 and 500:
>>> s = np.random.poisson(lam=(100., 500.), size=(100, 2))
- power(...)
- power(a, size=None)
Draws samples in [0, 1] from a power distribution with positive
exponent a - 1.
Also known as the power function distribution.
Parameters
----------
a : float
parameter, > 0
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : {ndarray, scalar}
The returned samples lie in [0, 1].
Raises
------
ValueError
If a<1.
Notes
-----
The probability density function is
.. math:: P(x; a) = ax^{a-1}, 0 \le x \le 1, a>0.
The power function distribution is just the inverse of the Pareto
distribution. It may also be seen as a special case of the Beta
distribution.
It is used, for example, in modeling the over-reporting of insurance
claims.
References
----------
.. [1] Christian Kleiber, Samuel Kotz, "Statistical size distributions
in economics and actuarial sciences", Wiley, 2003.
.. [2] Heckert, N. A. and Filliben, James J. (2003). NIST Handbook 148:
Dataplot Reference Manual, Volume 2: Let Subcommands and Library
Functions", National Institute of Standards and Technology Handbook
Series, June 2003.
http://www.itl.nist.gov/div898/software/dataplot/refman2/auxillar/powpdf.pdf
Examples
--------
Draw samples from the distribution:
>>> a = 5. # shape
>>> samples = 1000
>>> s = np.random.power(a, samples)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, bins=30)
>>> x = np.linspace(0, 1, 100)
>>> y = ax*(a-1.)
>>> normed_y = samplesnp.diff(bins)[0]y
>>> plt.plot(x, normed_y)
>>> plt.show()
Compare the power function distribution to the inverse of the Pareto.
>>> from scipy import stats
>>> rvs = np.random.power(5, 1000000)
>>> rvsp = np.random.pareto(5, 1000000)
>>> xx = np.linspace(0,1,100)
>>> powpdf = stats.powerlaw.pdf(xx,5)
>>> plt.figure()
>>> plt.hist(rvs, bins=50, normed=True)
>>> plt.plot(xx,powpdf,'r-')
>>> plt.title('np.random.power(5)')
>>> plt.figure()
>>> plt.hist(1./(1.+rvsp), bins=50, normed=True)
>>> plt.plot(xx,powpdf,'r-')
>>> plt.title('inverse of 1 + np.random.pareto(5)')
>>> plt.figure()
>>> plt.hist(1./(1.+rvsp), bins=50, normed=True)
>>> plt.plot(xx,powpdf,'r-')
>>> plt.title('inverse of stats.pareto(5)')
- promote_types(...)
- promote_types(type1, type2)
Returns the data type with the smallest size and smallest scalar
kind to which both ``type1`` and ``type2`` may be safely cast.
The returned data type is always in native byte order.
This function is symmetric and associative.
Parameters
----------
type1 : dtype or dtype specifier
First data type.
type2 : dtype or dtype specifier
Second data type.
Returns
-------
out : dtype
The promoted data type.
Notes
-----
.. versionadded:: 1.6.0
Starting in NumPy 1.9, promote_types function now returns a valid string
length when given an integer or float dtype as one argument and a string
dtype as another argument. Previously it always returned the input string
dtype, even if it wasn't long enough to store the max integer/float value
converted to a string.
See Also
--------
result_type, dtype, can_cast
Examples
--------
>>> np.promote_types('f4', 'f8')
dtype('float64')
>>> np.promote_types('i8', 'f4')
dtype('float64')
>>> np.promote_types('>i8', '<c8')
dtype('complex128')
>>> np.promote_types('i4', 'S8')
dtype('S11')
- putmask(...)
- putmask(a, mask, values)
Changes elements of an array based on conditional and input values.
Sets ``a.flat[n] = values[n]`` for each n where ``mask.flat[n]==True``.
If `values` is not the same size as `a` and `mask` then it will repeat.
This gives behavior different from ``a[mask] = values``.
Parameters
----------
a : array_like
Target array.
mask : array_like
Boolean mask array. It has to be the same shape as `a`.
values : array_like
Values to put into `a` where `mask` is True. If `values` is smaller
than `a` it will be repeated.
See Also
--------
place, put, take, copyto
Examples
--------
>>> x = np.arange(6).reshape(2, 3)
>>> np.putmask(x, x>2, x**2)
>>> x
array([[ 0, 1, 2],
[ 9, 16, 25]])
If `values` is smaller than `a` it is repeated:
>>> x = np.arange(5)
>>> np.putmask(x, x>1, [-33, -44])
>>> x
array([ 0, 1, -33, -44, -33])
- rand(...)
- rand(d0, d1, ..., dn)
Random values in a given shape.
Create an array of the given shape and propagate it with
random samples from a uniform distribution
over ``[0, 1)``.
Parameters
----------
d0, d1, ..., dn : int, optional
The dimensions of the returned array, should all be positive.
If no argument is given a single Python float is returned.
Returns
-------
out : ndarray, shape ``(d0, d1, ..., dn)``
Random values.
See Also
--------
random
Notes
-----
This is a convenience function. If you want an interface that
takes a shape-tuple as the first argument, refer to
np.random.random_sample .
Examples
--------
>>> np.random.rand(3,2)
array([[ 0.14022471, 0.96360618], #random
[ 0.37601032, 0.25528411], #random
[ 0.49313049, 0.94909878]]) #random
- randint(...)
- randint(low, high=None, size=None)
Return random integers from `low` (inclusive) to `high` (exclusive).
Return random integers from the "discrete uniform" distribution in the
"half-open" interval [`low`, `high`). If `high` is None (the default),
then results are from [0, `low`).
Parameters
----------
low : int
Lowest (signed) integer to be drawn from the distribution (unless
``high=None``, in which case this parameter is the highest such
integer).
high : int, optional
If provided, one above the largest (signed) integer to be drawn
from the distribution (see above for behavior if ``high=None``).
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : int or ndarray of ints
`size`-shaped array of random integers from the appropriate
distribution, or a single such random int if `size` not provided.
See Also
--------
random.random_integers : similar to `randint`, only for the closed
interval [`low`, `high`], and 1 is the lowest value if `high` is
omitted. In particular, this other one is the one to use to generate
uniformly distributed discrete non-integers.
Examples
--------
>>> np.random.randint(2, size=10)
array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0])
>>> np.random.randint(1, size=10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Generate a 2 x 4 array of ints between 0 and 4, inclusive:
>>> np.random.randint(5, size=(2, 4))
array([[4, 0, 2, 1],
[3, 2, 2, 0]])
- randn(...)
- randn(d0, d1, ..., dn)
Return a sample (or samples) from the "standard normal" distribution.
If positive, int_like or int-convertible arguments are provided,
`randn` generates an array of shape ``(d0, d1, ..., dn)``, filled
with random floats sampled from a univariate "normal" (Gaussian)
distribution of mean 0 and variance 1 (if any of the :math:`d_i` are
floats, they are first converted to integers by truncation). A single
float randomly sampled from the distribution is returned if no
argument is provided.
This is a convenience function. If you want an interface that takes a
tuple as the first argument, use `numpy.random.standard_normal` instead.
Parameters
----------
d0, d1, ..., dn : int, optional
The dimensions of the returned array, should be all positive.
If no argument is given a single Python float is returned.
Returns
-------
Z : ndarray or float
A ``(d0, d1, ..., dn)``-shaped array of floating-point samples from
the standard normal distribution, or a single such float if
no parameters were supplied.
See Also
--------
random.standard_normal : Similar, but takes a tuple as its argument.
Notes
-----
For random samples from :math:`N(\mu, \sigma^2)`, use:
``sigma * np.random.randn(...) + mu``
Examples
--------
>>> np.random.randn()
2.1923875335537315 #random
Two-by-four array of samples from N(3, 6.25):
>>> 2.5 * np.random.randn(2, 4) + 3
array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], #random
[ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) #random
- random = random_sample(...)
- random_sample(size=None)
Return random floats in the half-open interval [0.0, 1.0).
Results are from the "continuous uniform" distribution over the
stated interval. To sample :math:`Unif[a, b), b > a` multiply
the output of `random_sample` by `(b-a)` and add `a`::
(b - a) * random_sample() + a
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : float or ndarray of floats
Array of random floats of shape `size` (unless ``size=None``, in which
case a single float is returned).
Examples
--------
>>> np.random.random_sample()
0.47108547995356098
>>> type(np.random.random_sample())
<type 'float'>
>>> np.random.random_sample((5,))
array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5
array([[-3.99149989, -0.52338984],
[-2.99091858, -0.79479508],
[-1.23204345, -1.75224494]])
- random_integers(...)
- random_integers(low, high=None, size=None)
Return random integers between `low` and `high`, inclusive.
Return random integers from the "discrete uniform" distribution in the
closed interval [`low`, `high`]. If `high` is None (the default),
then results are from [1, `low`].
Parameters
----------
low : int
Lowest (signed) integer to be drawn from the distribution (unless
``high=None``, in which case this parameter is the highest such
integer).
high : int, optional
If provided, the largest (signed) integer to be drawn from the
distribution (see above for behavior if ``high=None``).
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : int or ndarray of ints
`size`-shaped array of random integers from the appropriate
distribution, or a single such random int if `size` not provided.
See Also
--------
random.randint : Similar to `random_integers`, only for the half-open
interval [`low`, `high`), and 0 is the lowest value if `high` is
omitted.
Notes
-----
To sample from N evenly spaced floating-point numbers between a and b,
use::
a + (b - a) * (np.random.random_integers(N) - 1) / (N - 1.)
Examples
--------
>>> np.random.random_integers(5)
4
>>> type(np.random.random_integers(5))
<type 'int'>
>>> np.random.random_integers(5, size=(3.,2.))
array([[5, 4],
[3, 3],
[4, 5]])
Choose five random numbers from the set of five evenly-spaced
numbers between 0 and 2.5, inclusive (i.e., from the set
:math:`{0, 5/8, 10/8, 15/8, 20/8}`):
>>> 2.5 * (np.random.random_integers(5, size=(5,)) - 1) / 4.
array([ 0.625, 1.25 , 0.625, 0.625, 2.5 ])
Roll two six sided dice 1000 times and sum the results:
>>> d1 = np.random.random_integers(1, 6, 1000)
>>> d2 = np.random.random_integers(1, 6, 1000)
>>> dsums = d1 + d2
Display results as a histogram:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(dsums, 11, normed=True)
>>> plt.show()
- random_sample(...)
- random_sample(size=None)
Return random floats in the half-open interval [0.0, 1.0).
Results are from the "continuous uniform" distribution over the
stated interval. To sample :math:`Unif[a, b), b > a` multiply
the output of `random_sample` by `(b-a)` and add `a`::
(b - a) * random_sample() + a
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : float or ndarray of floats
Array of random floats of shape `size` (unless ``size=None``, in which
case a single float is returned).
Examples
--------
>>> np.random.random_sample()
0.47108547995356098
>>> type(np.random.random_sample())
<type 'float'>
>>> np.random.random_sample((5,))
array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5
array([[-3.99149989, -0.52338984],
[-2.99091858, -0.79479508],
[-1.23204345, -1.75224494]])
- ranf = random_sample(...)
- random_sample(size=None)
Return random floats in the half-open interval [0.0, 1.0).
Results are from the "continuous uniform" distribution over the
stated interval. To sample :math:`Unif[a, b), b > a` multiply
the output of `random_sample` by `(b-a)` and add `a`::
(b - a) * random_sample() + a
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : float or ndarray of floats
Array of random floats of shape `size` (unless ``size=None``, in which
case a single float is returned).
Examples
--------
>>> np.random.random_sample()
0.47108547995356098
>>> type(np.random.random_sample())
<type 'float'>
>>> np.random.random_sample((5,))
array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5
array([[-3.99149989, -0.52338984],
[-2.99091858, -0.79479508],
[-1.23204345, -1.75224494]])
- ravel_multi_index(...)
- ravel_multi_index(multi_index, dims, mode='raise', order='C')
Converts a tuple of index arrays into an array of flat
indices, applying boundary modes to the multi-index.
Parameters
----------
multi_index : tuple of array_like
A tuple of integer arrays, one array for each dimension.
dims : tuple of ints
The shape of array into which the indices from ``multi_index`` apply.
mode : {'raise', 'wrap', 'clip'}, optional
Specifies how out-of-bounds indices are handled. Can specify
either one mode or a tuple of modes, one mode per index.
* 'raise' -- raise an error (default)
* 'wrap' -- wrap around
* 'clip' -- clip to the range
In 'clip' mode, a negative index which would normally
wrap will clip to 0 instead.
order : {'C', 'F'}, optional
Determines whether the multi-index should be viewed as indexing in
C (row-major) order or FORTRAN (column-major) order.
Returns
-------
raveled_indices : ndarray
An array of indices into the flattened version of an array
of dimensions ``dims``.
See Also
--------
unravel_index
Notes
-----
.. versionadded:: 1.6.0
Examples
--------
>>> arr = np.array([[3,6,6],[4,5,1]])
>>> np.ravel_multi_index(arr, (7,6))
array([22, 41, 37])
>>> np.ravel_multi_index(arr, (7,6), order='F')
array([31, 41, 13])
>>> np.ravel_multi_index(arr, (4,6), mode='clip')
array([22, 23, 19])
>>> np.ravel_multi_index(arr, (4,4), mode=('clip','wrap'))
array([12, 13, 13])
>>> np.ravel_multi_index((3,1,4,1), (6,7,8,9))
1621
- rayleigh(...)
- rayleigh(scale=1.0, size=None)
Draw samples from a Rayleigh distribution.
The :math:`\chi` and Weibull distributions are generalizations of the
Rayleigh.
Parameters
----------
scale : scalar
Scale, also equals the mode. Should be >= 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Notes
-----
The probability density function for the Rayleigh distribution is
.. math:: P(x;scale) = \frac{x}{scale^2}e^{\frac{-x^2}{2 \cdotp scale^2}}
The Rayleigh distribution arises if the wind speed and wind direction are
both gaussian variables, then the vector wind velocity forms a Rayleigh
distribution. The Rayleigh distribution is used to model the expected
output from wind turbines.
References
----------
.. [1] Brighton Webs Ltd., Rayleigh Distribution,
http://www.brighton-webs.co.uk/distributions/rayleigh.asp
.. [2] Wikipedia, "Rayleigh distribution"
http://en.wikipedia.org/wiki/Rayleigh_distribution
Examples
--------
Draw values from the distribution and plot the histogram
>>> values = hist(np.random.rayleigh(3, 100000), bins=200, normed=True)
Wave heights tend to follow a Rayleigh distribution. If the mean wave
height is 1 meter, what fraction of waves are likely to be larger than 3
meters?
>>> meanvalue = 1
>>> modevalue = np.sqrt(2 / np.pi) * meanvalue
>>> s = np.random.rayleigh(modevalue, 1000000)
The percentage of waves larger than 3 meters is:
>>> 100.*sum(s>3)/1000000.
0.087300000000000003
- restoredot(...)
- Restore `dot`, `vdot`, and `innerproduct` to the default non-BLAS
implementations.
Typically, the user will only need to call this when troubleshooting and
installation problem, reproducing the conditions of a build without an
accelerated BLAS, or when being very careful about benchmarking linear
algebra operations.
See Also
--------
alterdot : `restoredot` undoes the effects of `alterdot`.
- result_type(...)
- result_type(*arrays_and_dtypes)
Returns the type that results from applying the NumPy
type promotion rules to the arguments.
Type promotion in NumPy works similarly to the rules in languages
like C++, with some slight differences. When both scalars and
arrays are used, the array's type takes precedence and the actual value
of the scalar is taken into account.
For example, calculating 3*a, where a is an array of 32-bit floats,
intuitively should result in a 32-bit float output. If the 3 is a
32-bit integer, the NumPy rules indicate it can't convert losslessly
into a 32-bit float, so a 64-bit float should be the result type.
By examining the value of the constant, '3', we see that it fits in
an 8-bit integer, which can be cast losslessly into the 32-bit float.
Parameters
----------
arrays_and_dtypes : list of arrays and dtypes
The operands of some operation whose result type is needed.
Returns
-------
out : dtype
The result type.
See also
--------
dtype, promote_types, min_scalar_type, can_cast
Notes
-----
.. versionadded:: 1.6.0
The specific algorithm used is as follows.
Categories are determined by first checking which of boolean,
integer (int/uint), or floating point (float/complex) the maximum
kind of all the arrays and the scalars are.
If there are only scalars or the maximum category of the scalars
is higher than the maximum category of the arrays,
the data types are combined with :func:`promote_types`
to produce the return value.
Otherwise, `min_scalar_type` is called on each array, and
the resulting data types are all combined with :func:`promote_types`
to produce the return value.
The set of int values is not a subset of the uint values for types
with the same number of bits, something not reflected in
:func:`min_scalar_type`, but handled as a special case in `result_type`.
Examples
--------
>>> np.result_type(3, np.arange(7, dtype='i1'))
dtype('int8')
>>> np.result_type('i4', 'c8')
dtype('complex128')
>>> np.result_type(3.0, -2)
dtype('float64')
- run(*args)
- Creates a report of all existing long term fit to HST predictive data.
This report calculates the acending node for the fit and the given
test date using a constant rate (rate * t + <starting point>).
Usage:
do list_hst_orbit_fits test_date [-output=<output_filename>
-plot=<output_filename_for_plot> -nohead]
For example:
do list_hst_orbit_fits 2006.154 -out=myoutput -plot=myplot.ps -nohead
test_date (YYYY.DDD) is the date for which the ascending node will be
calculated
-output, if included, writes the output to the named file. The default
is to write the output to a file called list_hst_orbit_fits.rpt in
the current working directory.
-plot, if included, causes a plot of the semimajor axis vs time to be
produced. The value of -plot is the filename of the plot. Postscipt
works if the file suffix is '.ps'. Other formats have not been tested.
-nohead, if included, causes no headers to be written in the output.
This option is useful if the output is to be ingested into excel.
- sample = random_sample(...)
- random_sample(size=None)
Return random floats in the half-open interval [0.0, 1.0).
Results are from the "continuous uniform" distribution over the
stated interval. To sample :math:`Unif[a, b), b > a` multiply
the output of `random_sample` by `(b-a)` and add `a`::
(b - a) * random_sample() + a
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : float or ndarray of floats
Array of random floats of shape `size` (unless ``size=None``, in which
case a single float is returned).
Examples
--------
>>> np.random.random_sample()
0.47108547995356098
>>> type(np.random.random_sample())
<type 'float'>
>>> np.random.random_sample((5,))
array([ 0.30220482, 0.86820401, 0.1654503 , 0.11659149, 0.54323428])
Three-by-two array of random numbers from [-5, 0):
>>> 5 * np.random.random_sample((3, 2)) - 5
array([[-3.99149989, -0.52338984],
[-2.99091858, -0.79479508],
[-1.23204345, -1.75224494]])
- seed(...)
- seed(seed=None)
Seed the generator.
This method is called when `RandomState` is initialized. It can be
called again to re-seed the generator. For details, see `RandomState`.
Parameters
----------
seed : int or array_like, optional
Seed for `RandomState`.
Must be convertable to 32 bit unsigned integers.
See Also
--------
RandomState
- set_numeric_ops(...)
- set_numeric_ops(op1=func1, op2=func2, ...)
Set numerical operators for array objects.
Parameters
----------
op1, op2, ... : callable
Each ``op = func`` pair describes an operator to be replaced.
For example, ``add = lambda x, y: np.add(x, y) % 5`` would replace
addition by modulus 5 addition.
Returns
-------
saved_ops : list of callables
A list of all operators, stored before making replacements.
Notes
-----
.. WARNING::
Use with care! Incorrect usage may lead to memory errors.
A function replacing an operator cannot make use of that operator.
For example, when replacing add, you may not use ``+``. Instead,
directly call ufuncs.
Examples
--------
>>> def add_mod5(x, y):
... return np.add(x, y) % 5
...
>>> old_funcs = np.set_numeric_ops(add=add_mod5)
>>> x = np.arange(12).reshape((3, 4))
>>> x + x
array([[0, 2, 4, 1],
[3, 0, 2, 4],
[1, 3, 0, 2]])
>>> ignore = np.set_numeric_ops(**old_funcs) # restore operators
- set_state(...)
- set_state(state)
Set the internal state of the generator from a tuple.
For use if one has reason to manually (re-)set the internal state of the
"Mersenne Twister"[1]_ pseudo-random number generating algorithm.
Parameters
----------
state : tuple(str, ndarray of 624 uints, int, int, float)
The `state` tuple has the following items:
1. the string 'MT19937', specifying the Mersenne Twister algorithm.
2. a 1-D array of 624 unsigned integers ``keys``.
3. an integer ``pos``.
4. an integer ``has_gauss``.
5. a float ``cached_gaussian``.
Returns
-------
out : None
Returns 'None' on success.
See Also
--------
get_state
Notes
-----
`set_state` and `get_state` are not needed to work with any of the
random distributions in NumPy. If the internal state is manually altered,
the user should know exactly what he/she is doing.
For backwards compatibility, the form (str, array of 624 uints, int) is
also accepted although it is missing some information about the cached
Gaussian value: ``state = ('MT19937', keys, pos)``.
References
----------
.. [1] M. Matsumoto and T. Nishimura, "Mersenne Twister: A
623-dimensionally equidistributed uniform pseudorandom number
generator," ACM Trans. on Modeling and Computer Simulation,
Vol. 8, No. 1, pp. 3-30, Jan. 1998.
- seterrobj(...)
- seterrobj(errobj)
Set the object that defines floating-point error handling.
The error object contains all information that defines the error handling
behavior in Numpy. `seterrobj` is used internally by the other
functions that set error handling behavior (`seterr`, `seterrcall`).
Parameters
----------
errobj : list
The error object, a list containing three elements:
[internal numpy buffer size, error mask, error callback function].
The error mask is a single integer that holds the treatment information
on all four floating point errors. The information for each error type
is contained in three bits of the integer. If we print it in base 8, we
can see what treatment is set for "invalid", "under", "over", and
"divide" (in that order). The printed string can be interpreted with
* 0 : 'ignore'
* 1 : 'warn'
* 2 : 'raise'
* 3 : 'call'
* 4 : 'print'
* 5 : 'log'
See Also
--------
geterrobj, seterr, geterr, seterrcall, geterrcall
getbufsize, setbufsize
Notes
-----
For complete documentation of the types of floating-point exceptions and
treatment options, see `seterr`.
Examples
--------
>>> old_errobj = np.geterrobj() # first get the defaults
>>> old_errobj
[10000, 0, None]
>>> def err_handler(type, flag):
... print "Floating point error (%s), with flag %s" % (type, flag)
...
>>> new_errobj = [20000, 12, err_handler]
>>> np.seterrobj(new_errobj)
>>> np.base_repr(12, 8) # int for divide=4 ('print') and over=1 ('warn')
'14'
>>> np.geterr()
{'over': 'warn', 'divide': 'print', 'invalid': 'ignore', 'under': 'ignore'}
>>> np.geterrcall() is err_handler
True
- shuffle(...)
- shuffle(x)
Modify a sequence in-place by shuffling its contents.
Parameters
----------
x : array_like
The array or list to be shuffled.
Returns
-------
None
Examples
--------
>>> arr = np.arange(10)
>>> np.random.shuffle(arr)
>>> arr
[1 7 5 2 9 4 3 6 0 8]
This function only shuffles the array along the first index of a
multi-dimensional array:
>>> arr = np.arange(9).reshape((3, 3))
>>> np.random.shuffle(arr)
>>> arr
array([[3, 4, 5],
[6, 7, 8],
[0, 1, 2]])
- standard_cauchy(...)
- standard_cauchy(size=None)
Standard Cauchy distribution with mode = 0.
Also known as the Lorentz distribution.
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or scalar
The drawn samples.
Notes
-----
The probability density function for the full Cauchy distribution is
.. math:: P(x; x_0, \gamma) = \frac{1}{\pi \gamma \bigl[ 1+
(\frac{x-x_0}{\gamma})^2 \bigr] }
and the Standard Cauchy distribution just sets :math:`x_0=0` and
:math:`\gamma=1`
The Cauchy distribution arises in the solution to the driven harmonic
oscillator problem, and also describes spectral line broadening. It
also describes the distribution of values at which a line tilted at
a random angle will cut the x axis.
When studying hypothesis tests that assume normality, seeing how the
tests perform on data from a Cauchy distribution is a good indicator of
their sensitivity to a heavy-tailed distribution, since the Cauchy looks
very much like a Gaussian distribution, but with heavier tails.
References
----------
.. [1] NIST/SEMATECH e-Handbook of Statistical Methods, "Cauchy
Distribution",
http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm
.. [2] Weisstein, Eric W. "Cauchy Distribution." From MathWorld--A
Wolfram Web Resource.
http://mathworld.wolfram.com/CauchyDistribution.html
.. [3] Wikipedia, "Cauchy distribution"
http://en.wikipedia.org/wiki/Cauchy_distribution
Examples
--------
Draw samples and plot the distribution:
>>> s = np.random.standard_cauchy(1000000)
>>> s = s[(s>-25) & (s<25)] # truncate distribution so it plots well
>>> plt.hist(s, bins=100)
>>> plt.show()
- standard_exponential(...)
- standard_exponential(size=None)
Draw samples from the standard exponential distribution.
`standard_exponential` is identical to the exponential distribution
with a scale parameter of 1.
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : float or ndarray
Drawn samples.
Examples
--------
Output a 3x8000 array:
>>> n = np.random.standard_exponential((3, 8000))
- standard_gamma(...)
- standard_gamma(shape, size=None)
Draw samples from a Standard Gamma distribution.
Samples are drawn from a Gamma distribution with specified parameters,
shape (sometimes designated "k") and scale=1.
Parameters
----------
shape : float
Parameter, should be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or scalar
The drawn samples.
See Also
--------
scipy.stats.distributions.gamma : probability density function,
distribution or cumulative density function, etc.
Notes
-----
The probability density for the Gamma distribution is
.. math:: p(x) = x^{k-1}\frac{e^{-x/\theta}}{\theta^k\Gamma(k)},
where :math:`k` is the shape and :math:`\theta` the scale,
and :math:`\Gamma` is the Gamma function.
The Gamma distribution is often used to model the times to failure of
electronic components, and arises naturally in processes for which the
waiting times between Poisson distributed events are relevant.
References
----------
.. [1] Weisstein, Eric W. "Gamma Distribution." From MathWorld--A
Wolfram Web Resource.
http://mathworld.wolfram.com/GammaDistribution.html
.. [2] Wikipedia, "Gamma-distribution",
http://en.wikipedia.org/wiki/Gamma-distribution
Examples
--------
Draw samples from the distribution:
>>> shape, scale = 2., 1. # mean and width
>>> s = np.random.standard_gamma(shape, 1000000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> import scipy.special as sps
>>> count, bins, ignored = plt.hist(s, 50, normed=True)
>>> y = bins*(shape-1) ((np.exp(-bins/scale))/ \
... (sps.gamma(shape) scale*shape))
>>> plt.plot(bins, y, linewidth=2, color='r')
>>> plt.show()
- standard_normal(...)
- standard_normal(size=None)
Returns samples from a Standard Normal distribution (mean=0, stdev=1).
Parameters
----------
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : float or ndarray
Drawn samples.
Examples
--------
>>> s = np.random.standard_normal(8000)
>>> s
array([ 0.6888893 , 0.78096262, -0.89086505, ..., 0.49876311, #random
-0.38672696, -0.4685006 ]) #random
>>> s.shape
(8000,)
>>> s = np.random.standard_normal(size=(3, 4, 2))
>>> s.shape
(3, 4, 2)
- standard_t(...)
- standard_t(df, size=None)
Standard Student's t distribution with df degrees of freedom.
A special case of the hyperbolic distribution.
As `df` gets large, the result resembles that of the standard normal
distribution (`standard_normal`).
Parameters
----------
df : int
Degrees of freedom, should be > 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or scalar
Drawn samples.
Notes
-----
The probability density function for the t distribution is
.. math:: P(x, df) = \frac{\Gamma(\frac{df+1}{2})}{\sqrt{\pi df}
\Gamma(\frac{df}{2})}\Bigl( 1+\frac{x^2}{df} \Bigr)^{-(df+1)/2}
The t test is based on an assumption that the data come from a Normal
distribution. The t test provides a way to test whether the sample mean
(that is the mean calculated from the data) is a good estimate of the true
mean.
The derivation of the t-distribution was forst published in 1908 by William
Gisset while working for the Guinness Brewery in Dublin. Due to proprietary
issues, he had to publish under a pseudonym, and so he used the name
Student.
References
----------
.. [1] Dalgaard, Peter, "Introductory Statistics With R",
Springer, 2002.
.. [2] Wikipedia, "Student's t-distribution"
http://en.wikipedia.org/wiki/Student's_t-distribution
Examples
--------
From Dalgaard page 83 [1]_, suppose the daily energy intake for 11
women in Kj is:
>>> intake = np.array([5260., 5470, 5640, 6180, 6390, 6515, 6805, 7515, \
... 7515, 8230, 8770])
Does their energy intake deviate systematically from the recommended
value of 7725 kJ?
We have 10 degrees of freedom, so is the sample mean within 95% of the
recommended value?
>>> s = np.random.standard_t(10, size=100000)
>>> np.mean(intake)
6753.636363636364
>>> intake.std(ddof=1)
1142.1232221373727
Calculate the t statistic, setting the ddof parameter to the unbiased
value so the divisor in the standard deviation will be degrees of
freedom, N-1.
>>> t = (np.mean(intake)-7725)/(intake.std(ddof=1)/np.sqrt(len(intake)))
>>> import matplotlib.pyplot as plt
>>> h = plt.hist(s, bins=100, normed=True)
For a one-sided t-test, how far out in the distribution does the t
statistic appear?
>>> >>> np.sum(s<t) / float(len(s))
0.0090699999999999999 #random
So the p-value is about 0.009, which says the null hypothesis has a
probability of about 99% of being true.
- triangular(...)
- triangular(left, mode, right, size=None)
Draw samples from the triangular distribution.
The triangular distribution is a continuous probability distribution with
lower limit left, peak at mode, and upper limit right. Unlike the other
distributions, these parameters directly define the shape of the pdf.
Parameters
----------
left : scalar
Lower limit.
mode : scalar
The value where the peak of the distribution occurs.
The value should fulfill the condition ``left <= mode <= right``.
right : scalar
Upper limit, should be larger than `left`.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or scalar
The returned samples all lie in the interval [left, right].
Notes
-----
The probability density function for the Triangular distribution is
.. math:: P(x;l, m, r) = \begin{cases}
\frac{2(x-l)}{(r-l)(m-l)}& \text{for $l \leq x \leq m$},\\
\frac{2(m-x)}{(r-l)(r-m)}& \text{for $m \leq x \leq r$},\\
0& \text{otherwise}.
\end{cases}
The triangular distribution is often used in ill-defined problems where the
underlying distribution is not known, but some knowledge of the limits and
mode exists. Often it is used in simulations.
References
----------
.. [1] Wikipedia, "Triangular distribution"
http://en.wikipedia.org/wiki/Triangular_distribution
Examples
--------
Draw values from the distribution and plot the histogram:
>>> import matplotlib.pyplot as plt
>>> h = plt.hist(np.random.triangular(-3, 0, 8, 100000), bins=200,
... normed=True)
>>> plt.show()
- uniform(...)
- uniform(low=0.0, high=1.0, size=None)
Draw samples from a uniform distribution.
Samples are uniformly distributed over the half-open interval
``[low, high)`` (includes low, but excludes high). In other words,
any value within the given interval is equally likely to be drawn
by `uniform`.
Parameters
----------
low : float, optional
Lower boundary of the output interval. All values generated will be
greater than or equal to low. The default value is 0.
high : float
Upper boundary of the output interval. All values generated will be
less than high. The default value is 1.0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
out : ndarray
Drawn samples, with shape `size`.
See Also
--------
randint : Discrete uniform distribution, yielding integers.
random_integers : Discrete uniform distribution over the closed
interval ``[low, high]``.
random_sample : Floats uniformly distributed over ``[0, 1)``.
random : Alias for `random_sample`.
rand : Convenience function that accepts dimensions as input, e.g.,
``rand(2,2)`` would generate a 2-by-2 array of floats,
uniformly distributed over ``[0, 1)``.
Notes
-----
The probability density function of the uniform distribution is
.. math:: p(x) = \frac{1}{b - a}
anywhere within the interval ``[a, b)``, and zero elsewhere.
Examples
--------
Draw samples from the distribution:
>>> s = np.random.uniform(-1,0,1000)
All values are within the given interval:
>>> np.all(s >= -1)
True
>>> np.all(s < 0)
True
Display the histogram of the samples, along with the
probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 15, normed=True)
>>> plt.plot(bins, np.ones_like(bins), linewidth=2, color='r')
>>> plt.show()
- unpackbits(...)
- unpackbits(myarray, axis=None)
Unpacks elements of a uint8 array into a binary-valued output array.
Each element of `myarray` represents a bit-field that should be unpacked
into a binary-valued output array. The shape of the output array is either
1-D (if `axis` is None) or the same shape as the input array with unpacking
done along the axis specified.
Parameters
----------
myarray : ndarray, uint8 type
Input array.
axis : int, optional
Unpacks along this axis.
Returns
-------
unpacked : ndarray, uint8 type
The elements are binary-valued (0 or 1).
See Also
--------
packbits : Packs the elements of a binary-valued array into bits in a uint8
array.
Examples
--------
>>> a = np.array([[2], [7], [23]], dtype=np.uint8)
>>> a
array([[ 2],
[ 7],
[23]], dtype=uint8)
>>> b = np.unpackbits(a, axis=1)
>>> b
array([[0, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 1, 0, 1, 1, 1]], dtype=uint8)
- unravel_index(...)
- unravel_index(indices, dims, order='C')
Converts a flat index or array of flat indices into a tuple
of coordinate arrays.
Parameters
----------
indices : array_like
An integer array whose elements are indices into the flattened
version of an array of dimensions ``dims``. Before version 1.6.0,
this function accepted just one index value.
dims : tuple of ints
The shape of the array to use for unraveling ``indices``.
order : {'C', 'F'}, optional
.. versionadded:: 1.6.0
Determines whether the indices should be viewed as indexing in
C (row-major) order or FORTRAN (column-major) order.
Returns
-------
unraveled_coords : tuple of ndarray
Each array in the tuple has the same shape as the ``indices``
array.
See Also
--------
ravel_multi_index
Examples
--------
>>> np.unravel_index([22, 41, 37], (7,6))
(array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index([31, 41, 13], (7,6), order='F')
(array([3, 6, 6]), array([4, 5, 1]))
>>> np.unravel_index(1621, (6,7,8,9))
(3, 1, 4, 1)
- vdot(...)
- vdot(a, b)
Return the dot product of two vectors.
The vdot(`a`, `b`) function handles complex numbers differently than
dot(`a`, `b`). If the first argument is complex the complex conjugate
of the first argument is used for the calculation of the dot product.
Note that `vdot` handles multidimensional arrays differently than `dot`:
it does not perform a matrix product, but flattens input arguments
to 1-D vectors first. Consequently, it should only be used for vectors.
Parameters
----------
a : array_like
If `a` is complex the complex conjugate is taken before calculation
of the dot product.
b : array_like
Second argument to the dot product.
Returns
-------
output : ndarray
Dot product of `a` and `b`. Can be an int, float, or
complex depending on the types of `a` and `b`.
See Also
--------
dot : Return the dot product without using the complex conjugate of the
first argument.
Examples
--------
>>> a = np.array([1+2j,3+4j])
>>> b = np.array([5+6j,7+8j])
>>> np.vdot(a, b)
(70-8j)
>>> np.vdot(b, a)
(70+8j)
Note that higher-dimensional arrays are flattened!
>>> a = np.array([[1, 4], [5, 6]])
>>> b = np.array([[4, 1], [2, 2]])
>>> np.vdot(a, b)
30
>>> np.vdot(b, a)
30
>>> 14 + 4*1 + 5*2 + 62
30
- vonmises(...)
- vonmises(mu, kappa, size=None)
Draw samples from a von Mises distribution.
Samples are drawn from a von Mises distribution with specified mode
(mu) and dispersion (kappa), on the interval [-pi, pi].
The von Mises distribution (also known as the circular normal
distribution) is a continuous probability distribution on the unit
circle. It may be thought of as the circular analogue of the normal
distribution.
Parameters
----------
mu : float
Mode ("center") of the distribution.
kappa : float
Dispersion of the distribution, has to be >=0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : scalar or ndarray
The returned samples, which are in the interval [-pi, pi].
See Also
--------
scipy.stats.distributions.vonmises : probability density function,
distribution, or cumulative density function, etc.
Notes
-----
The probability density for the von Mises distribution is
.. math:: p(x) = \frac{e^{\kappa cos(x-\mu)}}{2\pi I_0(\kappa)},
where :math:`\mu` is the mode and :math:`\kappa` the dispersion,
and :math:`I_0(\kappa)` is the modified Bessel function of order 0.
The von Mises is named for Richard Edler von Mises, who was born in
Austria-Hungary, in what is now the Ukraine. He fled to the United
States in 1939 and became a professor at Harvard. He worked in
probability theory, aerodynamics, fluid mechanics, and philosophy of
science.
References
----------
Abramowitz, M. and Stegun, I. A. (ed.), *Handbook of Mathematical
Functions*, New York: Dover, 1965.
von Mises, R., Mathematical Theory of Probability and Statistics,
New York: Academic Press, 1964.
Examples
--------
Draw samples from the distribution:
>>> mu, kappa = 0.0, 4.0 # mean and dispersion
>>> s = np.random.vonmises(mu, kappa, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> import scipy.special as sps
>>> count, bins, ignored = plt.hist(s, 50, normed=True)
>>> x = np.arange(-np.pi, np.pi, 2*np.pi/50.)
>>> y = -np.exp(kappanp.cos(x-mu))/(2*np.pisps.jn(0,kappa))
>>> plt.plot(x, y/max(y), linewidth=2, color='r')
>>> plt.show()
- wald(...)
- wald(mean, scale, size=None)
Draw samples from a Wald, or Inverse Gaussian, distribution.
As the scale approaches infinity, the distribution becomes more like a
Gaussian.
Some references claim that the Wald is an Inverse Gaussian with mean=1, but
this is by no means universal.
The Inverse Gaussian distribution was first studied in relationship to
Brownian motion. In 1956 M.C.K. Tweedie used the name Inverse Gaussian
because there is an inverse relationship between the time to cover a unit
distance and distance covered in unit time.
Parameters
----------
mean : scalar
Distribution mean, should be > 0.
scale : scalar
Scale parameter, should be >= 0.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : ndarray or scalar
Drawn sample, all greater than zero.
Notes
-----
The probability density function for the Wald distribution is
.. math:: P(x;mean,scale) = \sqrt{\frac{scale}{2\pi x^3}}e^
\frac{-scale(x-mean)^2}{2\cdotp mean^2x}
As noted above the Inverse Gaussian distribution first arise from attempts
to model Brownian Motion. It is also a competitor to the Weibull for use in
reliability modeling and modeling stock returns and interest rate
processes.
References
----------
.. [1] Brighton Webs Ltd., Wald Distribution,
http://www.brighton-webs.co.uk/distributions/wald.asp
.. [2] Chhikara, Raj S., and Folks, J. Leroy, "The Inverse Gaussian
Distribution: Theory : Methodology, and Applications", CRC Press,
1988.
.. [3] Wikipedia, "Wald distribution"
http://en.wikipedia.org/wiki/Wald_distribution
Examples
--------
Draw values from the distribution and plot the histogram:
>>> import matplotlib.pyplot as plt
>>> h = plt.hist(np.random.wald(3, 2, 100000), bins=200, normed=True)
>>> plt.show()
- weibull(...)
- weibull(a, size=None)
Weibull distribution.
Draw samples from a 1-parameter Weibull distribution with the given
shape parameter `a`.
.. math:: X = (-ln(U))^{1/a}
Here, U is drawn from the uniform distribution over (0,1].
The more common 2-parameter Weibull, including a scale parameter
:math:`\lambda` is just :math:`X = \lambda(-ln(U))^{1/a}`.
Parameters
----------
a : float
Shape of the distribution.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
See Also
--------
scipy.stats.distributions.weibull_max
scipy.stats.distributions.weibull_min
scipy.stats.distributions.genextreme
gumbel
Notes
-----
The Weibull (or Type III asymptotic extreme value distribution for smallest
values, SEV Type III, or Rosin-Rammler distribution) is one of a class of
Generalized Extreme Value (GEV) distributions used in modeling extreme
value problems. This class includes the Gumbel and Frechet distributions.
The probability density for the Weibull distribution is
.. math:: p(x) = \frac{a}
{\lambda}(\frac{x}{\lambda})^{a-1}e^{-(x/\lambda)^a},
where :math:`a` is the shape and :math:`\lambda` the scale.
The function has its peak (the mode) at
:math:`\lambda(\frac{a-1}{a})^{1/a}`.
When ``a = 1``, the Weibull distribution reduces to the exponential
distribution.
References
----------
.. [1] Waloddi Weibull, Professor, Royal Technical University, Stockholm,
1939 "A Statistical Theory Of The Strength Of Materials",
Ingeniorsvetenskapsakademiens Handlingar Nr 151, 1939,
Generalstabens Litografiska Anstalts Forlag, Stockholm.
.. [2] Waloddi Weibull, 1951 "A Statistical Distribution Function of Wide
Applicability", Journal Of Applied Mechanics ASME Paper.
.. [3] Wikipedia, "Weibull distribution",
http://en.wikipedia.org/wiki/Weibull_distribution
Examples
--------
Draw samples from the distribution:
>>> a = 5. # shape
>>> s = np.random.weibull(a, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> x = np.arange(1,100.)/50.
>>> def weib(x,n,a):
... return (a / n) (x / n)(a - 1) np.exp(-(x / n)a)
>>> count, bins, ignored = plt.hist(np.random.weibull(5.,1000))
>>> x = np.arange(1,100.)/50.
>>> scale = count.max()/weib(x, 1., 5.).max()
>>> plt.plot(x, weib(x, 1., 5.)*scale)
>>> plt.show()
- where(...)
- where(condition, [x, y])
Return elements, either from `x` or `y`, depending on `condition`.
If only `condition` is given, return ``condition.nonzero()``.
Parameters
----------
condition : array_like, bool
When True, yield `x`, otherwise yield `y`.
x, y : array_like, optional
Values from which to choose. `x` and `y` need to have the same
shape as `condition`.
Returns
-------
out : ndarray or tuple of ndarrays
If both `x` and `y` are specified, the output array contains
elements of `x` where `condition` is True, and elements from
`y` elsewhere.
If only `condition` is given, return the tuple
``condition.nonzero()``, the indices where `condition` is True.
See Also
--------
nonzero, choose
Notes
-----
If `x` and `y` are given and input arrays are 1-D, `where` is
equivalent to::
[xv if c else yv for (c,xv,yv) in zip(condition,x,y)]
Examples
--------
>>> np.where([[True, False], [True, True]],
... [[1, 2], [3, 4]],
... [[9, 8], [7, 6]])
array([[1, 8],
[3, 4]])
>>> np.where([[0, 1], [1, 0]])
(array([0, 1]), array([1, 0]))
>>> x = np.arange(9.).reshape(3, 3)
>>> np.where( x > 5 )
(array([2, 2, 2]), array([0, 1, 2]))
>>> x[np.where( x > 3.0 )] # Note: result is 1D.
array([ 4., 5., 6., 7., 8.])
>>> np.where(x < 5, x, -1) # Note: broadcasting.
array([[ 0., 1., 2.],
[ 3., 4., -1.],
[-1., -1., -1.]])
Find the indices of elements of `x` that are in `goodvalues`.
>>> goodvalues = [3, 4, 7]
>>> ix = np.in1d(x.ravel(), goodvalues).reshape(x.shape)
>>> ix
array([[False, False, False],
[ True, True, False],
[False, True, False]], dtype=bool)
>>> np.where(ix)
(array([1, 1, 2]), array([0, 1, 1]))
- zeros(...)
- zeros(shape, dtype=float, order='C')
Return a new array of given shape and type, filled with zeros.
Parameters
----------
shape : int or sequence of ints
Shape of the new array, e.g., ``(2, 3)`` or ``2``.
dtype : data-type, optional
The desired data-type for the array, e.g., `numpy.int8`. Default is
`numpy.float64`.
order : {'C', 'F'}, optional
Whether to store multidimensional data in C- or Fortran-contiguous
(row- or column-wise) order in memory.
Returns
-------
out : ndarray
Array of zeros with the given shape, dtype, and order.
See Also
--------
zeros_like : Return an array of zeros with shape and type of input.
ones_like : Return an array of ones with shape and type of input.
empty_like : Return an empty array with shape and type of input.
ones : Return a new array setting values to one.
empty : Return a new uninitialized array.
Examples
--------
>>> np.zeros(5)
array([ 0., 0., 0., 0., 0.])
>>> np.zeros((5,), dtype=numpy.int)
array([0, 0, 0, 0, 0])
>>> np.zeros((2, 1))
array([[ 0.],
[ 0.]])
>>> s = (2,2)
>>> np.zeros(s)
array([[ 0., 0.],
[ 0., 0.]])
>>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
array([(0, 0), (0, 0)],
dtype=[('x', '<i4'), ('y', '<i4')])
- zipf(...)
- zipf(a, size=None)
Draw samples from a Zipf distribution.
Samples are drawn from a Zipf distribution with specified parameter
`a` > 1.
The Zipf distribution (also known as the zeta distribution) is a
continuous probability distribution that satisfies Zipf's law: the
frequency of an item is inversely proportional to its rank in a
frequency table.
Parameters
----------
a : float > 1
Distribution parameter.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m n k`` samples are drawn. Default is None, in which case a
single value is returned.
Returns
-------
samples : scalar or ndarray
The returned samples are greater than or equal to one.
See Also
--------
scipy.stats.distributions.zipf : probability density function,
distribution, or cumulative density function, etc.
Notes
-----
The probability density for the Zipf distribution is
.. math:: p(x) = \frac{x^{-a}}{\zeta(a)},
where :math:`\zeta` is the Riemann Zeta function.
It is named for the American linguist George Kingsley Zipf, who noted
that the frequency of any word in a sample of a language is inversely
proportional to its rank in the frequency table.
References
----------
Zipf, G. K., *Selected Studies of the Principle of Relative Frequency
in Language*, Cambridge, MA: Harvard Univ. Press, 1932.
Examples
--------
Draw samples from the distribution:
>>> a = 2. # parameter
>>> s = np.random.zipf(a, 1000)
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> import scipy.special as sps
Truncate s values at 50 so plot is interesting
>>> count, bins, ignored = plt.hist(s[s<50], 50, normed=True)
>>> x = np.arange(1., 50.)
>>> y = x**(-a)/sps.zetac(a)
>>> plt.plot(x, y/max(y), linewidth=2, color='r')
>>> plt.show()
|