SELECT id, random()
FROM generate_series(0, 1000000) g (id);
There seems to be an equivalent in SQLite: https://sqlite.org/series.html[1] https://www.postgresql.org/docs/current/functions-srf.html
([]x?x:1000000)
Which gives a table of a million rows like: x
------
877095
265141
935540
49015
... ?~1000000
or, if you don't need elements to be unique: ?$~1000000
Though it's just an array, not a persistent table - I don't know much about Jd :( xxd -ps -c 16 -l $bignum < /dev/urandom | (echo '\copy random_data from stdin'; cat) | psql
Depending on how you want your random keys formatted. shuf -i 1-$bignum
or for random numbers with replacement shuf -i 1-$bignum -r -n $bignum
These give you a sample of random integers. `shuf` is part of GNU coreutils, so present in most Linux installs (but not on macOS). WITH RECURSIVE
t AS (
SELECT 0 id
UNION ALL
SELECT id + 1
FROM t
WHERE id < 1000000
)
SELECT id, random() FROM t;
It returns 1000001 rows, but so does cribwi's. SELECT n FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS d(n)
for instance: SELECT seq = units.n + tens.n*10 + hundreds.n*100 + thousands.n*1000
FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS units(n)
CROSS JOIN (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS tens(n)
CROSS JOIN (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS hundreds(n)
CROSS JOIN (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS thousands(n)
WHERE units.n + tens.n*10 + hundreds.n*100 + thousands.n*1000 BETWEEN 1 AND 1337
You can make it neater still by making the sub-table with a CTE: WITH Digits AS (SELECT n FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS d(n))
SELECT seq = units.n + tens.n*10 + hundreds.n*100 + thousands.n*1000
FROM Digits AS units
CROSS JOIN Digits AS tens
CROSS JOIN Digits AS hundreds
CROSS JOIN Digits AS thousands
WHERE units.n + tens.n*10 + hundreds.n*100 + thousands.n*1000 BETWEEN 1 AND 1337
or WITH Digits AS (SELECT n FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS d(n))
, Thousands AS (SELECT seq = units.n + tens.n*10 + hundreds.n*100 + thousands.n*1000
FROM Digits AS units
CROSS JOIN Digits AS tens
CROSS JOIN Digits AS hundreds
CROSS JOIN Digits AS thousands
)
SELECT seq FROM thousands ORDER BY seq
WHERE units.n + tens.n*10 + hundreds.n*100 + thousands.n*1000 BETWEEN 1 AND 1337
though at that point you probably have the recursive option so you'd need to test to see which is more efficient. As a further hack you can combine this and the recursive CTE to extend the number of rows you can return when recursion limits would otherwise be a problem.You can also use simple window functions to make a sequence from a system table that you know will have a few thousand rows:
SELECT n=ROW_NUMBER() OVER (ORDER BY object_id) FROM sys.all_columns
Or if you just need a few thousand rows to hang randomly generated data from you don't even need the function.All a bit hacky compared to some DBs that have sequence support, but useful for generating test data or numbers tables where that isn't supported.
If you need to create test data with complex business logic, referential integrity and constraints we've been working on declarative data generator that is build exactly for this: https://github.com/openquery-io/synth.
Needs a little Groovy but very convenient for generating random (or non-random) data.
While this looks like a good way to generate simple data, practical applications are more involved.
E.g. I have a table structure but not alot of rows in it, so i go use this to get alot of rows in to check how fast queries get processed?
You could create a new database and restore a production backup instead of using fake data but that might not be allowed or require some kind of approval due to rules protecting the privacy of customers or employees.
The other is for populating "numbers tables" which can be very useful in reporting contexts and/or dealing with certain gaps/islands type problems.
You can even use the techniques dynamically rather than a stored numbers table, if you need it in a DB that doesn't have one and you don't have schema permissions to create one, though this is generally less efficient.
select top 1000000 ROW_NUMBER() from sys.objects a, sys.objects b
or you can use INFORMATION_SCHEMA which is more portable across different RDBMS engines
Much more performant and naturally relational way of generating data than looping recursively.
Gotta select row number on some big enough table!
A table of which DMBS’s support this would be useful
[1]: http://dcx.sap.com/index.html#1001/en/dbrfen10/rf-select-sta...
[2]: http://dcx.sap.com/index.html#sqla170/en/html/8190aea36ce210...
Microsoft SQL Server, Firebird 2.1, PostgreSQL, SQLite, IBM Informix version, CUBRID, MariaDB and MySQL
spark.sparkContext.range(1_000_000, numSlices=200).take(20) select dbms_random.value from dual connect by level < 1000001;
edit: 1,000,001 as level starts with 1