I have a table with two fields:
- id(UUID) that is primary Key and
- description (
var255)
I want to insert random data with SQL sentence. I would like that description would be something random.
PS: I am using PostgreSQL.
I dont know exactly if this fits the requirement for a "random description", and it's not clear if you want to generate the full data: but, for example, this generates 10 records with consecutive ids and random texts:
test=# SELECT generate_series(1,10) AS id, md5(random()::text) AS descr; id | descr ----+---------------------------------- 1 | 65c141ee1fdeb269d2e393cb1d3e1c09 2 | 269638b9061149e9228d1b2718cb035e 3 | 020bce01ba6a6623702c4da1bc6d556e 4 | 18fad4813efe3dcdb388d7d8c4b6d3b4 5 | a7859b3bcf7ff11f921ceef58dc1e5b5 6 | 63691d4a20f7f23843503349c32aa08c 7 | ca317278d40f2f3ac81224f6996d1c57 8 | bb4a284e1c53775a02ebd6ec91bbb847 9 | b444b5ea7966cd76174a618ec0bb9901 10 | 800495c53976f60641fb4d486be61dc6 (10 rows) SELECT generate_series(1,10) AS id, md5(random()::text) AS descr;?The following worked for me:
create table t_random as select s, md5(random()::text) from generate_Series(1,5) s; Here it is a more elegant way using the latest features. I will use the Unix dictionary (/usr/share/dict/words) and copy it into my PostgreSQL data:
cp /usr/share/dict/words data/pg95/words.list Then, you can easily create a ton of no sense description BUT searchable using dictionary words with the following steps:
1) Create table and function. getNArrayS gets all the elements in an array and teh number of times it needs to concatenate.
CREATE TABLE randomTable(id serial PRIMARY KEY, description text); CREATE OR REPLACE FUNCTION getNArrayS(el text[], count int) RETURNS text AS $$ SELECT string_agg(el[random()*(array_length(el,1)-1)+1], ' ') FROM generate_series(1,count) g(i) $$ VOLATILE LANGUAGE SQL; Once you have all in place, run the insert using CTE:
WITH t(ray) AS( SELECT (string_to_array(pg_read_file('words.list')::text,E'\n')) ) INSERT INTO randomTable(description) SELECT getNArrayS(T.ray, 3) FROM T, generate_series(1,10000); And now, select as usual:
postgres=# select * from randomtable limit 3; id | description ----+--------------------------------------------- 1 | ultracentenarian splenodiagnosis manurially 2 | insequent monopolarity funipendulous 3 | ruminate geodic unconcludable (3 rows) I assume sentance == statement? You could use perl or plperl as perl has some good random data generators. Check out perl CPAN module Data::Random to start.
Here's a sample of a perl script to generate some different random stuff taken from CPAN.
use Data::Random qw(:all); my @random_words = rand_words( size => 10 ); my @random_chars = rand_chars( set => 'all', min => 5, max => 8 ); my @random_set = rand_set( set => \@set, size => 5 ); my $random_enum = rand_enum( set => \@set ); my $random_date = rand_date(); my $random_time = rand_time(); my $random_datetime = rand_datetime(); open(FILE, ">rand_image.png") or die $!; binmode(FILE); print FILE rand_image( bgcolor => [0, 0, 0] ); close(FILE); Generating random strings with the specified characters and length
The fact that md5(random()::text) only has hex digits and a fixed length saddens me a bit, which made me look for a more elegant function.
This question has some good answers: Generate random String in PostgreSQL I think I'll end up going with the following function:
CREATE OR REPLACE FUNCTION random_string(int) RETURNS TEXT as $$ select string_agg(substr(characters, (random() * length(characters) + 1)::integer, 1), '') as random_word from (values('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789- ')) as symbols(characters) join generate_series(1, $1) on 1 = 1 $$ language sql; which can then be used as:
CREATE TABLE "mytable" ("i" INTEGER, "j" INTEGER, "s" TEXT, "t" TEXT); INSERT INTO "mytable" SELECT i, i*2, random_string(10), random_string(20) FROM generate_series(1, 10) as s(i); to produce data like:
i | j | s | t ----+----+------------+---------------------- 1 | 2 | VgXg6 9c9T | 3SbRe0CmyDzL6tbIi s8 2 | 4 | 6Nqx ynI7 | nQhOZrnXI9HdwBlNzFB 3 | 6 | FXGl4Wn4va | RZ2zdOLFqErJ44MhoQVm 4 | 8 | 6i IrvUVE | bY7Gta1 c8tVdDpMTiJ2 5 | 10 | aU-30dwGVl | mzZPchylUL6FgQ04N Fy 6 | 12 | yQMnXFPIt | 1 fMb7wQE-gqCzkFoVD 7 | 14 | Okj7XypLbQ | TosYftFbbRSD4GnQOELV 8 | 16 | 1w P3zUW | kY1c kU0mJNURCCw1na 9 | 18 | vYcFmC8W82 | R6Omwp KcL H1MNJKzR 10 | 20 | v0I fPg32 | 8 cjc0q8DCOczxNNvF4N Performance-wise it is not amazing however, with about 12s for 1 million rows. For comparison:
INSERT INTO "tmp" (i, j, s, t) SELECT i, i*2, md5(random()::text), md5(random()::text) took 1.8 s, and:
INSERT INTO "tmp" (i, j, s, t) SELECT i, i*2, 'a', 'a' took 0.5 s.
Generate a CSV with an external tool and import it with \copy FROM STDIN
Since every pure PostgreSQL option is a bit awkward and/or slow, perhaps the best option is to just use an external tool to generate the data, and then feed it to PostgreSQL as a CSV from stdin.
For example, given the table:
CREATE TABLE "mytable" ("i" INTEGER, "j" INTEGER, "s" TEXT, "t" TEXT); a potentially slow but convenient option would be to just use a Python script such as:
generate_data.py
import csv import random import string import sys CHARS = string.ascii_letters + string.digits + ' ' def rands(n): return ''.join(random.choice(CHARS) for _ in range(n)) if len(sys.argv) > 1: n = int(sys.argv[1]) else: n = 10 writer = csv.writer(sys.stdout) for i in range(n): writer.writerow((i, random.randrange(100), rands(10), rands(20))) and then use it as:
python generate_data.py 10 | psql mydb -c '\copy "mytable" FROM STDIN' Related:
Tested on Ubuntu 24.04, PostgreSQL 16.6, Python 3.12.3, Lenovo ThinkPad P14s.