Return to Question

Tweeted twitter.com/#!/StackDBAs/status/575041420232409089

occurred Mar 9, 2015 at 21:12

improved formatting

Source Link

edited Sep 8, 2012 at 11:37

user1822

Set up two nodes:
Set up two nodes:
```
 initdb -D $PGDATA initdb -D $STANDBY
```

In the master node create a user for replication. I do that in pgAdmin (it does have superuser privileges)

In the master node in pg_hba.conf add the part that allows standby to connect: host replication repuser 127.0.0.1/0 md5

In the master node in postgresql.conf set:

max_wal_senders = 1 archive_mode = on archive_command = 'cp %p ~/postgresql/backup/archivedir/%f' wal_level = archive wal_keep_segments = 32

Start the master node and do the base backup:

psql -d dellstore2 -c "SELECT pg_start_backup('backup for replication', true)" rsync -av ${PGDATA}/ $STANDBY --exclude postmaster.pid psql -d dellstore2 -c "select pg_stop_backup()"

pg_stop_backup says that everything is fine, all the WAL files were archived

In the standby (data2) node I create recovery.conf with:

standby_mode = 'on' primary_conninfo = 'host=127.0.0.1 port=5432 user=repuser password=haslo' trigger_file = '/home/michau/postgresql/replication.trigger' restore_command = 'cp /home/michau/postgresql/backup/archivedir/%f "%p"'

Start the master node, then start the standby node - replication should start and standby should catch up with the master. That was exactly what happened the first time. Now when I start the standby I get: "Address already in use" error. Of course both standby and master have the same port specified in postgresql.conf (they have exactly the same postgresql.conf files). If I change the port in standby to let's say 5433 then I get:

 LOG: database system was shut down in recovery at 2012-06-12 19:48:01 CEST LOG: entering standby mode cp: cannot stat /home/michau/postgresql/backup/archivedir/000000010000000000000007: No such file or directory LOG: consistent recovery state reached at 0/7000070 LOG: record with zero length at 0/7000070 cp: cannot stat /home/michau/postgresql/backup/archivedir/000000010000000000000007: No such file or directory LOG: streaming replication successfully connected to primary LOG: redo starts at 0/7000070

initdb -D $PGDATA

initdb -D $STANDBY

In the master node create a user for replication. I do that in pgAdmin (it does have superuser privileges)

In the master node in pg_hba.conf add the part that allows standby to connect:

host replication repuser 127.0.0.1/0 md5

In the master node in postgresql.conf set:

max_wal_senders = 1

archive_mode = on

archive_command = 'cp %p ~/postgresql/backup/archivedir/%f'

wal_level = archive

wal_keep_segments = 32

Start the master node and do the base backup:

psql -d dellstore2 -c "SELECT pg_start_backup('backup for replication', true)"

rsync -av ${PGDATA}/ $STANDBY --exclude postmaster.pid

psql -d dellstore2 -c "select pg_stop_backup()"

pg_stop_backup says that everything is fine, all the WAL files were archived

In the standby (data2) node I create recovery.conf with:

standby_mode = 'on'

primary_conninfo = 'host=127.0.0.1 port=5432 user=repuser password=haslo'

trigger_file = '/home/michau/postgresql/replication.trigger'

restore_command = 'cp /home/michau/postgresql/backup/archivedir/%f "%p"'

Start the master node, then start the standby node - replication should start and standby should catch up with the master. That was exactly what happened the first time. Now when I start the standby I get: "Address already in use" error. Of course both standby and master have the same port specified in postgresql.conf (they have exactly the same postgresql.conf files). If I change the port in standby to let's say 5433 then I get:

LOG: database system was shut down in recovery at 2012-06-12 19:48:01 CEST

LOG: entering standby mode

cp: cannot stat /home/michau/postgresql/backup/archivedir/000000010000000000000007: No such file or directory

LOG: consistent recovery state reached at 0/7000070

LOG: record with zero length at 0/7000070

cp: cannot stat /home/michau/postgresql/backup/archivedir/000000010000000000000007: No such file or directory

LOG: streaming replication successfully connected to primary

LOG: redo starts at 0/7000070

And it just hangs here. Running ps -ef | grep postgresql yields:

michau 2491 1898 0 19:46 pts/0 00:00:00 postgres -D /home/michau/postgresql/9.1/data

michau 2493 2491 0 19:46 ? 00:00:01 postgres: writer process

michau 2494 2491 0 19:46 ? 00:00:00 postgres: wal writer process

michau 2495 2491 0 19:46 ? 00:00:00 postgres: autovacuum launcher process

michau 2496 2491 0 19:46 ? 00:00:00 postgres: archiver process last was 000000010000000000000008

michau 2497 2491 0 19:46 ? 00:00:00 postgres: stats collector process

michau 2571 2214 0 19:49 pts/1 00:00:00 postgres -D /home/michau/postgresql/9.1/data2

michau 2572 2571 0 19:49 ? 00:00:01 postgres: startup process recovering 000000010000000000000009

michau 2575 2571 0 19:49 ? 00:00:01 postgres: writer process

michau 2578 2571 0 19:49 ? 00:00:02 postgres: wal receiver process streaming 0/99782DC

michau 2579 2491 0 19:49 ? 00:00:00 postgres: wal sender process repuser 127.0.0.1(42142) streaming 0/99782DC

michau 2586 2491 0 19:51 ? 00:00:00 postgres: michau postgres ::1(49941) idle

michau 2587 2491 0 19:51 ? 00:00:01 postgres: michau dellstore2 ::1(49942) idle

 michau 2491 1898 0 19:46 pts/0 00:00:00 postgres -D /home/michau/postgresql/9.1/data michau 2493 2491 0 19:46 ? 00:00:01 postgres: writer process michau 2494 2491 0 19:46 ? 00:00:00 postgres: wal writer process michau 2495 2491 0 19:46 ? 00:00:00 postgres: autovacuum launcher process michau 2496 2491 0 19:46 ? 00:00:00 postgres: archiver process last was 000000010000000000000008 michau 2497 2491 0 19:46 ? 00:00:00 postgres: stats collector process michau 2571 2214 0 19:49 pts/1 00:00:00 postgres -D /home/michau/postgresql/9.1/data2 michau 2572 2571 0 19:49 ? 00:00:01 postgres: startup process recovering 000000010000000000000009 michau 2575 2571 0 19:49 ? 00:00:01 postgres: writer process michau 2578 2571 0 19:49 ? 00:00:02 postgres: wal receiver process streaming 0/99782DC michau 2579 2491 0 19:49 ? 00:00:00 postgres: wal sender process repuser 127.0.0.1(42142) streaming 0/99782DC michau 2586 2491 0 19:51 ? 00:00:00 postgres: michau postgres ::1(49941) idle michau 2587 2491 0 19:51 ? 00:00:01 postgres: michau dellstore2 ::1(49942) idle

The recovering 0000000010000009 where changing for a while, but for half an hour it doesn't anymore.

Set up two nodes:

initdb -D $PGDATA

initdb -D $STANDBY

In the master node create a user for replication. I do that in pgAdmin (it does have superuser privileges)

In the master node in pg_hba.conf add the part that allows standby to connect:

host replication repuser 127.0.0.1/0 md5

In the master node in postgresql.conf set:

max_wal_senders = 1

archive_mode = on

archive_command = 'cp %p ~/postgresql/backup/archivedir/%f'

wal_level = archive

wal_keep_segments = 32

Start the master node and do the base backup:

psql -d dellstore2 -c "SELECT pg_start_backup('backup for replication', true)"

rsync -av ${PGDATA}/ $STANDBY --exclude postmaster.pid

psql -d dellstore2 -c "select pg_stop_backup()"

pg_stop_backup says that everything is fine, all the WAL files were archived

In the standby (data2) node I create recovery.conf with:

standby_mode = 'on'

primary_conninfo = 'host=127.0.0.1 port=5432 user=repuser password=haslo'

trigger_file = '/home/michau/postgresql/replication.trigger'

restore_command = 'cp /home/michau/postgresql/backup/archivedir/%f "%p"'

Start the master node, then start the standby node - replication should start and standby should catch up with the master. That was exactly what happened the first time. Now when I start the standby I get: "Address already in use" error. Of course both standby and master have the same port specified in postgresql.conf (they have exactly the same postgresql.conf files). If I change the port in standby to let's say 5433 then I get: