鼎鼎小筑

PG热迁移问题解决过程

2017/11/21 Share

说明

  1. 迁移过程参考:http://www.jianshu.com/p/2d07339774c0
  2. 遇到问题:当配置完从库,然后启动时发生如下错误:

    “FATAL: requested WAL segment 0000000800002A0000000000 has already been removed”

原因

  1. PostgreSQL的流复制的原理是通过传递主机(master)上的wal日志信息到备机(slave)然后恢复,这中间就有一个潜在的问题,如果主机端比较忙,wal日志被覆盖了,而从机可能因为网络或者其他原因没有接收到该日志,就会造成主从不一致,这时主从就断了,并且在主机端留下很多的流复制错误信息;通常这种情况不影响主库的使用,但如果有在备机上做一些查询,就需要注意了。出现了这个问题,需要到归档的日志文件里把文件拷贝到备机的pg_xlog下面。 如果主机的wal日志已经循环覆盖了,而且没有做wal的归档,那出现这种情况只能重新做流复制了;
  2. 在重新做流复制的时候,我也调大了postgresql.conf中wal_keep_segments的值(wal_keep_segments=16)
  • 官方对wal_keep_segments的解释:

    Specifies the minimum number of past log file segments kept in the pg_xlog directory, in case a standby server needs to fetch them for streaming replication. Each segment is normally 16 megabytes. If a standby server connected to the sending server falls behind by more than wal_keep_segments segments, the sending server might remove a WAL segment still needed by the standby, in which case the replication connection will be terminated. Downstream connections will also eventually fail as a result. (However, the standby server can recover by fetching the segment from archive, if WAL archiving is in use.)

    This sets only the minimum number of segments retained in pg_xlog; the system might need to retain more segments for WAL archival or to recover from a checkpoint. If wal_keep_segments is zero (the default), the system doesn’t keep any extra segments for standby purposes, so the number of old WAL segments available to standby servers is a function of the location of the previous checkpoint and status of WAL archiving. This parameter can only be set in the postgresql.conf file or on the server command line.

CATALOG
  1. 1. 说明
  2. 2. 原因